rdb_parallel execution.pdf

42
Extreme Performance with Oracle Database 11g and In-Memory Parallel Execution Jean-Pierre Dijcks Data Warehouse Product Management

Upload: fil-mango

Post on 03-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Extreme Performance with Oracle Database 11g and In-Memory Parallel Execution

Jean-Pierre DijcksData Warehouse Product Management

<Insert Picture Here>

Agenda

• Introduction to Parallel Execution

• Automatic Degree Of Parallelism

• Parallel Statement Queuing

• In Memory Parallel Execution

• Summary

<Insert Picture Here>

Introduction to Parallel Execution

How Parallel Execution works

User connects to the database

User

Background process is spawned

When user issues a parallel SQL statement the background process becomes the Query Coordinator

QC gets parallel servers from global pool and distributes the work to them

Parallel servers - individual sessions that perform work in parallel Allocated from a pool of globally available parallel server processes & assigned to a given operation

Parallel servers communicate among themselves & the QC using messages that are passed via memory buffers in the shared pool

Parallel Servers Parallel Servers do majority of the workdo majority of the work

Query CoordinatorQuery Coordinator

Parallel Execution Plan

IDID OperationOperation Name Name TQTQ ININ--OUTOUT PQ PQ DistributionDistribution

0 SELECT STATEMENT

1 PX COORDINATOR

2 PX SEND QC {RANDOM} Q1,01 P->S

3 HASH JOIN Q1,01 PCWP

4 PX RECEIVE Q1,01 PCWP

5 PX SEND BROADCAST Q1,01 P->P BROADCAST

6 PX BLOCK ITERATOR Q1,01 PCWP

7 TABLE ACCESS FULL CUSTOMERS Q1,01 PCWP

8 PX BLOCK ITERATOR Q1,01 PCWP

9 TABLE ACCESS FULL SALES Q1,01 PCWP

SELECT c.cust_name, s.purchase_date, s.amount

FROM sales s, customers c

WHERE s.cust_id = c.cust_id;

Parallel Execution of a Query

SELECT c.cust_name, s.date, s.amount

FROM sales s, customers c

WHERE s.cust_id = c.cust_id;

Producers

Consumers

ProducersProducersProducersProducers

ConsumersConsumersConsumersConsumers Query CoordinatorQuery Coordinator

Producers and Consumer in the execution plan

IDID OperationOperation Name Name TQTQ ININ--OUTOUT PQ PQ DistributionDistribution

0 SELECT STATEMENT

1 PX COORDINATOR

2 PX SEND QC {RANDOM} Q1,02 P->S

3 HASH JOIN Q1,02 PCWP

4 PX RECEIVE Q1,02 PCWP

5 PX SEND HASH Q1,00 P->P

6 PX BLOCK ITERATOR Q1,00 PCWP

7 TABLE ACCESS FULL CUSTOMERS Q1,00 PCWP

8 PX RECEIVE Q1,02 PCWP

9 PX SEND HASH Q1,01 P->P

10 PX BLOCK ITERATOR Q1,01 PCWP

11 TABLE ACCESS FULL SALES Q1,01 PCWP

Parallel Execution of a Scan

• Data is divided into Granules – block range or partition

• Each Parallel Server is assigned one or more Granules

• No two Parallel Servers ever contend for the same Granule

• Granules are assigned so that the load is balanced across all Parallel Servers

• Dynamic Granules chosen by the optimizer

• Granule decision is visible in execution plan

Full scan of the sales table

PQ 1

PQ 2

PQ 3

Identifying Granules of Parallelism during scans in the plan

Enabling Parallel Execution

There are three ways to enable parallel Execution1. Enable the table(s) for parallel execution:

alter table sales parallel ;alter table customers parallel ;

2. Use a parallel hintselect /*+ parallel(c) parallel(s) */c.state_province, sum(s.amount) revenuefrom customers c, sales swhere s.customer_id = c.idand s.purchase_date=to_date('01-JAN-2007','DD-MON-YYYY')and c.country = 'United States'group by c.state_province;

3. Use alter session force parallel query ;

Controlling Parallel Execution on RAC

Use RAC ServicesCreate two servicesSrvctl add service –d database_name

-s ETL-r sid1, sid2

Srvctl add service –d database_name-s AHOC-r sid3, sid4

ETL Ad-Hoc queries

Note:New Parameter to force a parallel statement to run on just node the query was issued on called PARALLEL_FORCE_LOCAL default FALSE

How could we enhance Parallel Execution?

Current Issues• Difficult to determine ideal DOP for each table without manual tuning• One DOP does not fit all queries touching an object• Not enough PX server processes can result in statement running serial• Too many PX server processes can thrash the system• Only uses IO resources

Solution• Oracle automatically decides if a statement

–Executes in parallel or not and what DOP it will use

–Can execute immediately or will be queued

–Will take advantage of aggregated cluster memory or not

<Insert Picture Here>

Automatic Degree of Parallelism

Automatic Degree of Parallelism

Business Requirement• Parallelism is completely manual

– Tuning typically required to determine ideal DOP– Generally reserved for well-defined workload(large SQL)– One DOP does not fit all queries touching an object

Solution• Oracle automatically decides if a statement

– Executes in parallel or not – What DOP the statement will use

Automatic Degree of Parallelism Auto DOP

• Statement with elapse times less than the threshold go serial

• Statement with elapse times greater than threshold are candidates for Parallel

• Optimizer derives the DOP for the statement based on resource requirements for all scans operations

• Applies to all types of statements Query, DML, or DDL• Explain plan has been enhanced to show DOP selected

How Auto Degree of Parallelism works

SQL statement

Statement is hard parsedAnd optimizer determines the execution plan

Statement executes in parallel

Actual DOP = MIN(PARALLEL_DEGREE_LIMIT, ideal DOP)

Statement executes serially

If estimated time less than threshold

Optimizer determines ideal DOP

If estimated time greater than threshold

Controlling Auto DOP

• Controlled by two init.ora parameters:– PARALLEL_DEGREE_POLICY

• Controls whether or not auto DOP will be used• Default is MANUAL which means no Auto DOP• Set to AUTO to enable auto DOP

– PARALLEL_MIN_TIME_THRESHOLD• Controls which statements are candidate for parallelism• Default is 10 seconds

– PARALLEL_DEGREE_LIMIT• Controls maximum DOP that can be used• Default is CPU meaning DEFAULT DOP

Explain plan enhancement for Auto DOP

Plan hash value: 2489314924ID Operation Name Rows Bytes Cost Time Pstart Pstop

0 Select Statement 96000 9889 5 00:00:01

1 PX COORDINATOR

2 PX SEND QC (RANDOM)

:TQ100 96000 9889 5 00:00:01

3 PX BLOCK ITERATOR 96000 9889 5 00:00:01

4 Table Access Full Sales 96000 9889 5 00:00:01 1 16

PLAN_TABLE_OUTPUT

Note- Computed Degree of Parallelism is 16 because of parallel threshold

<Insert Picture Here>

Parallel Statement Queuing

Parallel Statement Queuing

Business Requirement• With the introduction of Auto DOP

– More statements will run in parallel – Possible to exhaust all parallel execution server processes– Potential system thrashing due to too many processes

Solution• Parallel Statement Queuing

– Oracle automatically decides if a statement can execute immediately or not

– Prevents serializing parallel queries when parallel servers are not available

– Prevents system thrashing

Parallel Statement Queuing

• When a parallel statement starts checks if PX servers are available– Let it run if there are enough PX servers available– Queue the statement if there are not enough PX servers available

• Monitors RAC-wide availability of PX servers– Adaptive to dynamic environments

• Services– The service your session belongs to determines the limits on

queuing • Cluster reconfiguration

– Queue is aware of nodes leaving and joining the cluster and adjusts the limits accordingly

How Parallel Statement Queuing Works

SQL statements

Statement is parsedand oracle automatically determines DOP

If enough parallel servers available execute immediately

If not enough parallel servers available queue the statement

128163264

8

FIFO Queue

When the required number of parallel servers become available the first stmt on the queue is dequeued and executed

128

163264

Controlling Parallel Statement Queuing

• Enabled when PARALLEL_DEGREE_POLICY is set to AUTO• The Statement queue is enforced with a strict FIFO policy• PARALLEL_SERVER_TARGET indicates how many PX servers are

available to run queries before queuing kicks-in– Default values 4 X PARALLEL_THREADS_PER_CPU X CPU_COUNT– This a soft limit and does not replace PARALEL_MAX_SERVERS

PX server 1PX server 1-- 64 available 64 available

to run queries before to run queries before

queuing kicks inqueuing kicks in

Total PX servers availableTotal PX servers available

88

6464

160160

On an 8 CPU system

CPU Count

Parallel Server Target

Parallel Max Server

Controlling Parallel Statement Queuing

• Two new hints– To by-passes parallel statement queuing

SELECT /*+ NO_STMT_QUEUING */ col1 FROM foo;– To delay statement execution until resources are available without having

PARALLEL_DEGREE_POLICY is set to AUTOSELECT /*+ STMT_QUEUING */ col1 FROM foo;

• V$SQL_PLAN_MONITOR has a new status value for SQL that is queuedSELECT s.sql_id, s.sql_textFROM v$SQL_MONITOR m, v$SQL sWHERE m.status='QUEUED’AND m.sql_id = s.sql_id;

• Two new wait events– PX Queuing: Statement queue

• Indicates the first query in the queue– ENQ JX SQL statement queue

• All other queries in the queue wait on this enqueue

Monitoring Statement Queuing in EM

Awaiting screen shot from EMClock

symbol indicated a queued statement

Click on the SQL ID for more info

Monitoring Statement Queuing in EM

Wait event indicates stmt is at the head of the queue

Monitoring Statement Queuing in EM

Wait event indicates stmt is queued

<Insert Picture Here>

In-Memory Parallel Execution

In-Memory Parallel Execution

Business Requirement• Traditionally Parallel Execution takes advantage of

the IO capacity of a system• Disk speeds are not keeping up with Moore’s law

while CPU and Memory are

Solution• In-Memory Parallel Execution harness the memory

capacity of the entire system• Scan data nearly 10 X faster than scanning from disk

Prior to Oracle Database 11gR2 Parallel Execution and the buffer cache

Prior to Oracle Database 11gR2 Parallel Execution and the buffer cache

How In-Memory Parallel Execution Works

• Detect if the object fits in the aggregated buffer cache of the cluster– If so, distribute & affinitizes the blocks among the nodes and make

PQ aware of the affinity– If not, continue to by-pass the buffer cache and read directly from

disk

• Subsequent access to the object will be conducted only by PX servers on the node to each the data was affinitized

In-Memory Parallel Execution

How In-Memory Parallel Execution Works in detail

• Decision to use the buffer cache is based on set of heuristics including– Ratio between buffer cache size and object size– Frequency at which the object is accessed– How much the object changes between accesses

• In RAC fragments of the object are affinitized in the buffer cache on each of the active instances– Affinity is based on FileNumber and ExtentNumber unless hash

partitioned– Automatically prevents multiple instances reading the same data

from disk– Only PX process on the same RAC node can access each of the

fragments of the object

How In-Memory Parallel Execution Works

SQL statement

Determine the size of the table being looked at

Table is extremely Large

Always use direct read from disk

Table is a good candidate for In-Memory Parallel Execution

Fragments of Table are read into each node’s buffer cache

Only parallel server on the same RAC node will access each fragment

Read into the buffer cache on any node

Table is extremely small

Controlling In-Memory Parallel Execution

• Controlled by PARALLEL_DEGREE_POLICY– Active only when set to AUTO– No way to turn it off

<Insert Picture Here>

Summary

When to Introduce Auto Parallel Execution

PARALLEL_DEGREE_POLICY has three possible modes• Manual

– As before, DBA must manually specify all aspects of parallelism– No AUTO DOP, Stmt Queuing, In-Memory Parallel Execution– Useful for well-understood existing applications

• Limited– Restricted AUTO DOP for queries with tables decorated with default

PARALLEL – No Stmt Queuing, In-Memory Parallel Execution – Useful in a mixed-world environment when a limited number of

statements would benefit from parallel execution• Auto

– All qualified statements subject to executing in parallel– Statements can be queued– IN-memory PQ available– Useful when deploying new applications in 11g that would benefit from

parallel execution

New Parallel Execution Init.ora Parameters

Parameter Value Description

PARALLEL_DEGREE_LIMIT CPU Max DOP that can be selected with AUTO DOP

PARALLEL_DEGREE_POLICY MANUAL Specifies if AUTO DOP, Queuing, & In-memory PE will be enabled

PARALLEL_FORCE_LOCAL FALSE Restricts parallel server processes to the node where query is issued

PARALLEL_MIN_TIME_THRESHOLD AUTO Specifies min execution time a statement should have before AUTO DOP will kick in

PARALLEL_SERVERS_TARGET 4*CPU_COUNT* PARALLEL_THREAD

S_PER_CPU * ACTIVE_INSTANCES

Specifies # of parallel processes allowed to run parallel stmts before queuing will be use

PARALLEL_EXECUTION_MESSAGE_SIZE 16KB Specifies size of the message buffers used for communication

For More Information

search.oracle.com

Parallel Execution

Orhttp://www.oracle.com/technology/products/bi/db/11g/pdf/twp_parallel_execution_fundamentals_11gr2.pdf

Q & A