managing statistics for optimal query performance

119
Managing Statistics for Optimal Query Performance Karen Morton [email protected] OOW 2009 2009 October 13 1:00pm-2:00pm Moscone South Room 305

Upload: karen-morton

Post on 27-Jan-2015

4.202 views

Category:

Technology


0 download

DESCRIPTION

Half the battle of writing good SQL is in understanding how the Oracle query optimizer analyzes your code and applies statistics in order to derive the “best” execution plan. The other half of the battle is successfully applying that knowledge to the databases that you manage. The optimizer uses statistics as input to develop query execution plans, and so these statistics are the foundation of good plans. If the statistics supplied aren’t representative of your actual data, you can expect bad plans. However, if the statistics are representative of your data, then the optimizer will probably choose an optimal plan.

TRANSCRIPT

Page 1: Managing Statistics for Optimal Query Performance

Managing Statistics for Optimal Query Performance

Karen [email protected]

OOW 2009

2009 October 13

1:00pm-2:00pm

Moscone South Room 305

Page 2: Managing Statistics for Optimal Query Performance

Your speaker…

•Karen Morton

– Sr. Principal Database Engineer

– Educator, DBA, developer, consultant, researcher, author, speaker, …

•Come see me…

– karenmorton.blogspot.com

– An Oracle user group near you

Page 3: Managing Statistics for Optimal Query Performance

Mathor

Magic ?

Page 4: Managing Statistics for Optimal Query Performance

“I accept no responsibility forstatistics, which are a form of magic beyond my comprehension.”

— Robertson Davies

Page 5: Managing Statistics for Optimal Query Performance

SQL>desc deck

Name Null? Type

------------- -------- -------------

SUIT NOT NULL VARCHAR2(10)

CARD VARCHAR2(10)

COLOR VARCHAR2(5)

FACEVAL NOT NULL NUMBER(2)

Page 6: Managing Statistics for Optimal Query Performance

Table: DECK

Statistic Current value

--------------- -------------------

# rows 52

Blocks 5

Avg Row Len 20

Degree 1

Sample Size 52

Column Name NDV Nulls # Nulls Density Length Low Value High Value

----------- --- ----- ------- ------- ------ ---------- -----------

SUIT 4 N 0 .250000 8 Clubs Spades

CARD 13 Y 0 .076923 5 Ace Two

COLOR 2 Y 0 .500000 5 Black Red

FACEVAL 13 N 0 .076923 3 1 13

Index Name Col# Column Name Unique? Height Leaf Blks Distinct Keys

-------------- ----- ------------ ------- ------ ---------- -------------

DECK_PK 1 SUIT Y 1 1 52

2 FACEVAL

DECK_CARD_IDX 1 CARD N 1 1 13

DECK_COLOR_IDX 1 COLOR N 1 1 2

Page 7: Managing Statistics for Optimal Query Performance

Cardinality

The estimated number of rows

a query is expected to return.

number of rows in table

x

predicate selectivity

Page 8: Managing Statistics for Optimal Query Performance

select *

from deck

order by suit, faceval ;

Cardinality

52 x 1 = 52

Page 9: Managing Statistics for Optimal Query Performance

SQL>select * from deck order by suit, faceval ;

52 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3142028678

----------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

----------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 52 | 1040 | 2|

| 1 | TABLE ACCESS BY INDEX ROWID| DECK | 52 | 1040 | 2|

| 2 | INDEX FULL SCAN | DECK_PK | 52 | | 1|

----------------------------------------------------------------------

*

Page 10: Managing Statistics for Optimal Query Performance

select *

from deck

where color = 'Black' ;

Cardinality

52 x 1/2 = 26

Page 11: Managing Statistics for Optimal Query Performance

SQL>select * from deck where color = 'Black' ;

26 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1366616955

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 26 | 520 | 2|

| 1 | TABLE ACCESS BY INDEX ROWID| DECK | 26 | 520 | 2|

|* 2 | INDEX RANGE SCAN | DECK_COLOR_ID | 26 | | 1|

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("COLOR"='Black')

Page 12: Managing Statistics for Optimal Query Performance

select *

from deck

where card = 'Ace'

and suit = 'Spades' ;

Cardinality

52 x 1/13 x 1/4 = 1

Page 13: Managing Statistics for Optimal Query Performance

SQL>select *

2 from deck

3 where card = 'Ace'

4 and suit = 'Spades' ;

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 2030372774

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 20 | 2|

|* 1 | TABLE ACCESS BY INDEX ROWID| DECK | 1 | 20 | 2|

|* 2 | INDEX RANGE SCAN | DECK_CARD_IDX | 4 | | 1|

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("SUIT"='Spades')

2 - access("CARD"='Ace')

Page 14: Managing Statistics for Optimal Query Performance

select *

from deck

where faceval > 10 ;

52 x

High Value - Predicate Value

High Value - Low Value

(13 – 10)

(13 – 1)

Cardinality

= 13

Page 15: Managing Statistics for Optimal Query Performance

SQL>select *

2 from deck

3 where faceval > 10 ;

12 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1303963799

---------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

---------------------------------------------------------

| 0 | SELECT STATEMENT | | 13 | 260 | 3|

|* 1 | TABLE ACCESS FULL| DECK | 13 | 260 | 3|

---------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("FACEVAL">10)

Page 16: Managing Statistics for Optimal Query Performance

select *

from deck

where card = 'Ace' ;

Cardinality

52 x 1/13 = 4

Page 17: Managing Statistics for Optimal Query Performance

SQL>select * from deck where card = :b1 ;

4 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 2030372774

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 4 | 80 | 2|

| 1 | TABLE ACCESS BY INDEX ROWID| DECK | 4 | 80 | 2|

|* 2 | INDEX RANGE SCAN | DECK_CARD_IDX | 4 | | 1|

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("CARD"=:B1)

Page 18: Managing Statistics for Optimal Query Performance

Mathor

Magic ?

Maybe it's a little bit of both!

Page 19: Managing Statistics for Optimal Query Performance

What's the best methodfor collecting statistics?

Page 20: Managing Statistics for Optimal Query Performance

It depends.

Page 21: Managing Statistics for Optimal Query Performance

Statistics that don't reasonably describe your data

Page 22: Managing Statistics for Optimal Query Performance

…lead to poor cardinality estimates

Page 23: Managing Statistics for Optimal Query Performance

…which leads to poor access path selection

Page 24: Managing Statistics for Optimal Query Performance

…which leads to poor join method selection

Page 25: Managing Statistics for Optimal Query Performance

…which leads to poor join order selection

Page 26: Managing Statistics for Optimal Query Performance

…which leads to poor SQL execution times.

Page 27: Managing Statistics for Optimal Query Performance

Statistics matter!

Page 28: Managing Statistics for Optimal Query Performance

Automatic

Manual

vs

Page 29: Managing Statistics for Optimal Query Performance

AutomaticCollections

Page 30: Managing Statistics for Optimal Query Performance

Objects must changeby at least 10%

Page 31: Managing Statistics for Optimal Query Performance

Collection scheduledduring nightly

maintenance window

Page 32: Managing Statistics for Optimal Query Performance

dbms_statsgather_database_stats_job_proc

Page 33: Managing Statistics for Optimal Query Performance

Prioritizes collectionin order by objects

which most need updating

Page 34: Managing Statistics for Optimal Query Performance

Most functional whendata changes at a slow

to moderate rate

Page 35: Managing Statistics for Optimal Query Performance

Volatile tables andlarge bulk loads

are good candidatesfor manual collection

Page 36: Managing Statistics for Optimal Query Performance

Automaticdoesn't mean

accurate(for your data)

Page 37: Managing Statistics for Optimal Query Performance

AutomaticCollectionDefaults

Page 38: Managing Statistics for Optimal Query Performance

SQL>exec dbms_stats.gather_table_stats (ownname=>?, tabname=>?) ;

partname NULL

cascade DBMS_STATS.AUTO_CASCADE

estimate_percent DBMS_STATS.AUTO_SAMPLE_SIZE

stattab NULL

block_sample FALSE

statid NULL

method_opt FOR ALL COLUMNS SIZE AUTO

statown NULL

degree 1 or value based on number of CPUs

and initialization parameters

force FALSE

granularity AUTO (value is based on partitioning type)

no_invalidate DBMS_STATS.AUTO_INVALIDATE

Page 39: Managing Statistics for Optimal Query Performance

cascade=>

AUTO_CASCADE

Allow Oracle to determinewhether or not to

gather index statistics

Page 40: Managing Statistics for Optimal Query Performance

estimate_percent=>

AUTO_SAMPLE_SIZE

Allow Oracle to determinesample size

Page 41: Managing Statistics for Optimal Query Performance

method_opt=>

FOR ALL COLUMNS

SIZE AUTO

Allow Oracle to determine when to gather histogram statistics

SYS.COL_USAGE$

Page 42: Managing Statistics for Optimal Query Performance

no_invalidate=>

AUTO_INVALIDATE

Allow Oracle to determine when to invalidate dependent cursors

Page 43: Managing Statistics for Optimal Query Performance

Goal

Collect statistics that are"good enough" to meet

most needs most of the time.

Page 44: Managing Statistics for Optimal Query Performance

Say you were standing with onefoot in the oven and one foot inan ice bucket. According to the percentage people, you would

be perfectly comfortable.– Bobby Bragan

Page 45: Managing Statistics for Optimal Query Performance

Collections

Manual

Page 46: Managing Statistics for Optimal Query Performance

dbms_statsgather_*_stats

* = database, schema, table, index, etc.

Page 47: Managing Statistics for Optimal Query Performance

Is it common for your users to get slammed with performance problems shortly after statistics are updated?

Page 48: Managing Statistics for Optimal Query Performance

Does performance decline before a 10% data change occurs?

Page 49: Managing Statistics for Optimal Query Performance

Do low and high values for a column change significantly between automatic collections?

Page 50: Managing Statistics for Optimal Query Performance

Does your application performance seem "sensitive" to changing user counts as well as data volume changes?

Page 51: Managing Statistics for Optimal Query Performance

If you answered "Yes"to one or more ofthese questions...

Page 52: Managing Statistics for Optimal Query Performance

your application'sunique needs may

be best served withmanual collection.

Page 53: Managing Statistics for Optimal Query Performance

Test. Test. Test.

Page 54: Managing Statistics for Optimal Query Performance

DynamicSampling

Page 55: Managing Statistics for Optimal Query Performance

optimizer_dynamic_samplingparameter

dynamic_samplinghint

Page 56: Managing Statistics for Optimal Query Performance

SQL>create table depend_test as

2 select mod(num, 100) c1,

3 mod(num, 100) c2,

4 mod(num, 75) c3,

5 mod(num, 30) c4

6 from (select level num from dual

7 connect by level <= 10001);

Table created.

SQL>exec dbms_stats.gather_table_stats( user, 'depend_test',

estimate_percent => null, method_opt => 'for all columns size

1');

PL/SQL procedure successfully completed.

Page 57: Managing Statistics for Optimal Query Performance

Statistic Current value

--------------- --------------

# rows 10001

Blocks 28

Avg Row Len 11

Sample Size 10001

Monitoring YES

Column NDV Density AvgLen Histogram LowVal HighVal

------- --- ------- ------ --------- ------ -------

C1 100 .010000 3 NONE (1) 0 99

C2 100 .010000 3 NONE (1) 0 99

C3 75 .013333 3 NONE (1) 0 74

C4 30 .033333 3 NONE (1) 0 29

Page 58: Managing Statistics for Optimal Query Performance

SQL>set autotrace traceonly explain

SQL>select count(*) from depend_test where c1 = 10;

Execution Plan

----------------------------------------------------------

Plan hash value: 3984367388

----------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 3 | 8 (0)| 00:00:01 |

| 1 | SORT AGGREGATE | | 1 | 3 | | |

|* 2 | TABLE ACCESS FULL| DEPEND_TEST | 100 | 300 | 8 (0)| 00:00:01 |

----------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("C1"=10)

Page 59: Managing Statistics for Optimal Query Performance

SQL>set autotrace traceonly explain

SQL>select count(*) from depend_test where c1 = 10 and c2 = 10;

Execution Plan

----------------------------------------------------------

Plan hash value: 3984367388

----------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 |

| 1 | SORT AGGREGATE | | 1 | 6 | | |

|* 2 | TABLE ACCESS FULL| DEPEND_TEST | 1 | 6 | 8 (0)| 00:00:01 |

----------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("C1"=10 AND "C2"=10)

Page 60: Managing Statistics for Optimal Query Performance

SQL>set autotrace traceonly explain

SQL>select /*+ dynamic_sampling (4) */ count(*)

2 from depend_test where c1 = 10 and c2 = 10;

Execution Plan

----------------------------------------------------------

Plan hash value: 3984367388

----------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 |

| 1 | SORT AGGREGATE | | 1 | 6 | | |

|* 2 | TABLE ACCESS FULL| DEPEND_TEST | 100 | 600 | 8 (0)| 00:00:01 |

----------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("C1"=10 AND "C2"=10)

Note

-----

- dynamic sampling used for this statement

Page 61: Managing Statistics for Optimal Query Performance

11g Extended Statistics

Page 62: Managing Statistics for Optimal Query Performance

SQL> select dbms_stats.create_extended_stats(ownname=>user,

2 tabname => 'DEPEND_TEST',

3 extension => '(c1, c2)' ) AS c1_c2_correlation

4 from dual ;

C1_C2_CORRELATION

-------------------------------------------------------------

SYS_STUF3GLKIOP5F4B0BTTCFTMX0W

SQL> exec dbms_stats.gather_table_stats( user, 'depend_test');

PL/SQL procedure successfully completed.

Page 63: Managing Statistics for Optimal Query Performance

SQL> set autotrace traceonly explain

SQL> select count(*) from depend_test where c1 = 10 and c2 = 10;

Execution Plan

----------------------------------------------------------

Plan hash value: 3984367388

----------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 6 | 9 (0)| 00:00:01 |

| 1 | SORT AGGREGATE | | 1 | 6 | | |

|* 2 | TABLE ACCESS FULL| DEPEND_TEST | 100 | 600 | 9 (0)| 00:00:01 |

----------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("C1"=10 AND "C2"=10)

Page 64: Managing Statistics for Optimal Query Performance

SettingStatistics

Page 65: Managing Statistics for Optimal Query Performance

dbms_statsset_column_statsset_index_statsset_table_stats

Page 66: Managing Statistics for Optimal Query Performance

It's OK.

Really.

Page 67: Managing Statistics for Optimal Query Performance

Why guess(i.e. gather stats)when you know!

Page 68: Managing Statistics for Optimal Query Performance

CommonPerformance

Problems

Page 69: Managing Statistics for Optimal Query Performance

3 areas where

non-representativestatistics cause problems

Page 70: Managing Statistics for Optimal Query Performance

Data Skew

1

Page 71: Managing Statistics for Optimal Query Performance

The optimizer assumesuniform distribution

of column values.

Page 72: Managing Statistics for Optimal Query Performance

Color column - uniform distribution

Page 73: Managing Statistics for Optimal Query Performance

Color column – skewed distribution

Page 74: Managing Statistics for Optimal Query Performance

Data skew must beidentified witha histogram.

Page 75: Managing Statistics for Optimal Query Performance

Table: obj_tab

Statistic Current value

--------------- --------------

# rows 1601874

Blocks 22321

Avg Row Len 94

Sample Size 1601874

Monitoring YES

Column: object_type (has 36 distinct values)

OBJECT_TYPE PCT_TOTAL

------------------------------- ---------

WINDOW GROUP - PROGRAM .00-.02

EVALUATION CONTEXT - XML SCHEMA .03-.05

OPERATOR - PROCEDURE .11-.17

LIBRARY - TYPE BODY .30-.35

FUNCTION - INDEX PARTITION .54-.64

JAVA RESOURCE - PACKAGE 1.54-1.69

TABLE - VIEW 3.44-7.35

JAVA CLASS 32.80

SYNONYM 40.01

100% Statistics

FOR ALL COLUMNS SIZE 1

Page 76: Managing Statistics for Optimal Query Performance

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

SQL_ID 16yy3p8sstr28, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,

status from obj_tab where object_type = 'PROCEDURE'

Plan hash value: 2862749165

--------------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

--------------------------------------------------------------------------------

| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 2720 | 1237 |

|* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 2720 | 193 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"='PROCEDURE')

R = .06 seconds

E-Rows = 1/36 x 1,601,874

Page 77: Managing Statistics for Optimal Query Performance

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

SQL_ID 9u6ppkh5mhr8v, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,

status from obj_tab where object_type = 'SYNONYM'

Plan hash value: 2862749165

--------------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

--------------------------------------------------------------------------------

| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 640K| 104K|

|* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 640K| 44082 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"='SYNONYM')

R = 14.25 seconds

E-Rows = 1/36 x 1,601,874

Page 78: Managing Statistics for Optimal Query Performance

Re-collect statistics

100%

FOR ALL COLUMNS SIZE AUTO

Page 79: Managing Statistics for Optimal Query Performance

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

SQL_ID 16yy3p8sstr28, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,

status from obj_tab where object_type = 'PROCEDURE'

Plan hash value: 2862749165

--------------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

--------------------------------------------------------------------------------

| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 2720 | 2720 | 1237 |

|* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 2720 | 2720 | 193 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"='PROCEDURE')

R = .07 seconds

E-Rows = histogram x 1,601,874

Page 80: Managing Statistics for Optimal Query Performance

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

SQL_ID 9u6ppkh5mhr8v, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,

status from obj_tab where object_type = 'SYNONYM'

Plan hash value: 2748991475

-----------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

-----------------------------------------------------------------

|* 1 | TABLE ACCESS FULL| OBJ_TAB | 640K| 640K| 64263 |

-----------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("OBJECT_TYPE"='SYNONYM')

R = 3.36 seconds

E-Rows = histogram x 1,601,874

vs 14.25 seconds

Page 81: Managing Statistics for Optimal Query Performance

Histograms areimportant for more

reasons than justhelping determine

the access method.

Page 82: Managing Statistics for Optimal Query Performance

2

Bind Peeking

Page 83: Managing Statistics for Optimal Query Performance

During hard parse, the optimizer"peeks" at the bind value and

uses it to determine the execution plan.

Page 84: Managing Statistics for Optimal Query Performance

But, what if yourdata is skewed?

11g

Page 85: Managing Statistics for Optimal Query Performance

SQL> variable objtype varchar2(19)

SQL> exec :objtype := 'PROCEDURE';

PL/SQL procedure successfully completed.

SQL> select /*+ gather_plan_statistics */ count(*) ct

2 from big_tab

3 where object_type = :objtype ;

CT

---------------

4416

1 row selected.

SQL>

SQL> select * from table

(dbms_xplan.display_cursor('211078a9adzak',0,'ALLSTATS LAST'));

Page 86: Managing Statistics for Optimal Query Performance

PLAN_TABLE_OUTPUT

---------------------------------------

SQL_ID 211078a9adzak, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ count(*) ct from big_tab where

object_type = :objtype

Plan hash value: 154074842

-------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

-------------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 16 |

|* 2 | INDEX RANGE SCAN| BIG_OBJTYPE_IDX | 4416 | 4416 | 16 |

-------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"=:OBJTYPE)

Page 87: Managing Statistics for Optimal Query Performance

SQL> select child_number, executions, buffer_gets,

2 is_bind_sensitive, is_bind_aware, is_shareable

3 from v$sql where sql_id = '211078a9adzak' ;

CHILD_NUMBER = 0

EXECUTIONS = 1

BUFFER_GETS = 16

IS_BIND_SENSITIVE = N

IS_BIND_AWARE = N

IS_SHAREABLE = Y

Page 88: Managing Statistics for Optimal Query Performance

SQL> variable objtype varchar2(19)

SQL> exec :objtype := 'SYNONYM';

PL/SQL procedure successfully completed.

SQL> select /*+ gather_plan_statistics */ count(*) ct

2 from big_tab

3 where object_type = :objtype ;

CT

----------------

854176

1 row selected.

SQL>

SQL> select * from table

(dbms_xplan.display_cursor('211078a9adzak',0,'ALLSTATS LAST'));

Page 89: Managing Statistics for Optimal Query Performance

PLAN_TABLE_OUTPUT

---------------------------------------

SQL_ID 211078a9adzak, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ count(*) ct from big_tab where

object_type = :objtype

Plan hash value: 154074842

-------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

-------------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 2263 |

|* 2 | INDEX RANGE SCAN| BIG_OBJTYPE_IDX | 4416 | 854K | 2263 |

-------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"=:OBJTYPE)

Page 90: Managing Statistics for Optimal Query Performance

SQL> select child_number, executions, buffer_gets,

2 is_bind_sensitive, is_bind_aware, is_shareable

3 from v$sql where sql_id = '211078a9adzak' ;

CHILD_NUMBER = 0

EXECUTIONS = 2

BUFFER_GETS = 2279 (2263 + 16)

IS_BIND_SENSITIVE = Y

IS_BIND_AWARE = N

IS_SHAREABLE = Y

Page 91: Managing Statistics for Optimal Query Performance

SQL> variable objtype varchar2(19)

SQL> exec :objtype := 'SYNONYM';

PL/SQL procedure successfully completed.

PLAN_TABLE_OUTPUT

---------------------------------------

SQL_ID 211078a9adzak, child number 1

-------------------------------------

select /*+ gather_plan_statistics */ count(*) ct from big_tab where

object_type = :objtype

Plan hash value: 1315022418

-----------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

-----------------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 6016 |

|* 2 | INDEX FAST FULL SCAN| BIG_OBJTYPE_IDX | 854K | 854K | 6016 |

-----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"=:OBJTYPE)

Page 92: Managing Statistics for Optimal Query Performance

SQL> select child_number, executions, buffer_gets,

2 is_bind_sensitive, is_bind_aware, is_shareable

3 from v$sql where sql_id = '211078a9adzak' ;

CHILD_NUMBER = 0

EXECUTIONS = 2

BUFFER_GETS = 2279

IS_BIND_SENSITIVE = Y

IS_BIND_AWARE = N

IS_SHAREABLE = N

CHILD_NUMBER = 1

EXECUTIONS = 1

BUFFER_GETS = 6016

IS_BIND_SENSITIVE = Y

IS_BIND_AWARE = Y

IS_SHAREABLE = Y

Page 93: Managing Statistics for Optimal Query Performance

SQL> variable objtype varchar2(19)

SQL> exec :objtype := 'PROCEDURE';

PL/SQL procedure successfully completed.

PLAN_TABLE_OUTPUT

---------------------------------------

SQL_ID 211078a9adzak, child number 2

-------------------------------------

select /*+ gather_plan_statistics */ count(*) ct from big_tab where

object_type = :objtype

Plan hash value: 154074842

-------------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

-------------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 16 |

|* 2 | INDEX RANGE SCAN| BIG_OBJTYPE_IDX | 4416 | 4416 | 16 |

-------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_TYPE"=:OBJTYPE)

Page 94: Managing Statistics for Optimal Query Performance

SQL> select child_number, executions, buffer_gets,

2 is_bind_sensitive, is_bind_aware, is_shareable

3 from v$sql where sql_id = '211078a9adzak' ;

CHILD_NUMBER = 0

EXECUTIONS = 2

BUFFER_GETS = 2279

IS_BIND_SENSITIVE = Y

IS_BIND_AWARE = N

IS_SHAREABLE = N

CHILD_NUMBER = 1

EXECUTIONS = 1

BUFFER_GETS = 6016

IS_BIND_SENSITIVE = Y

IS_BIND_AWARE = Y

IS_SHAREABLE = Y

CHILD_NUMBER = 2

EXECUTIONS = 1

BUFFER_GETS = 16

IS_BIND_SENSITIVE = Y

IS_BIND_AWARE = Y

IS_SHAREABLE = Y

Page 95: Managing Statistics for Optimal Query Performance

10g will create only 1 plan.

11g will create plans as neededto cover data skew.

Page 96: Managing Statistics for Optimal Query Performance

Handling bind peekingis more of a coding issue

than a statistics issue.

Page 97: Managing Statistics for Optimal Query Performance

Incorrect

High and Low

Values

3

Page 98: Managing Statistics for Optimal Query Performance

To derive the cardinality estimate for range predicates,

the optimizer uses the low and high value statistics.

Page 99: Managing Statistics for Optimal Query Performance

Table: hi_lo_t

Statistic Current value

--------------- ---------------------

# rows 100000

Blocks 180

Avg Row Len 7

Sample Size 100000

Monitoring YES

Column NDV Nulls Density AvgLen Histogram LowVal HighVal

------- ------ ----- ------- ------ --------- ------ -------

A 100000 N .000010 5 NONE (1) 10 100009

B 10 Y .100000 3 NONE (1) 9 18

100% Statistics

FOR ALL COLUMNS SIZE 1

Page 100: Managing Statistics for Optimal Query Performance

select count(a)

from hi_lo_t

where b < 11 ;

11 – 9

18 – 9

100000 rows x .22222 = 22222

Predicate value – Low value

High value – Low value( )

Page 101: Managing Statistics for Optimal Query Performance

select count(a) from hi_lo_t where b < 11

Plan hash value: 3307858660

------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 184 |

|* 2 | TABLE ACCESS FULL| HI_LO_T | 22222 | 20000 | 184 |

------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("B"<11)

Page 102: Managing Statistics for Optimal Query Performance

select count(a)

from hi_lo_t

where b < 4 ;

4 – 9

18 – 9

100000 rows x .04444 = 4444

Predicate value – Low value

High value – Low value( ).10 x 1 +

Page 103: Managing Statistics for Optimal Query Performance

select count(a) from hi_lo_t where b < 4

Plan hash value: 3307858660

------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 184 |

|* 2 | TABLE ACCESS FULL| HI_LO_T | 4444 | 0 | 184 |

------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("B"<4)

Page 104: Managing Statistics for Optimal Query Performance

METHOD_OPT=>

'FOR ALL INDEXED COLUMNS'

Be cautious of using this!

Page 105: Managing Statistics for Optimal Query Performance

If column is not indexed,no statistics are collected.

Page 106: Managing Statistics for Optimal Query Performance

select count(a) from hi_lo_t where b = 12

Plan hash value: 3307858660

------------------------------------------------------------------

| Id | Operation | Name | E-Rows | A-Rows | Buffers |

------------------------------------------------------------------

| 1 | SORT AGGREGATE | | 1 | 1 | 184 |

|* 2 | TABLE ACCESS FULL| HI_LO_T | 1000 | 10000 | 184 |

------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("B"=12)

Without statistics, a 10% default is used.

Page 107: Managing Statistics for Optimal Query Performance

Result:

Cardinality estimates that are orders of magnitude "off"

Page 108: Managing Statistics for Optimal Query Performance

Conclusion

Page 109: Managing Statistics for Optimal Query Performance

Why guess when you can know.

Page 110: Managing Statistics for Optimal Query Performance

Thoroughly test anddocument your

statistics collectionstrategy.

Page 111: Managing Statistics for Optimal Query Performance

Check default optionsparticularly when

upgrading.

Things change.

Page 112: Managing Statistics for Optimal Query Performance

Regularly checkstatistics and compareto previous collections

for any anomalies.

10.2.0.4 and abovedbms_stats.diff_table_stats_*

Page 113: Managing Statistics for Optimal Query Performance

Don't ignoreyour data.

Page 114: Managing Statistics for Optimal Query Performance

There is nosingle strategy

that works bestfor everyone.

Page 115: Managing Statistics for Optimal Query Performance

Statistics must reasonablyrepresent your actual data.

Page 116: Managing Statistics for Optimal Query Performance

Understanding basicoptimizer statistics

computations is key.

Page 117: Managing Statistics for Optimal Query Performance

The more you know,the more likely you

are to succeed.

Page 118: Managing Statistics for Optimal Query Performance

Thank You!

Page 119: Managing Statistics for Optimal Query Performance

Q U E S T I O N S

A N S W E R S