www.db-nemec.com j.nemecare your statistics bad enough?1 verify the effectiveness of gathering...

33
J.Nemec Are Your Statistics Bad Enough? 1 www.db-nemec.com Are Your Statistics Bad Enough? Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

Upload: martha-turner

Post on 29-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 1

www.db-nemec.com

Are Your Statistics Bad Enough?

Verify the effectiveness of gathering optimizer statistics

Jaromir D.B. Nemec

UKOUG 2005

27.10.2005

Page 2: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 2

www.db-nemec.com

Gathering Statistics

The way from ANALYZE TABLE to DBMS_STATS

• Increasing number of parameters and options • changing defaults • CBO more intelligent• a “small change” can have a big effect

. . . and all that stuff only to get some redundant information

Page 3: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 3

www.db-nemec.com

Complexity Trap

In the chain of dependent objects the quantity is increasing

Page 4: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 4

www.db-nemec.com

Are My Statistics Adequate?

Way 1 - observe (they are not if you encounter problems)

This is reactive approach

Way 2 - test it (define test cases and check them)

define invariant the invariant must respect the dynamic of the database!

Page 5: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 5

www.db-nemec.com

Basis Scenario

Motto: fix the problem in its origin

Limit the scope to the basic table access only (full scan, index access)

This reduces the complexity significantly andcan expose the real origin of problems in the execution plan

Page 6: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 6

www.db-nemec.com

Simple Exampleselect * from ta, tb where ta.rand = tb.rand andta.norm = 141 and ta.rand <= 770 and(tb.norm = 1 or tb.norm = -1)

---------------------------------------------------------| Id | Operation | Name | Rows |---------------------------------------------------------| 0 | SELECT STATEMENT | | 65073 ||* 1 | HASH JOIN | | 65073 ||* 2 | TABLE ACCESS BY INDEX ROWID| TA | 2771 ||* 3 | INDEX RANGE SCAN | TA_INDEX | 3598 ||* 4 | TABLE ACCESS FULL | TB | 23580 |--------------------------------------------------------- Predicate Information (identified by operation id):--------------------------------------------------- 1 - access("TA"."RAND"="TB"."RAND") 2 - filter("TA"."RAND"<=770) 3 - access("TA"."NORM"=141) 4 - filter(("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770)

Page 7: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 7

www.db-nemec.com

Verify it! Re-estimate

For each table access• Get owner, object name and alias• Option FULL – get filter predicate• Option (b-tree) INDEX – get access and filter predicate on table and index line• enclose in brackets and connect with AND• form a re-estimation query• execute the query

Page 8: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 8

www.db-nemec.com

Re-estimation – how does it work?

Input from execution plan|* 2 | TABLE ACCESS BY INDEX ROWID| TA | 2771 ||* 3 | INDEX RANGE SCAN | TA_INDEX | 3598 ||* 4 | TABLE ACCESS FULL | TB | 23580 |--------------------------------------------------------- 2 - filter("TA"."RAND"<=770) 3 - access("TA"."NORM"=141) 4 - filter(("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770)

Generated re-estimation queryselect count(*) from "TB" where (("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770); -- gives 60317 rows

select count(*) from "TA" where ("TA"."NORM"=141) and ("TA"."RAND"<=770); -- gives 1 row

Run the re-estimation query to get the real cardinality, compare it with the original estimation

Page 9: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 9

www.db-nemec.com

Re-estimation – Summary

Run the re-estimation query to get the real cardinality, compare it with the original estimation

Query TA Query TBEstimation 2771 23580real cardinality 1 60317

Note, that the TA access cardinality is overestimated The TB access cardinality is underestimated

Page 10: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 10

www.db-nemec.com

Re-estimation Basic Idea

The simplification is based on the assumption that all problems in the execution plan are caused directly or indirectly by the cardinality assumption in the base table access.

Examples

• Cardinality underestimation can lead to NL join with big inner table• Cardinality overestimation can switch a NL join to a SORT MERGE join • improper assumption of cardinality 1 can lead to CARTESIAN join

Page 11: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 11

www.db-nemec.com

Access and Filter Predicates

Access Predicate – controls the index accessFilter Predicate – additional filter condition of index of table Availability of predicates

Oracle Version

Predicates available

8 on demand only (in trace)

9 current status (for cached cursors) V$SQL_PLAN

10 historical overview for a controlled period of time – AWR (view DBA_HIST_SQL_PLAN )

Page 12: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 12

www.db-nemec.com

Hypothesis

By selecting a limited set of objects and corresponding predicates we can try to find an answer to some fundamental questions: • are my statistics precise enough?• are my statistics up-to-date?

• does a histogram help?• is the sample size OK?• is dynamic sampling relevant?• is the partition design suited for the application?• ready for migration?

Page 13: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 13

www.db-nemec.com

Are My Statistics Precise Enough?

Comparing the estimated and re-estimated cardinality we obtain the relative and absolute difference

Predicate - where (("TB"."NORM"=(-1) OR "TB"."NORM"=1) AND "TB"."RAND"<=770)

ACCE EST_CARD RE_EST_CARD ABS_DIFF REL_DIFF---- ---------- ----------- ---------- ----------same 23580 60317 36737 61Predicate - where ("TA"."NORM"=141) and ("TA"."RAND"<=770)

ACCE EST_CARD RE_EST_CARD ABS_DIFF REL_DIFF ---- ---------- ----------- ---------- ----------same 2771 1 -2770 -277000

Page 14: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 14

www.db-nemec.com

Are My Statistics Up-to-date?

By observing the historical view we obtain the dynamic aspect of predicate selectivity

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

01.0

1.20

05

15.0

1.20

05

29.0

1.20

05

12.0

2.20

05

26.0

2.20

05

12.0

3.20

05

26.0

3.20

05

09.0

4.20

05

23.0

4.20

05

time

card

inal

ity

est_card

re_est_card

Page 15: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 15

www.db-nemec.com

Processing

Two basic steps are extracting and processing of the predicate information for execution plans

exampleadministrativetask

operativetask

Page 16: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 16

www.db-nemec.com

Simple Model

Page 17: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 17

www.db-nemec.com

Does a Histogram Help? Histogram candidates:

• column is used in predicate• bad estimation

• access with different values• real cardinality is in wide range

Test the influence of histogram size on the precision of selectivity estimation using re-estimation queries.

Page 18: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 18

www.db-nemec.com

Histogram / 2

Without a histogram (SIZE=1) the cardinality is underestimated (average).If the histogram size is higher than the number of distinct values the estimation is very precise.For columns with a high number of distinct values the estimation remains imprecise.

Target precision

Predicate with High Cardinlity

0,00

0,20

0,40

0,60

0,80

1,00

1,20

1,40

1 10 50 100 254

histogram size

num dist 29

num dist 265

num dist 2369

Page 19: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 19

www.db-nemec.com

Sample SizeRe-estimate a predicate on statistics gathered with different estimate_percent parameter

observe the influence of the sample size

Possible results:• relatively small percentage returns good results• when increasing estimate percent the error is reduced• there is marginal effect of the sample size on the precision

Q: is there one optimal single sample size for an object?

Page 20: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 20

www.db-nemec.com

Re-estimation and Dynamic Sampling

• add a dynamic sampling hint to the re-estimation query• trace with 10053 event

select /*+ dynamic_sampling(tc,2) */ * from tc where norm300 = 0;

Generated DS query

SELECT /* some hints */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ FULL("TC") some other hints */ 1 AS C1, CASE WHEN "TC"."NORM300"=0 THEN 1 ELSE 0 END AS C2 FROM "TC" SAMPLE BLOCK (1.076923 , 1) SEED (1) "TC") SAMPLESUB

Total rows in sample

rows satisfying the predicate

Sample size controlled by level

Page 21: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 21

www.db-nemec.com

Re-estimation and Dynamic Sampling / 2

Results of DS query

level : 2sample pct. : 50.806452actual sample size : 54147filtered sample card. : 6018orig. card. : 111358single table dynamic sel. est. : 0.11114189

DS blocks / table blocks

Total rows in sample

rows satisfying the predicate

Computed selectivityfiltered / actual rows

(= satisfying / total rows)

Page 22: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 22

www.db-nemec.com

Re-estimation and Dynamic Sampling / 3

Dynamic sampling has a neat positive side effect• Level 10 performs exactly the re-estimation no need to execute the re-estimation query, parsing with DS level 10 returns exact result

• Lower levels of DS are relatively cheap and can give some hints

Page 23: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 23

www.db-nemec.com

Dynamic Sampling – Levels

Dynamic sampling levels are absolute (except for level 10)

Even for a moderate table the estimate percent of DS could be too lowDS uses block sampling only

table size level 1 level 2 level 5 level 9 level 10DS blocks 32 64 512 8192 all1MB 25,00% 50,00% 100,00% 100,00% 100,00%10 MB 2,50% 5,00% 40,00% 100,00% 100,00%100 MB 0,25% 0,50% 4,00% 64,00% 100,00%1GB 0,02% 0,05% 0,39% 6,25% 100,00%

Block size 8KB

Page 24: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 24

www.db-nemec.com

Re-estimation – Final Approach

For a selected predicate• build re-estimation query• parse it with different dynamic sampling hints (e.g. 0, 2, 5 and 10)• peek in the PLAN_TABLE and get the cardinality and access option

Predicate where part_key = 3 and GROW_SHIF > 0 and shif = 0ACCE EST_CARD EST_DS2_CARD EST_DS5_CARD RE_EST_CARD DIFF DIFF_PREC---- ---------- ------------ ------------ ----------- ---------- ----------same 7982 94 116 89 -7893 -8869

Predicate where part_key = 2 and GROW = 100ACCE EST_CARD EST_DS2_CARD EST_DS5_CARD RE_EST_CARD DIFF DIFF_PREC---- ---------- ------------ ------------ ----------- ---------- ----------same 4281 13 2 0 -4281 -428100

Page 25: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 25

www.db-nemec.com

Estimation of Lower Levels of Dynamic Sampling

Two extreme cases can be observed while re-estimation on lower levels of DS.• Dynamic sampling is extremely precise even on lower levels• Dynamic sampling doesn’t “work” until a high level

Real Cardinality

Estimated Cardinality

DS Level 0 5 10

Page 26: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 26

www.db-nemec.com

Evolution of Dynamic Sampling

Consider that dynamic sampling itself is subject of evolution

dynamic sampling on a predicate with index:

9i rel 2 – FULL TABLE scan with SAMPLE clause

10g rel 1 – INDEX ACCESS with ROWNUM <= 2500

10g rel 2 – FULL TABLE scan with SAMPLE clause

Page 27: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 27

www.db-nemec.com

Re-estimation and Partitioning

• Constant pstart and pstop values • Add predicates for range and list partitions / subpartitions

--------------------------------- -----------------| Operation | Name | | Pstart| Pstop |--------------------------------- -----------------| SELECT STATEMENT | | | | || PARTITION RANGE SINGLE| | | 2 | 2 || PARTITION LIST SINGLE| | | 3 | 3 || TABLE ACCESS FULL | T2S | | 6 | 6 |---------------------------------- ----------------

• Verify granularity parameter• Verify partition pruning

Page 28: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 28

www.db-nemec.com

Predicates in Parallel Plans

For parallel execution plans no additional processing by generating re-estimation queries is required

| Id | Operation | Name | Rows |----------------------------------------------------| 0 | SELECT STATEMENT | | 82647 || 1 | PX COORDINATOR | | || 2 | PX SEND QC (RANDOM) | :TQ10001 | 82647 ||* 3 | HASH JOIN | | 82647 || 4 | PX BLOCK ITERATOR | | 3519 ||* 5 | TABLE ACCESS FULL | TA | 3519 || 6 | BUFFER SORT | | || 7 | PX RECEIVE | | 23580 || 8 | PX SEND BROADCAST | :TQ10000 | 23580 |

|* 9 | TABLE ACCESS FULL| TB | 23580 |

Page 29: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 29

www.db-nemec.com

Bitmap Index Access

Generating of re-estimation queries for bitmap index access must respect the access logic1 | TABLE ACCESS BY INDEX ROWID | T1 | 13 |2 | BITMAP CONVERSION TO ROWIDS| | |3 | BITMAP AND | | |4 | BITMAP INDEX SINGLE VALUE| IND2_T1 | |5 | BITMAP INDEX SINGLE VALUE| IND1_T1 | | Predicate Information (identified by operation id):--------------------------------------------------- 4 - access("T1"."GROUP2_ID"=28) 5 - access("T1"."STATUS"='X')

Generated predicate("T1"."GROUP2_ID"=28) AND ("T1"."STATUS"='X')

The bitmap index cardinality is missing in the execution plan

Page 30: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 30

www.db-nemec.com

Ready for Upgrade or Crash?

How can the behavior of an application in migrated environment be predicted?

A sophisticated but expensive method is to compare execution plans

Re-estimation method can be used as a lightweight timely warning system

Page 31: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 31

www.db-nemec.com

Some Limitations

Limited usage on predicates with bind variables and parse time partition pruningPredicates stored in ACCESS_PREDICATES and FILTER_PREDICATES are not always completeA single predicate is limited to the length of 4000 bytes

Further Work

Weighted function for the result of re-estimationRelevance to SQL profile

Page 32: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 32

www.db-nemec.com

Summary

• Method to verify the “status” of database statistics• Introduction of “predicate mining”• A green bar for database statistics• Dynamic sampling may be used as a verification method

Page 33: Www.db-nemec.com J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG 2005 27.10.2005

J.Nemec Are Your Statistics Bad Enough? 33

www.db-nemec.com

References

Christian Antognini, CBO: A Configuration Roadmap, Hotsos Symposium 2005Wolfgang Breitling, Using DBMS_STATS in Access Path Optimization, UKOUG 2004Jonathan Lewis, Strategies for Statistics, UKOUG 2004

metalink72539.1 Interpreting Histogram Information 114671.1 Gathering Statistics for the Cost Based Optimizer 236935.1 Global statistics - An Explanation