db2 application tuning

Upload: srnathan19849169

Post on 08-Apr-2018

225 views

Category:

Documents


2 download

TRANSCRIPT

  • 8/7/2019 DB2 Application Tuning

    1/38

    2009 Wipro Ltd - Confidential

    DB2 SQL TUNING

  • 8/7/2019 DB2 Application Tuning

    2/38

    2009 Wipro Ltd - Confidential2

    Topics

    General Tuning Recommendation Predicates Evaluation Using DB2 EXPLAIN Different Access Types Join Methods DB2 Data and Index page prefetch Sorting of Data and RIDs Special Techniques to influence access path

  • 8/7/2019 DB2 Application Tuning

    3/38

    2009 Wipro Ltd - Confidential3

    General Recommendation

    Make Sure Queries are simple Unused rows, columns are not fetched There is no unnecessary ORDER BY or GROUP BY Clause Minimize lock duration No redundant predicates

  • 8/7/2019 DB2 Application Tuning

    4/38

    2009 Wipro Ltd - Confidential4

    Are there subqueries in a query?If efficient indexes are available on the tables in the subquery, then a correlated subquery islikely to be the most efficient kind of subquery.If no efficient indexes are available on the tables in the subquery, then a noncorrelated

    subquery would be likely to perform better.If multiple subqueries are in any parent query, make sure that the subqueries are ordered inthe most efficient manner.

    Example: Assume that MAIN_TABLE has 1000 rows:SELECT * FROM MAIN_TABLEWHERE TYPE IN (subquery 1) ANDPARTS IN (subquery 2);

  • 8/7/2019 DB2 Application Tuning

    5/38

    2009 Wipro Ltd - Confidential5

    Continue..

    Assuming that subquery 1 and subquery 2 are the same type of subquery(either correlated or noncorrelated) and the subqueries are stage 2, DB2evaluates the subquery predicates in the order they appear in the WHEREclause. Subquery 1 rejects 10% of the total rows, and subquery 2 rejects80% of the total rows.

    The predicate in subquery 1 (which is referred to as P1) is evaluated 1000times, and the predicate in subquery 2 (which is referred to as P2) isevaluated 900 times, for a total of 1900 predicate checks. However, if theorder of the subquery predicates is reversed, P2 is evaluated 1000 times,but P1 is evaluated only 200 times, for a total of 1200 predicate checks.

    Coding P2 before P1 appears to be more efficient if P1 and P2 take anequal amount of time to execute. However, if P1 is 100 times faster toevaluate than P2, then coding subquery 1 first might be advisable.

    In general subquery predicates can potentially be thousands of times moreprocessor- and I/O-intensive than all other predicates, the order of subquerypredicates is particularly important.

    Regardless of coding order, DB2 performs noncorrelated subquerypredicates before correlated subquery predicates, unless the subquery istransformed into a join.

  • 8/7/2019 DB2 Application Tuning

    6/38

    2009 Wipro Ltd - Confidential6

    Does query involve aggregate functions?

    If a query involves aggregate functions, make sure that they are coded as simply aspossible; this increases the chances that they will be evaluated when the data isretrieved, rather than afterward. In general, a aggregate function performs bestwhen evaluated during data access and next best when evaluated during DB2 sort.Least preferable is to have a aggregate function evaluated after the data has beenretrieved.

  • 8/7/2019 DB2 Application Tuning

    7/38

    2009 Wipro Ltd - Confidential7

    Continue..

    No sort is needed for GROUP BY. Check this in the EXPLAIN output. No stage 2 (residual) predicates exist. No distinct set functions exist, such as COUNT(DISTINCT C1). If the query is a join, all set functions must be on the last table joined. All aggregate functions must be on single columns with no arithmetic

    expressions. The aggregate function is not one of the following aggregate functions:

    STDDEV STDDEV_SAMP VAR VAR_SAMP

    Does a query have an input variable in the predicate?

    When host variables or parameter markers are used in a query, theactual values are not known when bind the package or plan thatcontains the query. DB2 therefore uses a default filter factor todetermine the best access path for an SQL statement.

  • 8/7/2019 DB2 Application Tuning

    8/38

    2009 Wipro Ltd - Confidential8

    Does a query have a problem with column correlation?

    Two columns in a table are said to be correlated if the values in the columns do notvary independently. DB2 might not determine the best access path when yourqueries include correlated columns.

    Can a query be written to use a noncolumn expression?The following predicate combines a column, SALARY, with values that are not fromcolumns on one side of the operator:

    WHERE SALARY + (:hv1 * SALARY) > 50000

    If you rewrite the predicate in the following way, DB2 can evaluate it more efficiently:WHERE SALARY > 50000/(1 + :hv1) In the second form, the column is by itself onone side of the operator, and all the other values are on the other side of theoperator. The expression on the right is called a noncolumn expression . DB2 canevaluate many predicates with noncolumn expressions at an earlier stage ofprocessing called stage 1 , so the queries take less time to run.

    Can materialized query tables help your query performance?Dynamic queries that operate on very large amounts of data and involve multiple joins mighttake a long time to run. One way to improve the performance of these queries is to generate theresults of all or parts of the queries in advance, and store the results in materialized query tables .Materialized query tables are user-created tables. Depending on how the tables are defined,they are user-maintained or system-maintained. If you have set subsystem parameters or anapplication sets special registers to tell DB2 to use materialized query tables, when DB2executes a dynamic query, DB2 uses the contents of applicable materialized query tables ifDB2 finds a performance advantage to doing so.

  • 8/7/2019 DB2 Application Tuning

    9/38

    2009 Wipro Ltd - Confidential9

    Does the query contain encrypted data?Encryption and decryption can degrade the performance of some queries.Encryption, by its nature, degrades the performance of most SQL statements.Decryption requires extra processing, and encrypted data requires more space in DB2. If a

    predicaterequires decryption, the predicate is a stage 2 predicate, which can degrade performance.To minimize performance degradation, use encryption only in cases that require encryption.

    Creatingindexes on encrypted data can improve performance in some cases. Exact matches and joinsof encrypted

    data (if both tables use the same encryption key to encrypt the same data) can use theindexes that you

    create. Because encrypted data is binary data, range checking of encrypted data requirestable space

    scans. Range checking requires all the row values for a column to be decrypted. Therefore,

    range checkingshould be avoided, or at least tuned appropriately.CREATE TABLE EMP (EMPNO VARCHAR(48) FOR BIT DATA, NAME VARCHAR(48));CREATE TABLE EMPPROJ(EMPNO VARCHAR(48) FOR BIT DATA, PROJECTNAME

    VARCHAR(48));CREATE INDEX IXEMPPRJ ON EMPPROJ(EMPNO);

  • 8/7/2019 DB2 Application Tuning

    10/38

    2009 Wipro Ltd - Confidential10

    Continue..

    Poor performance:SELECT A.NAME, DECRYPT_CHAR(A.EMPNO) FROM EMP A, EMPPROJECT BWHERE DECRYPT_CHAR(A.EMPNO) = DECRYPT_CHAR(B.EMPNO) ANDB.PROJECT ='UDDI Project';

    SELECT PROJECTNAME FROM EMPPROJ WHERE DECRYPT_CHAR(EMPNO) ='A7513';

    Good performanceSELECT A.NAME, DECRYPT_CHAR(A.EMPNO) FROM EMP A, EMPPROJ BWHERE A.EMPNO = B.EMPNO AND B.PROJECT ='UDDI Project';

    SELECT PROJECTNAME FROM EMPPROJ WHERE EMPNO = ENCRYPT('A7513');

  • 8/7/2019 DB2 Application Tuning

    11/38

    2009 Wipro Ltd - Confidential11

    General Recommendation

    Try to use indexable predicates wherever possible Use correlated subquery only if efficient predicates are available If there are multiple subqueries, make sure that they are ordered in efficient

    manner

  • 8/7/2019 DB2 Application Tuning

    12/38

    2009 Wipro Ltd - Confidential12

    Order of Predicate Evaluation

    Predicates are evaluated in following sequence :1. Indexable matching predicates Index page2. Indexable non-matching predicates (Index screening) Index page3. Other stage 1 predicates Data page4. Finally stage 2 predicates After data page access

  • 8/7/2019 DB2 Application Tuning

    13/38

    2009 Wipro Ltd - Confidential13

    Definition: Predicates are found in the clauses WHERE,HAVING or ON of SQL statements;

    they describe attributes of data. They are usually based on thecolumns of a table and either qualify rows (through an index)or reject rows (returned by a scan) when the table is accessed.The resulting qualified or rejected rows are independent of theaccess path chosen for that table.Example: The following query has three predicates: an equal predicate on C1, a BETWEEN

    predicate on C2, and a LIKE predicate on C3.SELECT * FROM T1WHERE C1 = 10 AND

    C2 BETWEEN 10 AND 20 AND

    C3 NOT LIKE 'A%'Properties of predicatesPredicates in a HAVING clause are not used when selecting access paths. hence, the term'predicate means a predicate after WHERE or ON.A predicate influences the selection of an access path because of:1. Its type2. Whether it is indexable3. Whether it is stage 1 or stage 24. Whether it contains a ROWID column

  • 8/7/2019 DB2 Application Tuning

    14/38

    2009 Wipro Ltd - Confidential14

    Continue..

    Simple or compoundA compound predicate is the result of two predicates, whether

    simple or compound, connected together byAND or OR Boolean operators. All others are simple .Local or join

    Local predicates reference only one table. They are local to thetable and restrict the number of rowsreturned for that table. Join predicates involve more than one table

    or correlated reference. They determinethe way rows are joined from two or more tables.Boolean termAny predicate that is not contained by a compound OR predicate

    structure is a Boolean term . If a Booleanterm is evaluated false for a particular row, the whole WHERE

    clause is evaluated false for that row.

  • 8/7/2019 DB2 Application Tuning

    15/38

    2009 Wipro Ltd - Confidential15

    DB2 EXPLAIN AND TUNING

    EXPLAIN is a monitoring tool that produces information about a plan,package, or SQL statement when it is bound. The output appears in a user-supplied table called PLAN_TABLE

    It helps you to do the following Design databases, indexes, and application programs

    Determine when to rebind an application Determine the access path chosen for a query

  • 8/7/2019 DB2 Application Tuning

    16/38

    2009 Wipro Ltd - Confidential16

    DB2 EXPLAIN OUTPUT

    Explain output is stored in PLAN_TABLE Each plan is identified by APPLNAME column Filter Factor % of rows selected Indexes used Cluster Ratio - % of indexed rows in sequence with data rows

  • 8/7/2019 DB2 Application Tuning

    17/38

    2009 Wipro Ltd - Confidential17

    Type of Access

    Tablespace Scan Index scan

    Index Only Access (INDEXONLY = Y) Multiple index Scan (ACCESSTYPE=M,MI,MU,MX) Matching index scan (MATCHCOLS > 0)

    Non-Matching index scan ( MATCHCOLS = 0) One fetch access (ACCESSTYPE= I1)

  • 8/7/2019 DB2 Application Tuning

    18/38

    2009 Wipro Ltd - Confidential18

    Tablespace scan (ACCESSTYPE=R)

    Chosen when Huge number of rows returned Indexes available have low clusterratio No index available

    Sequential prefetch is used (PREFETCH=S)

  • 8/7/2019 DB2 Application Tuning

    19/38

    2009 Wipro Ltd - Confidential19

    Using Index

    Define index based on how you want to access data Proper definition of index (highly clustered) will avoid sort Sometimes, using an index would make the query costly. In such cases,

    discourage the use of such indexes

  • 8/7/2019 DB2 Application Tuning

    20/38

    2009 Wipro Ltd - Confidential20

    Matching Index Scan

    Provides best filtering possible Predicates are specified on either leading or all index key columns MATCHCOLS will provide the number of matching columns If there is more than one index, DB2 will choose the one with best filter-

    factor

  • 8/7/2019 DB2 Application Tuning

    21/38

    2009 Wipro Ltd - Confidential21

    Non-Matching Index scan

    Also called Index Screening When predicates are not in first few columns of index but atleast one

    predicate is in list of indexed columns Filters index pages MATCHCOLS = 0 and ACCESSTYPE = I

  • 8/7/2019 DB2 Application Tuning

    22/38

    2009 Wipro Ltd - Confidential22

    One Fetch Access

    When a query returns needed row in one step of page access Only one table in the query MIN or MAX column functions No GROUP BY

  • 8/7/2019 DB2 Application Tuning

    23/38

    2009 Wipro Ltd - Confidential23

    Index Only access

    When required data can be taken from index pages and no need to accessdata page

    Much efficient ACCESSTYPE = I AND INDEXONLY = Y

  • 8/7/2019 DB2 Application Tuning

    24/38

    2009 Wipro Ltd - Confidential24

    JOIN

    Retrieves rows from more than one table and combines them Application joins are called inner join, left outer join, right outer join and full

    outer join DB2 internally uses three types of join method - Nested loop join, Merge

    Scan Join and Hybrid Join

  • 8/7/2019 DB2 Application Tuning

    25/38

    2009 Wipro Ltd - Confidential25

    Nested Loop Join (METHOD =1)

    CZ

    AY

    BX

    BPRC

    OEB

    PCA

    BPRZ

    PCY

    OEX

  • 8/7/2019 DB2 Application Tuning

    26/38

    2009 Wipro Ltd - Confidential26

    Nested Loop Join (Method = 1)

    Nested loop join is efficient when Outer table is small The number of data pages accessed in inner table is also small. Highly clustered index available on join columns of the inner table. This join method is efficient when filtering for both the tables(Outer and

    inner) is high.

  • 8/7/2019 DB2 Application Tuning

    27/38

    2009 Wipro Ltd - Confidential27

    Merge Scan Join (Method = 2)

    CZ

    BX

    AY

    BPRC

    OEB

    PCA

    BPRZ

    PCY

    OEX

    Table is pre-sorted Table is pre-sorted

  • 8/7/2019 DB2 Application Tuning

    28/38

    2009 Wipro Ltd - Confidential28

    Merge Scan Join (Method = 2)

    Merge scan is used when : Qualifying rows of inner and outer tables are large and join predicates also

    does not provide much filtering Tables are large and have no indexes with matching columns

  • 8/7/2019 DB2 Application Tuning

    29/38

    2009 Wipro Ltd - Confidential29

    Hybrid Join (Method = 4)

    CZ

    BX

    AY

    10C

    30B

    5A

    BPRZ

    PCY

    OEX

    Table is pre-sorted Index with RID

    10Z

    30X

    5Y

  • 8/7/2019 DB2 Application Tuning

    30/38

    2009 Wipro Ltd - Confidential30

    Hybrid Join(Method=4)

    Hybrid join is used often when a non-clustered index available on joincolumn of the inner table and there are duplicate qualifying rows on outertable.

    Hybrid join handles are duplicates in the outer table as inner table isscanned only once for each set of duplicate values.

  • 8/7/2019 DB2 Application Tuning

    31/38

    2009 Wipro Ltd - Confidential31

    Sequential Prefetch (Prefetch=S)

    Sequential prefetch reads a sequential set of pages The maximum number of pages read by a request issued from application

    program is determined by the size of the buffer pool used. Sequential prefetch is generally used for a table space scan. For an index scan that accesses 8 or more consecutive data pages, DB2

    requests sequential prefetch at bind time. The index must have a clusterratio of 80% or above.

  • 8/7/2019 DB2 Application Tuning

    32/38

    2009 Wipro Ltd - Confidential32

    List Sequential (Prefetch=L)

    List sequential prefetch reads a set of data pages determined by a list ofRIDs taken from an index

    Usually with a single index that has a cluster ratio lower than 80%. Sometimes on indexes with a high cluster ratio, if the amount of data to be

    accessed is too small to make sequential prefetch efficient, but large

    enough to require more than one regular read. Always to access data by multiple index access or Hybrid join

  • 8/7/2019 DB2 Application Tuning

    33/38

    2009 Wipro Ltd - Confidential33

    Sequential Detection

    If DB2 does not choose prefetch at bind time, it can sometimes do that atexecution time. The method is called sequential detection.

    If a table is accessed repeatedly using the same statement (SQL in a do-while loop), the data or index leaf pages of the table can be accessedsequentially .

    DB2 can use this technique if it did not choose sequential prefetch at bindtime because of an inaccurate estimate of the no of pages to be accessed.

  • 8/7/2019 DB2 Application Tuning

    34/38

    2009 Wipro Ltd - Confidential34

    Sorting of data

    Sort can happen on a new table or on the composite table Sort is required by ORDER BY or GROUP BY clause.

    (SORTC_GROUPBY/SORTC_ORDERBY = Y).

    Sort is required to remove duplicates while DISTINCT or UNION is used.(SORTC_UNIQ=Y)

    During Nested loop and Hybrid join, composite table is sorted and Mergescan join, both of the tables might be sorted to make join efficient.(SORTN_JOIN/SORTC_JOIN=Y)

  • 8/7/2019 DB2 Application Tuning

    35/38

    2009 Wipro Ltd - Confidential35

    Sorting of data

    Sort is need for subquery processing. Result of the subquery is sorted andput into the work file for later reference by parent query.

    DB2 sorts RIDs into ascending page number order in order to perform listprefetch. This sort is very fast and is done totally in memory

    If sort is required during CURSOR processing, it is done during OPEN

    CURSOR. Once cursor is closed and opened, sort is to be performed again.

  • 8/7/2019 DB2 Application Tuning

    36/38

    2009 Wipro Ltd - Confidential36

    Some Special Techniques

    OPTIMIZE OF n ROWS Reducing the number of matching columns for index scan Adding extra local predicates Changing inner join to outer join Updating Catalog Statistics

  • 8/7/2019 DB2 Application Tuning

    37/38

    2009 Wipro Ltd - Confidential37

    Risks

    There is no GOLDEN RULE for DB2 SQL tuning Wrong Analysis of performance Data and access method information may

    led to more performance overhead While tuning SQL in test environment, the person should keep in mind that

    amount of data and DB2 sub-system setup are not same.

    Person with good knowledge of DB2 should be involved with tuning activity.

  • 8/7/2019 DB2 Application Tuning

    38/38

    2009 Wipro Ltd - Confidential38