performance tuning for developers and dba

Upload: kotmani

Post on 02-Jun-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Performance Tuning for Developers and DBA

    1/40

    Oct 5th 2009 4pm

    Platform: z/OS

    Kurt Struyf

    Competence Partners

    Session: E03

    Practical SQL performance tuning,

    for developers and DBA

  • 8/10/2019 Performance Tuning for Developers and DBA

    2/40

    2

    Agenda One SQL, one access path

    Index, stage1, stage2

    Sort impact

    SQL examples of sub optimal coding and

    its improvements

    Access path fields in the plan_table

    Other CPU saving techniques

  • 8/10/2019 Performance Tuning for Developers and DBA

    3/40

    3

    Static SQL One SQL = One access path

    SELECT FROM WHERE NAME BETWEEN :HV1-LOW AND :HV1-HIGH

    AND FIRSTNAME BETWEEN :HV2-LOW AND :HV2-HIGHAND BIRTHDATE BETWEEN :HV3-LOW AND :HV3-HIGHAND ZIPCODE BETWEEN :HV4-LOW AND :HV4-HIGH

    Our table has 4 indexes :

    IX1 on NAME

    IX2 on FIRSTNAMEIX3 on BIRTHDATE

    IX4 on ZIP CODE

    AT BIND TIME DB2 CHOOSES IX3

    Step1

    Step2

    Step3

    AT RUN TIME

    User only fills out a value for ZIP CODE

    Step4

    AT RUN TIME DB2 USES IX3

    which doesnt filter anything ONE SQL = ONE access path

    Step5

    DB2 determines for each SQL statement the best way to resolve the query. The result

    of this calculation is the access path. If it is a static SQL statement, this access path

    will be chosen at bind time. As a general rule we can say that after DB2 has chosen

    this, it wont change this access strategy at execution time. Even if at run time certain

    other access path choices would have been better. This is somewhat simplifying the

    truth, but is in most cases accurate.

    In the example on the slide this is explained by a very simple static query.

    Our table has 4 indexes. Our select has on all 4 columns a between. If nothing is

    filled out by a user, the host variables are low value and high value. If the user

    provides a value both host variables for that column hold the provided value.

    At bind time, DB2 chooses IX3 as the best possible access path, with the known

    parameters at that time.

    IF at execution time our user doesnt fill out a value for COL3, but he does provide a

    value for COL1. DB2 doesnt change his access path to IX1, but uses IX3, whichdoesnt filter anything.

    Well explain more later, the purpose here is just to explain that DB2 chooses one

    access path and sticks to it. This access path can be a cheap access path or a more

    expensive access path. But DB2 estimates that within the parameters at bind time it is

    the cheapest.

  • 8/10/2019 Performance Tuning for Developers and DBA

    4/40

    4

    Agenda One SQL, one access path

    Index, stage1, stage2

    Sort impact

    SQL examples of sub optimal coding and

    its improvements

    Access path fields in the plan_table

    Other CPU saving techniques

  • 8/10/2019 Performance Tuning for Developers and DBA

    5/40

  • 8/10/2019 Performance Tuning for Developers and DBA

    6/40

  • 8/10/2019 Performance Tuning for Developers and DBA

    7/40

    7

    Matching columns is an indication of how well an index is used,

    - more matching columns better index use

    - always start with first index column- on = and on one IN you can continue

    Example: Index on (Name, Clientno, Salarycode)

    Predicate Matching--------------------- -------------1. Name = Smith AND

    Clientno = 20 ANDSalarycode = 56

    2. Name = Smith ORClientno > 20 ANDSalarycode = 56

    3. Name IN (Smith, Doe)AND Clientno > 20AND Salarycode = 56

    Matching Columns

    Predicate Matching--------------------- -------------4. Name IN (Smith, Doe)

    AND Salarycode > 0

    5. Name SmithAND Clientno = 56

    6. Cliento = 56AND Salarycode = 0

  • 8/10/2019 Performance Tuning for Developers and DBA

    8/40

    8

    Stage 1 Keep it positive and simple but no index!

    = : equal to

    > : larger then

    = : larger then or equal to

  • 8/10/2019 Performance Tuning for Developers and DBA

    9/40

    9

    Stage 2 All the rest !! All functions such as

    SUBSTR

    CONCAT

    CHAR()

    Mismatching data types

    Colchar_6 = 1234567

    Host variable checking

    AND :HV1 = 5

    Decryption

    Current date between col1 and col2 Sorting

    DB2

    RDS

    DM

    Index Index

    Stage1

    Stage2

    All functions require by definition more procession power then what the data

    manager is capable of providing, and so they are resolved in stage 2.

    This functions also include any mathematical function such as adding and subtracting

    with a column.

    Mismatching data types, this is a bit more complex. As a general rule of thumb, you

    can say that, when the data type of a the host variable doesnt match the data type of

    the column. The predicate is stage2. This is cutting it a bit short, you could also say

    (and is more correct) if the host variable is bigger than the column data type the

    predicate is stage 2. Many exceptions exist, but best is to use the correct data type.

    Host variable checking is done in stage2 and this should NEVERbe done in SQL and

    should always be done in COBOL.

  • 8/10/2019 Performance Tuning for Developers and DBA

    10/40

    10

    In COBOL checking stage 3 NEVER (ab)use this!

    All DB2 columns that

    CAN be checked in SQL

    SHOULD be checked in SQL

    So BETTER

    a Stage2 predicate then NO predicate

    DB2

    RDS

    DM

    Index Index

    Stage1

    Stage2

    IN COBOL

    Being said that stage2 is expensive doesnt mean that you should use them.

    If indeed the only way to write the predicate on a COLUMN is as a stage2 predicate,

    you should write it as a stage2 and not pass the row on to COBOL and check it in

    COBOL, that obviously is even more expensive. If such a thing as stage3 would

    exist this would be it.

  • 8/10/2019 Performance Tuning for Developers and DBA

    11/40

    11

    Index, Stage1, Stage2DB2

    RDS

    DM

    Index Index

    Stage1

    Stage2

    This time around you should understand this slide. And know that there are more and

    less expensive ways to writing a query, depending on where DB2 can resolve its

    where predicates. And how many rows are filtered as early (index) on as possible and

    how many are carried on to stage1 or even stage2.

  • 8/10/2019 Performance Tuning for Developers and DBA

    12/40

    12

    SQL processing

    DM (stage1)

    1) matching index predicates (when the index is accessed)

    2) other indexable stage 1 predicates (index screening)

    3) non indexable stage 1 predicates on index pages

    4) stage 1 predicates on the data

    5) rows passed to RDS

    RDS (stage2)

    1) stage 2 predicates2) sort

    Selected rows passed to the user

    DB2 resolves its where predicate always in the same manner.

    First it will resolve the matching index predicates in the sequence of the index

    columns

    Secondly it will resolve all the screening predicates in the index

    Thirdly DB2 will resolve all non indexable where predicates, that are stage 1 and can

    be resolved in the index pages

    Fourth, DB2 will resolve all stage1 predicates on the data

    Then all stage2 predicates are resolved and lastly all returning rows are sorted.

  • 8/10/2019 Performance Tuning for Developers and DBA

    13/40

    13

    Order of evaluating predicates

    Within each of the above non index steps :

    1) all equal predicates

    2) all range predicates and col IS NOT NULL

    3) all other predicates

    Within each of the above sub-step :

    the order in which they appear

    Within all the non index steps of the previous slide, the same logic is followed.

    E.G step 4 stage1 on data pages :

    First DB2 will resolve all equal predicates

    Secondly all range predicates

    Thirdly all the rest (e.g. not equal to)

    Within each sub step, the order in the SQL statement is followed. That means that if

    we for example have two equal predicates that we have to resolve in the data pages,

    DB2 will take the physical sequence in the SQL statement to determine the order in

    which to resolve the predicates.

    Well explain with a little example on the next slide

  • 8/10/2019 Performance Tuning for Developers and DBA

    14/40

    14

    ExampleSELECT *FROM MYTABLEWHERE C1 > ? 1 i ndexAND : HV 5 6 st age2

    AND C5 = ? 3 st age1AND C4 = ? 4 st age1AND DATE( C2) < ? 5 st age2

    AND C3 = ? 2 i ndexORDER BY C2 7 st age2

    INDEX (C1, C3)

  • 8/10/2019 Performance Tuning for Developers and DBA

    15/40

    15

    Agenda One SQL, one access path

    Index, stage1, stage2

    Sort impact

    SQL examples of sub optimal coding and

    its improvements

    Access path fields in the plan_table

    Other CPU saving techniques

  • 8/10/2019 Performance Tuning for Developers and DBA

    16/40

  • 8/10/2019 Performance Tuning for Developers and DBA

    17/40

    17

    Agenda One SQL, one access path

    Index, stage1, stage2

    Sort impact

    SQL examples of sub optimal coding and

    its improvements

    Access path fields in the plan_table

    Other CPU saving techniques

  • 8/10/2019 Performance Tuning for Developers and DBA

    18/40

    18

    Select * SELECT * almost never to be used SELECT ONLY COLUMNS that are

    needed !

    Reason :

    Program maintenance

    CPU cost per extra column

    SORT file becomes bigger Maybe not index only

  • 8/10/2019 Performance Tuning for Developers and DBA

    19/40

    19

    Select * Even for :where exists (select *)Better where exists (select 1)

    Select col5, where col5= ABBetter Select AB where col5= ABBest Select where col5= AB

    Select col1, col2order by col2Better Select col1order by col2if just for order by

  • 8/10/2019 Performance Tuning for Developers and DBA

    20/40

    20

    Other easy improvements:hv between col1 and col2 col1 >= :hv and col2 0

    COL :hv COL in ( , , , , , )

    COL not 5

  • 8/10/2019 Performance Tuning for Developers and DBA

    21/40

    21

    Other easy improvementsSELECT DISTINCT COL1, COL2, COUNT(C1)

    FROM TABLE

    WHERE

    Always results in extra SORT

    SELECT COL1, COL2, COUNT(C1)

    FROM TABLE

    WHERE

    GROUP BY COL1, COL2

    Same results SORT can be avoided

    V9

    Before version 9, although logically alike, there was a clear difference between both,

    queries.

    Using a distinct would always result in an extra sort, whereas the second query, with

    adequate indexing could avoid the sort.

    For instance an index on COL1, COL2 would have avoided a sort in the second query.

    Since version 9, the distinct clause can also be used to avoid an extra sort.

    Another important change is that since V9 and index COL2, COL1 can also be used to

    avoid an extra sort. That of course means that you could have an impact in the

    sequence of your result set and an order by clause should be included if you want to

    guarantee the V8 sequence.

  • 8/10/2019 Performance Tuning for Developers and DBA

    22/40

    22

    More easy improvementsCol1=A orCol1= B Col1 in (A,B)

    Col1>= :hv1 and COL1= :hv1 AND

    Col1 = :hv1 or (col1=:hv1 or

    Col1 >:hv1 and Col2 = :hv2 col1>:hv1 and col2 =:hv2)

    Col1 = :hva (always 5) Col1 = 5

    :hv = 5 IN COBOL !!!

  • 8/10/2019 Performance Tuning for Developers and DBA

    23/40

    23

    Even More easy improvementsCol1 not between 10 and 50 col 1 < 10

    union all

    col1 > 50

    Existence checking select 1

    from table

    where col1 =:hv

    fetch first 1 row only

    Col1 not in (A, B, C) if possible

    Col1 in (the rest)

    will be cheaper even

    when list is bigger

  • 8/10/2019 Performance Tuning for Developers and DBA

    24/40

    24

    Agenda One SQL, one access path

    Index, stage1, stage2

    Sort impact

    SQL examples of sub optimal coding and

    its improvements

    Access path fields in the plan_table

    Other CPU saving techniques

  • 8/10/2019 Performance Tuning for Developers and DBA

    25/40

    25

    Determine Access Path Optimization Service Center

    Newest generation of Visual explain

    Plan_table See next slide

    Might require some exercise

    Not everything in it

    DSN_statement_table Contains the Cost columns

  • 8/10/2019 Performance Tuning for Developers and DBA

    26/40

    26

    DB2 Plan_tableSELECT QBLOCKNO, PROGNAME, PLANNO, METHOD,

    TNAME, ACCESSTYPE, MATCHCOLS, ACCESSNAME, I NDEXONLY, PREFETCH

    FROM PLAN_TABLE WHERE QUERYNO = 30303

    ORDER BY QBLOCKNO, PLANNO ;

    QBLOCKNO PROGNAME PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME I NDEXONLY

    1 DSNESM68 1 0 AATEHA1 I 2 AAX0EHA1 N

    1 DSNESM68 2 1 AATEHB1 I 2 AAX0EHB1 N

    1 DSNESM68 3 3 0 N

    Qblockno: indicates the number blocksnecessary to resolve the query

    General rule, more blocks = less performing

    Progname: represents the Program/packagename

  • 8/10/2019 Performance Tuning for Developers and DBA

    27/40

    27

    Access path: planno, method Planno: the number of steps AND thesequence in which a query is resolved

    General rule, more steps = less performing

    Method: expresses what kind of access is

    done 0 : First access

    1 : Nested Loop Join

    3 : extra sort needed

    Tname: table name to be accessed

    Access type : how that data is accessed

  • 8/10/2019 Performance Tuning for Developers and DBA

    28/40

    28

    DSN_Statement_Table Amongst others :

    COST_CATEGORY:

    A: Indi cates that DB2 had enough i nfo rmatio n to make a cost esti mate withou t using

    default values.

    B: Indicates that some condition exists for which DB2 was forced to use default

    values.

    PROCMS:The estimated processor cost, in mil liseconds, for t he SQL statement

    PROCSU:The estimated processor cost, in service units , for the SQL statement

  • 8/10/2019 Performance Tuning for Developers and DBA

    29/40

    29

    DSN_PREDICAT_TABLE Contains all predicates and how they are

    used.

    Extremely useful for index design

    Replaces the old spreadsheet

    technique

  • 8/10/2019 Performance Tuning for Developers and DBA

    30/40

    30

    Access Path Follow UpSpecificplan_tables

    Identify

    every

    query

    using

    QUERYNO

    New binds

    plan_tables

    Generalplan_tables

    Transfer

    LAN

    EXCELL

    Changes

    plan_tablesEMAIL

    Insert

    It is also best to set up, an automated way of following up your access path changes.

    And notifying your DBA and responsible developers.

  • 8/10/2019 Performance Tuning for Developers and DBA

    31/40

    31

    Agenda One SQL, one access path

    Index, stage1, stage2

    Sort impact

    SQL examples of sub optimal coding and

    its improvements

    Access path fields in the plan_table

    Other CPU saving techniques

  • 8/10/2019 Performance Tuning for Developers and DBA

    32/40

    32

    Multi Row fetch Technique to save up to 60% of DB2 cpu

    Easy to use

    Fetches a rowset into an array

    Program can control size of rowset

    !! due to compiler limits !!

    elementary item : max. 16Mb

    complete working storage : max 128 Mb

  • 8/10/2019 Performance Tuning for Developers and DBA

    33/40

    33

    Multi Row Fetch To be able to use this, the cursor should be

    DECLAREd for rowset positioning, forexample:

    EXEC SQLDECLARE cur sor - name CURSORWITH ROWSET POSITIONING FORSELECT col umn1

    , col umn2 FROM t abl e- name;END- EXEC

    instead ofEXEC SQL

    DECLARE cur sor - name CURSOR FORSELECT col umn1

    , col umn2 FROM t abl e- name;

    END- EXEC

    Then you can FETCH multiple rows at-a-timefrom the cursor

  • 8/10/2019 Performance Tuning for Developers and DBA

    34/40

    34

    Multi Row FetchOn the FETCH statement

    the amount of rows requested can be specifiedfor example:

    EXEC SQLFETCHNEXT ROWSET FROMcurs or- nameFOR :rowset-size ROWS

    I NTO END- EXEC

    instead ofEXEC SQL

    FETCH curs or - nameI NTO

    END- EXEC

    The rowset size can be defined as a constant or avariable, for example:

    01 rowset-size PIC S9(09) COMP-5.

  • 8/10/2019 Performance Tuning for Developers and DBA

    35/40

    35

    Multi Row fetch Do not use single and multiple row fetch

    for the same cursor in one program

    Be aware of compiler limits elementary item : max. 16Mb

    complete working storage : max 128 Mb

    Last FETCH on a rowset can be

    incomplete

    !! due to compiler limits !!

    elementary item : max. 16Mb

    complete working storage : max 128 Mb

  • 8/10/2019 Performance Tuning for Developers and DBA

    36/40

    36

    Multi Row Fetch Performance results may differ: < 5 rows : poor performance (worse than before)

    10 100 rows : best performance

    > 100 rows : no improvement anymore

    Following data is based upon treatment of

    1 million rows (in seconds CPU).

    16 (-35%)6076FETCH + UPDATE via rowset

    10 (-15%)6676FETCH + UPDATE via row

    10 (-60%)616FETCH

    Gain on DB2

    in CPU seconds

    Via rowsetVia row

    Performance results may differ, depending on the amount of columns and

    their data type, but mainly:< 5 rows : poor performance (worse than before)

    10 100 rows : best performance

    > 100 rows : no improvement anymore (same as 10 - 100 rows)

    gain 10 seconds of CPU per one million rows when using rowset pointers

    Following data is based upon treatment of 1 million rows (in seconds CPU).

  • 8/10/2019 Performance Tuning for Developers and DBA

    37/40

    37

    Sequences Easy, fast and cheap way to generate

    unique numbers if : Holes are allowed

    The order isnt important

    Use : next value for yy.xxxxxxxx statement

    BASIC SYNTAX : CREATE SEQUENCE yy.xxxxxxxx

    START WITH 1

    INCREMENT BY 1

    NO MINVALUE

    NO MAXVALUENO CYCLE

    CACHE 200;

  • 8/10/2019 Performance Tuning for Developers and DBA

    38/40

    38

    SequencesEffect of concu rrency on elapsedtime

    0

    2

    4

    6

    8

    1 2 3

    amount jobs

    duration

    own table

    seq object

    Effect on cpu usage

    0

    20

    40

    60

    80

    100

    120

    1 2 3

    amount jobs

    cpu own table

    seq object

    Better response times

    Less cpu need

  • 8/10/2019 Performance Tuning for Developers and DBA

    39/40

    39

    Questions ?

    [email protected]

  • 8/10/2019 Performance Tuning for Developers and DBA

    40/40