who just killed my db2 gelb2003b

Upload: karla-aguirre

Post on 14-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    1/52

    Who Just Killed My DB2?!

    Ivan L. Gelb

    Gelb Information Systems Corp.

    Email: [email protected]

    Phone: 732-303-1333

    @CMGA 2003 - Sydney

    DB2 for z/OS and S/390

    Insuring optimum DB2 service levels in the OS/390 environment is challenging because of not so

    obvious dependencies between the many subsystems. Performance biases introduced by systems

    tuners in z/OS - OS/390, CICS, and DB2 can dramatically affect the complexs service levels andtotal effective capacity. This presentation will describe how to focus DB2 environments tuning

    projects while insuring that the interdependent areas of z/OS, CICS, TSO and batch are optimized.

    Attendees will learn how to avoid being caught in unproductive finger pointing sessions by (a)

    insuring that subsystems are tuned with proper bias, (b) monitoring performance metrics that

    indicate the true illness or wellness of the complex, and (c) knowing what measures are available

    once the source of a service level problem is identified.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    2/52

    2003 GIS Corp . - www.gelbis.com 2

    Agenda

    Basics of Performance Tuning

    DB2 Point of View

    z/OS - OS/390 Point of View

    CICS Point of View

    Pointing in the Right Directions

    We will focus on the latest systems versions: z/OS 1.4, CICS 2.2, DB2 6 8. The first four bullet

    point will show where, how, and what to look for, and the last point will show six examples that

    always cause contention among staff in different system and application areas. Following is oursession outline.

    Basics of Performance Tuning: (1) Is anyone complaining? (2) Where to begin?

    DB2 Point of View: (1) Eliminate bottlenecks; (2) Protect loved ones

    OS/390 Point of View: (1) Allocate resources: CPU, I/O, storage; (2) Definitions of relative

    priorities & their effects; (3) Protect loved ones (4) Who used, how much, of what, for howlong..

    CICS Point of View: (1) Eliminate bottlenecks; (2) Protect loved ones (3) Create throttles;

    Pointing in the Right Directions (3 CPU and 2 I/O cases): (1) Starved for CPU; (2) CICS region

    saturated; (3) DPMODE = What? (4) Buffer Pools (5) High I/O Service Time

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    3/52

    2003 GIS Corp . - www.gelbis.com 3

    Basics of Performance Tuning

    Is anyone complaining?

    Do YOU want to prevent complaints?

    Where to begin?

    Ask these questions

    TRADEMARKS

    The following are trade or service marks of the IBM Corporation: CICS, CICS TS , CICSPlex,

    DB2, IBM, MVS, OS/390, z/OS, Sysplex, Parallel Sysplex. Any omissions are purely unintended.

    MOAD MOTHER OF ALL DISCLAIMERS

    All of the information in this document is tried and true. However, this fact alone cannot guarantee

    that you can get the same results at your place and with your skills. In fact, some of this advice can

    be hurtful if it is misused and misunderstood. As with all kinds of analysis, anything you may hear

    or read can be understood and misunderstood in many ways that may seem contradictory to you. In

    this regard, a further and associated contradictory element requires considerable systems analysis

    and trade-off studies to arrive at the structural design, based on rigorous system engineering

    concepts. By combining advice and certain experiences, any fully integrated performance testprogram is weakly equivalent to any subsystem compatibility testing designed to eschew

    obfuscation. Gelb Information Systems Corporation, Ivan Gelb and any one found anywhere

    assume no responsibility for this informations accuracy, completeness or suitability for any

    purpose. Anyone attempting to adapt these techniques to their own environments anywhere do so

    completely at their own risk. ;-)

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    4/52

    2003 GIS Corp . - www.gelbis.com 4

    Is Anyone Complaining?

    If YES, why?

    Some service is slow and/or failing to meet service level

    objectives (be sure that objectives are rational!)

    Response time is OK, but not enough is completed

    For either case:

    Total Delays = Response Times Measured Service Times

    What is Total Delays / Total Response Time ratio? < .50 is OK

    but it up to doubles response time

    The ratio acceptable to your site will depend on service level

    goals.

    First, we should not be waiting for complaints even though it helps justify the existence of our job.

    Be effectively proactive, and they may wonder why tuning is needed.

    Being proactive includes at least the following minimum activites:

    -Profiles of all workloads so you can tell if behavior changes,

    -Tracking of past and future changes in system software, hardware, and applications, and

    -Profiles of utilization by business units of work is the most effective way to do all this.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    5/52

    2003 GIS Corp . - www.gelbis.com 5

    Do YOU Want to Prevent Complaints?

    We consider WAITING for

    complaints a CLS (Career

    Limiting Strategy).

    Your actions plan should

    prevent complaints.

    If you do your job real well,

    some may wonder if you are

    needed.

    Wait for or prevent complaints,the plan of action is the same.

    OK

    Performance?

    Adjust 1Thing!

    Measure

    AnyIdeas?

    This is a bit of philosophy here.

    The practical issue is that for well managed shops the best practice is to work on prevention of

    complaints.

    Performance management is a cyclical activity.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    6/52

    2003 GIS Corp . - www.gelbis.com 6

    Where to begin - 1

    Evaluate performance:

    Inside DB2

    Outside DB2

    Inside DB2 (not our sessions focus) Obtain / establish Service Level Agreements (SLAs) for

    business critical work.

    Without SLA-s, identify what is reasonable response time given

    the service time profile of the workload.

    Collect accounting trace class 2 and 3 for best information on

    externally caused delays

    Class 1 and 3 accounting traces reveal the tasks complete

    activity (we will review a summary report from DB2 PM)

    A DB2 performance evaluation should always start from inside out. Simply, you do not want to be

    found with your homework not done.

    We are going to look at reports in our quest for who killed your DB2 performance that will showboth types of measurements: indicators of problems inside DB2, and indicators that factors outside

    DB2 are cause of degradation.

    This session will drive towards identifying situations where the outside factors are causing the

    misery.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    7/52

    2003 GIS Corp . - www.gelbis.com 7

    Where to begin - 2

    Outside DB2 analysis focuses on eliminating factors that

    degrade DB2s performance

    Search for hints inside DB2 that outside factors may be

    cause of problems:

    Wait for

    CPU due to higher priority work

    CPU due to LPAR management of weights

    Central storage

    Long basic I/O service times

    > 3 msec for cached writes

    > 5 msec for cached reads

    Long non-service time components of I/O servicePEND, Disconnect, Device Busy, Control Unit Busy

    As you can see from the list on the slide, delays come in many flavors.

    We will show reports and recommend where to look to isolate causes for the delays.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    8/52

    2003 GIS Corp . - www.gelbis.com 8

    Where to begin - 3

    Prepare profiles of resource utilization by workload:

    CPU, I/O, processing parallelism, processor storage, network time

    Any latent demand?

    If YES, what makes it up? Work you HAVE to care about or NOT?

    Easy to tell if there is some latent demand see RMF CPU Activityreport

    Harder to tell who it is waiting for CPU

    Where to look?

    RMF Workload, CPU, device, storage activity reports

    SMF file activity reports

    CICS & DB2 subsystems reports

    CICS & DB2 subsystems traces are last resort, but most time

    consuming to analyze and most trustworthy

    This is an outline of where to begin analysis. Samples of how this is done follow later in

    presentation.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    9/52

    2003 GIS Corp . - www.gelbis.com 9

    Ask These Questions - 1

    What is your class 2 elapsed time (C2ET)?

    How much of C2ET is:

    Wait for CPU dispatch?

    MVS Overhead? I/O-s related wait?

    System page-ins?

    Which DASD devices are not providing good performance?

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    10/52

    2003 GIS Corp . - www.gelbis.com 10

    Ask These Questions - 2

    Other classic questions you should not forget to ask, but willnot be discussed in much detail in our session today:

    What is your system I/O rate/Second for:

    Each pool?

    Critical objects?

    What is your average Synch I/O elapsed time?

    What are your buffer pool hit ratios?

    Is you SQL coding effective?

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    11/52

    2003 GIS Corp . - www.gelbis.com 11

    DB2 Point of View

    Eliminate bottlenecks

    Protect loved ones

    Identify Suspect Areas

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    12/52

    2003 GIS Corp . - www.gelbis.com 12

    Eliminate Bottlenecks

    Start with trace or detailed performance report of a task with

    performance problems and examine time spent using versus

    waiting for each resource:

    CPU

    Storage

    I/O activity

    DB2 locking activity

    Thread waits

    Application enqueues

    Network time for DB2 initiated external units of work

    DB2 DISPLAY command

    Shows thread waits or cases when request is moved to a poolthread

    This is a checklist of what we are looking for in this analysis.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    13/52

    2003 GIS Corp . - www.gelbis.com 13

    Protect Loved Ones - 1

    Isolation is the best protection but not always possible

    Such protection can be implemented at many levels:

    WLM (z/OS Workload Manager) and IRD (Intelligent Resource

    Director) in the SYSPLEX

    WLM service policy coded relative priorities

    WLM CPU priority protection (new as of OS/390 Version 10)

    WLM Storage protection (new as of OS/390 Version 10)

    PR/SM LPAR weights

    WLM I/O priority protection

    Isolation of DB2 objects in buffer pools

    Isolation of DB2 objects on specific devices

    Protection of resources via WLM goal mode service definitions can be very effective.

    Some WLM service policy creativity can cause problems. Avoid:

    (A) more than 15 20 service classes,

    (B) complicated work classification rules - fewest rules cause least CPU overhead,

    (C) classification rules in wrong order (most to least likely order of conditions is a MUST),

    (D) multiple periods / service class must be carefully considered best use is for low importance

    work than can truly be decreased in importance as it demonstrates a higher and higher

    propensity to use CPU resources, and

    (E) specifying non-achievable service class goal example: 90% < 1 second when you can see

    your average never reaching this goal.

    We will examine next what WLM can do, what its protection looks like, and what should we

    expect as results.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    14/52

    2003 GIS Corp . - www.gelbis.com 14

    Identify Suspect Areas

    Source: DB2 PM Accounting Report (long version)

    Focus on the following times to identify sources of where your work is processing or waiting:

    A. Lock/Latch time of DB2 + IRLM

    B. Synchronous I/OC. Class 2 CPU

    D. Other read I/O made up fro:

    a. Sequential prefetch

    b. List prefetch

    c. Sequential detection

    d. Synchronous I/O by another thread (different from this one)

    E. Other write suspensions

    a. Asynchronous write I/O

    b. Synchronous write I/O by another thread (different from this one)

    F. DB2 service task suspensionsa. Wait for data set extend/delete/define task

    b. Wait for other service tasks

    G. Suspended for processing ARCHIVE LOG MODE(QUIESCE)

    H. Suspended for read from archive log tape

    I. Suspended waiting for a drain lock

    J. Suspended for release of an object from all claim holders

    K. Suspended for page latch do you do RUNSTATS and COPY with SHRLEVEL(CHANGE)? This will cause thispage latch contention.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    15/52

    2003 GIS Corp . - www.gelbis.com 15

    z/OS - OS/390 Point of View

    WLMs Resources Management

    Definitions of relative priorities

    Protect loved ones

    Who used what and how much

    Note:

    All sample RMF reports are from: SC33-7991 z/OS RMF

    Report Analysis (Version 1.4)

    All other samples are from various GIS projects

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    16/52

    2003 GIS Corp . - www.gelbis.com 16

    WLMs Resources Management

    WLM managed resources:

    Processor (CPU)

    Storage controls

    Multi-programming levels (MPL)

    I/O priority

    Parallel I/O access volumes

    JES initiators

    DB2 stored procedure address spaces

    Websphere scalable address spaces

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    17/52

    2003 GIS Corp . - www.gelbis.com 17

    Definitions of Relative Priorit ies

    WLM Priorities:

    Pre-defined service classes

    SYSTEM (fixed CPU DP=255)

    SYSSTC (fixed CPU DP=254)

    Importance 1 5 Discretionary when you DONT NEED TO CARE!

    WLM goal types:

    Percentile response time

    Average response time

    Velocity guarantees CPU access only, not priority

    IRLM should be in SYSSTC. The other importance levels DB2 inherits from the caller so it can

    greatly vary.

    Please note that a velocity goal only guaranties access to the processor and not the CPU

    dispatching priority. If it is set to high, the work can never reach it. After a few attempts, WLM

    will give up on trying to help this workload.

    Highly recommended settings:

    1. IOQ=PRIORITY

    2. MSO coefficient of 0 or 0.0001 (the minimum possible)

    3. Equal CPU and SRB service definition coefficients

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    18/52

    2003 GIS Corp . - www.gelbis.com 18

    Protect Loved Ones

    If WLM level < Release 10

    If service goal being met, NO guarantee that CPU access of

    your loved workloads will be protected from lower priority

    work

    DITTO for central storage can cause pain due to paging

    New options since OS/390 Release 10:

    Identify service class as CPU Critical

    Identify service class as Storage Critical you WILL HURT

    less important work and increase system paging rate. You

    should be concerned about this in 64-bit mode systems that

    are storage poor! They page to disks!

    Watch out for I/O priority shifts caused by multi-period serviceclasses

    What is a loved one? A workload that your business is willing to spend resources on to

    maintain and/or improve its quality of service. In other words, work that most important for

    your business.WLM may do service policy adjustments every 10 seconds and resource adjustments

    every 2 seconds. This is not fast enough for critical online work. Once a lower priority task

    is moved above your loved ones CPU dispatch priority, the few seconds required to

    regain the higher CPU dispatching priority will cause missed service goals.

    CPU CRITICAL attribute solves this problem. Lower priority works priority will not be

    raised above higher importance work with this attribute set.

    STORAGE CRITICAL solves the problem of paging by protecting the working set of your

    favorite work.

    I/O priority shifts caused by multi-period service classes may increase/decrease theperformance of a loved workload in unexpected ways. Just be aware this potential exists

    by identifying such service classes.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    19/52

    2003 GIS Corp . - www.gelbis.com 19

    Who Used What and How Much

    CPU Activity Reports

    LPAR Activity Reports

    I/O Device Activity Reports

    Workload Activity Reports

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    20/52

    2003 GIS Corp . - www.gelbis.com 20

    CPU Activity

    6

    54

    3

    Provides only 100% accurate CPU utilization figures for all LPAR-s and each LPAR individually.

    Use it in conjunction with workload activity measurements to establish CPU utilization capture

    ratiosObserve and consider:

    1. ONLINE TIME less than 100% indicated CPU being varied on- or offline. IRD or manual

    process may cause this.

    2. LPAR BUSY % - what % of each allocated CPU this LPAR utilized. Less than 100% indicates

    possible capacity issues.

    3. MVS BUSY % - LPARs % CPU utilization. 100% should cause performance and capacity

    concerns if (a) anyone complains, and (b) critical workloads + SYSTEM make up 90-95%+ of

    the utilization

    4. QUEUE LENGTHS (%) indicates how many others you may have to wait behind for CPU

    access

    5. IN READY - address spaces ready to run but CPU not available

    6. OUT READY even worst than IN READY if the OUT-s are workloads you care about. See

    workload activity reports to determine the victims

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    21/52

    2003 GIS Corp . - www.gelbis.com 21

    CPU Activity Processor Delays

    3

    Processor delays report identifies who is delayed and by ABOUT how much.

    1. DLY % = (# of Delay Samples / # of Samples) * 100 is % of time task is delayed from getting

    CPU time

    2. USG % = (# Using Samples / # Samples ) * 100 is % of time the task is receiving CPU service

    3. Holding Job(s) up to three tasks that most contributed to delay

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    22/52

    2003 GIS Corp . - www.gelbis.com 22

    LPAR Partit ion Data Report

    54

    Partition Data Report is from the RMF post processor. This is the most useful single place where

    we can see defined and actual LPAR capacity reporting.

    1. WGT LPARs weight/Total defined weight is the % SHARE this LPAR will be dispatched

    by PRSM if it needs CPU service

    2. MSU DEF and ACT defined and actual LPAR MSUs

    3. CAPPING DEF partitions capping option

    4. CAPPING WLM% - % of time WLM capped this LPAR

    5. LPAR MGT LPAR management overhead

    To minimize LPAR overhead, try to define a ratio no greater than 2 logical CPUs defined perphysical CPU. This ratio is calculated by adding the logical CPUs defined in all LPARs and

    dividing this total by the number of available physical CPUs.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    23/52

    2003 GIS Corp . - www.gelbis.com 23

    LPAR Cluster Activity

    2

    Summary of attributes and activity of LPARs. Note the 2 PLEXs on this report because this one is

    a SYSPLEX wide example.

    1. TOTAL% LBUSY logical CPU busy

    2. TOTAL% PBUSY physical CPU busy

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    24/52

    2003 GIS Corp . - www.gelbis.com 24

    LPAR Activity from RMF III

    CPC Capacity reports allows the online examination of the same information that RMF also

    records and can be obtained with the post processor reports.

    1. MSU Def defined capacity

    2. MSU Act actual capacity

    3. Cap Def defined capacity for variable Workload License Charging (vWLC)

    4. Proc Num number of logical processors

    5. Logical Util %

    6. Physical Util %

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    25/52

    2003 GIS Corp . - www.gelbis.com 25

    Processor Speed Issues

    A problem if expected speed is not delivered for your work

    Why?

    Cycle time often the least reliable indicator

    MSU rate / hour also not uniformly reliable MIPS rates are based on cycle times

    MIPS rates are based on various specific workloads

    IBMs Large Systems Performance Reference (LSPR)

    contains various rates for different types of workloads. For

    latest results visit:

    http://www-1.ibm.com/servers/eserver/zseries/lspr/zSeries.html

    A side, but very important area for what we are covering here.

    For example if you just upgraded from 125 MIPS CPU to a 250 MIPS CPU, you might expect

    CPU time to drop by 50%. There are many more reasons than the three major ones identified hereof why such expectation are not met.

    Differences in the profile of workloads can produce an over 25% swing in the wrong direction for

    your workloads.

    Not much you can do about this other than doing your homework to understand that the decreased

    throughput is not due to some performance tuning factors that needed adjusting when you changed

    hardware.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    26/52

    2003 GIS Corp . - www.gelbis.com 26

    I/O Device Activity -1

    631

    DASD Activity report tells us all we need to know about a single volume.

    Possibly unproductive activity to watch out for:

    1. IOSQ TIME IOS queue

    2. DPB DLY director port delay

    3. DB DLY delay due to device busy

    4. PEND TIME pending

    5. DISC TIME disconnect

    6. AVG NUMBER ALLOC reveals how many files were open on the volume. Did you expect

    to be alone?

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    27/52

    2003 GIS Corp . - www.gelbis.com 27

    I/O Device Activity - 2

    31

    This report revels where the I/O activity for any single DASD volume originates from, and what is

    the level of the activity. This report often surprises. Just when you think you are alone you

    are NOT.

    1. SMF SYS ID produces one line for each system touching this volume

    2. % DEV RESV device reserved by another system

    3. AVG NUMBER ALLOC avg. number of files allocated in this interval

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    28/52

    2003 GIS Corp . - www.gelbis.com 28

    I/O Device Activity 3

    4

    321

    Device I/O activity delays report shows which devices delay a particular workload, and what are

    the chief contributors to these delays.

    1. DLY % - delay this job experienced

    2. USG % - using %

    3. CON % - connect %

    4. MAIN DELAY VOLUME(S) - % delay contributed by top 4 volumes

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    29/52

    2003 GIS Corp . - www.gelbis.com 29

    File Level I/O Details - 1

    From SMF type 42-6 records sorted by I/O rate

    Source: Joel Goldstein, Responsive Systems

    3

    SMF type 42-6 records are the finest tool for analysis of I/O activity for any file component. The

    only down side to this record that if activity is produced on multiple CPUs, you are best of if

    you merge these record into a single report. This will reveal the different point of view thesame object produces on different SYSPLEX members.

    1. IO INTENSITY is calculated from product of

    2. IO RESP device response time, and

    3. IO COUNT duhhhhhhh

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    30/52

    2003 GIS Corp . - www.gelbis.com 30

    File Level I/O Details - 2

    Source: Joel Goldstein, Responsive Systems

    2

    1

    From the SMF 42-6 we can produce all the details about the physical I/O activity of any object on

    any volume. This helps us identify where to concentrate tuning activities.

    Such data introduced to some type of modeling tool can be used to explore what if scenarios

    before the effort to make the change is expanded.

    This sample of 42-6 reporting shows analysis details possible for an object. It can be viewed as:

    1. Part of all databases, or

    2. Within a single volume

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    31/52

    2003 GIS Corp . - www.gelbis.com 31

    Source: Chris Baker, IBM

    RMF Workload Measurements

    You can basically put that the BTE number is the TORS point of view of the

    response time versus the EXE that is the other stuff.

    We could have actually drawn another box that could have been an FOR so it would

    be a subset of EXE.

    The transactions with multiple regions.will have multiple EXE lines.DB2 activity is issued from AOR-s to DB2

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    32/52

    2003 GIS Corp . - www.gelbis.com 32

    RMF Workload Activity - 1 W O R K L O A D A C T I V I T Y

    MVS/ESA SYSPLEX WSC1 DATE 01/14/1997 INTERVAL 15.00.002 MODE = GOAL

    SP5.2.2 RPT VERSION 1.2.0 TIME 09.29.00

    POLICY ACTIVATION DATE/TIME 01/14/1997 06.50.04

    REPORT BY: POLICY=CICSHARE WORKLOAD=CICSWKLD SERVICE CLASS=CTRAN1 RESOURCE GROUP=*NONE PERIOD=1 IMPORTANCE=HIGHEST

    -TRANSACTIONS-- TRANSACTION TIME HHH.MM.SS.TTT

    AVG 0.0 0 ACT UAL 000 .00 .00 .18 7MPL 0.0 0 QUE UED 000 .00 .00 .18 7

    ENDED 4363 EXECUTION 000.00.00.000

    END/SEC 4.85 STANDARD DEVIATION 000.00.01.423

    #SWAPS 0

    EXECUTD 0

    -------------------------------RESPONSE TIME BREAKDOWN IN PERCENTAGE-------------------- ------STATE------

    SUB P TOTAL ACTIVE READY IDLE -------------------------WAITING FOR-------------------------- SWITCHED TIME (%)

    TYPE LOCK I/O CONV DIST LOCAL SYSPL REMOT TIMER PROD MISC LOCAL SYSPL REMOT

    CICS BTE 760 27.6 12.0 233 0.2 0.1 0.0 0.0 0.0 0.0 0.0 221 45.8 221 0.0 0.0 0.0

    ---RESPONSE TIME--- EX PERF

    HH.MM.SS.TTT VEL INDX

    GOALS 00.00.00.500 AVG

    ACT UAL S 0 0. 00. 00. 187 N/A 0.4

    ----------RESPONSE TIME DISTRIBUTION----------

    ----TIME---- ---NUMBER TRANSACTIONS--- ----PERCENT---- 0 10 20 30 40 50 60 70 80 90 100

    HH.MM.SS.TTT BUCKETS TOTAL BUCKETS TOTAL ........................................

    < 00.00.00.250 4109 4109 94.2 94.2 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    > 00.00.01.000 42 4363 1.0 100 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    Sample RMF pos t proc essor (ERBRMFPP) repor t w i th opt ion SYSRPTS(WLMGL(SCPER))

    Source: Chris Baker, IBM

    This it the RMF workload activity report

    You can make statements like 94% ran in less than a quarter of a second: this is the

    response time distribution from RMF post processor report.

    Also, ABOUT 45.8% of the time was spent in DB2.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    33/52

    2003 GIS Corp . - www.gelbis.com 33

    Source: Chris Baker, IBM

    RMF Workload Activity - 2REPORT BY: POLI CY=HPTSPOL1 WORKLOAD=PRODWKLD SERVI CE CLASS=CI CSHR RESOURCE GROUP=*NONE PERI OD=1 I MPORTANCE=HI GH

    - TRANSACTI ONS- - TRANSACTI ON TI ME HHH. MM. SS. TTT

    AVG 0. 00 ACTUAL 000. 00. 00. 114

    MPL 0. 00 QUEUED 000. 00. 00. 036

    ENDED 216 EXECUTI ON 000. 00. 00. 078

    END/ SEC 0. 24 STANDARD DEVI ATI ON 000. 00. 00. 270

    #SWAPS 0

    EXECUTD 216

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -RESPONSE TIME BREAKDOWN I N PERCENTAGE- - - - - - - - - - - - - - - - - - - - - - - - - - STATE- - - - - -

    SUB P TOTAL ACTIVE READY I DLE --- - - - - - - - - - - - - - - - - - - - - - - WAI TING FOR-- -- - - - - - - - - - - - - - - - - - - - - - - SWI TCHED TIME (%)

    TYPE LOCK I / O CONV DI ST LOCAL SYSPL REMOT TI MER PROD MI SC LOCAL SYSPL REMOT

    CI CS BTE 93. 4 10. 2 0. 0 0. 0 0. 0 0.0 83. 3 0.0 0. 0 0.0 0.0 0.0 0.0 0. 0 83. 3 0.0 0. 0

    CI CS EXE 67. 0 13. 2 7.1 0. 0 0.0 0.0 0. 0 0.0 0. 0 0.0 0.0 0.0 46.7 0. 0 0.0 0.0 0. 0

    This is a sample RMF post processor (ERBRMFPP) output with option SYSRPTS(WLMGL(SCPER))

    Now here is a slightly more detailed view where you have BTE and EXE lines

    How safe are these numbers?

    Well its response time breakdown into percentage

    How is RMF finding this out?.these numbers are vulnerable to all kinds of

    things because they are from CICS Performance Block (PB) sampling

    From CMF you get the absolute number if this is not enough.

    PROD column on this report is the % of time CICS thinks this transaction is

    waiting for DB2 activity to complete.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    34/52

    2003 GIS Corp . - www.gelbis.com 34

    RMF Workload Activity - 3

    Storage delays due to paging and swapping activity.

    DLY% - delay workload experienced

    % Delayed for OTHR includes delays due to VIO, cross-memory address space paging, and

    hiperspace paging in one total.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    35/52

    2003 GIS Corp . - www.gelbis.com 35

    CICS Point of View

    Eliminate bottlenecks

    Protect loved ones

    Create throttles

    CICS/DB2 Performance Improvements

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    36/52

    2003 GIS Corp . - www.gelbis.com 36

    Eliminate Bottlenecks

    Do not hit limit conditions in CICS region

    MAXTASK

    TRANCLASS

    DB2 threads

    Applications enqueues

    MAXTASK is a user maximum for the tasks the region will handle at any one time. You should

    set this just high enough so the region does not hit it. By setting this to a value that causes delays,

    you produced a throttle on the processing capacity, and therefore that ability to consume resources,of this region.

    TRANCLASS is another place you can create a throttle. This is the recommended way to limit

    certain resource hogs from dominating the entire region.

    DB2 threads should not cause any waits. This increases the cost per unit of work.

    Application produced enqueues are an area worthy of a dedicated presentation. Just look for signs

    that this is going on, and try to eliminate/minimize this type of waiting.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    37/52

    2003 GIS Corp . - www.gelbis.com 37

    Protect Loved Ones

    Protection can be implemented at many levels:

    WLM and IRD Intelligent Resource Director) in the SYSPLEX

    WLM Workload relative priorities

    PR/SM LPAR weights

    WLM combined with IRD will shift resources within LPARs in a SYSPLEX to meet service

    objectives.

    WLM relative priorities will insure than the work defined as most important gets all the available

    resources until the service objectives are met. Only then will lower priority work also get a change

    to run.

    PR/SM LPAR weights can be static when defined by Operation, and they can be dynamically

    managed by IRD if in WLM goal mode.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    38/52

    2003 GIS Corp . - www.gelbis.com 38

    Create Throttles

    Most effective throttles are highest level MPL controls at

    various points:

    Maximum batch initiators

    Maximum tasks within a CICS region

    Transactions restricted by class

    Lower level throttles, or factors that in any way slow down the

    work once it begins, cause wasted resources.

    Just follow these recommendations for a well performing outcome.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    39/52

    2003 GIS Corp . - www.gelbis.com 39

    CICS Dispatcher StatisticsDISPATCHER STATISTICS

    Dispatcher Start Date and Time. . . . . . . : 11/24/2002 09:22:44.7563

    Address Space CPU Time. . . . . . . . . . . : 02:11:34.1901

    Address Space SRB Time. . . . . . . . . . . : 00:02:24.3700

    Peak number of dispatcher tasks . . . . . . : 149

    Peak ICV time (msec). . . . . . . . . . . . : 1000

    Peak ICVR time (msec) . . . . . . . . . . . : 150000Peak ICVTSD time (msec) . . . . . . . . . . : 250

    Peak PRTYAGE time (msec). . . . . . . . . . : 0

    Peak MRO (QR) Batching (MROBTCH) value. . . : 1

    Number of Excess TCB Scans. . . . . . . . . : 1030792M

    Excess TCB Scans - No TCB Detached. . . . . : 901943M

    Number of Excess TCBs Detached. . . . . . . : 222681M

    Average Excess TCBs Detached per Scan . . . : 0

    Number of CICS TCB MODEs. . . . . . . . . . : 13

    Number of CICS TCB POOLs. . . . . . . . . . : 3

    Notes:

    - Excess TCB scans and detaches increase unproductive overhead.

    - Tune number of TCB-s allocated to minimize overhead.

    CICS dispatcher statistics reveal the effects of excess TCBs allocated in the region.

    Reduce MAXOPENTCBS to reduce the excesses.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    40/52

    2003 GIS Corp . - www.gelbis.com 40

    CICS DB2 Entry Statistics

    DB2ENTRY STATISTICS - REQUESTS

    DB2Entry Call Signon Partial Commit Abort Single Thread Thread Thread

    Name Count Count Signon Count Count Phase Reuse Terms Waits/Overfl

    AMD2 2730679 24238 8147 0 26 24222 23644 594 0

    MDI 0 0 0 0 0 0 0 0 0

    MDI1 0 0 0 0 0 0 0 0 0MDI2 0 0 0 0 0 0 0 0 0

    MNIF 1213 31 4 0 0 31 0 0 31

    MT1010MQ 43872 871 868 30 3 841 0 871 0

    MT4I 2814 22 15 68 0 4 0 0 22

    Note: Many repetitive lines deleted from here

    ______________________________________________________________________________________________________

    *TOTALS* 2778578 25162 9034 98 29 25098 23644 1465 53

    CICS DB2 Entry statistics show requests by DB2 entry.

    Minimize ABORT COUNT and THREAD WAITS/OVERFL.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    41/52

    2003 GIS Corp . - www.gelbis.com 41

    O Save TCB switch costs with DB2 Version. 6Application needs to be marked as Threadsafe

    CICS APIs used between DB2 call must be Threadsafe

    O Percentage savings depend on application Savings based on the saved task switches and total path

    length

    OA switch from CICSs QR TCB to another TCB and back

    is about 4K instruct ions TCB switches for 25 SQLs cost 1 millisecond (ms) CPU on 100

    MIPS CPUs

    Source: Geoff Sharman- with t hanks to John Burgess, IBM, Hursley

    CICS/DB2 Performance Improvements -1

    CTS 2.2 can have significant performance improvements for CICS applications

    making many calls to DB2.

    CICS applications normally run on the CICS QR TCB, when they make a call to

    DB2, the request is processed on another TCB associated with a DB2 thread. This

    requires a TCB mode switch. When the request is complete the application resumesexecuting back on the QR TCB and this requires another TCB mode switch.

    With CTS 2.2, if the application is marked as 'Threadsafe', the application code

    between the DB2 calls can continue running on the same TCB as the DB2 thread

    instead of switching back to the QR TCB. When situation is exploited, 2 TCB

    mode switches can be saved for each DB2 call. This can yield potentially significant

    CPU savings for these applications.

    Source: Dave Raiman, IBM. - As an example, the same application making 100

    DB2 calls was run both on CICS 2.1 and CICS 2.2 on our 9672 X37 processor.

    When run on CICS 2.1, the application used 11.27ms of CPU, when run on CICS

    2.2 the application took 8.7ms of CPU.

    The results show that this particular application made a 22% CPU saving when

    migrated to CICS 2.2.

    Minimum V6 of DB2 required for these savings.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    42/52

    2003 GIS Corp . - www.gelbis.com 42

    CICS/DB2 Performance Improvements - 2

    O How? With CSD program definition

    CONCURRENCY(THREADSAFE)

    O CICS control of all TCBs in region with MAXOPENTCB inSIT If excessive TCBs, CPU time wasted on scanning them

    O DB2CONN TCBLIMIT defines number of L8 TCBs that can

    be connected to DB2

    O See CICS Application Programmer Reference manual

    Appendix L.

    CTS 2.2 and minimum V6 of DB2 required to get this benefit.

    Other than saving some CPU time used by TCB switches, CICS QR TCB constraint

    is relieved by more processing running on the LE TCBs. This can enable the re-

    combining of regions that were split into multiple AORs (Application Owning

    Regions). Such merged can save valuable system resources: CPU, storage, diskspace, I/O activity.

    How you get this? (1) Automatically, SQL calls will switch to L8 TCB-s and

    application will stay there until a non-threadsafe command is encountered. Thus, the

    SQLs processing and some of the application is automatically

    shifted to L8 TCBs.

    (2) Specify in CSD program definition CONCURRENCY(THREADSAFE)

    attribute for such programs. If you lie, the programs will just keep switching and

    you are causing one extra TCB switch with your lying.

    MAXOPENTCBS in CICS region controls the total of all TCBs within one region.

    CICS region should be tuned to have enough of them so wait for TCB is

    eliminated/minimized. Specifying much more than need, causes wasted CPU and

    storage (below 16 MB may be critical to you).

    DB2CONN TCBLIMIT specifies a subset of MAXOPENTCBS number to be used

    for L8 TCBs. Again there are statistics to show any wait that might be caused if not

    enough are specified.

    See Appendix L in CICS Application Programmer Reference manual for

    complete list of threadsafe commands. IBM is working on making more commands

    threadsafe ASAP.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    43/52

    2003 GIS Corp . - www.gelbis.com 43

    Pointing in the Right Directions

    3 CPU and 2 I/O examples

    Starved for CPU

    CICS region saturated

    DPMODE = What?

    Buffer Pools

    High I/O Service Time

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    44/52

    2003 GIS Corp . - www.gelbis.com 44

    Starved for CPU

    Symptoms: CPU time in OK range, but wait for CPU

    is significant part of service time

    Where lo look?

    RMF CPU Activity

    RMF LPAR Activity

    RMF Workload Activity

    DB2 trace and class 2 data

    What to look for?

    High LPAR Utilization but low % of physical CPU-

    s

    High LPAR management overhead

    High wait times in DB2 trace and accounting data

    Higher priority work causing starvation

    The measurements can be deceptive and cause misdirection of performance enhancing activities.

    CPU starvation is often caused by LPAR weight settings. These values provide a percentage based

    distribution to each LPAR. No priority is involved and distribution is not enforced until complex is100% utilized.

    So, things can be fine as you approach 100% utilization, and then go to unacceptable levels very

    quickly.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    45/52

    2003 GIS Corp . - www.gelbis.com 45

    CICS Region Saturated - 1

    Symptoms: CPU time in OK range, but wait for

    CPU dispatch is significant part of service time

    Where lo look?

    CICS interval dispatcher statistics

    RMF Workload Activity

    What to look for?

    CICS region with CPU utilization of 85% or

    higher of a single CPU within a CEC

    CICS QR TCBs utilization > 85%

    This problem is often missed because the symptoms are not tracked.

    Remember to add capture ratio to CPU time reported in the Workload Activity Report.

    See next page for a sample of CICS Dispatcher statistics.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    46/52

    2003 GIS Corp . - www.gelbis.com 46

    CICS Region Saturated - 2

    DISPATCHER STATISTICS (Note: Columns 2 5 deleted to improve legibility)

    TCB . . . MVS Total Time Total Time Total CPU

    Mode . . . Waits in MVS wait Dispatched Time / TCB

    QR . . . 13051397 000-18:18:33.24 000-01:49:46.74 000-01:12:02.27

    RO . . . 48658 000-20:05:12.28 000-00:02:46.27 000-00:01:00.80CO . . . 0 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    SZ . . . 0 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    RP . . . 0 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    FO . . . 800 000-19:00:52.61 000-00:00:44.05 000-00:00:06.50

    SL . . . 1 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    SO . . . 2 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    S8 . . . 0 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    D2 . . . 2419 000-20:18:01.28 000-00:00:03.26 000-00:00:00.43

    L8 . . . 16952578 007-03:07:31.31 000-05:36:18.48 000-01:13:35.37

    H8 . . . 0 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    J8 . . . 0 000-00:00:00.00 000-00:00:00.00 000-00:00:00.00

    Total Time Dispatched much > than Total CPU Time / TCB can be a sign of higher priority

    work causing CICS delays. If this is the case, then DB2 might either be the cause of these delays,

    or it may be degraded by the same degree or worst.

    Following is the list of TCBs used within CICS regions:

    QR = Quasi-reentrant (CICS system & applications); CO = Concurrent (VSAM); FO = File Owning (VSAM); RO =Resource Owning; RP = ONC/RPC; SL = Sockets Listener; SO = Sockets; SZ = FEPI; J8 = JavaVM; L8 =Open (used by DB2 Version 6, or later, as of CICS TS V2.2); S8 = Secure Sockets Layer (SSL)

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    47/52

    2003 GIS Corp . - www.gelbis.com 47

    DPMODE = What?

    Option gone with ini tially shipped CICS 2.2 now back

    via maintenance so what is this prove?

    DPMODE=HIGH Best for high volumes with little DB2 use as long as ample CPU

    capacity is available

    DPMODE=EQUAL (is/was CICS 2.2 default!)

    May(!) provide better performance for non-SQL transactions

    DPMODE=LOW

    Can provide more consistent service in some CPU constrained

    situations otherwise AVOID!

    Important Note: All 3 will work OK if non-CPU constrained in

    a multi-CP complex!

    This is the a most difficult one! You may need to figure out via experimentation

    what works best for your particular workloads.

    The CICS and DB2 bigots will come and say one thing versus the other

    None of these is the 100% answer for any one situation.

    So clearly the one you should not be doing is the the low, on the other hand it isinteresting that I found that in some CPU constrained situations its a roller coaster

    ride with low but that is not a situation that you should be in for your loved

    workloads.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    48/52

    2003 GIS Corp . - www.gelbis.com 48

    Buffer Pools 1

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    Feb. 5 Feb. 6 Feb. 7 Feb. 8 Feb. 9 Feb. 12

    C2 Elapsed

    Wait

    C2 CPU

    Increased BP3 from 2000 to 7000

    ITR Increased from 1 TPS to more than 8 TPS

    C2 elapsed decreased by more than 75%

    Source: Joel Goldstein, Responsive Systems

    Buffer Pools Always Mater! Tuning buffer pools is the single most productive area you can work

    on because you:

    -Can make all the changes without dependencies on other staff less politics is always a goodthing.

    -Facts you need, can be reported from many sources - RMF, SMF, DB2 stats, traces &

    accounting.

    -Benefits are easy to track and demonstrate.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    49/52

    2003 GIS Corp . - www.gelbis.com 49

    Buffer Pools - 2

    Buffer pool tuning is not a set and forget set of option! Ongoing

    tuning activities produce the best results!

    Top seven steps for buffer pools tuning:

    1. Isolate catalog in BP0

    2. Isolate sort work

    3. Isolate indexes from table spaces

    4. Isolate good buffer pool candidates from bad ones

    5. Isolate important works data from the other work

    6. Group objects with similar attributes is same pool

    7. Adjust buffer pool sizes to get most benefit for least storage

    used.

    Much was said about this in IDUGs history, so we wont do it again here. It is important to know

    that this is the most productive way to improve DB2 performance.

    Also, buffer pool tuning is not a set and forget set of options. Due to changes in the system

    hardware and software, various customization options in all DB2 related subsystems, and the

    applications, it must be an ongoing activity.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    50/52

    2003 GIS Corp . - www.gelbis.com 50

    High I/O Service Time

    Symptoms due to disagreement between two tools, much

    staff time was wasted

    ISV tool reported Sync IO time ~ 24 milliseconds (ms)

    RMF Average IO time ~ 14 milliseconds

    Where lo look?

    RMF DASD Activity

    SMF type 42-6

    DB2 accounting records

    What to look for?

    Basic I/O service time for any volume

    I/O service time for DB2 objects being studied

    Source of all the delays within I/O service time

    Knowing where to look for basic information about disk I/O operations will insure your success

    with such tuning activities.

    The SMF 42-6 are an excellent to start such an investigation. See the sample on next page.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    51/52

    2003 GIS Corp . - www.gelbis.com 51

    I/O Service Profi le from SMF 42 & RMF 741

    2

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    A E F L M N O P Q R S T U V

    Sum of All Activity / CPU / File within the Sysplex

    VOL

    RATEFILE UTL VOL UTL ALL UTL

    RESP

    MS

    CONN

    MSDISC MS

    PEND

    MS

    IOSQ

    MS

    13.71 6.68% 6.68% A 1 77.6 45.7 48.8 48.8 16.1 3.6 2.3 0.3 10

    6.81 3.32% 10.00% B 2 40.3 22.7 23.1 23.1 6.5 3.6 2 0.3 0.6

    5.88 2.86% 12.86% C 3 12.2 19.6 19.8 24.4 23.4 3.5 12.7 1 6.3

    4.89 2.38% 15.24% D 4 36.4 16.3 23 23 6.7 1.2 3.3 0.3 1.9

    4.65 2.27% 17.51% E 5 22.4 15.5 15.8 15.8 11.6 4 2.9 0.6 4.1

    4.53 2.21% 19.72% F 6 29.5 15.1 15.3 15.3 8.6 2.2 2.9 0.3 3.2

    4.29 2.09% 21.81% G 7 28 14.3 14.5 14.5 7.6 2.3 2.8 0.4 2

    4.29 2.09% 23.90% H 8 6.4 14.3 14.7 14.7 26.9 2.8 19.5 0.5 4.1

    4.23 2.06% 25.96% I 9 8.9 14.1 14.2 30.7 24.4 3.8 11.9 3.3 5.4

    4.20 2.05% 28.00% J 10 8.9 14 14.2 31.2 26.5 3.8 11.9 3.7 7

    4.11 2.00% 30.01% K 11 36.8 13.7 15.9 15.9 5.2 1.3 2.4 0.3 1.3

    VOL

    SER

    DATA

    NAME

    Total Serv./

    IO Intensity

    (mins)

    % of

    Current

    Total

    Service

    Cummul.

    % of

    Current

    Service

    1

    2

    4

    5

    6

    7

    8

    9

    1011

    12

    13

    14

    AD AE AF AG AH AI AJ AK AS AT AU AV AW AX AY AZ BA BB BC BD BE BF

    Physical IO Activity Details by Sysplex Member

    VOL

    SER

    VOL

    RT2

    PCT

    BY2

    RSP

    TM2

    IOQ

    TM2

    CON

    TM2

    DISC

    TM2

    PND

    TM2

    VOL

    RT4

    PCT

    BY4

    RSP

    TM4

    IOQ

    TM4

    CON

    TM4

    DISC

    TM4

    PND

    TM4

    VOL

    RT5

    PCT

    BY5

    RSP

    TM5

    IOQ

    TM5

    CON

    TM5

    DISC

    TM5

    PND

    TM5

    1 78.7 48.8 16.5 9.9 3.6 2.6 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 40.3 23.1 6.3 0.2 3.7 2.1 0.4

    3 0.0 0.0 67.4 0.0 32.7 7.9 26.9 1.0 2.0 26.7 0.5 3.6 15.3 7.3 12.2 19.8 22.0 4.7 3.6 12.7 1.0

    4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 72.5 23.0 5.6 2.1 1.3 1.9 0.4

    5 22.4 15.8 10.9 3.1 4.0 3.1 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    6 29.5 15.3 8.3 2.7 2.2 3.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    7 28.0 14.5 7.4 1.9 2.3 2.8 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.08 6.6 14.7 26.6 3.7 3.1 19.2 0.6 0.0 0.0 0.7 0.0 0.4 0.0 0.2 0.0 0.0 0.6 0.0 0.4 0.0 0.3

    9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.8 8.4 22.7 2.8 3.8 1 0.6 5.5 8.9 1 4.2 2 3.6 4.4 3.9 1 2.0 3.4

    10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.7 8.6 2 7.2 5.6 3.9 1 1.2 6.5 8.9 1 4.2 2 4.3 4.6 3.9 1 2.0 3.7

    11 61.3 15.9 4.0 1.1 1.1 1.5 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Lets finish with the busiest sample report of them all. This one is produced by SAS code running

    against the merged results from SMF 42-6 and the RMF 74 records from a five member

    SYSPLEX.All the object names and volume serials were changed to protect the source of this report.

    Top section reports activity and performance for each object and by volume

    Bottom section merges the individual volumes performance view from each member of the

    SYSPLEX

    Why bother with such analysis? The % of CURRRENT TOTAL SERVICE will quickly guide

    you to the most productive object for performance tuning. Saving a little on a very busy object can

    yield great results for the entire system. The other details in this report help us pinpoint the source

    of the I/O activity.

  • 7/27/2019 Who Just Killed My Db2 Gelb2003b

    52/52

    Speaker: Ivan Gelb 2003, Gelb Information

    Systems Corp. (GIS)

    E-mail: [email protected]

    Phone: 732-303-1333

    www.gelbis.com

    Your questions and comments are

    always welcome.

    Session Title: Who Just Killed My DB2?!

    Session: E8

    TRADEMARKS

    The following are trade or service marks of the IBM Corporation: CICS, CICS TS , CICSPlex,

    DB2, IBM, MVS, OS/390, z/OS, Parallel Sysplex. Any omissions are purely unintended.

    MOAD MOTHER OF ALL DISCLAIMERS

    All of the information in this document is tried and true. However, this fact alone cannot guarantee

    that you can get the same results at your place and with your skills. In fact, some of this advice can

    be hurtful if it is misused and misunderstood. As with all kinds of analysis, anything you may hear

    or read can be understood and misunderstood in many ways that may seem contradictory to you. In

    this regard, a further and associated contradictory element requires considerable systems analysis

    and trade-off studies to arrive at the structural design, based on rigorous system engineering

    concepts. By combining advice and certain experiences, any fully integrated performance testprogram is weakly equivalent to any subsystem compatibility testing designed to eschew

    obfuscation. Gelb Information Systems Corporation, Ivan Gelb and any one found anywhere

    assume no responsibility for this informations accuracy, completeness or suitability for any