my presentation on db2 performance databases - idug

Welcome to my presentation on DB2 Performance Databases

How to create your own DB2 performance database and then how you can use it!

ABSTRACT: Many vendors of DB2 performance analysis tools offer a "performance

database" which consists of tables where you store db2 trace output as processed

by the analysis tool. When loaded, you can use the power of SQL to easily mine this

valuable information

1

Welcome to my agenda

Manulife is where I work. We are called John Hancock in the USA.

Some shops have been running standard traces for years.. but they never look at the data!

You should look at it… it is interesting

My target audience is DBAs who don’t have a PDB but they hava a DB2 monitor tool which

allows the creation of a PDB.

I hope I am not saying things that are TOO obvious… it is obvious to me now. But it wasn’t

obvious in the beginning.

My goal of this presentation is to give the big picture of a PDB (not reference… good

reference exists). I want to help others understand the value of a PDB

my first DB2 performance database was provided by CA Insight for DB2. It was fine. For

reasons beyond my input or control my shop switched from CA Inisght for DB2 to

Omegamon for DB2.

It turns out they both have “performance databases”. <the names may vary> But the

structure and contents of tables of the performance db was remarkably the same.

But that makes sense (in hindsight), obviously both are built upon the same input source >

the DB2 trace output (via SMF in my case)

Again. Lots of good reference DB2 trace reference information exists. This is a quick

summary on what you see after –DIS TRACE db2 command

I found the IBM reference documentation onTRACE and IFCIDs to not be good at

explaining the big picture. It is excellent reference documentation. It took me a while to get

to my still not-perfect understanding. I had to put the pieces together myself… I am still

putting it together..

I am not telling you what traces to run! Most shops are already running traces. I just want

you to utilize your traces more! And that can be done by putting the trace output in a PDB

Accounting Traces are the best. They provide details on application activity. The

accounting Trace output is generated when a accounting event occurs. This is normally

when a db2 thread completes its activity (ie end of CICS transaction… or batch job step).

But it is not always 1 to 1 even for CICS … ( I have learned to my surprise). And even

more so… distributed db2 activited (DRDA) is often summarized by Db2 according to

DSNZPARM of ACCUMACC value. This is probably important to set so you don’t generate

too much accounting output from high volume web applications. We have ACCUMACC set

to 10… I am thinking of setting it to 100… What do others do? (I would like to know).

There is a accounting trace output field that tells you many distributed acitivites are

summarized into this trace output… it is called PAR_

The differences between the PDB and PWH are explained in many places. It took me a

while to really figure it out. A PDB is totlaly under my control and was relatively easy to

setup. No expensive server to purchase or assign responsibility. Hands on type dbas

probably gravitate towards building a PDB

… maybe with more experience I will switch to PWH

This link is probably redundant… it can be easily googled. Make sure you pick OM XE for

DB2 documentation for your particular Version! For example.. My shop is at V5.2 but V5.3

is already available (with some new PDB tables available… to be mentioned soon)

Reading the PDB reference documentation leads to the obvious question… To File or

Save. It is a good question.

It took me too long to figure out the difference between SAVE and FILE. Using SAVE

produces the most detailed and non-aggreagated data. In my world it was OK. Merely

millions of records.

>> although I do have a new interesting high volume CICS application that is changing from

VSAM to DB2. It will be interesting the amount of new volume of data in my PDB.. I may

have to rethink my strategy….

There are many possible PDB tables. The most valuable and immediate

accounting tables include

DB2PMFACCT_GENERAL, DB2PMFACCT_PROGRAM (*)…

(*) was it a poor decision many years ago to call the

package data table DB2PMFACCT_PROGRAM? I think yes… it should be

package (I think)

The main important statistics table is DB2PM_STAT_GENERAL

And then there are relatively new bonus tables for dynamic and static sql statements.

Fascinating info here… but it may be a bit expensive to constantly gather (ie. Run these

traces)… so think about it.

The reference documentation is not explicit about HOW to build a PDB. What database

name to use? Where to put the db?

And tablespaces? They are ommited in the provided DDL. You make your own … it isn’t

that hard. May as well use UTS PBG (although you could partitioni)

No indexes are suggested in the reference. Build indexes for your anticipated use.

non-unique indexes are ok

in hindisght... my indexes are not the best. not all are used all the time… but it is something

The accounting tables contain lots of data (depending upon your retention rule). Using your

indexes will help your query performance! I keep the index picture handy to remind

myself… or I do a quick explain in data studio to confirm a index is used.

The indexes I built on the STAT tables.

Really… the STAT tables contain relatively little data (often one row per minute per db2

subsystem) so even if you scan the whole table sometimes… not the end of the world

The new PDB tables in V5.3 for deadlock/timeout will be useful. I anticipte using this table

to capture trends.

Sample JCL for processing SMF. It was not obvious in the reference documentation on

how to build the jcl…. It was there… but not obvious (to me). Here is my example

LOAD JCL is easy… it is a simple load (no RI in my PDB) Here is my sample JCL for load.

You really need to think of a purge strategy. And then implement as regular job to

purge/maintain.

(*) I have millions of rows every day. It is lots. but not impossible.

Making a PDB variant is a valuable tool for looking at long term trends!

I built my own (as described above). Very useful to see a popular CICS txn over time… or

a distributed server activity over time.

For my ”OTHER” table, I summarized it by batch job (corrname) … including MAINPACK

would have helped break out the batch job steps. in hindisht, that wouldd have been a

good idea...

After some years of experience… The grouping fields could be refined to be more

general… but the above was a good start. In hindishgt, if I built them new again… they

would be slightly different if I built them new today

The field names are all based upon the source ACCT_GENERAL column names. I do not

change the names to make it more obvious.. It is best to use what IBM provided for later

reference and others!

My only new field name is ACTIVITY_LOCAL_DT which identifies the “date” for all the

summarized GENERAL records.

And DB2PMF_ACCT_GENERAL_CNT is at the end of my tables… to tell me how many

source GENERAL records when into this BYDAY summary.

>> for CICS and DRDA then this CNT will be high… for BATCH it might be one!

IBM PDB table column descriptions. You can assume you know what is the meaning of the

field names… but sometimes it is good to check the reference document from IBM.

The qualtiy of these data dictionary tables could be debated. But it does exist. Does

anyone else really load these data into the tables? It is also debatable if it is easier to use

these data dictionary tables or just use the source PDS member in the SAMP libraries…

whatever works for you!

I hope that IBM DB2PM people have regret for creating a DB2 table with a field called

“TIMESTAMP”. In hindsight, that is not a good name.

REQ_LOCATION is the same as the CLIENT_WSNAME if you connect directly to DB2 for

z/OS - skip the DB2 Connect server

Now the magic question! How to use the PDB? What to do with all this wonderful data you

have now collected?

Remember… if you don’t use it then why do you bother building it? Do something!

DB2 trace output is powerful information… it is now relatively easy to look at now that it is in

the PDB!

Starting with the STAT_GENERAL… what is interesting? Lots is interesting! 741 columns

of data per interval (one minute)

Example 1 looks at cpu by db2 address spaces by day…

Example 2 looks at locks and timeouts by day

ACCT_GENERAL is the key accounting application trace output table.

Again… lots of columns or data for every accounting event

Be careful of TIMESTAMP (did I already say that the column name of TIMESTAMP is

disappointing? Poor name. anyways)

CORRNAME to find a specific job (at least one accounting event per job step! Therefore

possible for a few rows for the same job)

Lots of other great columsn (ziip… deadlocks, timeouts… class1, class2, class3)

Automate a daily query to look at recently loaded data. Send email if the query produces a

“interesting” result! (any exception)

ACCT_PROGRAM

Really about package, not program. It bothered me at first. But that is ok now. I have

accepted the wrong name.

Interesting columns… CLASS7, CLASS8 , SQL_STMTS_ISSUED (total sql statements!

Not broken down)

(sql counts by sql statement type are only populated if you have appropriate trace turned on

… I think acct class(10)) … I assume it is a bit expensive…. But if it is important then gather

it

My favorite ACCT_GENERAL % BYDAY tables.

This is my attempt to extend the supplied PDB tables into something I use for long term

trends

Here are some examples of how I use my BYDAY tables…

Dyanmic sqt statement cache tables are new to my PDB.

But here are examples of some use

Who was in the cache a long time? Perhaps the query could be refined to sort by cpu… or

exec count? Send a alert in some cases

Who was executed the most? Why executed the most? How long was it in the cache?

Which CICS transactions (from my BYDAY) uses lots of daily cpu?

Again, this could be refined to send a email/alert if necessary… when something odd

shows up!

In my BYDAY tables I now have AVG_cpu and MAX_cpu columns at the end of my table..

This is good to look for widely varying performance by day… it helps you learn about the

apps

I think a analytics accelerator would be a very interesting tool to use with the PDB. The

PDB doesn’t have to be too HUGE. So if you can load it into the analytics accelerator then

you can query whatever way you want! No index required

And would IBM Watson be helpful? Interesting idea here…

IBM Data Studio. Really… it is the best thing since sliced bread

Many places exist to find more information on PDB

As a bit of bonus information… here are my thoughts about how to send automated emails

from data in the PDB.

If someone really wants to see my jcl to send emails then I can share a copy of the jcl… it is

basic and non-proprietary.

Honestly… I would appreciate feedback! This is my first presentation! (second time

presenting it) I would appreciate your feedback!

Do you use a PDB today? If yes then I would love to hear from you and hear about how

you use it.

If no, then I would like to hear if my presentation is inspiring you to consider such a

repository? Do you see value in it?

Speaker BIO: Brian Laube has been a DB2 for z/OS application DBA for 18 years

(15+ years with Manulife Financial and 3 years with Bell Canada International (now

part of CGI))

As a application DBA, my primary area of interest is DB2 application performance

and understanding what is going on inside the application and finding ways to make

it more efficient.

39

my presentation on db2 performance databases - idug

Documents