db2 key performance metric descriptions

1

DB2 OBJECTS & PERFORMANCE

Buffer Pool:

Allocated memory for DB2; the main memory allocated to DB Manager to cahce table and index data

pages as they are read from disk or modified. DB Manage decides when to bring data from disk into

buffer pool. When old data is not being used, it can be written back out to disk.

Table Space:

Logical layer between physical tables w/data and the database; maps logical database design to physical

storage. Two types of table spaces:

• (SMS) System Managed Space: OS file system allocates and manages the space where the table

is stored

• (DMS) Database Managed Space: db manager controls storage space; special purpose file

system.

Container:

Allocates storage for the table space; It is the physical storage. It can be directory name, device or a file

name. All database and table data is assigned to table spaces. A single table space can span multiple

containers, but each container can ONLY belong to one table space.

Extent:

Is a unit of space within a container of a table space. DB Objects are stored in pages within DB2 which

are grouped into allocation units.

Container_0

Table Space Extent Pages

2

Page sizes:

Rows of table data are organized in page blocks. Four sizes exist (4K, 8K, 16K, 32K)

In a page of table data , ~ 75 bytes are reserved for DB2, the remaining is used for user data.

As you increase the page size, the following items also increase:

• Columns in the table

• Maximum row length

• Maximum table size

Big Block Reads:

If several pages (extent) are retrieved in a single request, then big-block read occurs. If the rows in the

pages are in the extent retrieved, then no physical I/O required.

Sequential Pre-fetching:

Ability of DB Manager to read pages in advance pages being referenced by a query. Use I/O Servers to

perform page reading.

Page Cleaning:

As pages are read and modified, they accumulate. Page cleaner tasks write out modified pages to

guarantee availability of buffer pool pages for use by read requests.

Table Descriptor:

The table descriptor provides information about the table, particularly the data definition from the

CREATE TABLE statement that created the object.

Catalog Table Space:

Catalog is where DB2 keeps all its metadata about database objects.

Temporary Table Spaces:

Space for intermediate tables as it waits to determine the final result set

Catalog Cache:

Table descriptors for tables, views, and aliases; a descriptor stores information about a table, view, or

alias in a condensed internal format. When an SQL statement references a table, it causes an insert of a

table descriptor into the cache, so that subsequent SQL statements referencing that same table can use

that descriptor and avoid reading from disk.

3

Package Cache: (see Package)

The package cache hit ratio tells you whether or not the package cache is being used effectively. If the

hit ratio is high (more than 0.8), the cache is performing well. A smaller ratio may indicate that the

package cache should be increased.

The package and section information required for the execution of dynamic and static SQL statements

are placed in the package cache as required. This information is required whenever a dynamic or static

statement is being executed.

The package cache exists at a database level. This means that agents with similar environments can

share the benefits of another agent's work. For static SQL statements, this can mean avoiding catalog

access.

Hash Join:

In a hash join, one table (selected by the optimizer) is scanned and rows are copied into memory buffers

drawn from the sort heap allocation. The memory buffers are divided into partitions based on a hash

code computed from the columns of the join predicates. Rows of the other table involved in the join are

matched to rows from the first table by comparing the hash code. If the hash codes match, the actual

join predicate columns are compared.

Consideration of the performance implications of coding your predicates in different ways

Join Predicate:

A join predicate is applied to identify the records that shall be joined. If the predicate evaluates to True,

then the combined record occurs in the joined table; otherwise, it does not. The join predicate can be

any predicate supported by SQL, for example in WHERE and ON clauses.

A join is a relation composition. That is the fundamental operation in relational algebra

DPCs (Deferred Procedures Call):

Interrupts that run at a lower priority than standard interrupts.

User Mode:

User mode is a restricted processing mode designed for applications, environment subsystems, and

integral subsystems.

Privileged Mode:

Privileged is designed or O/S components and allows access to hardware & memory

4

Split I/O:

May result from requesting data in a size that is too large to fit into a single I/O

Logical & Physical:

• Logical Reads is the number of Logical I/O requests made by DB2 for the physical file (or table).

• Physical Reads is the actual number of Physical I/O operations performed to satisfy the Logical I/O

requests.

The values of physical disk counters are sums of the values of the logical disks (or partitions) into which

they are divided.

MDL (Memory Descriptor List) Read Hits:

Read requests to the file system cache that hit the cache; so it does not require disk accesses in order to

provide memory access to the page.

Data Map Hits:

The percentage of data maps in the file system cache that could be resolved without having to retrieve a

page from the disk, because the page was already in physical memory.

Heap:

A logical grouping of memory that fulfills the needs of a particular component. For example, the utility

heap memory is used by DB2 utilities such as backup, restore, and load.

Indexes:

Indexes provide quick access to data and can enforce uniqueness on the rows in the table.

Threshold Trigger:

An event that occurs when the value of a performance variable exceeds or falls below a user-defined

threshold value. The action that occurs as a result of a threshold trigger can be:

� Logging information in an alert log file.

� Displaying information in an alert log window.

� Generating an audio alarm.

� Issuing a message window.

� Invoking a predefined command or program.

5

A database must have at least one buffer pool, and can have a number of buffer pools depending on the

workload characteristics, database page sizes used.

Using the Hidden Buffer pools:

When the main buffer pools are configured too large, it is possible that they will not fit into the

addressable memory space. (We will talk about addressable memory later.) That means DB2 cannot

start the database, because a database must have at least one buffer pool. If the database is not started,

you cannot connect to the database and change the buffer pool sizes. For this reason, DB2 pre-allocates

these four small buffer pools. Should the main buffer pools fail to start, DB2 will start the database with

the small buffer pools.

Sorting & Memory:

Sorting is required when no index satisfies the requested ordering of fetched rows, or the optimizer

determines that a sort is less expensive than an index scan.

There are two kinds of sorts: private sorts and shared sorts.

• Private sorts take place in an agent's private agent memory

• Shared sorts take place in the database's shared memory

The following formula calculates approximately how much memory the database shared memory set

requires:

6

Database shared memory = (Main bufferpools + 4 hidden bufferpools + database heap + utility heap +

locklist + package cache + catalog cache) + (number of estore pages * 100 bytes) + approx. 10%

overhead

Agent Private Memory:

Each DB2 agent process needs to acquire memory to perform work. It will use memory to optimize,

build and execute access plans on behalf of the application, to perform sorts, to record cursor

information such as location and state, to gather statistics, etc. Agent private memory is allocated for a

DB2 agent when the agent is assigned as the result of a connect request or a new SQL request in a

parallel environment.

When an agent becomes idle,

it retains its agent private memory. This is designed to improve performance, because the agent will

have its private memory ready when it is called again. If there are many idle agents and all of them

retain their private memory, it is possible that the system runs out of memory. To avoid this, DB2 has a

registry variable which limits the amount of memory each idle agent can retain.

Example:

All database requests are serviced by DB2 agents or subagents. For example, when an application

connects to a database, a DB2 agent is assigned to it. When the application issues any database

requests, such as a SQL query, the agent goes out and performs all the tasks that are required to

complete the query - It works on behalf of the application.

Each agent, or subagent, is considered a DB2 process, and it acquires a certain amount of memory to

perform work. This memory is referred to as the agent private memory - It cannot be shared with any

other agents.

Boundaries of Private & Shared Memory:

In addition to its own private memory, in which the agent performs its "private" tasks such as private

sorts, using the sortheap; the agent also requires database level resources such as the buffer pools, the

locklist and the log buffers. These resources are found within the database shared memory.

The way DB2 works is that everything within the database shared memory is shared by all DB2 agents or

subagents connected to the same database. Therefore this memory set is called shared memory, as

opposed to private memory.

Example:

For example, agent x connecting to database A uses the resources within the database shared memory

of database A. Now a second agent, Agent y, also connects to database A. Agent y will share the

7

database memory of database A with agent x. (Of course, both agent x and y have their own agent

private memory, which is not shared.)

What is a package?

Basically, it’s a database object contained with optimized SQL. Each SQL statement must go through the

DB2 optimizer before it can be executed. The optimizer generates a data access plan (which is used to

locate the data when a query is executed), and the access plan is stored in a package. The package itself

is stored in the system catalog if the SQL statement is a static statement coded in an application, or in

the package cache, if the SQL statement is a dynamic statement.

You can view the access plan for one or more SQL statements with Explain. It's possible to run an

embedded SQL application using only the package. When you precompile the application, a package is

created containing access plans for the SQL statements coded in the application.

That package is stored in the database that was used to precompile the application.

You can also precompile an application and have the access plan information stored in an external file,

which can then be bound to any database you want to use the application with or the package can be

stored in a bind file.

The binding process generates the package and stores it in the database specified — this is referred to a

deferred binding.)

A program module that contains embedded dynamic SQL has associated package and sections but the

sections act only as placeholders for SQL statements that are dynamically prepared.

Package Section or Section:

A section is a compiled form of a SQL statement. Every section corresponds to one statement. An

optimized access plan will be stored in a section.

System Catalog

Package

Data Access Plan

Optimizer

SQL

8

Dynamic Statements:

Dynamic SQL allows a programmer or end user to create a SQL statement's specifics at runtime and pass

the statement to the database. The database then returns data into the program variables, which are

bound at SQL runtime.

Static Statement

A static SQL statement is written and not meant to be changed. Although static SQL statements can be

stored as files ready to be executed later or as stored procedures in the database, static SQL does not

quite offer the flexibility that is allowed with dynamic SQL. THINK: Stored Procedures

The problem with static SQL is that even though numerous queries may be available to the end user,

there is a good chance that none of these "canned" queries will satisfy the users' needs on every

occasion.

Comparing Dynamic vs. Static SQL Statements1

An application using dynamic SQL has a higher start-up (or initial) cost per SQL statement due to the

need to compile the SQL statements before using them.

Once compiled, the execution time for dynamic SQL compared to static SQL should be equivalent and,

in some cases, faster due to better access plans being chosen by the optimizer.

1 http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/ad/c0005785.htm

9

Each time a dynamic statement is executed, the initial compilation cost becomes less of a factor. If

multiple users are running the same dynamic application with the same statements, only the first

application to issue the statement realizes the cost of statement compilation.

Differences between Static & Dynamic:

Dynamic SQL is often used by ad hoc query tools, which allow a SQL statement to be created on-the-fly

by a user to satisfy the particular query requirements for that particular situation. After the statement

is customized according to the user's needs, the statement is sent to the database, checked for syntax

errors and privileges required to execute the statement, and compiled in the database where the

statement is carried out by the database server.

Although dynamic SQL provides more flexibility for the end user's query needs, the performance may

not compare to that of a stored procedure whose code has already been analyzed by the SQL

optimizer.

A call-level interface (CLI):

CLI is used to embed SQL code in a host program, such as ANSI C. It is one of the methods that allows a

programmer to embed SQL in different procedural programming languages. When using a call-level

interface, you simply pass the text of a SQL statement into a variable using the rules of the host

programming language.

You can execute the SQL statement in the host program through the use of the variable into which you

passed the SQL text.

Direct SQL:

Direct SQL is where a SQL statement is executed from some form of an interactive terminal. The SQL

results are returned directly to the terminal that issued the statement. Most of this book has focused on

direct SQL. Direct SQL is also referred to as interactive invocation or direct invocation.

Embedded SQL:

Embedded SQL is SQL code used within other programs, such as Pascal, FORTRAN, COBOL, and C. SQL

code is actually embedded in a host programming language, as discussed previously, with a call-level

interface.

Embedded SQL statements in host programming language codes are commonly preceded by EXEC SQL

and terminated by a semicolon in many cases. Other termination characters include END-EXEC and the

right parenthesis.

10

Deadlock:

condition under which a transaction cannot proceed because it is dependent |on exclusive resources

that are locked by another transaction, which in turn |is dependent on exclusive resources that are in

use by the original transaction

Fenced vs. Not Fenced Resources:

Fenced resource executes in a separate process from the database agent. Not fenced resource executes

in the same database process as the database agent.

File Buffer Cache

System buffer cache hit

12

Buffer Cache / Hit Ratio / Read & Write Request

On a file read request, the file system first attempts to read the requested data from the buffer cache. If

the data is not already present in the buffer cache, it is read from disk and cached in the buffer cache.

Similarly, writes to a file are cached so that future reads can be satisfied without necessitating a disk

access, and to reduce the frequency of disk writes. The use of a file system buffer cache can be

extremely effective when the cache hit rate is high. It also enables the use of sequential read-ahead and

write-behind policies to reduce the frequency of physical disk I/O’s.

Another benefit is in making file writes asynchronous, since the application can continue execution

without waiting for the disk write to complete. Figure 3 shows the sequence of actions for a write

request under cached I/O.

Note:

While the file system buffer cache improves I/O performance, it also consumes a significant portion of

system memory.

13

Dynamic SQL Processing

DynSQL

Dynamic SQL

SQL statement is assembled

and completed @ runtime

Global Package Cache

Executable Access Plan

Compiler invoked if

executable exists.

If executable does not

exist, compiler not

invoked.

DB Optimizer

Access plan

Table(s)

Data

14

SQL Monitoring Diagram

DB Optimizer

Access plan

Table(s)

Data

Sql

System Catalog Tables

Package: SQL executable form

(Access Plan)

Access Paths

How to get the data

What is my strategy?

• Index Usage

• Sort Methods

• Lock Semantics

• Join Methods

15

Sorting:

Data that needs to be defined in some sequence or order

DB2 attempts to perform the ordering via index usage. If an index can’t be used, the sort will occur.

A sort involves:

• Sort Phase

o Overflowed – data sorted cannot fit entirely on the sort heap it overflows into

temporary database tables

o Non-Overflowed – fits and performs better

• Return of the results of the sort phase

o Piped – if sorted information can return directly w/o requiring a temp table to store a

final, sorted list of data. This is better!

o Non-piped – if results require a temp table to be returned

Figure: Non-Piped returns

See SORTHEAP database configuration parameter

Sort Heap:

Block of memory allocated each time a sort is performed.

Sort Heap

Sorted Data Temp

Tables

Temp

16

Transactions:

A set of SQL statements (series of instructions) that execute in a single operation. A transaction is

completed when either an explicit COMMIT or ROLLBACK is encountered.

Internal Commits:

The total number of commits initiated by the database manager

Locks:

DB2 Locking mechanism attempts to avoid resource conflicts yet still provide full data integrity. Locks

are released when resource is no longer required at the end of the transaction.

Lock Escalations:

The number of locks that have been escalated from several row locks to table locks.

Logs:

As transactions are processed, they are tracked within the log files. DB2 tracks all statements that are

issued within its logs. DB2 Uses write ahead logging to ensure that changes to the database will be

applied. Changes are written to the logs first then later applied to the physical database tables.

17

.NET and DB2 Model

db2 key performance metric descriptions

Documents

host programming

db optimizeraccess

small buffer

fenced resource

static sql

file system

agent private

database shared