best practices for emc symmetrix

Best Practices for EMC Symmetrix® 8000 with IBM® DB2® Universal DatabaseTM

Karen Sullivan IBM Canada Ltd

IBM Toronto Lab

John Macdonald EMC Corporation

June 2003

Engineering White Paper

EMC Corporation and IBM Corporation

Best Practices for EMC Symmetrix with IBM DB2 Universal Database 1

Copyright © 2003 EMC Corporation and IBM Corporation. All rights reserved.

EMC and IBM believe the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” - NEITHER EMC CORPORATION NOR IBM CORPORATION MAKE ANY REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND BOTH SPECIFICALLY DISCLAIM IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

The furnishing of this document does not imply giving license to any IBM or EMC patents.

References in this document to IBM products, Programs, or Services do not imply that IBM intends to make these available in all countries in which IBM operates.

Use, copying, and distribution of any EMC or IBM software described in this publication requires an applicable software license.

IBM, AIX, DB2, DB2 Universal Database, and RS/6000 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both.



Table of Contents

Introduction......................................................................................................................4 Symmetrix Concepts and Definitions........................................................................5

Hypervolumes .....................................................................................................................5 Hypervolume Size............................................................................................................5

Metavolumes ......................................................................................................................6 Metavolume Size.............................................................................................................6 Meta Head and Meta Tail .................................................................................................7

Types of Metavolumes.........................................................................................................7 Concatenated Metavolume ...............................................................................................7 Striped Metavolume.........................................................................................................8 Mirrored Metavolumes .....................................................................................................9

Channel Directors ...............................................................................................................9 DB2 UDB Concepts and Definitions ....................................................................... 10

Instances .......................................................................................................................... 10 Databases ........................................................................................................................ 10 Database Partitions ........................................................................................................... 10

Nodegroups .................................................................................................................. 10 Buffer Pools ...................................................................................................................... 11 Tables .............................................................................................................................. 11 Table Spaces .................................................................................................................... 11

System-Managed ve rsus Database-Managed Table Spaces ............................................ 12 Containers .................................................................................................................... 12 Pages ........................................................................................................................... 12 Extents.......................................................................................................................... 12 Prefetch Size................................................................................................................. 12

Prefetching ....................................................................................................................... 13 Page Cleaners .................................................................................................................. 13

Configuring a Symmetrix System ........................................................................... 14 Creating Metavolumes ....................................................................................................... 14

Metavolume Size........................................................................................................... 14 Hypervolumes versus Physical Disks.............................................................................. 15 Striped versus Concatenated Metavolumes ..................................................................... 15 Stripe Size .................................................................................................................... 15

Channel Directors ............................................................................................................. 15 Configuring the Operating System ......................................................................... 17

Multipathing with PowerPath .............................................................................................. 17 Operating System Logical Volume Striping ......................................................................... 17

Configuring DB2 UDB................................................................................................. 18 Table Space Container Configurations................................................................................ 18

Shared Nothing ............................................................................................................. 18 Shared Everything ......................................................................................................... 20 JBOD............................................................................................................................ 22

Table Space Configuration................................................................................................. 22 Extent Size.................................................................................................................... 22



Prefetch Size................................................................................................................. 22 Overhead and Transfer Rate .......................................................................................... 22

Other Tuning Parameters .................................................................................................. 23 I/O Servers .................................................................................................................... 23 DB2_PARALLEL_IO ...................................................................................................... 23 Multipage File Allocation ................................................................................................ 23

Understanding Existing Systems............................................................................ 24



Introduction For every complex problem, there is a solution that is simple, neat, and wrong.

— H. L. Mencken

H. L. Mencken was a journalist whose clever observation that simple solutions to complicated problems are not always right ones, reflects the fear people sometimes feel when attempting to take on new, complicated problems. Avoiding the solution that is neat and wrong is never simple; rather, it takes patience, planning, and experience. This paper gives a general overview of IBM DB2 Universal Database® (DB2 UDB) with database paritioning and the EMC Symmetrix® 8000 series (Symmetrix). It also supplies practical recommendations for implementing DB2 UDB for data warehouse applications running on EMC Symmetrix 8000 series storage servers. The information presented within this paper was compiled using DB2 UDB V7.2. However, unless otherwise noted, concepts and methodologies remain the same for DB2 UDB V8.1.

The paper does not provide complete descriptions for DB2 UDB or the Symmetrix 8000 series products; refer to www.emc.com or www.ibm.com/db2 for additional product information.



Symmetrix Concepts and Definitions The following is a brief summary of Symmetrix terminology and a discussion of limits required to understand the contents of this white paper. Note that the configuration and capacity limits are microcode-level dependent and subject to change. For a complete description, review your EMC Symmetrix Product Guide. The limits discussed in this paper are all based on microcode level 5068.

The physical disks within a Symmetrix system can be subdivided into various-sized logical volumes that can be logically joined together again. These logically linked volumes are then presented to a server as an addressable device. Within a Symmetrix system there are two types of logical volumes: hypervolumes and metavolumes. The maximum number of logical volumes is a microcode-dependent value currently set to 8000 volumes.

Hypervolumes A hypervolume , also referred to as a hyper, is a range of contiguous space on a single physical disk that is defined to be an individually addressable Symmetrix logical volume. Each physical disk can be divided into a maximum of 128 hypervolumes. People familiar with the process of creating hypervolumes will often refer to the process as slicing up the physical disks or creating splits. For clarity, the terms slices and splits will not be used to describe hypervolumes. While hypervolumes can be presented to the server as a directly addressable device, they are also the foundation for creating metavolumes. The major attribute that defines a hypervolume is its size.

Figure 1. A 36 GB Physical Disk Divided into Four Hypervolumes of 9 GB Each

Hypervolume Size The amount of physical disk space associated with one hypervolume is called the hypervolume size. Hypervolume size is microcode-dependent and currently limited to 15 GB. In Figure 1, a 36 GB physical disk is subdivided into four hypervolumes, each with a hypervolume size of 9 GB.

36 GB Physical Disk

9 GB Hypervolume

9 GB Hypervolume

9 GB Hypervolume

9 GB Hypervolume



Metavolumes Once a physical disk has been divided into hypervolumes, a group of hypervolumes of the same size can then be logically joined across various physical disks to create a metavolume. This newly created logical volume can then be presented to a server as an addressable device. The hypervolumes that make up the newly created metavolume can no longer be presented to the server as separate devices.

Figure 2 provides an example of how four metavolumes can logically reside on four 36 GB physical disks. Note, for simplicity’s sake only one metavolume is labeled. In the diagram, the four physical disks are subdivided into four 9 GB hypervolumes. Each hypervolume is coloured red, yellow, blue, or green. The metavolumes are made up of four like-coloured 9 GB hypervolumes. Therefore, there are four metavolumes of 36 GB each in the diagram.

Figure 2. Example Layout of Four Metavolumes across Four Physical Di sks

Metavolume Size A metavolume consists of 2 to 255 hypervolumes. Each time a metavolume is created, the number of hypervolumes it contains must be determined. This value is referred to as the number of hypers per meta for one metavolume. The metavolume size is simply the product of the hypervolume size and the number of hypers per meta.

metavolume size = ( hypers per meta ) * ( hypervolume size )

Formula 1. Metavolume Size

Given the fact that the maximum hypervolume size is 15 GB and the maximum number of hypers per meta is 255, the largest metavolume that can be created is 3825 GB or 3.74 TB (based on the formula given).

9 GB Hypers 36 GB Metavolume



Meta Head and Meta Tail As the hypervolumes are assigned to a metavolume, they are given a sequence number. The first hypervolume in the sequence is known as the meta head , and the last hypervolume in the sequence is considered the meta tail. All remaining hypervolumes are considered members of the metavolume. Figure 3 shows which hypervolumes are considered the meta head, meta members, and the meta tail.

Figure 3. Relative Positions of a Meta Head, Member, and Tail for an Example Metavolume

Data is always placed on the metavolume across the hypervolumes from meta head to meta tail. Therefore, when data is first written to a metavolume, the first write always takes place on the meta head.

Types of Metavolumes There are two different types of metavolumes: concatenated metavolumes and striped metavolumes. For both types, the metavolume size is defined in the same way. For example, if you have the same number of hypers per meta (e.g., four) and the same hypervolume size (e.g., 9 GB), then both a concatenated and a striped metavolume will produce a device of the same size (e.g., 36 GB). The difference between concatenated and striped metavolumes is in the method in which the logical data is placed on the underlying hypervolumes. DB2 UDB database data allocation will be discussed in more detail later in the paper.

Concatenated Metavolume A concatenated metavolume writes data to a hypervolume until the hypervolume size is reached before placing data onto the next hypervolume. Therefore, when first allocating data to the metavolume, the meta head would receive all the data until the hypervolume size is reached. Only after the meta head is full will data be placed onto the next hypervolume.

Figure 4. Logical Data Placement on a Concatenated Metavolume

Member Tail Head

9 GB Hypers 36 GB Metavolume

Member

Head Tail



Striped Metavolume In the case of a striped metavolume , data will be placed on the underlying hypervolumes in multiples of Symmetrix cylinders. When an application writes data to the metavolume, the first write will take place on the meta head, and the subsequent write will reside on the next member of the metavolume. Allocation will continue in this fashion until the meta tail is reached. The process is then repeated starting at the meta head once again.

Figure 5. Logical Data Placement on a Striped Metavolume

Stripe Size The amount of data written to a single hypervolume is known as the stripe size. The size is based on units of disk cylinders with the default and minimum value being two cylinders. Since a cylinder is 480 KB of data, the minimum stripe size is 960 KB.

Figure 6. Close Up of a Stripe on a Single Hypervolume

Stripe Width The stripe width is the stripe size times the number of hypers per meta. So, if we have the default stripe size of 960 KB and four hypers per meta, the stripe width would be 3840 KB.

Formula 2. Stripe Width (with diagram)

Stripe Size = 2 cylinders

Head Tail

stripe width = ( stripe size ) * ( hypers per meta )



Mirrored Metavolumes Any metavolume, whether it be striped or concatenated, can also be mirrored. This means another complete copy of the metavolume is created and is stored on a different physical disk using like hyper volumes (same hyper volume size with the same hypers per meta but different physical disks). Even though mirrored metavolumes require twice as much physical disk space, the metavolume size does not change. The device presented to the server will still be the same size as a nonmirrored metavolume. When a read request takes place against a mirrored metavolume, either hyper volume where the data resides may service the request. Consequently, when a write request takes place, the write must occur on both hyper volumes. To safeguard redundancy, a Symmetrix system ensures that mirrored copies are not created on the same physical disk.

Figure 7. Logical Data Placement on a Mirrored Striped Metavolume

Channel Directors Host adapters on the Symmetrix system are known as channel directors. This is where the server physically attaches to the storage server via cables. Each card contains a number of fiber, SCSI, or serial ESCON ports.

Same Logical Data Written



DB2 UDB Concepts and Definitions The following is a brief summary of some DB2 UDB V7.2 concepts and terminology required to understand the contents of this white paper. Although some of the terminology has changed for DB2 UDB V8.1, the overall concepts remain the same. For a more in-depth discussion of these and other DB2 UDB terms, refer to the DB2 UDB manuals available online.

For DB2 UDB V7.2:

http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix /support/v7pubs.d2w/en_main

For DB2 UDB V8.1:

http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix /support/v8infocenter.d2w/report?target=mainFrame&fn=c0008880.htm

Instances An instance, in DB2 UDB, is a logical database manager environment where you can create and/or catalog databases and set various instance-wide configuration parameters. A database manager instance can also be defined as being similar to an image of the actual database manager environment. Furthermore, you can have several instances of the database manager product on the same database server. You can use these instances to separate the development environment from the production environment, tune the database manager to a particular environment, and protect sensitive information from a particular group of people. For a partitioned database environment, all database partitions will reside within a single instance and will share at the instance level a common set of configuration parameters.

Databases A database is created within an instance. They present logical data as a collection of database objects (e.g., tables and indexes). Each database includes a set of system catalog tables that describe the logical and physical structure of the data, configuration files containing the parameter values allocated for the database, and recovery log(s).

DB2 UDB allows multiple databases to be defined within a single database instance. Configuration parameters can also be set at the database level to tune various characteristics, such as memory usage and logging.

Database Partitions DB2 UDB allows the user to divide a single database into multiple logical database partitions. Each of these database partitions can look and behave as an independent database. Therefore, multiple database partitions can reside on the same server, and/or database partitions can reside on many servers. They are all part of the same database that is joined through the catalog database partition where the database is actually created. This database partition stores the overall database configuration information. Each database partition also has access to its own set of database-level configuration parameters.

Another term for a database partition is a node. A unique node number identifies each node.

Nodegroups A nodegroup is a set of one or more database partitions. For nonpartitioned database implementations, there is only one nonconfigurable nodegroup, which is always made up of a single database partition.

Figure 9 shows how five database partitions can be divided into three different nodegroups. As you can see, a database partition can reside within multiple nodegroups. In this example, nodegroup 1 is made up



of database partitions 1, 2, 3, and 4. Nodegroup 2 contains only a single database partition, database partition 1, and finally nodegroup 3 compris es database partitions 4 and 5.

Figure 8. One DB2 UDB Database Comprising Five Database Partitions Grouped into Three Nodegroups

Buffer Pools A buffer pool is the main memory allocated in the host processor to cache table and index data pages as they are being read from disk, or being modified. The purpose of the buffer pool is to improve system performance. Data can be accessed much faster from memory than from disk; therefore, the fewer times the database manager needs to read from or write to disk (I/O) the better the performance. Buffer pools are created by database partitions and each partition can have multiple buffer pools.

Tables The primary database object is the table. A table is defined as a named data object consisting of a specific number of columns and a various number rows. Tables are uniquely identified units of storage maintained within a DB2 table space. They consist of a series of logically linked blocks of storage that have been given the same name. They also have a unique structure for storing information that permits that information to be related to information in other tables.

When creating a table, you can choose to have certain objects, such as indexes, stored separately from the rest of the table data. In order to do this, the table must be defined to a DMS (data-managed space) table space.

Table Spaces A database is logically organized into table spaces. A table space is a place to store tables. The table space is where the database is defined to use the disk storage subsystem. One method to spread a table space over one or more physical storage devices is to simply specify multiple containers.

Database Partition 1





Nodegroup 1

Nodegroup 2 Nodegroup 3

DB2 UDB Database



There are three main types of user table spaces: regular, temporary, and long. In addition to these user-defined table spaces, DB2 also defines separate system and catalog table spaces. For partitioned database environments, the catalog table space resides on the catalog database partition.

System-Managed versus Database-Managed Table Spaces For partitioned databases, the table spaces can reside in nodegroups. During the create table space command, the containers themselves are assigned to a specific database partition in the nodegroup, thus maintaining the ‘shared nothing’ character of DB2 UDB. Table spaces can be either system-managed space (SMS), or data-managed space (DMS). For an SMS table space, each container is a directory in the file system, and the operating system’s file manager controls the storage space. For a DMS table space, each container is either a fixed-size pre-allocated file or a physical volume, and the database manager controls the storage space itself.

Containers A container is an allocation of physical storage. It is a way to define the device that will be made available for storing database objects. Containers may be assigned to file systems by specifying a directory. Such containers are identified as PATH containers and are used with SMS table spaces. Containers may also reference files that reside within a directory. These are identified as FILE containers, and a specific size must be identified. FILE containers are only used with DMS file table spaces. Containers may also reference raw character devices. These containers are used by DMS raw table spaces and are identified as DEVICE containers. Note that the device must already exist on the system before the container can be used. In all cases, containers must be unique and can belong to only one table space.

Pages Data is transferred to and from devices in discrete blocks that are buffered in memory called pages. DB2 UDB supports various page sizes including 4 KB, 8 KB, 16 KB and 32 KB. When an application accesses data randomly, the page size determines the amount of data transferred. In other words, it will correspond to the data transfer request size to the disk array. Page size determines the maximum length of a row, and is associated with the maximum size of a table space. These limits are shown in Table 1. In all cases DB2 UDB limits the number of data rows on a single page to 255 rows.

Table 1. Page Size Limits

Page Size Max Table Space Size Max Row Length

4 KB 64 GB 4005 B

8 KB 128 GB 8101 B 16 KB 256 GB 16293 B

32 KB 512 GB 32677 B

Extents An extent is the unit at which space is allocated within a container of a table space for a single table space object. This allocation consists of multiple pages. The size of the extent is specified when the table space is created. Note that when data is written to a table space with multiple containers, the data is striped across all containers in extent-sized blocks.

Prefetch Size The number of pages that the database manager will prefetch can be defined for each table space using the PREFETCHSIZE clause with either the CREATE TABLESPACE or ALTER TABLESPACE statements. The value specified is maintained in the PREFETCHSIZE column of the SYSCAT.TABLESPACES system catalog table.



Prefetching Prefetching is a technique for anticipating data needs and “reading ahead” from storage in large blocks. By transferring data in larger blocks, fewer system resources are expended and less total time is required.

Sequential prefetches read consecutive pages into the buffer pool before they are needed by DB2. List prefetches are more complex. In this case, the DB2 optimizer optimizes the retrieval of randomly located data.

The amount of data being prefetched is part of what determines the amount of parallel I/O activity. Ordinarily the database administrator should define a prefetch value large enough to allow parallel use of all of the available containers, and therefore all of the array’s physical disks.

Consider the following example:

• A table space is defined with a page size of 16 KB using raw DMS.

• The table space is defined across four containers, and each container resides on a separate logical disk, and each logical disk resides on a separate RAID array.

• The extent size is defined as 16 pages (or 256 KB).

• The prefetch value is specified as 64 pages (number of containers x extent size).

Suppose a user issued a query that results in a table space scan, which then results in DB2 performing a prefetch operation. The following would happen:

• DB2 UDB would recognize that this prefetch request for 64 pages (a megabyte) evenly spans four containers, and would issue four parallel I/O requests, one against each of those containers. The request size to each container would be 16 pages, or 256 KB.

• The AIX® Logical Volume Manager would divide the 256 KB request to each AIX logical volume into smaller units (128 KB is the largest), and pass them on to the array as back -to-back requests against each logical disk.

• An array receives a request for 128 KB; if the data is not in cache, four arrays would operate in parallel to retrieve the data.

• After receiving several of these requests, the array would recognize that these DB2 UDB prefetch requests are arriving as sequential accesses, causing the array sequential prefetch to take effect.

Page Cleaners Page cleaners write dirty pages from the buffer pool to disk, reducing the chance that agents looking for victim buffer pool slots in memory will have to incur the cost of writing dirty pages to disk. For example, if you have updated a large amount of data in a table, many data pages in the buffer pool may be updated but not written into disk storage (these pages are called dirty pages). Since agents cannot place fetched data pages into the dirty pages in the buffer pool, these dirty pages must be flushed to disk storage before their buffer pool memory can be used for other data pages.



Configuring a Symmetrix System Several factors affect database performance and should be considered when configuring a Symmetrix system. Some examples are:

• The size of device required by the database

• The number of physical disk spindles that will service DB2 UDB to provide a device of the size required by the database

• The affect (if any) on the positioning schema of the maximum number of hypervolumes allowed with a Symmetrix system

Creating Metavolumes When creating metavolumes for a database server, several factors must be considered in order to ensure reasonable performance.

Metavolume Size A metavolume’s size depends on two factors: hypervolume size and the number of hypers per meta. The value for either of these factors can affect the overall metavolume performance.

Hypervolume Size If you use a hyper size that is too large, you may not reach the six to ten desired spindles per CPU typically recommended by DB2 UDB for your server. For example, if the hypervolume size is 15 GB and the sought-after metavolume size is 30 GB, then only two physical disks (one per hypervolume) are required. Even if DB2 UDB uses multiple containers created out of these devices, only two physical disks will be servicing the requests. However, if a hypervolume size of 5 GB is used, and all six hypervolumes are placed on different physical disks, then there will be six physical disks servicing the requests.

However, if your hypervolume size is too small, it is possible to reach the maximum number of logical volumes allowed within a Symmetrix system. The equation in Formula 3 describes how to calculate the maximum number of physical disks that can be partitioned before reaching this limit. Formula 3 assumes each metavolume will be created using the same hypervolume size and number of hypers per meta.

Formula 3. Maximum Number of Disks

The rule of thumb for hypervolume size is 9 GB. This value is also easily divisible into common Symmetrix disk sizes.

Hypers per Meta Although this does not explicitly affect performance, too many hypers per meta may increase the chances of wasting disk space. This can only occur when physical disks are dedicated to the database server. In this case, it is possible to meet the database disk space requirements without fully allocating the underlying physical disks.

If you use too few hypers per meta, you may not exploit the full performance potential of your Symmetrix system, since the underlying disks may not be able to fully parallelize your transactions. The suggested starting point is to create a metavolume using four hypers per meta.

Finally, four hypers per meta combined with a hyper size of 9 GB will produce a 36 GB metavolume. A 36 GB device is typically large enough without becoming unmanageable.

maximum # of disks =

maximum # of volumes - maximum # of volumes

hypers per meta

floor physical disk size

hyper volume size

( ) ( )



Hypervolumes versus Physical Disks The number of hypervolumes servicing a database system is not necessarily equivalent to the number of physical disks servicing that same system. Although each hypervolume within a metavolume is typically created upon a separate physical disk, those same physical disks can contain other hypervolumes. These hypervolumes can belong to other metavolumes servicing the same database system as well.

Striped versus Concatenated Metavolumes When creating a metavolume, there are two methods in which a metavolume can be created: concatenated and striped. To achieve the best performance, it is recommended that striped metavolumes be used. In the concatenated case, some disk spindles can be left idle from lack of data or fro m data always being found on the same disk. Therefore, your system will not benefit from using all drive heads. Using striped metavolumes will increase the average number of drive heads servicing a request since it will be more likely that data being retrieved will be found on different underlying hypervolumes.

Figure 9. Logical Data Placement on a Striped Metavolume

Stripe Size Another consideration when defining metavolumes is stripe size. The minimum, and currently the default, stripe size is 960 KB. This minimum value is based on the size of two disk cylinders. It is possible to set this value higher, but it is recommended that the stripe size be left at the default for most systems. This allows for the highest likelihood of requested data being spread across more than one underlying hypervolumes, thus minimizing the chance for a bottleneck to occur on only one resource.

Channel Directors Another physical performance consideration occurs when connecting a Symmetrix system to the database server. The I/O cables should be spread across as many channel directors as possible. Each channel director has a tangible throughput limit. Therefore, spreading the cables across all available channel directors will decrease the likelihood of the channel directors becoming a bottleneck. Figures 10a and 10b demonstrate the difference between the recommended and the not recommended method for attaching the cables.

Multiple Fiber Channel (FC) connections per physical server provide both performance and redundancy to the overall configuration. With current generation 2 GB FC ports, DB2 UDB can be configured with two to four FC ports per physical server per attached Symmetrix system. (It is possible to configure more, but they would not normally provide additional value.)

Head Tail



Figure 10a. Two I/O Cables Connect to Two Channel Directors (Recommended)

Figure 10b.Two I/O Cables Connect to One Channel Directors (Not Recommended)

Server

Symmetrix

I/O Cables

Channel Director

Server

Symmetrix

I/O Cables

Channel Director



Configuring the Operating System

Multipathing with PowerPath EMC produces a software product, PowerPath® , which enables multipathing for Symmetrix arrays and other storage systems. Multipathing can increase the overall throughput between storage and server by increasing the number of I/O channels available to the server to address a specific device. For more detailed instructions on multipathing with PowerPath, refer to the PowerPath product guide.

There are several load balancing policy settings for PowerPath. Changing the policy can have a large impact on DB2 UDB performance. However, the default policy, Symmetrix Optimization, is generally best. AIX DB2 UDB users must use version 2.1.0 or higher in order to avoid a known performance defect in PowerPath.

Operating System Logical Volume Striping For decision support systems, logical volumes at the operating system level should not be striped. The striping at the DB2 UDB container level and on the Symmetrix system is enough to exploit parallelism without compromising overall sequential detection. Adding additional layers of striping may cause data to be placed in a random order on the underlying physical disks, which could affect when sequential detection occurs. This is less of an issue on systems where the workload is generally random I/O.



Configuring DB2 UDB

Table Space Container Configurations There are many different ways the devices presented to a server by a Symmetrix system can be allocated for use by DB2 UDB. However, most schemas can be categorized into two major philosophies: shared nothing and shared everything. This discussion assumes each metavolume corresponds to a single file system on the database server.

Shared Nothing The basic concept behind shared nothing is resources are isolated for use by specific applications. For DB2 UDB, this usually means isolating physical disks for use by a particular database partition or table space. Therefore, all metavolumes residing on a set of physical disks are used by a single database partition or table space. This must be done carefully as more than one metavolume can reside on a physical disk. When successful, each database partition or table space will have its own dedicated set of physical disks.

Figure 12 shows an example of isolating physical disks at the database partition level. In this example, there are 16 physical disks. Each set of four physical disks has been divided into four metavolumes as is in Figure 2. Therefore, the total of 16 separate metavolumes can be addressed by a server.

For this example, we want to create two SMS table spaces for an imaginary database that has two database partitions. As Figure 12 shows, the file systems that are mounted on the top eight metavolumes will be assigned to database partition 1, while the file systems on the bottom eight metavolumes will be assigned to database partition 2. Thus, the underlying disks are isolated to be used exclusively by a specific database partition. This particular layout corresponds to the CREATE TABLESPACE statement presented in Figure 13.

Although not highlighted in the example, shared nothing is typically easier to configure and manage since the creation of numerous additional devices is not usually required. However, in some cases, performance under this configuration may not be optimal. When a system has a limited number of physical disks, sharing all the physical disks between all the DB2 UDB database partitions can cause a performance gain.



Figure 11. 16 Physical Disks Arranged in a Shared Nothing Configuration for DB2 UDB

CREATE TABLESPACE My_Tablespace PAGESIZE 16K MANAGED BY SYSTEM

USING( ‘/node1/meta_volume1/My_Tablespace’, ‘/node1/meta_volume2/My_Tablespace’, ‘/node1/meta_volume3/My_Tablespace’, ‘/node1/meta_volume4/My_Tablespace’, ‘/node1/meta_volume5/My_Tablespace’, ‘/node1/meta_volume6/My_Tablespace’, ‘/node1/meta_volume7/My_Tablespace’, ‘/node1/meta_volume8/My_Tablespace’) ON NODE (1)


EXTENTSIZE 16 PREFETCHSIZE 128;

Mounted On

/node1/meta_volume1 Metavolume 1

Metavolume 2

Metavolume 3

Metavolume 4

/node1/meta_volume2

/node1/meta_volume4

/node1/meta_volume3

Physical Disks

/node1/meta_volume5 Metavolume 5

Metavolume 6

Metavolume 7 Metavolume 8

/node1/meta_volume6

/node1/meta_volume8

/node1/meta_volume7

/node2/meta_volume9 Metavolume 9 Metavolume 10 Metavolume 11 Metavolume 12

/node2/meta_volume10









Figure 12. Example CREATE TABLESPACE Statement to Figure 11

Shared Everything With shared everything, resources are not isolated for use. All applications should have access to all the resources. For DB2 UDB, this typically means all database partitions and table spaces will reside on all physical disks. Therefore, each physical disk will need to be addressable by each database partition. This can only be accomplished by creating at least one hypervolume per database partition on every physical disk. In addition, if you are planning on using DMS raw table space containers, a separate metavolume must be created for each table space in each database partition on every physical disk. You should notice how this design can quickly increase the number of devices that must be managed by your system administrator. Therefore, the chance of a possible performance gain should be weighed against the extra administrative costs.

Figures 13 and 14 provide an example of creating a shared everything table space on a database with two database partitions. Note that the Symmetrix disk configuration for this example has the exact same layout as in the previous example for shared nothing (Figure 11). As before, 16 physical disks have been divided into 16 separate metavolumes. This difference is in how the metavolumes are address by the database server(s). Look closely at the ordering of the file system names used as containers for the two database partitions in Figure 14. Notice how each database partition in the create table space statement has access to each physical disk.



Figure 13.16 Physical Disks Arranged in a Shared Everything Configuration for DB2 UDB

CREATE TABLESPACE My_Tablespace PAGESIZE 16K MANAGED BY SYSTEM



EXTENTSIZE 16 PREFETCHSIZE 128;

Mounted On


/node1/meta_volume2

/node2/meta_volume4

/node2/meta_volume3

Physical Disks


/node1/meta_volume6

/node2/meta_volume8

/node2/meta_volume7

/node1/meta_volume9 Metavolume 9 Metavolume 10

Metavolume 11 Metavolume 12










Figure 14. Example CREATE TABLESPACE Statement that Corresponds with Figure 13

JBOD The final way to lay out Symmetrix physical disks is called JBOD (just a bunch of disks). In essence, it is another example of shared nothing, where physical dis ks are isolated for use. It is only possible to create a JBOD schema without the use of metavolumes. In this case, each hypervolume is presented to the server as a separate addressable device. These devices are then used as containers by various DB2 UDB table spaces. Basically, this is the same as the previous shared nothing schema, without the metavolume layer. When databases are less than 1 TB in size, removing the metavolume layer does provide an additional performance gain. However, if your database is larger, or has a chance of growing past 1 TB, do not use a JBOD schema.

Table Space Configuration

Extent Size Extent size is usually configured to be the same as the stripe width of the devices on which the table space resides. However, a typical stripe width for the Symmetrix system is 3840 KB (960 KB stripe size * 4 hypers per meta), which is significantly larger than other like systems. Setting the extent size to the stripe width can actually impede performance; instead, the extent size should be configured around 256 KB.

Prefetch Size Prefetch size specifies how much data should be read into the buffer pool on a prefetch data request. Prefetching data can help queries avoid unnecessary page faults. Therefore, the value of the most efficient prefetch size for a table space is closely linked to its workload, and must be tuned on a per-system basis. However, a good starting point for a Symmetrix-based system is to multiply the number of containers in the table space by its extent size in KB, and then double it: This is twice the usual rule of thumb for prefetch size and is linked to the ability of the Symmetrix mirrored metavolumes to fulfill a read request from two separate physical disks.

prefetchsize (KB) = extentsize (KB) * # of containers * 2

Formula 4. Prefetch Size

Note that prefetchsize is tunable after table space creation. This is not true for extent size and page size. These values are set at table space creation time and cannot be altered without re-defining the table space and re-loading its data.

Overhead and Transfer Rate Two other parameters that relate to I/O preference can be configured for a table space: overhead and transfer rate. These parameters are used when making optimization decisions, and help determine the relative cost of random versus sequential accesses.

Overhead provides an estimate (in milliseconds) of the time required by the container before any data is read into memory. This overhead activity includes the container's I/O-controller overhead, as well as the disk latency time, which includes the disk seek time.

Transfer rate provides an estimate (in milliseconds) of the time required to read one page of data into memory.



Table 2. Suggested Overhead and Transfer Rate Values

Transfer Rate

Disk Capacity Overhead 4 KB 8 KB 16 KB 32 KB

36 GB 10K RPM 8.7 0.1 0.1 0.3 0.6

50 GB 7200 RPM 11.6 0.1 0.1 0.3 0.6 73 GB 10K RPM 8.6 0.1 0.1 0.2 0.5

181 GB 7200 RPM 11.7 0.1 0.1 0.2 0.5

Other Tuning Parameters

I/O Servers The number of I/O servers configured for a database can also have a significant impact on performance. I/O servers are used on behalf of the database agents to perform I/O prefetches and asynchronous I/O for utilities such as backup and restore. This value, like prefetch size depends on overall system workload. However, a good starting point for configuring I/O servers is to count the number of containers in the table space with the most containers, and multiply that number by two.

DB2_PARALLEL_IO It is recommended that DB2_PARALLEL_IO be set to ON for all table spaces using containers created on RAID devices. And Symmetrix striped metavolumes fall into this category. DB2_PARALLEL_IO allows for multiple read and writes to occur on a single container, thus increasing throughput.

Multipage File Allocation In an SMS table space, a file is extended one page at a time as the object grows. If you need improved insert performance, you can consider enabling multipage file allocation. This allows the system to allocate or extend the file by more than one page at a time. You must run db2empfa to enable multipage file allocation. In a partitioned database environment, run this utility on each database partition. Once multipage file allocation is enabled, it cannot be disabled.



Understanding Existing Systems If your database system already exists, it is still possible to understand how the database system relates to the underlying disks and vice versa. This information can be vital when monitoring a system for performance. In addition, it can also help in isolating bottlenecks. Figure 16 gives a general view of the relationship between a DB2 UDB database created using a Symmetrix system for storage for an imaginary system. The following procedure walks through an example set of commands that can be used to determine the nature of the layout for a system. The example output corresponds to Figure 16 and the steps follow the diagram from right to left.

Procedure/Step: Command (examples for AIX):

Example Output (output corresponds with Figure 14):

1. Determine the number of database partitions.

db2 connect to <database_name> (e.g., my_db) db2 list nodes db2 connect reset

NODE NUMBER ---------------------- 0 1 2 record(s) selected.

2. Determine which table spaces reside within a particular database partition and their corresponding table space ID value.

export DB2NODE=<database_partition_number> (e.g., 1) db2 connect to <database_name> (e.g., my_db) db2 list tablespaces

Note: Only table space with IDs 3 and 4 are shown in diagram.

Tablespaces for Current Database Tablespace ID = 0 Name = SYSCATSPACE Type = System managed space Contents = Any data State = 0x0000 Detailed explanation: Normal Tablespace ID = 1 Name = TEMPSPACE1 Type = System managed space Contents = System Temporary data State = 0x0000 Detailed explanation: Normal



Tablespace ID = 2 Name = USERSPACE1 Type = System managed space Contents = Any data State = 0x0000 Detailed explanation: Normal Tablespace ID = 3 Name = TABLESPACE1 Type = System managed space Contents = Any data State = 0x0000 Detailed explanation: Normal Tablespace ID = 4 Name = TABLESPACE2 Type = System managed space Contents = Any data State = 0x0000 Detailed explanation:

Normal

3. Determine which containers belong to a table space specified using its table space ID.

Db2 list tablespace containers for <tablespace_id> (e.g., 3) db2 connect reset export DB2NODE=

Tablespace Containers for Tablespace 3 Container ID = 0 Name = /my_fs0/tbspace Type = Path Container ID = 1 Name = /my_fs1/tbspace Type = Path Container ID = 2 Name = /my_fs2/tbspace Type = Path Container ID = 3 Name = /my_fs3/tbspace Type = Path Container ID = 4 Name = /my_fs4/tbspace Type = Path



4. Determine the logical volume in which the container name resides.

df <container_name> (e.g., /my_fs2/tbspace)

Filesystem 512-blocks Free %Used Iused %Iused Mounted on /dev/lv_myfs2 75497472 45462376 40% 17134 1% /my_fs2

5. Determine the PowerPath physical volumes that make up the logical volume.

lslv –l <logical_volume_name> (e.g., lv_myfs2)

lv_myfs2:/my_fs2 PV COPIES IN BAND DISTRIBUTION hdiskpower0 288:000:000 20% 058:058:058:058:056 hdiskpower1 288:000:000 20% 058:058:058:058:056 hdiskpower2 288:000:000 20% 058:058:058:058:056 hdiskpower3 288:000:000 20% 058:058:058:058:056

6. Determine the Symmetrix volume ID for the meta head.

powermt display dev=<hdiskpower> (e.g., hdiskpower0)

Pseudo name=hdiskpower0 Symmetrix frame ID=000276901285; volume ID=0174 state=alive; policy=SymmOpt; priority=0; queued-IOs=0

====================================================================== --------- Host Devices -------- - Symm - --- Path ---- -- Stats --- ### HW-path device director mode state q-IOs errors ====================================================================== 2 fscsi1 hdisk11 FA 3aA active open 0 0 3 fscsi2 hdisk18 FA 4aA active open 0 0 4 fscsi3 hdisk25 FA 13aA active open 0 0 0 fscsi4 hdisk32 FA 14aA active open 0 0 1 fscsi0 hdisk44 FA 13bA active open 0 0

7. Determine the Symmetrix physical disks that make up the metavolume on which the hdiskpower resides.

symdev -DA ALL list | head -9

symdev -DA ALL list | grep "<hdiskpower>"

(e.g., hdiskpower0)

Symmetrix ID: 000276901285 Device Name Directors Device ------------------------- ------------------ ------------------------- Cap Sym Physical SA :P DA :IT Hyper Config (MB) ------------------------- ------------------ ------------------------- 0177 /dev/rhdiskpower0 ???:? 01A:D5 1 2-Way Mir (m) - 0176 /dev/rhdiskpower0 ???:? 02A:D5 1 2-Way Mir (m) -



0175 /dev/rhdiskpower0 ???:? 15A:D5 1 2-Way Mir (m) - 0174 /dev/rhdiskpower0 13B:0 16A:D5 1 2-Way Mir (M) 37125 0174 /dev/rhdiskpower0 13B:0 01B:C3 1 2-Way Mir (M) 37125 0175 /dev/rhdiskpower0 ???:? 02B:C3 1 2-Way Mir (m) - 0176 /dev/rhdiskpower0 ???:? 15B:C3 1 2-Way Mir (m) - 0177 /dev/rhdiskpower0 ???:? 16B:C3 1 2-Way Mir (m) -

8. Determine which other physical volumes reside on a particular Symmetrix physical disk.

symdev -DA ALL list | head -9 symdev -DA ALL list | grep "<DA :IT> " | \ sort -k 5 (e.g.,01A:D5 )

Symmetrix ID: 000276901285

Device Name Directors Device ------------------------- ------------------ ------------------------- Cap Sym Physical SA :P DA :IT Hyper Config (MB) ------------------------- ------------------ ------------------------- 0177 /dev/rhdiskpower0 ???:? 01A:D5 1 2-Way Mir (m) - 0168 /dev/rhdiskpower22 13B:0 01A:D5 2 2-Way Mir (M) 24750 01A7 /dev/rhdiskpower43 13B:0 01A:D5 3 2-Way Mir (m) 6188 0198 /dev/rhdiskpower34 13B:0 01A:D5 4 2-Way Mir (M) 20625 0117 /dev/rhdiskpower1 ???:? 01A:D5 5 2-Way Mir (m) -



Figure 15. Overall View of an Example Relationship between DB2 UDB and Symmetrix

best practices for emc symmetrix

Documents