db2 for z/os and ziip zaap the redirect to success

38
DB2 for z/OS and zIIP & zAAP The REDIRECT to Success Timm Zimmermann & Michael Dewert DB2 for z/OS Development Session Code: A14 16. October 2013 – 5:00 to 6:00pm | Platform: DB2 for z/OS 1

Upload: others

Post on 31-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

DB2 for z/OS and zIIP & zAAP ‐ The REDIRECT to Success

Timm Zimmermann & Michael DewertDB2 for z/OS Development

Session Code: A1416. October 2013 – 5:00 to 6:00pm |  Platform: DB2 for z/OS

1

Page 2: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

History of Specialty Engines

22

Internal Coupling Facility (ICF) 1997

Integrated Facility for Linux® (IFL)

2000

System z Application Assist Processor (zAAP)

2004

Eligible for zIIP:• DB2 remote access 

and BI/DW,Utilities,DFSORTXML Parsing

• ISVs• IPSec encryption• z/OS XML System Services• z/OS Global Mirror (XRC)• HiperSockets for large 

messages• IBM GBS Scalable Architecture 

for Financial Reporting • z/OS CIM Server• zAAP on zIIP

Eligible for zAAP:

Java execution environment

z/OS XML System Services

IBM System z Integrated Information Processor and

zIIP (2006)

2

Page 3: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

The big picture

33

z/OS LPAR

Application

TCP/IP

Linux on System z LPAR

ApplicationIFL

TCP/IPHiperSockets

DRDA

DB2

CP

z/OS LPARWAS

ApplicationzAAP

TCP/IPHiperSockets

WASApplication

zAAP

zIIP

DRDA

DRDA

ParallelQuery

z/OS LPAR

Batch

DB2

CP

WASApplication zAAP

IMS™CICS®

QMF™(TSO)

CPCP

CP

3

-SAP netweaver, WAS on zLinux, hipersocket-using type 4 locally will invoke the TCP IP stack, so overhead there and going through the DDF address space in and outboundThe specialty engines can be used to improve the cost of ownership, providing a low price for the hardware and not incurring software charges, since they are not general purpose processors. Customers can use all of the engines together with DB2. The ICF provides the Coupling Facility for DB2 data sharing with Parallel Sysplex foravailability and scalability. The IFL can run Linux applications using DB2 Connect over a communication link or hipersockets to DB2 forz/OS. The zAAP can run Java applications, while the zIIP runs part of the DB2 work

3

Page 4: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

How can they help me?

• HARDWARE costs

• Move work from GP to zIIP

• Higher costs of lower cost processors

• Possible postpone an upgrade

• SOFTWARE costs

• Reduce MSU units, generally increases with the # of general processors

4

4

-3 reasons you go to zIIP 1) 2) latent demand help for a box that is pegged 3) postpone an upgrade by utilizing capacity of zIIP 3) grand pooba - lower monthly SW cost, hence have to hit the 4 hour rolling average -zVM 5.3 added ability to virtualize ziip and zaap to allow for testing of sw-latent demand could mean that discretionary work finally gets some processor cycles-I know it says zAAP

4

Page 5: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Dispatchable Unit (DU) 

• To identify and keep track of its work z/OS represents each unit of work on the system with a control block

• Standard dispatching units (TCBs and SRBs)• TCB – runs at dispatching priority of address space and is preemtible• SRB – runs at supervisor priority and is non‐preemtible 

• Advanced dispatching units• Enclave

• Serves as an anchor for an address space independent transaction• Can consist of multiple tasks (TCBs or SRBs) executed across multiple address 

spaces• Client SRB

• Similar to an ordinary SRB but runs with the client dispatching priority and is preemtible

• Enclave SRB• Similar to an ordinary SRB but runs with enclave dispatching priority and is pre‐

emptible

5

5

Enclaves vs. address (enclave must run in an address space) – introduced v4 and MVS 5.2

-until V10 prefetch and asynch I/O were TCB time-Non-preemptable SRB – RTS, deadlock, suspension, logging, checkpoints-TCB – archiving, pset open/close, stats-preempt backout for rollback or abort good because in n-way system with n rollbacks system could look hung, not it can be not only interrupted but pre-empted by higher importance work

5

Page 6: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

So what is an Enclave ?

• Enclaves represents a “business unit of work”

• Enclaves are managed separately from the z/OS address space

• Enclaves can include multiple TCB/SRB• Can span multiple address spaces

• Can have many enclaves in a single address space

• Assigned by WLM to a service class for prioritization by the system

• DB2 was one of the early exploiters of the concept of enclaves• Enclaves provided mechanism to manage and prioritize DB2 distributed 

(DDF) workload• DB2 sysplex query parallelism

• DB2 sequential prefetch and deferred write processing (DB2 10)

6

6

Page 7: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP Workflow Characteristics

• Enclave SRBs are used for• Schedule pre‐emptible work• Associate multiples threads with same 

work unit and WLM policy• DB2 runs work under

• Tasks• Client SRBs• Enclaves SRBS

• Some DB2 enclave SRB work is zIIP eligible 

• Complex parallel query processing• DRDA requests using TCP/IP• Index maintenance for some utilities• Prefetch engines as of DB2 10

• When DB2 schedules an enclave SRB, it identifies to z/OS dispatcher which Enclave SRB are zIIP capable.

AS1 AS2 AS3

7

T72

T30 T7

2

T72

T30 T7

2

T30 T7

2

TCB SRB ENCLAVE SRB

7

DDFDEFpolicyRT=85%IMP LVL=2

DDFPROD policyAVG RT=5s IMP LVL=2

7

Page 8: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP Capacity Planning 

• zCP3000 study• Provided by IBM Techline

• Send in SMF 30’s and 70’s

• Breakdown of zIIP eligible work

• Overlay 4 hour peak

• See collisions of workload 

• Remember to normalize

8

8

-it will show things like 90% percentile, what workloads collide and so on to determine min number of CPs and zIIP – this is USAA- Remember if GP kneecapped need to normalize utilization on zIIP-note this is a holistic view and due to the layering a bit awkward to get specifics, I will show you later how to see what is getting the offload by workload service class in the activity report which will help the DBAs determine what apps or utility runs are using the zIIP-can help predict latent demand and future utilization -this shows a stacked diagram so obviously DB2DDF1 is largest consumer of the zIIP- but we also see that the DDFWHL warehouse workload is adding to it as well, online and warehouse queries together?-for probability, 10 CPs 90% busy, and 1 zIIP 50% busy, 50% of time zIIP is 100% busy, always 1 CP open – this graph shows that 50% of ziIIp pool being used, 50% of both

8

Page 9: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP Usage

• How many zIIPs do you need ?• (this scenario 12:1 ratio of CP to zIIP)

• The red graph (AAPL% IIPCP) indicates zIIP eligible work that went to CP either because zIIP is overloaded or local suspense lock 

• Must have enough capacity to absorb spikes – not just typical offload

• “Need Help” algorithm ensures work does not pile up waiting on zIIP

9

9

-application jumped to 45% of CP, while zIIP took small bump because GPs were knee capped and zIIP running full speed-RMF monitor 2 and z/OS 1.10 put LOCK in RMF mon 3 parmlib/ z/OS 1.13 added it to MON 1 – like for GETMAIN-this CEC had 12 CPs and 1 zIIP (State Street) so the propensity for zIIP eligible workload to end up on CP is higher due to likely hood that zIIP is busy and CP is not- Why didn’t zIIP jump to 90% utilized?? zIIP help algorithm to get assistance from another GP .queues is what we see an example of here – 1 100% busy 40% of time, with 12 GPs 1 CP only busy 0.2%-in picture if GPs were kneed capped you would have to know the normalization factor to determine how much CP they are saving-4x amount of trans came through DDF-27.94 CP + 35.49 = 63.4355.95% redirect-for queuing theory if you have 10 CPs driven at 90% and 1 zIIP at 50% - almost guaranteed that CP will be available, but zIIP only available 50% of time (.9 to 10th power)

-add another zIIP at it will only be busy 1/8 of the time so 12.5% chance of hitting one that is busy

9

Page 10: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Law of Probability

• Markov’s equation based on 1 server

• As U approaches 100% the       approaches ∞

• IF 12 CPS are utilized at 65% 

• Each CP is instantaneously 0.5% busy

• So if 1 zIIP is 35% busy, it is instantaneously 35% busy• The “needs help” algorithm is likely to overflow zIIP eligible work back to a GCP

10

TW = TS ×U / (1−U)

0.6512 = 0.5%35

TS

TW

U%

TW

Page 11: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

“Needs Help” Algorithm or zIIP Overflow

• zIIP processors can ask for help from GCPs using the “Needs Help” algorithm• Activated by IIPHONORPRIORITY and HIPERDISPATCH in IEAOPTxx 

parmlib member

• Default setting is YES and best practices

• A zIIP asks for help after a period of time (ZIIPAWMT) when more work is coming in (ZIIPMAXQL) and all the zIIPs are busy• Default values are 

• ZIIPAWMT  = 3200 (3.2msec)

• ZIIPMAXQL = 7 (# of DUs waiting for zIIP)

• Focus on these parameter especially when zIIP capacity is under‐configured• zIIPs are assist processors and not intended to be run as hard as GCPs

• zIIPs usage should in the 30‐50% CPU busy range on average (peaks higher)

11

11

Open PMR & ask Level 2 before adjusting!}

With the above default settings and if the zIIP capacity is under‐configured

DB2 prefetch engines can end up queuing for a zIIP for up to 3.2 msec before they are dispatched on a GCP

Minimum value when HIPERDISPATCH=YES is ZIIPAWMT=1600 – still very high

Of course, this could be much worse if the zIIP processors were not allowed to ask GCPs for help (IIPHONORPRIORITY=NO)

Correct technical solution is to add more zIIP capacity

zIIPs are assist processors and not intended to be run as hard as GCPs

zIIPs usage should in the 30‐50% CPU busy range on average (peaks higher)

11

Page 12: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zAAP on zIIP Capability

• A new capability that can enable System z Application Assist Processor (zAAP) eligible workloads to run on System z Integrated Information Processors (zIIPs).

• For customers with no zAAPs and zIIPs

• The combined eligible workloads may make the acquisition of a single zIIP cost effective.

• For customers with only zIIP processors

• Makes Java and z/OS XML System Services ‐based workloads eligible to run on existing zIIPs – maximizes zIIP investment.

• Available on z/OS V1.9, V1.10 and V1.11

• This new capability is not available for z/OS LPARS if zAAPs are installed on the server.

12

12

Page 13: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

How to enable the zAAP on zIIP Capability 

• The capability ships default enabled with z/OS V1.11. • Parameter in SYS1.PARMLIB(IEASYSxx) : ZAAPZIIP = YES (default in z/OS V1.11)• If you wish to disable the function for any reason, you must IPL with ZAAPZIIP=NO in 

the IEASYSxx Parmlib member.

• IFAHONORPRIORITY now based on IIPHONORPRIORITY• Also available with z/OS V1.9 and V1.10 • With PTF for APAR OA27495, and 

• Enabled with ZAAPZIIP=YES in the IEASYSxx Parmlib (the default is NO)

• APAR OA38829 for z/OS 1.12 and 1.13• Allow zAAP on zIIP even if there is a zAAP installed

• This new capability does not remove the requirement to purchase and maintain the correct zIIP to CP ratio of• no more than 2:1 (For zEC12 and zBC12 servers)• no more than 1:1 (zEnterprise 196 and zEnterprise 114 and older)

13

13

Page 14: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

DB2 10 and some Re‐Directed Work to zIIP

14

• Distributed DRDA requests

• DB2 V10 & z/OS 1.10: before DRDA over TCP/IP ‐> PM12256

• Query parallelism

• Up to 80% of the child tasks, portion of main task for remote calls

• Prefetch and deferred write processing

• Shows up in DBM1 SRB time (roughly 70% of overall DBM1 SRB time)

60%

˜80%

100%

14Refer to APPENDIX for a complete list !

http://publib.boulder.ibm.com/infocenter/zos/v1r9/topic/com.ibm.zos.r9.icet100/ice1ct0073.htm-memory object sorting-//SORTDIAG DD DUMMY added to job will show more detail on memory object sorts and hyperspace sorts-Memory object sorting is a new DFSORT capability that uses a memory object on 64-bit real architecture toimprove the performance of sort applications. A memory object is a data area in virtual storage that is allocatedabove the bar and backed by central storage. With memory object sorting, a memory object can be used exclusively,or along with disk space, for temporary storage of records. Memory object sorting can reduce I/O processing,elapsed time, EXCPs, and channel usage. When a memory object is used, Hiperspace and data space arenot needed.

-global mirror – system data mover functions-DB2 V8 maintenance… PK46171 (11/07)- fixed accounting and usage calculation

-Web deliverable from zIIP prereqs for 1.6 and z/OS 1.7-CDSSRDEF – zPARM to set default of current degree special register 14

Page 15: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

DB2 10 and its Re‐Directed Work to zIIP …

• Distributed calls to native SP

• Some percentages offloaded as remote DRDA requests withoutFENCED or EXTERNAL keywords

• Utilities

• Depending on # of parts and IDXs – BUILD and REBUILD phase is offloaded. Since V9 also UNLOAD phase of REORG

• XML parsing

• Up to 36% zAAP offload for XML LOAD utility

• Up to 63% offload for XML INSERTS via DRDA

15

Refer to APPENDIX for a complete list !

X%

˜60%

˜36%

˜63%

zAAP

15

Page 16: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

What is eligible – even more …

• Measured LOAD, REBUILD INDEX and REORG Utilities.• zIIP redirect % depends on % CPU consumed by the Build Index phase of the Utility.• Observed Class 1 CPU reduction for configuration with 4 CPs and 2 zIIPs with fixed length Index 

key :•• 5 to 20% for Rebuild Index5 to 20% for Rebuild Index•• 10 to 20% for Load or Reorg of a Partition with one Index only, 10 to 20% for Load or Reorg of a Partition with one Index only, or Load of entire Table, or Reorg of or Load of entire Table, or Reorg of 

entire Tablespaceentire Tablespace•• 40% for Rebuild Index of logical Partition of Non Partitioning I40% for Rebuild Index of logical Partition of Non Partitioning Index ndex •• 40 to 50% for Reorg Index40 to 50% for Reorg Index

•• 30 to 60% for Load or Reorg of a Partition with more than one In30 to 60% for Load or Reorg of a Partition with more than one Indexdex

• CPU overhead incurred during execution unit switch from TCB to enclave SRB during Index Rebuild phase

• Typically less than 10%• Eligible for zIIP redirect

1616

16

Page 17: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP & OMPE Accounting Report

• Tivoli Omegamon DB2PE accounting report changes with PM51045 & PM50575• IIP changed to SE to indicate that 

the value may include zIIP and zAAP engines

• SECP (projection / overflow) does not include zAAP overflow or zAAP projection • Applicable only to zIIP in DB2 V8 

and DB2 9

• SECP (projection / overflow) is not reported in DB2 10.

• Need to use RMF Workload Activity Report Service / Reporting Class information

AVERAGE APPL(CL.1) DB2 (CL.2)------------ ---------- ----------ELAPSED TIME 0.331525 0.005241NONNESTED 0.331525 0.005241STORED PROC 0.000000 0.000000UDF 0.000000 0.000000TRIGGER 0.000000 0.000000

CP CPU TIME 0.001567 0.001477AGENT 0.001567 0.001477NONNESTED 0.001567 0.001477STORED PRC 0.000000 0.000000UDF 0.000000 0.000000TRIGGER 0.000000 0.000000

PAR.TASKS 0.000000 0.000000

SECP CPU 0.000002 N/A

SE CPU TIME 0.002225 0.002152NONNESTED 0.002225 0.002152STORED PROC 0.000000 0.000000UDF 0.000000 0.000000TRIGGER 0.000000 0.000000

PAR.TASKS 0.000000 0.000000

17

17

Page 18: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP & OMPE Accounting Report …

18

Chargeable CPU timeCP CPU TIME (including SECP CPU)

zIIP eligible but ran on CP( PM57206 )

CPU time on zIIP

AVERAGE APPL(CL.1) DB2 (CL.2)------------ ---------- ----------ELAPSED TIME 0.331525 0.005241NONNESTED 0.331525 0.005241STORED PROC 0.000000 0.000000UDF 0.000000 0.000000TRIGGER 0.000000 0.000000

CP CPU TIME 19.373768 19.365788 AGENT 6.779348 6.771411 NONNESTED 6.779348 6.771411 STORED PRC 0.000000 0.000000 UDF 0.000000 0.000000 TRIGGER 0.000000 0.000000

PAR.TASKS 12.594420 12.594377

SECP CPU 2.813831 N/A

SE CPU TIME 35.886951 35.886951

Total zIIP eligible work % = 70% ((SE +SECP) / (CP+SE))

zIIP Redirect % = 65% ((SE / (CP+SE))

zIIP eligible but ran on CP = 5% ((SECP / (CP+SE))            

18

Page 19: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

DB2 10 Exploits Enclaves For Prefetch

• Buffer pool prefetch activity (dynamic prefetch, list prefetch, sequential prefetch) 100% zIIP eligible in DB2 10

• DB2 10 zIIP eligible buffer pool prefetch is asynchronously initiated by DBM1 address space• Executed with a dependent enclave owned by the MSTR address space• Deferred write also eligible for zIIP 

• Asynchronous buffer pool prefetch activities are not accounted to the DB2 client• After PM30468 reported in DBM1 SRB time• Increase due to index I/O parallelism/ index list prefetch for disorganized

indexes/ access path changes/ more dynamic prefetch in V9,V10

19

CPU TIMES TCB TIME PREEMPT SRB NONPREEMPT SRB TOTAL TIME PREEMPT IIP SRB /COMMIT---------------------------------------------- --------------- --------------- --------------- --------------- --------------SYSTEM SERVICES ADDRESS SPACE 1:21.412389 2:41.314908 17.080711 4:19.808009 N/A 0.000082DATABASE SERVICES ADDRESS SPACE 47.922003 13:22.407698 1:18.442771 15:28.772471 7:37.180101 0.000295IRLM 0.333537 0.000011 16:27.775261 16:28.108809 N/A 0.000314DDF ADDRESS SPACE 2:46.994605 2:16:23.840972 9:34.392889 2:28:45.228465 2:30:33.316117 0.002834

TOTAL 4:56.662534 2:32:27.563589 27:37.691632 3:05:01.917754 2:38:10.496218 0.003525

19

PREEMPT IIP SRB time shows the CPU time redirected to zIIP/COMMIT shows the chargeable (non-zIIP) CP CPU time

-70% of DBM1 SRB time went to zIIP-lots if prefetch in v8, so it went from non-pre-emptible SRB to pre-emptible SRB-backout became a pre-emptible task in V10 – row level sequential detectionMatching on multiple IN List predicates – huge in SAP shop-learn-in prefetch

19

Page 20: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

DB2 Stored Procedures

20

DB2DBM1DB2DIST DB2WLM CICS Agent

Enclave A

Enclave B

Enclave C

queryresult

result

query

TASK•Listens for requests coming from outside of the system•Creates independent enclave•Schedules enclave SRB

TCB

Enclave SRB

Stored Procedure runs under a TCB which joins the enclave of the caller.

20

Page 21: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

DB2 Stored Procedures with zIIPs

21

21

-29.6% external to native SQL-56% if remote

21

Page 22: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

22

Click to edit Master title style

Thread footwork T2 vs. T4 Connections same LPAR

• Going through DIST adds network translations, another address space, and context switch to an SRB – but you get zIIP offload

• DB2 objective is to improve T2 driver performance to beat T4• Even local T4 connection hits DIST and TCP stack – need DDF WLM service class• Moving T2 to T4 is not 1 for 1 MIP exchange due to overhead

• But, the worse behaving the application the more there is to offload so in the end it must be tested and compared

2222

22

-Customers have discovered they can create a T4 connection to DB2 even if WAS is on the same LPAR as DB2- hence it can take advantage of the zIIP offload allotted to DRDA/TCPIP threads-The ‘not 1 for 1’ MIP exchange refers to a customer environment where moving moving WebSphere Process Server from a local T2 RRS attach to a T4 connection on the same LPAR reduced the workload’s CP consumption by 30%, but the zIIP MIPs were more than doubled – this could be explained by the overhead induced by lengthening the code path

- Based on the SQL being executed there is a tipping point where the T2 connection will use less general purpose MIPs than a T4 connection, this should be tested at the customer before trying to exploit the zIIP based on perceived offload savings

Page 23: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP vs. GP while changing to T4 driver

• In 1 day customer switched all T2 to T4 connections, WPS and DB2 z/OS both on same LPAR… Not A Benchmark• GP MIPS from 350 230• zIIP MIPs from about 100 405 ….. Not a 1 for 1 exchange

2323

23

Generally speaking if less than 100 rows, simple SQL (sub-second) T2 will win-34% decrease, 60% offload in DB2 rest is lost in JAVA

Page 24: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Utility zIIP redirect with DFSORT

• Introduced in Aug 2009 for zIIP redirect for DFSORT processing for some DB2 Utilities

• Applicable to in‐memory fixed length record sort processing in DFSORT• Utilities that benefit :

• LOAD, REORG, REBUILD INDEX and CHECK INDEX for Index key Sort processing

• CHECK DATA for Foreign key Sort processing• RUNSTATS for COLGROUP processing

• Measured zIIP redirect benefit

• 30% to 60% of DFSORT CPU • 10% to 40% of total Utility CPU• Varies with number of Indices

• More benefit with more Indices• Measurement with up to 6 Indices

• DFSORT with MSGICE256I DFSORT CODE IS ELIGIBLE TO USE ZIIP FOR THIS DB2 UTILITY RUN (PK85899) 

2424

24

Page 25: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Query Parallelism Enhancements since V9

• Multiple sequential plans will be considered for parallelism• Lowest cost, after parallelism, is the winner

• Parallelism degree can cut on non‐leading table• Beneficial for 1‐row table, workfile, or DPSI on fact table

• The default degree of parallelism is

• Parallelism degree used in Utilities as of V9

25

CP CP CP CP CP X 4 (V9)

2 (V10)

3 LOAD / REORG / REBUILD / CHECK

CP CP CP CP CP X 1 UNLOAD

∞ (no limit) Parallel index operations25

-v8 = 1x CP, then v9 10x, reduced to 4x, v10 will be 2x-limited parallelism – check index saw 20-40% redux in CPU and elapsed

-BP space, host variable values, hardware sort assist facility, # CPs, ambiguous cursors, join technique-v9 uses parallelism in REORG (unload, load, and index build), REBUILD INDEX-load, reorg, rebuild limited to 3x CPUs

-unload 1x CPUs-so allowing for parallelism then constricting it at runtime is more costly in v9-parallel index load, reorg, rebuild UNLIMITED-log apply during REORG sharelevel change is in parallel as well-PK26989 – utility subtasks no longer count against IDFORE or IDBACK

25

Page 26: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Lifted Query Parallelism Limitations in DB2 10

• Support parallelism for multi‐row fetch• This restriction is only removed if the CURSOR is DECLARED as READ 

ONLY – Ambiguous Cursors will not have the restriction removed

• Allow parallelism if a parallel group contains a work file• DB2 generates temporary a work file when view or table expression is 

materialized

• This type of work file can not be shared among child task in previous releases of DB2, hence parallelism is disabled• DB2 10 will make the work file shareable only applies to CP modeparallelism and no full outer join case

26

26

Page 27: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Parallelism Enhancements ‐ Effectiveness

• DB2 10 will use Dynamic record range partitioning• DB2 will materialize the intermediate result in a sequence of join 

processes• Results will be divided into ranges with equal number of records• Division doesn't have to be on the key boundary

• Unless required for group by or distinct function

• Record range partitioning is dynamic• no longer based on the key ranges decided at bind time

• Now based on number of composite side records and number of workload elements

• So data skew, out of date statistics etc. will not have any effect on performance (as they are not used)

• DB2 will try to use in‐memory work file for the materialization output if it is possible

27

27

Page 28: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Parallelism  Enhancements – STRAW Model

28

100

50

40

30

20

0

100

50474441

0

38353229262320

10,000 rows

Degree = 3

C1 C210,000 rowsMedium_T

C1 C2

Degree = 3#ranges = 10

Medium_T

Index on C1 Index on C1

Divided in key ranges before DB2 10 Divided in key ranges with Straw Model

Task 1

Task 2

Task 3

28

Page 29: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

What to look for with Parallelism

• DSNB440I ‐ shows degraded parallel tasks from buffer pools 

• DSNU397I – Utility message on constrained tasks (SORTNUM) 

• DISPLAY THREAD(*) – PT appears next to parallel tasks 

29

16.32.57 DB1G DISPLAY THREAD(*)16.32.57 STC00090 DSNV401I DB1G DISPLAY THREAD REPORT FOLLOWS –16.32.57 STC00090 DSNV402I DB1G ACTIVE THREADS –NAME ST A REQ ID AUTHID PLAN ASID TOKEN BATCH PT * 1 PUPPYDML USER001 MYPLAN 0025 30

PT * 641 PUPPYDML USER001 MYPLAN 002A 40PT * 72 PUPPYDML USER001 MYPLAN 002A 39PT * 549 PUPPYDML USER001 MYPLAN 002A 38

...

DSNU397I DSNUBBID - NUMBER OF TAKSS CONSTRAINED BY VIRTUAL STORAGEDSNU427I DSNUBBID - OBJECTS WILL BE PROCESSED IN PARALLEL

NUMBER OF OBJECTS = 6

DSNB440I - PARALLEL ACTIVITY –PARALLEL REQUEST = 2 DEGRADED PARALLEL = 0

29

Page 30: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

What to look for with Parallelism …

• STATS long report – calculate BP size based on number of denied parallel tasks 

• ACCNT trace – Query parallelism section • Ran as Planned/Ran reduced 

• IFCID 0222 – OMEGAMON activity trace • Shows actual number of tasks and degradation 

• IFCID 0221 – tells you which buffer pool restricted parallelism 

30

QUERY PARALLEL. TOTAL--------------- --------MAXIMUM MEMBERS 1MAXIMUM DEGREE 10GROUPS EXECUTED 1RAN AS PLANNED 1RAN REDUCED 0ONE COOR=N 0ONE ISOLAT 0ONE DCL TTABLE 0SEQ - CURSOR 0SEQ - NO ESA 0SEQ - NO BUF 0SEQ - ENCL.SER. 0

MEMB SKIPPED(%) 0DISABLED BY RLF NOREFORM PARAL-CONFIG 0REFORM PARAL-NO BUF 0

30

Page 31: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

What you can control with Parallelism

• Hidden zParm SPRMPTH – DSN6SPRC • Threshold below which parallelism disabled 

• PARAMDEG – MAX_DEGREE limits parallel groups • Static and dynamic SQL (default ‘0’, unlimited) 

• DEGREE(ANY) and CURRENTDATA(NO) bind options • Or DB2 needs to know if cursor is read‐only 

• CDSSRDEF – SET CURRENT DEGREE special register for dynamic queries • Default =1, ‘ANY’ lets DB2 decide 

• VPPSEQT ‐ % of sequential steal for parallel operations • Each utility task needs 128 pages in BP 

• Star join enabled, number of tables involved • PARA_EFF ‐ % of optimism regarding parallel access path 

improvement (PM16020) 

31

31

Current Default for SPRMPTH is 120 ms

31

Page 32: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

zIIP relevant PARMLIB parameters

• IIPHONORPRIORITY (YES/NO) in IEAOPTxx parmlib member • This means if we reach the queue limit and ZIIPAWMT is triggered

• the dispatcher will route work over to a GP • Maximum number of dispatchable units that will queue waiting for a zIIP 

processor • This number changes with the HW level and ties in with dispatcher 

algorithms • ZIIPAWMT, ZAAPAWMT – Alternate wait management threshold is 

how long zIIP will run before checking to see if it needs help from GP • Default 12 milliseconds/ 3200 for Hiperdispatch • In V10/V11 that means system engines may wait 3.2ms 

• ZAAPZIIP = YES|NO (IEASYSxx option) • Allows zAAP eligible workload to run on a zIIP 

• zAAP has other settings not applicable to zIIP • IFACrossover – disallow zAAP work on general CP 

32

32

Page 33: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Offload APARS

• PM12256 – zIIP offload improvement up to 55-60%, and less overhead• http://www-

01.ibm.com/support/docview.wss?uid=swg1PM12256&myns=swgimgmt&mynp=OCSSEPEK&mync=R

• PM28626 – corrected inconsistent response time from long running queries after the application of PM12256• https://www-

304.ibm.com/support/docview.wss?crawler=1&uid=swg1PM28626

• OA35146 – z/OS for PM28626, allows preemptible SRB to join/leave and enclave• https://www-

304.ibm.com/support/entdocview.wss?uid=isg1OA35146

33

33

-please take the turbo off my car, when I floor it it accelerates too quickly -z/OS APAR to ensure that only IBM authorized code can be offloaded –NEON has requested all its customers uninstall zPrime and it is now unsupported

Page 34: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Results of zIIP Maintenance• Pre- PM12256

• After – PM12256

• After – PM28626 ???• Less noticeable elapsed time difference for customers with

knee-capped general CPs

34

34

-3 out of 5 threads were zIIP eligible, but due to elapsed time differences of warehouse type queries and transaction types customers noticed

-this was result of more offload and better algoritm-after PM28626 if it runs longer than 0.1 CPU seconds then we will swap it back -this new behavior will vastly improve the L1 and L2 cache hit improvement for z10 and z196 especially, miss both if it hits other CP and core

Page 35: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

Reference material• II12836 - Info APAR for parallelism V6- DB2 9• II14219 - zIIP Exploitation• OA38829 - z/OS V1.12 and z/OS V1.13 remove the restriction that prevents

zAAP-eligible workloads from running on zIIP processors when a zAAP is installed on the server

• OA37201 - faster switch to SRB mode• PM62824 – DFSORT offload of memory object workfile sorting to zIIP• Techline Sizing with zCP3000 tool: contact your local IBMer• RMF Spreadsheet Reporting Tool

• http://www-03.ibm.com/systems/z/os/zos/features/rmf/tools/rmftools.html• Getting Started Resources

• http://www-03.ibm.com/systems/z/hardware/features/ziip/resources.html• Link to article on ZIIPMAXQL

• https://www.ibm.com/developerworks/mydeveloperworks/blogs/22586cb0-8817-4d2c-ae74-0ddcc2a409bc/entry/december_17_2012_6_07_am3?lang=en

35

35

-DB2 code has been modified to create only one enclave for all parallel tasks under a query. This allows WLM to manage each query as a whole, and properly distinguish between high-consumption queries and low-consumption queries -PK27578 – v8 in 2008-PK18454 – v8 2008At this time, IBM has no plan for enabling DFSORT to exploit the system z9 Integrated Information Processor (zIIP). IBM realizes DFSORT remains a prominent component of our customers' batch workloads. However, the added controls that would need to be implemented in order to maintain our high standards for performance, reliability and system integrity are not justified in view of estimations that there is a low offload potential and the value to clients may be marginal. IBM will continue to focus its DFSORT development efforts on the enhanced function, performance, reliablility and service items that we believe provide the most value to our clients. The foregoing represents IBM's current intent and is subject to change. We have analyzed SMF data from numerous customers and that analysis has confirmed our assessment that a)DFSORT is not a major contributor to overall system cpu utilization and b)peak cpu utilization is often driven by other workloads at times when DFSORT activiy is minimal (during online processing for example). We are looking closer at the DB2 Utility situation because of their ability to run multiple concurrent sort tasks but have nothing we can share with customers yet.

Page 36: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Timm Zimmermann & Michael DewertIBM[tizimm|mdewert]@de.ibm.com

Session A14DB2 for z/OS and zIIP & zAAP ‐ The REDIRECT to Success

36

Page 37: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

37

zIIP EligibleAPAR II14219

Function Amount Redirected Prerequisitesz/OS 1.8 – base feature

z/OS1.9 – WLM weights on zIIPs

DB2 V8

1) Utilities

2) Distributed DRDA requests

3) Parallelism (star schema and parallel queries)

4) Result set of remote Stored procedures

1) Up to 60% in Lab measurements (depending on # of parts and indexes (BUILD and REBUILD phases of index maintenance go to zIIP)

2) Up to 55-60% in Lab measurements.

3) Portion of main task for remote calls, 80% of child tasks

4) Call, commit, result-set processing

1) UK15814

2) DRDA over TCP/IP – PM12256

3) zPARM CDSSRDEF=1, PARAMDEG =0 (>0 to limit the degree of parallelism) DEGREE ANY bind parameter and SET CURRENT DEGREE ANY at statement level.

4) N/A

DB2 9

1) All the offload in V8 plus the following

2) Distributed calls to Native Stored Procedures

3) XML parsing offloaded to zAAP and zIIP

1) Slightly less for Utilities due to CPU reduction for index processing in DB2 9 but added UNLOAD phase during REORG

2) Remote calls offload same percentage as remote DRDA requests

3) Up to 36% zAAP redirect in Lab measurements for XML LOAD utility. Up to 63% zIIP redirect in Lab measurements for XML INSERT via DRDA.

1) PM37622

2) No FENCED or EXTERNAL keywords, native SQL code

3) Z/OS 1.8

Other Processes

1) IPSec

2) Global Mirror for z/OS (formerly Extended Remote Copy)

3) HiperSockets for Large messages

4) DFSORT

5) zAAP on zIIP

1) Encryption processing, header processing and crypto validation (93% for bulk data movement)

2) Most System Data Mover processing

3) Handles large outbound messages (multiple channel paths given to SRBs)

4) Sorting of fixed length rows

5) zAAP eligible work can move to zIIP if no zAAP installed

1) z/OS 1.8 + UA34582 AND z/OS Communication Server PTF UK27062-63

2) z/OS 1.10, or 1.9 + UA39510, or 1.8 + UA39509 (zGM parmlib zIIPEnable)

3) z10 and z/OS 1.10 (GLOBALCONFIG ZIIP IQDIOMULTIWRITE)

4) PK85899 and PK85856 (z/OS 1.10)

5) z/OS 1.11 base or 1.9 or 1.10 w/ APAR OA27495

*Read left to right all the way across 1)Function 1)Amount Redirected 1)Prerequisites

Taken from Adrians Burke’s zIIP Experiences Webcast - https://ibm.biz/BdDY3r

http://publib.boulder.ibm.com/infocenter/zos/v1r9/topic/com.ibm.zos.r9.icet100/ice1ct0073.htm-memory object sorting-//SORTDIAG DD DUMMY added to job will show more detail on memory object sorts and hyperspace sorts-Memory object sorting is a new DFSORT capability that uses a memory object on 64-bit real architecture toimprove the performance of sort applications. A memory object is a data area in virtual storage that is allocatedabove the bar and backed by central storage. With memory object sorting, a memory object can be used exclusively,or along with disk space, for temporary storage of records. Memory object sorting can reduce I/O processing,elapsed time, EXCPs, and channel usage. When a memory object is used, Hiperspace and data space arenot needed.

-global mirror – system data mover functions-DB2 V8 maintenance… PK46171 (11/07)- fixed accounting and usage calculation

-Web deliverable from zIIP prereqs for 1.6 and z/OS 1.7-CDSSRDEF – zPARM to set default of current degree special register for

Page 38: DB2 for z/OS and zIIP zAAP The REDIRECT to Success

Click to edit Master title style

38

zIIP EligibleAPAR II14219

Function Amount Redirected Prerequisitesz/OS 1.8 – base feature

z/OS1.9 – WLM weights on zIIPs

DB2 10

1) All of DB2 v8 and 9 offload++

2) RUNSTATS

3) Prefetch and deferred write processing

4) Parallelism enhancements

1) BUILD phase, Native SQL procs, parallelism,, 60% DRDA requests

2) Basic RUNSTATS for table, NO Histogram, DSTATS, COLGROUP… BUT index stats almost all offloaded (not DPSIs)

3) 100% (roughly 70% of DBM1 SRB time)

4) Parallelism more likely (80% of child tasks)

1) DB2 10/ z/OS 1.10

2) Run RUNSTATS, no inline STATS

3) Shows up in DBM1 SRB time

4) V10 NFM with rebind

DB2 11?

1) More RUNSTATS

2) LOAD REPLACE with dummy input

3) The rest of the system engines (GBP write, castout, p-lock notify/exit)

4) Index pseudo delete cleanup

1) COLCARD, FREQVAL, HISTOGRAM statistics, including inline stats (80%, possibly more)

2) 100% of delete processing eligible

3) 100% eligible

4) 100% eligible

1) DB2 11, z/OS 1.10, z10

2) “ “

3) “ “

4) if you overload the zIIPs with INDEXCLEANUP_THREADS there could be overflow to GPs

Other Processes

1) IPSec

2) Global Mirror for z/OS (formerly Extended Remote Copy)

3) HiperSockets for Large messages

4) DFSORT

5) zAAP on zIIP

1) Encryption processing, header processing and crypto validation (93% for bulk data movement)

2) Most System Data Mover processing

3) Handles large outbound messages (multiple channel paths given to SRBs)

4) Sorting of fixed length rows (10-40% Utility), memory object work file sorts

5) zAAP eligible work can move to zIIP if no zAAP installed

1) z/OS 1.8 + UA34582 AND z/OS Communication Server PTF UK27062-63

2) z/OS 1.10, or 1.9 + UA39510, or 1.8 + UA39509 (zGM parmlib zIIPEnable)

3) z10 and z/OS 1.10 (GLOBALCONFIG ZIIP IQDIOMULTIWRITE)

4) PK85899 and PK85856 (z/OS 1.10), PM62824 and z/OS 1.12

5) z/OS 1.11 base or 1.9 or 1.10 w/ APAR OA27495 / OA38829 if both installed

*Read left to right all the way across 1) Function -> 1) Amount Redirected -> 1) Prerequisites

Taken from Adrians Burke’s zIIP Experiences Webcast - https://ibm.biz/BdDY3r

-PM30468 – puts deferred and prefetch into DBM1 accounting-v10 : Inline stats are not supported, COLGROUP is not supported, DSTATS not supported Histogram stats not supported.But: During index scan, DSTATS, FREQVAL and HISTOGRAM stats aresupported. Except DPSIs. So nearly all RUNSTATS INDEX is offloaded, but only basic RUNSTATS TABLE is offloaded. http://publib.boulder.ibm.com/infocenter/zos/v1r9/topic/com.ibm.zos.r9.icet100/ice1ct0073.htm-memory object sorting-//SORTDIAG DD DUMMY added to job will show more detail on memory object sorts and hyperspace sorts-Memory object sorting is a new DFSORT capability that uses a memory object on 64-bit real architecture toimprove the performance of sort applications. -Hipersockets using multi-write facility is for large payload FTP, XML, etc.-global mirror – system data mover functions-DB2 V8 maintenance… PK46171 (11/07)- fixed accounting and usage calculation

-Web deliverable from zIIP prereqs for 1.6 and z/OS 1.7-CDSSRDEF – zPARM to set default of current degree special register for dynamic queries-PARAMDEG = MAX_DEGREE – for static queries