disaster recovery strategies for ims

46
IBM Software Group © 2008 IBM Corporation IMS Application Dependent & Mirroring Overview Disaster Recovery Solutions Glenn Galler IBM SW IT Specialist, ATS Ann Arbor, Michigan [email protected]

Upload: ibm-ims

Post on 14-Dec-2014

1.163 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Disaster recovery strategies for ims

IBM Software Group

© 2008 IBM Corporation

IMS Application Dependent & Mirroring Overview Disaster Recovery Solutions

Glenn GallerIBM SW IT Specialist, ATSAnn Arbor, [email protected]

Page 2: Disaster recovery strategies for ims

2

The following are trademarks of the International Business Machines Corporation in the United States and/or other countries.

The following are trademarks or registered trademarks of other companies.

* Registered trademarks of IBM Corporation

* All other products may be trademarks or registered trademarks of their respective companies.

Notes:

Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.

IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.

All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.

This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.

All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

This presentation and the claims outlined in it were reviewed for compliance with US law. Adaptations of these claims for use in other geographies must be reviewed by the local country counsel for compliance with local laws.

Intel is a trademark of the Intel Corporation in the United States and other countries.Java and all Java-related trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc., in the United States and other countries.Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.UNIX is a registered trademark of The Open Group in the United States and other countries.

AIX*CICS*DB2*e-business logo*Enterprise Storage Server*ESCON*FICONFlashCopy*

GDPS*HyperSwapIBM*IBM eServer*IBM logo*NetView*OS/390*Parallel Sysplex*

S/390*Sysplex Timer*Tivoli*TotalStorage*z/OS*z/VM*zSeries*

Trademarks

Page 3: Disaster recovery strategies for ims

3

Acknowledgments

• Peter Armstrong (BMC)• Book: “DBRC in Practice”

• http://www.dbazine.com/ofinterest/oi-articles/armstrong6

• Technical Support • Rich Lewis (ATS)• Helene Lyon (IBM France)

• DM Tools• Mitch Dooley and Lynne Bisceglia (IMS Recovery Expert)• Bob Magid (DM Tools Architect)

Page 4: Disaster recovery strategies for ims

4

Agenda

• Two Disaster Recovery Strategies for IMS• IMS Application Dependent DR• Storage Management Mirroring

• IMS Application Dependent DR• Managing image copies, Recons, and Logs at a Remote Site• IBM DM Tools can assist with this DR strategy

• Storage Management Mirroring• Production data is mirrored to Remote Site• Creating Consistency is the Key• Mirroring can be Asynchronous or Synchronous• GDPS can be used to automate DR strategy

Page 5: Disaster recovery strategies for ims

5

Concepts

• Disaster Recovery• Process of recovering a Production environment • Restore to point where business can be conducted

• Recovery Time Objective (RTO)• Time allowed to recover the applications • All critical operations are up and running again

• Recovery Point Objective (RPO)• Amount of data lost in the disaster• Last point-in-time when all data was current

Page 6: Disaster recovery strategies for ims

6

IMS Application Dependent Disaster Recovery

Page 7: Disaster recovery strategies for ims

7

Production Site

Remote

Site

Logger

RECON

WADS

OLDS

RDS

IMS ControlRegion

DBRC

DLI/SAS Image Copies

RLDS

Change Accumulation

SLDS

BACKUP RECON

Page 8: Disaster recovery strategies for ims

8

Remote Site

Logger

IMS ControlRegion

DBRC

DLI/SAS

RECON

Production

SiteRLDS

Image Copies

Change Accumulation

SLDS

BACKUP RECON

RDS

WADS

OLDS

Page 9: Disaster recovery strategies for ims

9

3 Application Dependent DR Strategies

• #1: Recover the Databases to an Earlier Image Copy• Image Copies and Recon are shipped to remote site• Production activity is quiesced for image copy

• Procedure may include IMS, DB2 and CICS• Data since last image copy is lost• RTO is low since recovery time is short• RPO is high since the log updates are lost

Image Copies

Page 10: Disaster recovery strategies for ims

10

3 Application Dependent DR Strategies

• #2: Recover the Databases to the Last Good Log Data Set• Image Copies, Recon, Logs are shipped to remote site• Forward and Backward recovery is performed with the logs• RTO is higher as recoveries are required• RPO is lower as log updates are applied to image copy

SLDS

RLDS

Change Accumulation

Image Copies

+

Page 11: Disaster recovery strategies for ims

11

3 Application Dependent DR Strategies

• #3: Recover the Databases to the IMS Recovery Expert RP• Image Copies, Recon, Logs are shipped to remote site• Forward recovery is performed to the RP with the logs• RTO is higher as recoveries are required• RPO is lower as log updates are applied to image copy• IBM Tools provide additional capability

SLDS

RLDS

Change Accumulation

Image Copies

+ + RP

Page 12: Disaster recovery strategies for ims

12

Recover DBs to Earlier Image Copy

Page 13: Disaster recovery strategies for ims

13

Recover DBs to Earlier Image Copy

• Primary: Step 1: • Quiesce the IMS environment• Take Batch (Clean) Image Copies periodically• Ship Image Copies + Recon to Remote Site

Page 14: Disaster recovery strategies for ims

14

Recover DBs to Earlier Image Copy

• Remote: Step 2: Clean Backup Recon• Active Subsystems

• If Recon shows active SUBSYSTEMs• Use LIST.SUBSYS to show active subsystems• Issue DBRC Commands:

• CHANGE.SUBSYS SSID(ssidname) ABNORMAL• CHANGE.SUBSYS SSID(ssidname) STARTRCV• CHANGE.SUBSYS SSID(ssidname) ENDRECOV• DELETE.SUBSYS SSID(ssidname)

• Secondary Image Copies• If Image Copy at Remote site is a Secondary Image Copy

• Flag the Primary Image Copy in the Recon as Invalid• Fastpath DEDBs

• Flag DEDBs AREAs as Recovery Needed to enable recoveries

Page 15: Disaster recovery strategies for ims

15

Recover DBs to Earlier Image Copy

• Remote: Step 3: Recover DBs to Clean IC• Use Cleaned Up Recon to for GENJCL.RECOV JCL • Recover database data sets:

• Standard IMS Recovery Utility (DFSURDB0)• Or, IMS Database Recovery Facility (DRF)

• HPIC Incremental Image Copies• Must be recovered with IMS Database Recovery Facility (DRF)

• HPIC COPY Image Copies• Exact copy of database, no recovery is needed• Ensure Database Data Set name matches

Image Copies Database Recovery

Utility or Tools

Page 16: Disaster recovery strategies for ims

16

Recover DBs to Earlier Image Copy

• Remote: Step 4: Cold Start IMS Systems• Databases are consistent with earlier Image Copy

Page 17: Disaster recovery strategies for ims

17

Recover DBs to Last Good Log

Page 18: Disaster recovery strategies for ims

18

Recover DBs to Last Good Log

• Primary: Step 1: • Take Batch (Clean) or CIC (Fuzzy) Image Copies• Optionally take Change Accumulations periodically• Ship ICs + Recon + SLDS/RLDS + CAs to Remote Site

Page 19: Disaster recovery strategies for ims

19

Recover DBs to Last Good Log

• Remote: Step 2A: Clean Backup Recon • To use DBRC Recon for Recovery

•No Databases can be Allocated on an OPEN log

• Close and Archive OLDS in Recon data •When OLDS are not at the Remote Site•All PRILOG, PRISLD, PRIOLD need non-zero Stop Times

• Two Methods:1.Add dummy SLDS log entry and Start Archive2.Close & flag INUSE OLDS as Archive Needed & Archive

Page 20: Disaster recovery strategies for ims

20

Recover DBs to Last Good Log

• Remote: Step 2B: Clean the Backup Recon • Active Subsystems

• If Recon shows active SUBSYSTEMs• Use LIST.SUBSYS to show active subsystems• Issue DBRC Commands:

• CHANGE.SUBSYS ABNORMAL• CHANGE.SUBSYS STARTRCV• CHANGE.SUBSYS ENDRECOV• DELETE.SUBSYS

• Secondary Image Copies• If Image Copy at Remote site is a Secondary Image Copy

• Flag the Primary Image Copy in the Recon as Invalid• Fastpath DEDBs

• Flag DEDBs AREAs as Recovery Needed to enable recoveries

Page 21: Disaster recovery strategies for ims

21

Recover DBs to Last Good Log

• Remote: Step 2C: Clean the Backup Recon• If copies of CA data sets are at remote site

• Recon will show CA data sets from Production site• Use CHANGE.CA to point to correct copy of CA

• If CA data sets are unavailable and exist in Recon• Flag the Recon to show CA is INVALID

• DBRC will use the logs instead of the CAs

Page 22: Disaster recovery strategies for ims

22

Recover DBs to Last Good Log

• Remote: Step 3: Recover DBs from ICs, Logs, CA• Use Clean Up Recon to GENJCL.RECOV JCL• Recover DBs from ICs + CAs + SLDS/RLDS

• Standard IMS Recovery Utility (DFSURDB0)• Or, IMS Database Recovery Facility (DRF)

• HPIC Incremental ICs• Use IMS Database Recovery Facility (DRF)

+Change Accumulation

RLDS

SLDS

Image Copies

RECON

+

Page 23: Disaster recovery strategies for ims

23

Recover Database to Last Good Log

• Remote: Step 4: Batch Backout• Following Full Database Recovery

• Backout Inflight UOWs• Cold Start or ERE COLDSYS or ERE COLDBASE

• Following Timestamp (RP) or Point-In-Time Recovery• No Inflight UOWs to backout• Cold Start or ERE

Page 24: Disaster recovery strategies for ims

24

Recover Databases to Last Good Log

• Remote: Step 5: Cold Start or /ERE from SLDS

• If /ERE From SLDS

• Ensure SUBSYS records have been cleaned up• Online IMS SUBSYS record should not be deleted

• IMS dynamically allocates SLDS• OLDS were archived for GENJCL.RECOV

Page 25: Disaster recovery strategies for ims

25

Recover DBs to IMS Recovery Expert RP

Page 26: Disaster recovery strategies for ims

26

Recover DBs to IMS Recovey Expert RP

• Primary: Step 1: Create Backup Datasets• Take Batch (Clean) or CIC (Fuzzy) Image Copies• Optionally take Change Accumulations periodically• Create IMS Recovery Expert RPs periodically• Create and Clean Backup Recon• Ship ICs + Clean Backup Recon + SLDS/RLDS + CAs

Page 27: Disaster recovery strategies for ims

27

Recover DBs to IMS Recovey Expert RP

• Primary: Step 2: Clean Backup RECON

• IMS Recovery Expert Recon Clean Up (RCU)•Cleanup Timestamp

• Last DEALLOC or Stop Time of SLDS/RLDS•Closes open PRILOG, PRIOLD and SECSLD records•Deletes PRIOLD, SECOLD and SUBSYS records•Updates or deletes ALLOC and LOGALL records•Deletes IC and CA records past the Clean Up Time

• i.e… Automates the Recon clean up process

Page 28: Disaster recovery strategies for ims

28

Recover DBs to IMS Recovey Expert RP

• Remote: Step 3: Recover DBs• Use Cleaned Up Recon to GENJCL.RECOV JCL• Recover DBs from ICs + CAs + SLDS/RLDS

+Change Accumulation

RLDS

SLDS

Image Copies

RECON

+ + RP

Page 29: Disaster recovery strategies for ims

29

Recover DBs to IMS Recovey Expert RP

• Step 4: Cold Start IMS• IMS RCU deletes all IMS Subsystem Records

Page 30: Disaster recovery strategies for ims

30

Storage Management Mirroring Disaster Recovery

Page 31: Disaster recovery strategies for ims

31

Storage Management Mirroring DR

• Mirroring DR Solutions• All Production volumes are mirrored to Remote Site• Mirroring can be Synchronous or Asynchronous

• Or, combination of two strategies• IMS can be Cold Started or Emergency Restarted

• Backouts occur during Emergency Restart• Consistency of Data is the key to mirroring• IBM Graphically Dispersed Parallel Sysplex (GDPS)

• Optional, helps automate the DR solution

Page 32: Disaster recovery strategies for ims

32

Storage Management Mirroring DR

• Geographically Dispersed Parallel Sysplex (GDPS)• Manages:

• IBM Metro Mirror (PPRC)• IBM Global Mirror• IBM z/OS Global Mirror (XRC)

• Controls remote copy configuration and storage subsystem• Provides automation of sysplex operational tasks• Independent of applications like IMS and DB2• Includes IBM Services for configuration and manageability• Optional for Mirroring Solutions

Page 33: Disaster recovery strategies for ims

33

Consistency of Data: Dependent Writes

• Committed Database Update

LOG

(1) Log “Before Image”

Good Sequence of Writes• (1)• (1) and (2)• (1), (2) and (3)

Bad Sequence of Writes• (1) and (3) only

(2) Update Database

Database

(3) Log “After Image”

LOG

Page 34: Disaster recovery strategies for ims

34

Mirroring Environments

Page 35: Disaster recovery strategies for ims

35

GDPS/PPRC… IBM Metro Mirror

– Synchronous DR Solution

– RPO is zero

– Emergency restart of IMS

– Automation and Freeze policy

– Duplexing of CF Structures okay

– Distances up to 300 km (with RPQ)

CF1

CF2

GDPSGDPSKK--syssys P1P1 P2P2

A

B

12

23

4567

8910

11 1

12

23

4567

8910

11 1

GDPSGDPSKK--syssys CBUCBU

Metro Mirror

K2

K1

Site 1

Site 2

OpenOpen

OpenOpen

Page 36: Disaster recovery strategies for ims

36

GDPS/PPRC: Consistency

• (1) Consistency Group (CG)• Set of volumes hold IMS datasets and logs• When failure occurs for a volume in CG at

Remote Site• All writes in CG are held for period of time

• (2) GDPS Freeze Automation• FREEZE and GO:

• Writes continue at Primary even if failing at Secondary

• FREEZE and STOP:• Writes are frozen at Primary and Secondary

Page 37: Disaster recovery strategies for ims

37

GDPS/Global Mirror

– Asynchronous DR Solution

– RPO can be 3 – 5 seconds (dependent on bandwidth)

– RTO - Emergency restart of IMS

– Local site response times are negligible

– Two site long distance DR and backup remote copy solution

CF1 GDPSGDPSKK--syssys P1P1 P2P2

A

12

23

4567

8910

11 1

K1

Site 1

Global Mirror

CF2GDPSGDPSRR--syssys P1P1 P2P2

B

12

23

4567

8910

11 1

R

Site 2

Backups

FRequired

FlashCopy

OpenOpen OpenOpen

Page 38: Disaster recovery strategies for ims

38

GDPS/Global Mirror: Consistency

− (1) Primary creates Out-of-Sync Bit Maps:− Shows tracks with new data

− (2) Consistency Groups− FlashCopy is required for IBM Global Mirror

− FlashCopy is taken before changes are applied− Change Recording bitmap from Out-of-Sync bitmap

− Tracks in Change Recording bitmap are Updated

Page 39: Disaster recovery strategies for ims

39

GDPS/XRC… z/OS Global Mirror

CF1 GDPSGDPSKK--syssys P1P1 P2P2

A

12

23

4567

8910

11 1

K1

Site 1

z/OS

Global Mirror

CF2P1P1 P2P2

B

12

23

4567

89

1011 1

SDM

Site 2

FOptional

FlashCopy

SDMSDM KxKx

Kx

– Asynchronous DR Solution

– RPO can be 3 – 5 seconds (dependent on bandwidth)

– RTO - Emergency restart of IMS

– Scales to any amount of DASD/distance

– z/OS Solution

Page 40: Disaster recovery strategies for ims

40

GDPS/XRC: Consistency

− (1) Primary writes are timestamped − Timestamps are stored in Side File cache

− (2) System Data Mover (SDM)− Global Mirror uses SDM to “pull” data to Secondary− SDM uses timestamps in Side File to order writes− Consistency Group is journaled

Page 41: Disaster recovery strategies for ims

41

GDPS/PPRC -- Global Mirror

Site 3

– Three site mirroring solution:–Metro Mirror for Site 1 & 2 –Global Mirror for site 2 & 3

– Site 2 RPO near 0– Site 3 RPO 3-5 secs if site 1 & 2 lost

CF1

CF2

GDPSGDPSKK--syssys P1P1 P2P2

A

B

12

23

4567

8910

11 1

12

23

4567

8910

11 1

GDPSGDPSKK--syssys CBUCBU

Metro Mirror

K2

K1

Site 1

Site 2

Global Mirror

CF3GDPSGDPSRR--syssys P1P1 P2P2

C

12

23

4567

8910

11 1

R BackupsOpenOpen

OpenOpen

Page 42: Disaster recovery strategies for ims

42

GDPS/PPRC – XRC

Site 3

CF1

CF2

GDPSGDPSKK--syssys P1P1 P2P2

A

B

12

23

4567

8910

11 1

12

23

4567

8910

11 1

GDPSGDPSKK--syssys CBUCBU

Metro Mirror

K2

K1

Site 1

Site 2

z/OS

Global MirrorCF3P1P1 P2P2

B

12

23

4567

8910

11 1

SDM

SDMSDM KxKx

KxOpenOpen

OpenOpen – Three site mirroring solution:–Metro Mirror for Site 1 & 2 –z/OS Global Mirror for site 2 & 3

– Site 2 RPO near 0– Site 3 RPO 3-5 secs if site 1 & 2 lost

Page 43: Disaster recovery strategies for ims

43

Coupling Facility Structures

• GDPS/PPRC• Primary and Secondary distance allows for Duplexing

• GDPS/Global Mirror, GDPS/XRC• Distance between sites usually prevents Duplexing• All CF structures are allocated during Emergency Restart

Page 44: Disaster recovery strategies for ims

44

GDPS Metro Mirror: IMS CF Structures

• No Duplexing Needed:• OSAM and VSAM Buffer Pools

• Stored on DASD when data is committed• Secondary Site:

• All buffers are invalid and structure is rebuilt• IRLM Locks

• Secondary site:• Restart backs out inflight trans to release locks• IRLM rebuilds lock structure as empty

Page 45: Disaster recovery strategies for ims

45

GDPS Metro Mirror: IMS CF Structures

• Good Candidate for Duplexing:• Shared Queues (MSGQ and EMHQ)• VTAM Generic Resources

• Structure changes infrequently• Rebuilding can take long time and users must wait

• Shared VSO• Store-In Structure:

• Committed updates on CF and not on DASD• Recover FP Areas after restart (if not duplexed)

• Two Duplexing options:• IMS Managed: IMS creates 2 structures/Area• System Managed (IMS V9+): Multiple Areas per

structure

Page 46: Disaster recovery strategies for ims

46

Summary

• Two Disaster Recovery Strategies for IMS• IMS Application Dependent DR• Storage Management Mirroring

• IMS Application Dependent DR• Managing image copies, Recons, Logs at a Remote Site• IBM DM Tools can assist with this DR strategy

• Storage Management Mirroring• Production data is mirrored to Remote Site• Creating Consistency is the Key• Mirroring can be Asynchronous or Synchronous• GDPS can be used to automate DR strategy