mimix ha runbook template - vision...

46
Vision: iSeries Availability Runbook Template Overview For the most current version of this template, look on the Vision web site at http://www.mimix.com/partners/services/templates/index.asp . This template is purposely overwritten to minimize the amount of time you have to spend doing original writing. As we gain experience using this template, we will incorporate feedback from Vision Installation Professionals like you as to how it can be improved. If you have questions about how to use this template, contact the Vision Solutions Project Manager associated with the account. Instructions 1. Variables — This document makes use of variable fields to minimize the time to create the document. The variables are located on page 3. This is the cover page for the customers document. The variable box is hidden. To locate the box, move your mouse pointer below the red header line and click. This will cause the box to be selected. To see the variables, enlarge the box by enlarge the box by moving your mouse to the small square at the bottom center of the outline and click and hold the mouse button as you drag the line down. Once you are finished making the changes to the variables reduce the box back to the original size so it will not show during printing 2. Write the report — Author the report based on your discussions with the customer and the data you collect using this template. Delete all material that doesn't apply, respond appropriately to Highlighted Directions, edit the XXX and **Phrase** items, and add new material unique to the customer's environment. 3. Keep it simple — Resist the temptation to use additional colors or typefaces. The template has been worked out using common Windows fonts to present an orderly, professional appearance. 4. Be Consistent — Whatever spelling, grammar, style, and formatting rules you use, use them consistently. 5. Spell out hyperlinks — Since the user is likely to have to rely on a printed version of the final documents, spell out hyperlinks and email addresses. An embedded hyperlink will not be of any use to someone with just a paper copy of the report. 6. Remove Highlighted Directions, and check to be sure you have either appropriately edited XXX and **Phrase** items, or returned the text formatting to "Auto" by: Selecting the text, then FormatFontColorAuto (or by using the Text Color button on the toolbar). 7. Review the report — run Spell Check. Check for grammar and completeness. document.doc Availability Runbook page 1 of 46

Upload: others

Post on 24-Mar-2020

60 views

Category:

Documents


5 download

TRANSCRIPT

MIMIX HA Runbook Template

Availability Runbook

Vision: iSeries Availability Runbook Template

Overview

For the most current version of this template, look on the Vision web site at http://www.mimix.com/partners/services/templates/index.asp.

This template is purposely overwritten to minimize the amount of time you have to spend doing original writing. As we gain experience using this template, we will incorporate feedback from Vision Installation Professionals like you as to how it can be improved. If you have questions about how to use this template, contact the Vision Solutions Project Manager associated with the account.

Instructions

1. Variables — This document makes use of variable fields to minimize the time to create the document. The variables are located on page 3. This is the cover page for the customers document. The variable box is hidden. To locate the box, move your mouse pointer below the red header line and click. This will cause the box to be selected. To see the variables, enlarge the box by enlarge the box by moving your mouse to the small square at the bottom center of the outline and click and hold the mouse button as you drag the line down. Once you are finished making the changes to the variables reduce the box back to the original size so it will not show during printing

2. Write the report — Author the report based on your discussions with the customer and the data you collect using this template. Delete all material that doesn't apply, respond appropriately to Highlighted Directions, edit the XXX and **Phrase** items, and add new material unique to the customer's environment.

3. Keep it simple — Resist the temptation to use additional colors or typefaces. The template has been worked out using common Windows fonts to present an orderly, professional appearance.

4. Be Consistent — Whatever spelling, grammar, style, and formatting rules you use, use them consistently.

5. Spell out hyperlinks — Since the user is likely to have to rely on a printed version of the final documents, spell out hyperlinks and email addresses. An embedded hyperlink will not be of any use to someone with just a paper copy of the report.

6. Remove Highlighted Directions, and check to be sure you have either appropriately edited XXX and **Phrase** items, or returned the text formatting to "Auto" by: Selecting the text, then Format(Font(Color(Auto (or by using the Text Color button on the toolbar).

7. Review the report — run Spell Check. Check for grammar and completeness.

8. Clean up the report — Remove all Highlighted Directions, and non-essential blank lines, extra rows and columns in tables, and this instruction page.

9. Update the Table of Contents — Place the cursor in the table of contents field and update the table (press F9, select Update Field, and check the Update entire table choice, then click OK).

10. If time permits, have someone unfamiliar with the project give it a "cold read." Spell check can miss some big mistakes, and a quick read by a fresh pair of eyes (even someone who is not technically proficient with the iSeries or MIMIX) is always good, time permitting.

11. Update records — Both Vision consultants and Business Partner consultants should send a copy of the report to the Vision Solutions Project Manager responsible for the customer’s territory.

12. Deliver the Runbook to the customer.

Runbook Template Revision History

Document Template Revision Date

(On cover page)

Summary of Revisions

June 16,2008

· Updated hyper links

Aug 1, 2007

· Revised for V5

Note: This page and the page(s) preceding it are not to be included in the final Runbook.

Customer X

iSeries Managed Availability Runbook

Prepared by:

Solutions Consultant

Vision Solutions SET BCKUPSYS \* MERGEFORMAT

SET BCKUPSYS

Owner:

Customer X

Created:

Last revision:

Table of Contents

2Summary of Revisions

Purpose and Audience5

Ownership5

Maintaining this procedure5

Revision Changes6

Server Switching7

Concepts and Strategy7

Switch Cycle7

Graphical full switch cycle overview8

Switch Overview9

Planned Switch Overview9

Unplanned Switch Overview10

Switch readiness validation11

Goal11

Switch readiness validation tasks11

MSFname-SWITCH from Sys1 to Sys213

Procedure SWITCHOVER-SWITCH – Switch to Backup13

SWITCHOVER-SWITCH Pre-Switch Tasks14

SWITCHOVER-SWITCH Planned Switch Tasks15

SWITCHOVER-SWITCH Post-Switch Tasks16

Procedure SYNCHRONIZE-SWITCH – Resynchronize17

SYNCHRONIZE-SWITCH Pre-Synchronization Tasks17

SYNCHRONIZE-SWITCH Synchronization Tasks18

SYNCHRONIZE-SWITCH Post-Synchronization Tasks19

Procedure FAILOVER-SWITCH – Fail over to Backup19

FAILOVER-SWITCH Pre-Switch Tasks20

FAILOVER-SWITCH Unplanned Switch Tasks20

FAILOVER-SWITCH Post-Switch Tasks22

MSFname-RETURN from Sys2 to Sys123

Procedure SWITCHOVER-RETURN – Switch to Backup23

SWITCHOVER-RETURN Pre-Switch Tasks24

SWITCHOVER-RETURN Planned Switch Tasks25

SWITCHOVER-RETURN Post-Switch Tasks26

Procedure SYNCHRONIZE-RETURN – Resynchronize27

SYNCHRONIZE-RETURN Pre-Synchronization Tasks27

SYNCHRONIZE-RETURN Synchronization Tasks28

SYNCHRONIZE-RETURN Post-Synchronization Tasks29

Procedure FAILOVER-RETURN – Fail over to Backup29

FAILOVER Pre-Switch Tasks30

FAILOVER-RETURN Unplanned Switch Tasks30

FAILOVER-RETURN Post-Switch Tasks32

Appendix A: Runbook Hyperlinks33

Appendix B: The Runbook Data Capture Tool34

Using and Updating the Links Used in this Document34

Updating The Links35

Method 1: Automatically35

Method 2: Manually36

About

Purpose and Audience

This Runbook, describes the operational actions to switch the host production role from the Sys1 system to the Sys2 system and the actions needed to return the host production role from the Sys2 system to the Sys1 system. The intention of the document is to guide the MIMIX administrator through the switch process.

Ownership

The owner of this document named on the cover page is responsible for maintaining the procedures and schedules presented to comply with your availability goals and objectives. This document must be revised when changes, ranging from a simple fix update to major software or hardware changes, occur in your managed availability environment.

Maintaining this procedure

Whenever the system setup changes, it may be needed to change both the Runbook and this switch procedure because many changes can occur in your managed availability environment that can impact the effectiveness of your solution. Some of the more common changes that can occur are:

· New Availability Solution Administrator.

· New Operating system technology (i.e. remote journaling, new protocols) will impact performance and the configuration of MIMIX and automation code.

· Network changes or additional - such as new hardware or communication components - can impact the switching of users to a remote system.

· Introduction of a new application on the systems that needs to be included in the managed availability environment.

· Introduction of new database features such as triggers, null fields or referential integrity constraints.

· Addition or changes to application change management can result in files on hold and failed requests.

· Other: __________________________________________

When changes are needed to this Runbook, contact the document owner (listed on the cover) and notify them of discrepancies and enhancements.

Revision Changes

Indicate the date and type of changes made to this document.

Date:

Document creation

Server Switching

Concepts and Strategy

[Define terms (production, backup, planned, unplanned) and describe the strategy (IP impersonation, use of A/B switches, etc.). The purpose here is to describe how switching is done, while the individual step details are left to the procedure descriptions below.

If MIMIX Monitor is used, explain how (what is monitored?).

What IP addresses or SNA LUs are switched?

Is the switch automatic or semi-automatic?

Is the MIMIX Switch Framework or clustering used to affect the switch?

Will the MIMIX Availability Manager Interface be available when users have been switched to the backup server?

The following text may be used as an intro or modified if applicable:

Server switching consists of moving users from one server to another in a controlled way. At Customer X, server switching means moving users from the Sys1 server to the Sys2 server, and, when appropriate, moving the users back again to the Sys1 server.

The criteria for performing a switch will be different for a planned switch than for an unplanned switch.

A planned switch is done at a time when it is generally convenient and when the readiness to switch can be carefully assessed.

An unplanned switch is done when a failure of the current production system has been detected. In this case, the readiness to switch is difficult to assess. However, that readiness can be assumed with some confidence if a regimen of auditing, monitoring, and testing has been followed.

NOTE: A switch is unplanned, if the original production system is no longer accessible from the backup system. If the original production system is reachable, it is a planned switch, even if it was not scheduled or intended.

Switch Cycle

Switching systems, when done properly, includes a complete cycle, since it requires not only switching users to a second system, but also provides for returning the users safely to the original system when appropriate.

The full cycle involves a -SWITCH and a -RETURN. Each of which has two identical phases. The first phase is called Switchover/failover If planned it is referred to has a switchover if unplanned it is considered a failover. The second phase is called Resynchronize. After the -SWITCH, system Sys1 now plays the role of backup to system Sys2. This allows for the repeating of the two phases, Switchover/failover and Resynchronize, to return production to Sys1.

When moving the current production from Sys1 to Sys2 this is considered the -SWITCH. When you are ready to return the production back to Sys1 from Sys2 this is considered the -RETURN.

Graphical full switch cycle overview

MSFname - Switch Cycle

BACKUP

T

y

p

e

=

*

P

L

A

N

N

E

D

T

y

p

e

=

*

U

N

P

L

A

N

N

E

D

PHASE #1

SWITCHOVER/

FAILOVER

T

y

p

e

=

*

P

L

A

N

N

E

D

T

y

p

e

=

*

U

N

P

L

A

N

N

E

D

System 2

UNAVAILABLE

PRODUCTION

PRODUCTION

System 2

System 1

UNAVAILABLE

MIMIX

PRODUCTION

System 2

System 1

MIMIX

PRODUCTION

System 1

BACKUP

System 2

s

PHASE #2

RESYNCRONIZE

PHASE #4

RESYNCHRONIZE

M

S

F

n

a

m

e

-

S

W

I

T

C

H

M

S

F

n

a

m

e

-

R

E

T

U

R

N

PHASE #3

SWITCHOVER/

FAILOVER

System 1

Switch Overview

Procedure MSFname-SWITCH is switching production role from the Sys1 system to the Sys2 system. For a planned switch to Sys2 use procedure SWITCHOVER-SWITCH on page 13, for unplanned switching use procedure FAILOVER-SWITCH on page 19.

Procedure MSFname-RETURN is switching production role from the Sys2 system to the Sys1 system. For a planned switch to Sys1 use procedure SWITCHOVER-RETURN on page 23, for unplanned switching use procedure FAILOVER-RETURN on page 29.

Users and

Connections

Users and

Connections

MSFname-SWITCH

S

D

S

D

MIMIX replication

System 1

System 2

S

D

S

D

MIMIX replication

System 1

System 2

MSFname-RETURN

Planned Switch Overview

The Planned switch scenario includes 2 major steps, all of which are begun by interactively issuing a command on the system serving as the backup.

For moving production from Sys1 to Sys2:

· Switching production from the production system Sys1 to the backup system Sys2. This step, called “switch to backup”, carefully disengages the production system from the network, connects the backup system to the network, and makes this the new production system. See Procedure SWITCHOVER-SWITCH below.

· Starting MIMIX replication from the new production system back to the old production system. Effectively, the means the old production system now becomes a backup system. This step does not affect any user or connections. This step is called “catch-up” or “resync” because it allows the new backup system to catch up on all the changes that have been taking place on the new production system since Step 1 was performed.See Procedure SYNCHRONIZE-SWITCH below.

For moving production from Sys2 to Sys1:

· Switching production from the backup system Sys2 to the old production system Sys1. This step, called “switch to backup”, carefully disengages the production system from the network, connects the backup system to the network, and makes this the new production system. See Procedure SWITCHOVER-RETURN below.

· Starting MIMIX replication from the new production system back to the old production system. Effectively, the means you have switched full circle and are back to the initial roles of Production and Backup for the systems. This step does not affect any user or connections. This step is called “catch-up” or “resync” because it allows the new backup system to catch up on all the changes that have been taking place on the new production system since Step 1 was performed.See Procedure SYNCHRONIZE-RETURN below.

Unplanned Switch Overview

The Unplanned scenario includes 3 major steps, all of which are begun by interactively issuing a command on the Sys2 system.

For failover production from Sys1 to Sys2:

· Failover production to the backup system Sys2. This step, called “fail-over to backup”, quickly establishes this system as the new production system. The original production system Sys1 cannot be reached so it cannot be changed to no longer hold Production Role.See Procedure FAILOVER-SWITCH below.

· Repairing and preparing the old production system Sys1 to no longer hold Production Role. This means taking down connections to the network.

· Starting MIMIX replication from the new production system to back to the old production system. Effectively, the means the old production system now becomes a backup system. This step does not affect any user or ATM connections. This step is called “catch-up” or “resync” because it allows the backup system to catch up on all the changes that have been taking place on the new production system since Step 1 was performed.See Procedure SYNCHRONIZE-SWITCH below.

For failover production from Sys2 to Sys1:

· Failover production to the backup system Sys1. This step, called “fail-over to backup”, quickly establishes this system as the new production system. The original production system Sys2 cannot be reached so it cannot be changed to no longer hold Production Role.See Procedure FAILOVER-RETURN below.

· Repairing and preparing the old production system Sys2 to no longer hold Production Role. This means taking down connections to the network.

· Starting MIMIX replication from the new production system to back to the old production system. Effectively, the means the old production system now becomes a backup system. This step does not affect any user or ATM connections. This step is called “catch-up” or “resync” because it allows the backup system to catch up on all the changes that have been taking place on the new production system since Step 1 was performed.See Procedure SYNCHRONIZE-RETURN below.

Switch readiness validation

Note

These steps are not part of any switch. If you are executing an unplanned switch, please proceed:

For failover production from Sys1 to Sys2:

To Procedure FAILOVER-SWITCH – Fail over to Backup on page 19 or

For moving production from Sys2 to Sys1:

To Procedure FAILOVER-RETURN – Fail over to Backup on page 29

Goal

It is very important to maintain a switch ready environment. Constantly monitoring the MIMIX replication and regularly running the audit procedure can achieve this.

Below are the steps that will confirm the backup system is healthy and ready to be switched to, either for a planned or unplanned scenario.

Switch readiness validation tasks

We advise to perform below tasks at least once a week. Also, they must be performed a few days to a week prior to any planned switch:

Step

Action

time done

1

Check active replication and resolve problems discovered. The MIMIX active replication can be checked from the following command:

WRKDG

2

Perform and review the audits and resolve problems discovered. The MIMIX audit screen can be reached with the following command:

WRKAUD

3

Review Object Activity entries for status of *ACTIVE and *FAILED. Resolve any entries that are lingering for more than a few minutes. On both systems:

WRKDGACTE STATUS(*ACTIVE *FAILED)

4

Review File & Tracking Entries to determine that no files are in any non-active status. Resolve any entries that are lingering for more than a few minutes. On both systems

WRKDGFE STSVAL(*INACTIVE)WRKDGOBJTE STSVAL(*INACTIVE)WRKDGIFSTE STSVAL(*INACTIVE)

5

On the Sys2 partition, check the status of the last switch:

DSPDTAARA DTAARA(MIMIX/MSFname)

The status is in position 32 – 35.

6

On the Sys2 partition, the above resulting status has to be one of ‘SCMP’ or ‘PCMP’.

If not, then a previous switch did not complete normally, and you have to correct the status to ‘PCMP’. An FAQ on the www.mimix.com website will assist you to reset this status.

MSFname

REF MSFA \* MERGEFORMAT -SWITCHfromSys1 to Sys2

Procedure MSFname-SWITCH is switching production role from the Sys1 system to the Sys2 system. For a planned switch to backup use procedure SWITCHOVER-SWITCH on page 13, for unplanned switching use procedure FAILOVER-SWITCH on page 19.

A reminder: A switch is unplanned, if the original production system is no longer accessible from the backup system. If the original production system is reachable, it is a planned switch, even if it was not scheduled or intended.

Users and

Connections

Users and

Connections

MSFname-SWITCH

S

D

S

D

MIMIX replication

System 1

System 2

S

D

S

D

MIMIX replication

System 1

System 2

MSFname-RETURN

For switching back from the Sys2 to the Sys1, please use the procedures for MSFname-RETURN procedure.

Procedure SWITCHOVER-SWITCH – Switch to Backup

Goal

Switch users from the Sys1 system to the Sys2 system. Almost all the actions are initiated on the Sys2 system, which is the controlling system for this procedure.

The next pages hold the following groups of steps:

· SWITCHOVER-SWITCH Pre-Switch Tasks To be executed immediately before the planned switch

· SWITCHOVER-SWITCH Planned Switch TasksThe actual switch.

· SWITCHOVER-SWITCH Post-Switch TasksChecking and cleaning up the switch.

SWITCHOVER-SWITCH Pre-Switch Tasks

These tasks should be performed immediately prior to the switch:

Step

Action

Time done

1

In the next step, connections to the system will be closed. To ensure you keep your connection, use a session that is not using the switched IP address 10.230.24.20, but one of the administrative addresses.

Create a 5250 telnet session to both systems. Then, on both systems, transfer your job to QCTL instead of QINTER:

TFRJOB JOBQ(QCTL)

2

On the Sys2, add MIMIX to the library list.

ADDLIBLE MIMIX

3

Check active replication and resolve problems discovered. The MIMIX active replication can be checked from the following command:

WRKDG

4

Review Object Activity entries for status of *ACTIVE and *FAILED. Resolve any entries that are lingering for more than a few minutes. On both systems:

WRKDGACTE STATUS(*ACTIVE *FAILED)

5

Review File & Tracking Entries to determine that no files are in any non-active status. Resolve any entries that are lingering for more than a few minutes. On both systems

WRKDGFE STSVAL(*INACTIVE)WRKDGOBJTE STSVAL(*INACTIVE)WRKDGIFSTE STSVAL(*INACTIVE)

When you are ready for the actual switch, continue with step 6 below

SWITCHOVER-SWITCH Planned Switch Tasks

These tasks are the actual switch from Sys1 to Sys2, and are to be considered “downtime”.

Step

Action

Time done

6

Shut down the production environment.

Give the users adequate time to end their work on the production system.

Check no interactive jobs are active, no batch jobs are active and all scheduled batch jobs in job queues are held. Also all productive subsystems should be ended.

7

Wait for until all above ended subsystems and environments really ended. It may take a few minutes for the Application subsystems to end. Do *not* end the MIMIXSBS subsystem.

8

On the Sys2 system perform the following command:

MIMIX/RUNSWTFWK SWTFWK(MSFname) PRC(*BCKUP)

This last command will check if the system is in the correct status. Next, it will confirm that the user wishes to switch by issuing messages to QSYSOPR message queue.

9

On the Sys2 system answer the message in QSYSOPR message queue to confirm the switchover. Use SysRq 6 to go to the QSYSOPR message queue

The RUNSWTFWK process will end the production infrastructure, end MIMIX replication controlled; switch MIMIX data group direction, switch the network connections, and start the production infrastructure on the new production system Sys2.

This is done by internally calling the following programs: (This is shown here as documentation, these are not steps you need to perform)

· SWTFWKCFM (running on system Sys2)This program will ask confirmation from the system operator, it also internally calls programs DGSELECT to determine which data groups to switch. This program performs no MSFname specific actions.

· ENDBCKUP (running on system Sys2)This program ends the backup environment on Sys2. This program performs no MSFname specific actions.

· ENDPROD (running on system Sys1)This program ends the production environment on Sys1. This program performs the following MSFname specific actions:

· Sets the local system role indicator data area MXSYSROLE to “S” (in switch)

· Sets MIMIX auto-start indicator to “*SWITCH” to prevent auto-start of replication

· Ends the TCP/IP interfaces 10.230.24.20

· Kills all production subsystems. These subsystems should already have been ended nicely, but to prevent accidental activity after the switch, they are killed from this program.

· Removes the TCP/IP host name to host

· Ends the MVXCMEX subsystem and controller

· The replication to the Sys2 system is ended controlled and the data groups are switched in their direction of replication. Reverse replication is not started yet.

· STRPROD (running on system Sys2) This program starts the production environment on Sys2 system. This program performs the following MSFname specific actions:

· Sets the TCP/IP host name to host

· Sets the local system role indicator data area MXSYSROLE to “P” (in Production)

· Starts the TCP/IP interfaces 10.230.24.20

· Starts the MVXCMEX subsystem and controller

Note: The source for the above programs is located in file QSWTSRC in library MIMIXMSF on both systems.

SWITCHOVER-SWITCH Post-Switch Tasks

These tasks should be performed immediately after to the switch:

Step

Action

Time done

10

On the Sys2 system, check that:

· The 10.230.24.20 TCP/IP interface is activeuse command NETSTAT for this

· The TCP host name is set to host use command CFGTCP option 12 for this

· The NETSERVER job is startedCheck for a job named QZLSSERVER

11

Record all current journal receiver names and first sequence numbers of all journals on the new production system Sys2, by using the following command:

WRKDG OUTPUT(*OUTFILE) OUTFILE(MIMIXMSF/WRKDG) OUTMBR(SWITCH)

12

Start the other subsystems on the new production system Sys2 that you want to have active during this switch.

13

Signal to the test users group they can start working. Ensure they understand they should connect to the same production host host. It should automatically connect to new production system Sys2

The system is now switched and the production environment and host host is now active on the Sys2 partition.However, there is no replication yet to the original production system, so when appropriate, start reverse synchronization by following instruction 14 on the next page.

Procedure SYNCHRONIZE

REF MSFA \* MERGEFORMAT -SWITCH – Resynchronize

Goal

After the users were switched to the Sys2 system by the MSFname-SWITCH procedure, this procedure will start MIMIX replication back to the Sys1 system, which then effectively becomes the backup system.

The next pages hold the following groups of steps:

· SYNCHRONIZE-SWITCH Pre- Synchronization Tasks.To be executed before the synchronization

· SYNCHRONIZE-SWITCH Synchronization Tasks.The actual synchronization.

· SYNCHRONIZE-SWITCH Post- Synchronization TasksChecking and cleaning up the synchronization.

SYNCHRONIZE-SWITCH Pre-Synchronization Tasks

These tasks should be performed some time prior to the synchronization:

Step

Action

Time done

14

Only perform this sync procedure if you are certain that the current production system is the Sys2 system.

15

Check that the Sys1 system is now again available for use (after being repair or maintained).

16

Add the MIMIX and MIMIXWORK libraries to the library list on the Sys2

17

On both systems, ensure the MIMIX journal and system managers are running and that the MIMIX data groups are not running. The MIMIX active replication can be checked from the following command:

WRKDG

18

On the Sys2 partition, check the status of the last switch:

DSPDTAARA DTAARA(MIMIX/MSFname)

The status is in position 32 – 35..

19

On the Sys2 partition, if the above resulting status has to be either ‘BCMP’ or ‘FCMP’.

If not, then a previous switch did not complete normally, and you have to correct the status to ‘BCMP’.

SYNCHRONIZE-SWITCH Synchronization Tasks

These tasks are the actual synchronization from Sys2 to Sys1. The Sys2 is and remains the production system after these steps:

Step

Action

Time done

20

On the Sys2, check if you still have MIMIX and MIMIXWORK in the library list.

21

On the Sys2 system perform the following command:

MIMIX/RUNSWTFWK SWTFWK(MSFname) PRC(*SYNC)

This command will start MIMIX replication from the Sys2 system to the Sys1 system after user confirmation by issuing messages to QSYSOPR message queue.

22

On the Sys2 system answer the message in QSYSOPR message queue to confirm the switchover. Use SysRq 6 to go to the QSYSOPR message queue.

The RUNSWTFWK process will check the system and MIMIX stratus, and start replication from the Sys2 system to the Sys1 system

This is done by internally calling the following programs: (This is shown here as documentation, these are not steps you need to perform)

1. SWTFWKCFM (on system Sys2)This program will ask confirmation from the system operator, it also internally calls programs DGSELECT to determine which data groups to start. This program performs no MSFname specific actions.

2. STRBCKUP (on system Sys1)This program starts the backup environment on the former production system Sys1. This program performs no MSFname specific actions.

3. Starts the Data Groups to replicate from the Sys2 production system to the Sys1 system, which is now a backup system.

Note: The source for the above programs is located in file QSWTSRC in library MIMIXMSF on both systems.

SYNCHRONIZE-SWITCH Post-Synchronization Tasks

These tasks should be performed immediately after the synchronization:

Step

Action

Time done

23

Verify that the MIMIX data groups’ source jobs are active, and they are replicating from the Sys2 to the Sys1 partition. The target jobs should not yet be active. Use the following command:

WRKDG

24

Check that the Sys1 system still has no production users. Check that there are no production subsystems, and that the TCP/IP 10.230.24.20 interface is not active

25

When you feel the switch and sync are in order, you can now actually start writing the updates into the Sys1 system by starting the apply sessions.

(If you are unsure of the status of the switch or replicated data, postpone this step until advise from database and system administrators and testers has been obtained)

STRDG DGDFN(*ALL) PRC(*ALLTGT)

The switchover of the production environment and host host from Sys1 to Sys2 is now fully complete, and the Sys1 is now the backup system. If you need to switch back, please use document and procedure MSFname

REF MSFB \* MERGEFORMAT -RETURN

No more actions from this document are needed.

Procedure FAILOVER

REF MSFA \* MERGEFORMAT -SWITCH – Fail over to Backup

Goal

Following the unplanned deactivation (or system crash) of the Sys1 production system, the Sys2 is activated to take on production role and allow user connections.

Only use this procedure if the Sys1 production system is not reachable from the Sys2 backup system. Use the MSFname-SWITCHOVER procedure on page 13 if the Sys1 system can still be reached by you as an administrator.

The next pages hold the following groups of steps:

· FAILOVER Pre-Switch Tasks To be executed immediately before the unplanned switch

· FAILOVER Unplanned Switch TasksThe actual switch.

· FAILOVER Post-Switch TasksChecking and cleaning up the switch.

FAILOVER-SWITCH Pre-Switch Tasks

This procedure applies only to an unplanned switch. These tasks should be performed immediately prior to the switch:

Step

Action

Time done

1

Check if the production Sys1 system is really no longer reachable.

2

Check active replication and resolve problems discovered for as far as still possible. The operations procedures can be found in the operations section of the run book.

3

Add MIMIX libraries to the library list.

4

Review the latest audits results, which probably ran in the last 7 days. See if there are any replication/audit issues that make it impossible for you to switch over to the Sys2 partition.

If there are any major issues in the audit results, you should have dealt with them when they were first reported. In any case, you may need to make a judgment call whether to switch or not.

The MIMIX audit screen can be reached with the following command:

WRKAUD

5

On the Sys2 partition, check the status of the last switch:

DSPDTAARA DTAARA(MIMIX/MSFname)

The status is in position 32 – 35.

6

On the Sys2 partition, the above resulting status has to be one of ‘SCMP’ or ‘PCMP’.

If not, then a previous switch did not complete normally, and you have to correct the status to ‘PCMP’.

FAILOVER-SWITCH Unplanned Switch Tasks

These tasks are the actual switch from Sys1 to Sys2:

Step

Action

Time done

7

On the Sys2, check if you still have MIMIX and MIMIXWORK in the library list.

8

On the Sys2 system perform the following command:

MIMIX/RUNSWTFWK SWTFWK(MSFname) PRC(*BCKUP)

TYPE(*UNPLANNED)

This last command will check if the system is in the correct status. Next, it will confirm that the user wishes to switch by issuing messages to QSYSOPR message queue.

9

On the Sys2 system answer the message in QSYSOPR message queue to confirm the switchover. Use SysRq 6 to go to the QSYSOPR message queue.

The RUNSWTFWK process will end the production infrastructure, end MIMIX replication controlled; switch MIMIX data group direction, switch the network connections, and start the production infrastructure on the new production system Sys2.

This is done by internally calling the following programs: (This is shown here as documentation, these are not steps you need to perform)

· SWTFWKCFM (running on system Sys2)This program will ask confirmation from the system operator, it also internally calls programs DGSELECT to determine which data groups to switch. This program performs no MSFname specific actions.

· ENDBCKUP (running on system Sys2)This program ends the backup environment on Sys2. This program performs no MSFname specific actions.

· Program ENDPROD is not called, as this is an unplanned switch, and the production system is assumed crashed and unreachable.

· The replication to the Sys2 system is ended controlled and the data groups are switched in their direction of replication. Reverse replication is not started yet.

· STRPROD (running on system Sys2) This program starts the production environment on Sys2 system. This program performs the following MSFname specific actions:

· Sets the TCP/IP host name to host

· Sets the local system role indicator data area MXSYSROLE to “P” (in Production)

· Starts the TCP/IP interfaces 10.230.24.20

· Starts the MVXCMEX subsystem and controller

Note: The source for the above programs is located in file QSWTSRC in library MIMIXMSF on both systems.

FAILOVER-SWITCH Post-Switch Tasks

These tasks should be performed immediately after to the switch:

Step

Action

Time done

10

On the Sys2 system, check that:

· The 10.230.24.20 TCP/IP interface is activeuse command NETSTAT for this

· The TCP host name is set to host use command CFGTCP option 12 for this

· The NETSERVER job is startedCheck for a job named QZLSSERVER

11

Record all current journal receiver names and first sequence numbers of all journals on the new production system Sys2, by using the following command:

WRKDG OUTPUT(*OUTFILE) OUTFILE(MIMIXMSF/WRKDG) OUTMBR(SWITCH)

12

Start the other subsystems on the new production system Sys2 that you want to have active during this switch.

13

Signal to the test users group they can start working. Ensure they understand they should connect to the same production host host. It should automatically connect to new production system Sys2

Step

Action

Time done

14

Also notify the users that an unplanned switch has occurred and ask them to check for the most recent transactions they entered just before the system crashed. (Most likely they will have noticed the unplanned switch already)

15

On the Sys1 system, physically disconnect the Ethernet cable, or at least ensure when it starts, the 10.230.24.20 production IP addresses is not started during IPL. Also ensure no prod environment or other productive subsystem will not be started during IPL

16

The system is now switched and the production environment and host host is now active on the Sys2 partition.However, there is no replication yet to the original production system, so when appropriate, start reverse synchronization by following instruction 14 on the synchronize procedure on page 17

MSFname-RETURNfromSys2 to Sys1

Procedure MSFname-RETURN is switching production role from the Sys1 system to the Sys2 system. For a planned switch to backup use procedure SWITCHOVER-RETURN on page 23, for unplanned switching use procedure FAILOVER-RETURN on page 29.

A reminder: A switch is unplanned, if the original production system is no longer accessible from the backup system. If the original production system is reachable, it is a planned switch, even if it was not scheduled or intended.

Users and

Connections

Users and

Connections

MSFname-SWITCH

S

D

S

D

MIMIX replication

System 1

System 2

S

D

S

D

MIMIX replication

System 1

System 2

MSFname-RETURN

For switching back from the Sys2 to the Sys1, please use the separate document describing the MSFname-SWITCH procedure.

Procedure SWITCHOVER-RETURN – Switch to Backup

Goal

Switch users from the Sys2 system to the Sys1 system. Almost all the actions are initiated on the Sys1 system, which is the controlling system for this procedure.

The next pages hold the following groups of steps:

· SWITCHOVER-RETURN Pre-Switch Tasks To be executed immediately before the planned switch

· SWITCHOVER-RETURN Planned Switch TasksThe actual switch.

· SWITCHOVER-RETURN Post-Switch TasksChecking and cleaning up the switch.

SWITCHOVER-RETURN Pre-Switch Tasks

These tasks should be performed immediately prior to the switch:

Step

Action

Time done

1

In the next step, connections to the system will be closed. To ensure you keep your connection, use a session that is not using the switched IP address 10.230.24.20, but one of the administrative addresses.

Create a 5250 telnet session to both systems. Then, on both systems, transfer your job to QCTL instead of QINTER:

TFRJOB JOBQ(QCTL)

2

On the Sys1, add MIMIX to the library list.

ADDLIBLE MIMIX

3

Check active replication and resolve problems discovered. The MIMIX active replication can be checked from the following command:

WRKDG

4

Review Object Activity entries for status of *ACTIVE and *FAILED. Resolve any entries that are lingering for more than a few minutes. On both systems:

WRKDGACTE STATUS(*ACTIVE *FAILED)

5

Review File & Tracking Entries to determine that no files are in any non-active status. Resolve any entries that are lingering for more than a few minutes. On both systems

WRKDGFE STSVAL(*INACTIVE)WRKDGOBJTE STSVAL(*INACTIVE)WRKDGIFSTE STSVAL(*INACTIVE)

When you are ready for the actual switch, continue with step 6 below

SWITCHOVER-RETURN Planned Switch Tasks

These tasks are the actual switch from Sys2 to Sys1, and are to be considered “downtime”.

Step

Action

Time done

6

Shut down the production environment.

Give the users adequate time to end their work on the production system.

Check no interactive jobs are active, no batch jobs are active and all scheduled batch jobs in job queues are held. Also all productive subsystems should be ended.

7

Wait for until all above ended subsystems and environments really ended. It may take a few minutes for the MOVEX subsystems to end. Do *not* end the MIMIXSBS subsystem.

8

On the Sys1 system perform the following command:

MIMIX/RUNSWTFWK SWTFWK(MSFname) PRC(*BCKUP)

This last command will check if the system is in the correct status. Next, it will confirm that the user wishes to switch by issuing messages to QSYSOPR message queue.

9

On the Sys1 system answer the message in QSYSOPR message queue to confirm the switchover. Use SysRq 6 to go to the QSYSOPR message queue

The RUNSWTFWK process will end the production infrastructure, end MIMIX replication controlled; switch MIMIX data group direction, switch the network connections, and start the production infrastructure on the new production system Sys1.

This is done by internally calling the following programs: (This is shown here as documentation, these are not steps you need to perform)

· SWTFWKCFM (running on system Sys2)This program will ask confirmation from the system operator, it also internally calls programs DGSELECT to determine which data groups to switch. This program performs no MSFname specific actions.

· ENDBCKUP (running on system Sys2)This program ends the backup environment on Sys2. This program performs no MSFname specific actions.

· ENDPROD (running on system Sys1)This program ends the production environment on Sys1. This program performs the following MSFname specific actions:

· Sets the local system role indicator data area MXSYSROLE to “S” (in switch)

· Sets MIMIX auto-start indicator to “*SWITCH” to prevent auto-start of replication

· Ends the TCP/IP interfaces 10.230.24.20

· Kills all production subsystems. These subsystems should already have been ended nicely, but to prevent accidental activity after the switch, they are killed from this program.

· Removes the TCP/IP host name to host

· Ends the MVXCMEX subsystem and controller

· The replication to the Sys2 system is ended controlled and the data groups are switched in their direction of replication. Reverse replication is not started yet.

· STRPROD (running on system Sys2) This program starts the production environment on Sys2 system. This program performs the following MSFname specific actions:

· Sets the TCP/IP host name to host

· Sets the local system role indicator data area MXSYSROLE to “P” (in Production)

· Starts the TCP/IP interfaces 10.230.24.20

· Starts the MVXCMEX subsystem and controller

Note: The source for the above programs is located in file QSWTSRC in library MIMIXMSF on both systems.

SWITCHOVER-RETURN Post-Switch Tasks

These tasks should be performed immediately after to the switch:

Step

Action

Time done

10

On the Sys1 system, check that:

· The 10.230.24.20 TCP/IP interface is activeuse command NETSTAT for this

· The TCP host name is set to host use command CFGTCP option 12 for this

· The NETSERVER job is startedCheck for a job named QZLSSERVER

11

Record all current journal receiver names and first sequence numbers of all journals on the new production system Sys1, by using the following command:

WRKDG OUTPUT(*OUTFILE) OUTFILE(MIMIXMSF/WRKDG) OUTMBR(SWITCH)

12

Start the other subsystems on the new production system Sys1 that you want to have active during this switch.

13

Signal to the test users group they can start working. Ensure they understand they should connect to the same production host host. It should automatically connect to new production system Sys1

The system is now switched and the production environment and host host is now active on the Sys1 partition.However, there is no replication yet to the original production system, so when appropriate, start reverse synchronization by following instruction 14 on the next page.

Procedure SYNCHRONIZE-RETURN – Resynchronize

Goal

After the users were switched to the Sys1 system by the MSFname-RETURN procedure, this procedure will start MIMIX replication back to the Sys2 system, which then effectively becomes the backup system.

The next pages hold the following groups of steps:

· SYNCHRONIZE Pre- Synchronization Tasks.To be executed before the synchronization

· SYNCHRONIZE Synchronization Tasks.The actual synchronization.

· SYNCHRONIZE Post- Synchronization TasksChecking and cleaning up the synchronization.

SYNCHRONIZE-RETURN Pre-Synchronization Tasks

These tasks should be performed some time prior to the synchronization:

Step

Action

Time done

14

Only perform this sync procedure if you are certain that the current production system is the Sys1 system.

15

Check that the Sys2 system is now again available for use (after being repair or maintained).

16

Add MIMIX library to the library list on the Sys1

17

On both systems, ensure the MIMIX journal and system managers are running and that the MIMIX data groups are not running. The MIMIX active replication can be checked from the following command:

WRKDG

18

On the Sys2 partition, check the status of the last switch:

DSPDTAARA DTAARA(MIMIX/MSFname)

The status is in position 32 – 35.

19

On the Sys1 partition, if the above resulting status has to be either ‘BCMP’ or ‘FCMP’.

If not, then a previous switch did not complete normally, and you have to correct the status to ‘BCMP’.

SYNCHRONIZE-RETURN Synchronization Tasks

These tasks are the actual synchronization from Sys1 to Sys2. The Sys1 is and remains the production system after these steps:

Step

Action

Time done

20

On the Sys1, check if you still have MIMIX and MIMIXWORK in the library list.

21

On the Sys1 system perform the following command:

MIMIX/RUNSWTFWK SWTFWK(MSFname) PRC(*SYNC)

This command will start MIMIX replication from the Sys2 system to the Sys1 system after user confirmation by issuing messages to QSYSOPR message queue.

22

On the Sys1 system answer the message in QSYSOPR message queue to confirm the switchover. Use SysRq 6 to go to the QSYSOPR message queue.

The RUNSWTFWK process will check the system and MIMIX status, and start replication from the Sys1 system to the Sys2 system

This is done by internally calling the following programs: (This is shown here as documentation, these are not steps you need to perform)

4. SWTFWKCFM (on system Sys2)This program will ask confirmation from the system operator, it also internally calls programs DGSELECT to determine which data groups to start. This program performs no MSFname specific actions.

5. STRBCKUP (on system Sys1)This program starts the backup environment on the former production system Sys1. This program performs no MSFname specific actions.

6. Starts the Data Groups to replicate from the Sys2 production system to the Sys1 system, which is now a backup system.

Note: The source for the above programs is located in file QSWTSRC in library MIMIXMSF on both systems.

SYNCHRONIZE-RETURN Post-Synchronization Tasks

These tasks should be performed immediately after the synchronization:

Step

Action

Time done

23

Verify that the MIMIX data groups’ source jobs are active, and they are replicating from the Sys1 to the Sys2 partition. The target jobs should not yet be active. Use the following command:

WRKDG

24

Check that the Sys2 system still has no production users. Check that there are no production subsystems, and that the TCP/IP 10.230.24.20 interface is not active

25

When you feel the switch and sync are in order, you can now actually start writing the updates into the Sys2 system by starting the apply sessions.

(If you are unsure of the status of the switch or replicated data, postpone this step until advise from database and system administrators and testers has been obtained)

STRDG DGDFN(*ALL) PRC(*ALLTGT)

The switchover of the production environment and host host from Sys2 to Sys1 is now fully complete, and the Sys2 is now the backup system. If you need to switch back, please use document and procedure MSFname

REF MSFB \* MERGEFORMAT -RETURN

No more actions from this document are needed.

Procedure FAILOVER-RETURN – Fail over to Backup

Goal

Following the unplanned deactivation (or system crash) of the Sys2 production system, the Sys1 system is activated to take on production role and allow user connections.

Only use this procedure if the Sys2 production system is not reachable from the Sys1 backup system. Use the MSFname SWITCHOVER-RETURN procedure on page 23 if the Sys2 system can still be reached by you as an administrator.

The next pages hold the following groups of steps:

· FAILOVER Pre-Switch Tasks To be executed immediately before the unplanned switch

· FAILOVER Unplanned Switch TasksThe actual switch.

· FAILOVER Post-Switch TasksChecking and cleaning up the switch.

FAILOVER Pre-Switch Tasks

This procedure applies only to an unplanned switch. These tasks should be performed immediately prior to the switch:

Step

Action

Time done

1

Check if the production Sys2 system is really no longer reachable.

2

Check active replication and resolve problems discovered for as far as still possible. The operations procedures can be found in the operations section of the run book.

3

Add MIMIX library to the library list.

4

Review the latest audits results, which probably ran in the last 7 days. See if there are any replication/audit issues that make it impossible for you to switch over to the Sys1 partition.

If there are any major issues in the audit results, you should have dealt with them when they were first reported. In any case, you may need to make a judgment call whether to switch or not.

The MIMIX audit screen can be reached with the following command:

WRKAUD

5

On the Sys2 partition, check the status of the last switch:

DSPDTAARA DTAARA(MIMIX/MSFname)

The status is in position 32 – 35.

6

On the Sys1 partition, the above resulting status has to be one of ‘SCMP’ or ‘PCMP’.

If not, then a previous switch did not complete normally, and you have to correct the status to ‘PCMP’.

FAILOVER-RETURN Unplanned Switch Tasks

These tasks are the actual switch from Sys2 to Sys1:

Step

Action

Time done

7

On the Sys1, check if you still have MIMIX in the library list.

8

On the Sys1 system perform the following command:

MIMIX/RUNSWTFWK SWTFWK(MSFname) PRC(*BCKUP)

TYPE(*UNPLANNED)

This last command will check if the system is in the correct status. Next, it will confirm that the user wishes to switch by issuing messages to QSYSOPR message queue.

9

On the Sys1 system answer the message in QSYSOPR message queue to confirm the switchover. Use SysRq 6 to go to the QSYSOPR message queue.

The RUNSWTFWK process will end the production infrastructure; end MIMIX replication controlled, switch MIMIX data group direction, switch the network connections, and start the production infrastructure on the new production system Sys2.

This is done by internally calling the following programs: (This is shown here as documentation, these are not steps you need to perform)

· SWTFWKCFM (running on system Sys2)This program will ask confirmation from the system operator, it also internally calls programs DGSELECT to determine which data groups to switch. This program performs no MSFname specific actions.

· ENDBCKUP (running on system Sys2)This program ends the backup environment on Sys2. This program performs no MSFname specific actions.

· Program ENDPROD is not called, as this is an unplanned switch, and the production system is assumed crashed and unreachable.

· The replication to the Sys1 system is ended controlled and the data groups are switched in their direction of replication. Reverse replication is not started yet.

· STRPROD (running on system Sys2) This program starts the production environment on Sys2 system. This program performs the following MSFname specific actions:

· Sets the TCP/IP host name to host

· Sets the local system role indicator data area MXSYSROLE to “P” (in Production)

· Starts the TCP/IP interfaces 10.230.24.20

· Starts the MVXCMEX subsystem and controller

Note: The source for the above programs is located in file QSWTSRC in library MIMIXMSF on both systems.

FAILOVER-RETURN Post-Switch Tasks

These tasks should be performed immediately after to the switch:

Step

Action

Time done

10

On the Sys1 system, check that:

· The 10.230.24.20 TCP/IP interface is activeuse command NETSTAT for this

· The TCP host name is set to host use command CFGTCP option 12 for this

· The NETSERVER job is startedCheck for a job named QZLSSERVER

11

Record all current journal receiver names and first sequence numbers of all journals on the new production system Sys1, by using the following command:

WRKDG OUTPUT(*OUTFILE) OUTFILE(MIMIXMSF/WRKDG) OUTMBR(SWITCH)

12

Start the other subsystems on the new production system Sys1 that you want to have active during this switch.

13

Signal to the test users group they can start working. Ensure they understand they should connect to the same production host host. It should automatically connect to new production system Sys1

Step

Action

Time done

14

Also notify the users that an unplanned switch has occurred and ask them to check for the most recent transactions they entered just before the system crashed. (Most likely they will have noticed the unplanned switch already)

15

On the Sys2 system, physically disconnect the Ethernet cable, or at least ensure when it starts, the 10.230.24.20 production IP addresses is not started during IPL. Also ensure no prod environment or other productive subsystem will not be started during IPL

16

The system is now switched and the production environment and host host is now active on the Sys1 partition.However, there is no replication yet to the original production system, so when appropriate, start reverse synchronization by following instruction 14 on the synchronize procedure on page 27

Appendix A: Runbook Hyperlinks

This appendix contains embedded links (Hyperlinks) that allow the reader to view related configuration information contained in separate files. The information in the files referenced by the links must be retrieved from the iSeries servers and stored on the workstation used to view this document.

To properly use the links, the files referenced by the links must be located precisely where the link specifies and have the name identified by the link

The following table contains the links to be used to view the detailed information considered part of this Runbook:

Link

Detailed Information Shown

MIMIX HA Configuration Information

system_definitions.csv

MIMIX System Definitions

transfer_definitions.csv

MIMIX Transfer Definitions

journal_definitions.csv

MIMIX Journal Definitions

remote_journal_links.csv

MIMIX Remote Journal Link Information

data_group_definitions.csv

MIMIX Data Groups

data_group_file_entries.csv

MIMIX Data Group File Entries

data_group_object_entries.csv

MIMIX Data Group Object Entries

data_group_object_tracking_entries.csv

MIMIX Data Group Object Tracking Entries

data_group_ifs_entries.csv

MIMIX Data Group IFS Entries

data_group_ifs_tracking_entries.csv

MIMIX Data Group IFS Tracking Entries

data_group_dlo_entries.csv

MIMIX Data Group DLO Entries

data_group_data_area_entries.csv

MIMIX Data Group Data Area Entries

collision_resolution_class_definitions.csv

MIMIX Collision Resolution Classes

Cluster_standard_application_groups.csv

MIMIX Standard Application Groups

Cluster_parent_application_groups.csv

MIMIX Parent Application Groups

Cluster_child_application_groups.csv

MIMIX Child Application Groups

Audit_schedule_definitions

MIMIX Audit Definitions

Mimix_ext_policy.csv

MIMIX Extended Policies

mimix_policy.csv

MIMIX Policies

mimix_rule.csv

MIMIX Audit Rules

mimix_rulg.csv

MIMIX Audit Rule Groups

mimix_grul.csv

MIMIX Audit Rule Group Rules

S1111111 Server Information

S1111111_ASP_information

Auxiliary Storage Pool (ASP) Information

S1111111_network_attributes.txt

Network Attributes

S1111111_TCPIP_interface_table_entries.csv

TCP/IP Interface Table Entries

S1111111_TCPIP_host_table_entries.csv

TCP/IP Host Table Entries

S1111111_TCPIP_domain_information.csv

TCP/IP Domain Information

S1111111_TCPIP_route_table_entries.csv

TCP/IP Routing Table Entries

S1111111_TCPIP_service_table_entries.csv

TCP/IP Service Table Entries

S2222222 Server Information

S2222222_ASP_information

Auxiliary Storage Pool (ASP) Information

S2222222_network_attributes.txt

Network Attributes

S2222222_TCPIP_interface_table_entries.csv

TCP/IP Interface Table Entries

S2222222_TCPIP_host_table_entries.csv

TCP/IP Host Table Entries

S2222222_TCPIP_domain_information.csv

TCP/IP Domain Information

S2222222_TCPIP_route_table_entries.csv

TCP/IP Routing Table Entries

S2222222_TCPIP_service_table_entries.csv

TCP/IP Service Table Entries

Appendix B: The Runbook Data Capture Tool

Vision Solution provides a software tool to automatically create/refresh the files referenced by the Hyperlinks in the Runbook. The software tool runs on a Windows workstation and uses FTP to interact with the iSeries servers associated with the MIMIX solution. The tool will create/refresh the files and, if selected, downloads the files from the iSeries server to a folder on the workstation.

The Runbook Data Capture tool includes functions that run on the user’s workstation and functions that run on each of the iSeries servers from which data is to be retrieved. The intent is that the Runbook will contain the following information retrieved from the iSeries servers:

Server Type

Downloaded Files

Management System

MIMIX configuration information, Policy, Rules, and Rule Groups

Network attributes

TCP/IP configuration information

ASP information (for independent ASP environments)

Each Network System

Network attributes

TCP/IP configuration information

ASP information (for independent ASP environments)

To use the Runbook Data Capture tool, do the following:

Note: These instructions apply to V5 MIMIX or later. If you are working with an earlier level of MIMIX, contact your Vision Solution Manager for further information.

1. Sign on to the Support section of the www.visionsolutions.com site and navigate to Support Login. After logging in navigate to Support, Downloads, your product, then at the bottom of the download page will be a link to the collection tool. Click on the Runbook Automated Collection Tool to download the runbooksetup.exe file. This file can be placed in any folder you choose on your workstation.

2. Once the file is downloaded, run the runbooksetup.exe file. Doing this will create (if you accept the defaults) the following folder structure on your workstation:

C:\Program Files\Vision Solutions\Runbook\runbookdata

C:\Program Files\Vision Solutions\Runbook\runbooktool

The runbookdata folder will be empty initially. It is the default location for data to be downloaded later from the iSeries by the Runbook Data Capture tool.

The runbooktool folder will contain several objects, including the Read Me file and, most importantly, the runRunbook batch file used to run the tool itself.

3. When you are ready to capture data for the Runbook, follow the instructions in the README.TXT file.

Using and Updating the Links Used in this Document

When the configuration data files referenced by the links are downloaded from the iSeries by the Runbook Data Capture tool, the user of the tool is able to specify the folder on the workstation where the files will be placed.

· The default folder is:

C:\Program Files\Vision Solutions\Runbook\runbookdata

· You may specify any folder convenient to you

When the links in this document are used, they must refer to the folder where the downloaded files are stored. By default, this document expects to find the files in the same folder where the Runbook document itself is stored. For the links to work properly, one of the two following steps must be done:

· The files to be downloaded must either be downloaded into the folder where the Runbook Word document resides, or moved from where they were downloaded to that folder.

· The links in the Runbook Word document must be modified to reference the location where the downloaded files reside.

Updating The Links

There are two methods to update the links

Method 1: Automatically

This method will alter all the links in the template to reference the folder in which the template is saved.

After ensuring that you have the .csv and .txt files generated by using the Runbook Data Capture Tool in the same folder as the copy of this Runbook document into which you want those .csv and .txt files embedded, perform the following steps:

· Move a copy of the Runbook document into the folder where you have (or will have) located your .csv and .txt files

· Open the Runbook document

· Do a "CTRL-A". That is, press and hold the CONTROL key, then while you have the CONTROL key depressed, press the A key also. Then release both keys. You should see that you have selected the entire document.

· Press the F9 key. This should open up a window that looks something like the following:

· Select the UPDATE ENTIRE TABLE button, and press OK. This should update all links in your entire document, as well as updating the table of contents. The links should now point to the folder in which you have the Runbook document and .csv files located.

Method 2: Manually

If it is preferred to manually modify the links, for example if for some reason you want to keep the Runbook document and .csv and .txt files in different folders, there are two attributes of each link that typically need to be modified:

1. The text displayed

This is a text string that appears in the document and is viewable when the document is printed. While this text string is not used in accessing the file associated with the link, by convention, it often consists of some of the characters in the file name. See the example below.

2. The file name

This is the full file name (including path information) of the file. It is used to access the file when the link is selected. This file name does not appear when the document is printed. See the example below.

To view these two link attributes, as well as to change them, do the following while editing the Runbook Word document:

· Highlight the link (being careful not to click on it).

· Click on the Microsoft Word Insert pull-down.

· Click on the Hyperlink option. The following window will pop up [the data group definition link was chosen for this example]:

· Click on the Browse for: File button and navigate to the location of the files

· Select the file

· Make the desired changes and press OK when completed.

Note: The Microsoft Word “Find/Replace” function may be used to globally change the hyperlink “text to display” specifications in this template. To make that easier to do, the following character strings are used only for hyperlinks in this entire template:

system_

transfer_

journal_

data_group_

collision_

S1111111_

S2222222_

The “Find/Replace” function has no effect on the “File Name” associated with a Hyperlink.

Last page of the Customer X Availability Runbook document

Text to Display

Variables used in this document:

This should be hidden, and not in view, make the window small enough to drop text out of view.

1 – in this window, Use Ctrl-A and Shift-F9 to show fields.

2 – Modify the fields’ value.

3 – Use Ctrl-A and F9 to record updated fields

4 – In main doc, Use Ctrl-A and F9 to update document. Do this twice!

5 – Review document, insert customer specific actions where needed.

To insert field in text, use menu Insert->Field->Ref-Field

Switch Framework and procedure names:

Name of the MSF �SET MSF "MSFname"�MSFname�

Suffix of this MSF �SET MSFA "-SWITCH"�-SWITCH�

Suffix of reverse MSF �SET MSFB "-RETURN"�-RETURN�

Proc for planned switch (*BCKUP) �SET MSF_B "SWITCHOVER"�SWITCHOVER�

Proc for unplanned switch �SET MSF_U "FAILOVER"�FAILOVER�

Proc for sync (*SYNC) �SET MSF_S "SYNCHRONIZE"�SYNCHRONIZE�

Proc for return switch (Like *PROD) �SET MSF_P "SWITCHBACK"�SWITCHBACK�

Proc for system recovery �SET MSF_R "RECOVER"�RECOVER�

Customer name:

Customer name �SET CUSTNAME "Customer X"�Customer X�

System nodes:

System name of the production node �SET PRODSYS "Sys1"�Sys1�

System name of the backup node �SET BCKUPSYS "Sys2"�Sys2�

Node suffix (system or partition)�SET SYSSFX "system"�system�

Productive names:

Switched Production IP �SET PRODIP "10.230.24.20"�10.230.24.20�

Switched alternate Production IP �SET PRODA "10.230.24.20"�10.230.24.20�

Switched Host/System name �SET PRODHOST "host" �host�

Description of the prod environment �SET PRODDESC "prod environment" �prod environment�

MIMIX Automation:

MIMIX program library �SET MMXLIB "MIMIX"�MIMIX�

MIMIX MSF program library �SET MMXCSTLIB "MIMIXMSF"�MIMIXMSF�

MIMIX switch source file �SET MMXMSFSRC "QSWTSRC" �QSWTSRC�

MIMIX IPL source file �SET MMXIPLSRC "QIPLSRC" �QIPLSRC�

MIMIX other source file �SET MMXCLSRC "QCLSRC" �QCLSRC�

Select File

File Name

PAGE

MIMIX HA Runbook template_08102007Availability Runbookpage 1 of 38

_1247472587.vsd

System 1�

MSFname - Switch Cycle�

PHASE #3�SWITCHOVER/FAILOVER�

PHASE #4�RESYNCHRONIZE�

Type = *PLANNED�

s�

MIMIX�

PRODUCTION�

System 1�

BACKUP�

System 2�

PRODUCTION�

System 2�

System 1�

UNAVAILABLE�

MIMIX�

PRODUCTION�

System 2�

BACKUP�

System 1�

System 2�

UNAVAILABLE�

PRODUCTION�

PHASE #2�RESYNCRONIZE�

Type = *UNPLANNED�

Type = *PLANNED�

Type = *UNPLANNED�

PHASE #1�SWITCHOVER/FAILOVER�

MSFname-SWITCH�

MSFname-RETURN�

_1247471035.vsd

IBM Model 30S�

IBM Compatible�

Users and Connections�

Users and Connections�

MSFname-SWITCH�

MIMIX replication�

System 1�

System 2�

MIMIX replication�

System 1�

System 2�

MSFname-RETURN�