version 7.1 mimix...

MIMIX® Availability™

Version 7.1MIMIX Operations–5250

Notices

MIMIX Operations - 5250 User GuideApril 2014 Version: 7.1.21.00

© Copyright 1999, 2014 Vision Solutions®, Inc. All rights reserved.

The information in this document is subject to change without notice and is furnished under a license agreement. This document is proprietary to Vision Solutions, Inc., and may be used only as authorized in our license agreement. No portion of this manual may be copied or otherwise reproduced without the express written consent of Vision Solutions, Inc.

Vision Solutions provides no expressed or implied warranty with this manual.

The following are trademarks or registered trademarks of their respective organizations or companies:

• MIMIX and Vision Solutions are registered trademarks and AutoGuard, Data Manager, Director, Dynamic Apply, ECS/400, GeoCluster, IntelliStart, Integrator, iOptimize, iTERA, iTERA Availability, MIMIX AutoNotify, MIMIX Availability, MIMIX Availability Manager, MIMIX DB2 Replicator, MIMIX Director, MIMIX dr1, MIMIX Enterprise, MIMIX Global, MIMIX Monitor, MIMIX Object Replicator, MIMIX Professional, MIMIX Promoter, OMS/ODS, RecoverNow, Replicate1, RJ Link, SAM/400, Switch Assistant, Vision AutoValidate, and Vision Suite are trademarks of Vision Solutions, Inc.

• Double-Take Share, Double-Take Availability, and Double-Take RecoverNow—DoubleTake Inc.

• AIX, AIX 5L, AS/400, DB2, eServer, IBM, Informix, i5/OS, iSeries, OS/400, Power, System i, System i5, System p, System x, System z, and WebSphere—International Business Machines Corporation.

• Adobe and Acrobat Reader—Adobe Systems, Inc.

• HP-UX—Hewlett-Packard Company.

• Teradata—Teradata Corporation.

• Intel—Intel Corporation.

• Java, all Java-based trademarks, and Solaris—Sun Microsystems, Inc.

• Linux—Linus Torvalds.

• Internet Explorer, Microsoft, Windows, and Windows Server—Microsoft Corporation.

• Mozilla and Firefox—Mozilla Foundation.

• Netscape—Netscape Communications Corporation.

• Oracle—Oracle Corporation.

• Red Hat—Red Hat, Inc.

• Sybase—Sybase, Inc.

• Symantec and NetBackup—Symantec Corporation.

• UNIX and UNIXWare—the Open Group.

All other brands and product names are trademarks or registered trademarks of their respective owners.

If you need assistance, contact Vision Solutions’ CustomerCare team at:

CustomerCareVision Solutions, Inc.Telephone: 1.800.337.8214 or 1.949.724.5465Email: [email protected] Web Site: www.visionsolutions.com/Support/Contact-CustomerCare.aspx

mailto:[email protected]

http://www.visionsolutions.com/Support/Contact-CustomerCare.aspx

Contents

Who this book is for................................................................................................... 11What is in this book ............................................................................................. 11

The MIMIX documentation set .................................................................................. 11Sources for additional information............................................................................. 13How to contact us...................................................................................................... 14

Chapter 1 MIMIX overview 15MIMIX concepts......................................................................................................... 17

Product concepts................................................................................................. 17System role concepts .......................................................................................... 18Journaling concepts ............................................................................................ 19Configuration concepts........................................................................................ 20Process concepts ................................................................................................ 21Additional switching concepts ............................................................................. 22

Best practices for maintaining your MIMIX environment ........................................... 23Authority to products and commands........................................................................ 23Accessing the MIMIX Main Menu.............................................................................. 24

Chapter 2 MIMIX policies 26Environment considerations for policies.................................................................... 27

Policies in environments with more than two nodes or bi-directional replication. 27When to disable automatic recovery for replication and auditing ........................ 28

Disabling audits and recovery when using the MIMIX CDP feature .............. 29Setting policies - general ........................................................................................... 29

Changing policies for an installation.................................................................... 29Changing policies for a data group...................................................................... 30Resetting a data group-level policy to use the installation level value ................ 30

Policies which affect an installation ........................................................................... 31Changing retention criteria for procedure history ................................................ 31

Policies which affect replication................................................................................. 32Errors handled by automatic database recovery ................................................. 33Errors handled by automatic object recovery ...................................................... 34

Policies which affect auditing .................................................................................... 36Policies for auditing runtime behavior ................................................................. 36Policies for submitting audits automatically ......................................................... 37

When automatically submitted audits run...................................................... 38Changing auditing policies ........................................................................................ 41

Changing when automatic audits are allowed to run........................................... 41Changing scheduling criteria for automatic audits......................................... 41

Changing the selection frequency of priority auditing categories ........................ 42Changing the audit level policy when switching .................................................. 43Changing the system where audits are performed.............................................. 43Changing retention criteria for audit history......................................................... 43Restricting auditing based on the state of the data group ................................... 44Preventing audits from running ........................................................................... 45

Disabling all auditing for an installation ......................................................... 46Disabling all auditing for a data group ........................................................... 46Disabling automatically submitted audits....................................................... 46

Policies for switching with model switch framework .................................................. 48Specifying a default switch framework in policies ............................................... 48

3

Setting polices for MIMIX Switch Assistant ......................................................... 49Setting policies when MIMIX Model Switch Framework is not used.................... 49

Policy descriptions..................................................................................................... 50

Chapter 3 Checking status in environments with application groups 60Checking application group status ............................................................................ 60

Resolving problems reported in the Monitors field .............................................. 61Resolving problems reported in the Notifications field ........................................ 63Resolving problems reported in Status columns ................................................. 64

Resolving a procedure status problem.......................................................... 64Resolving an *ATTN status for an application group..................................... 65Resolving other common status values for an application group .................. 66

Status for Work with Node Entries ............................................................................ 66Status for Work with Data Resource Group Entries .................................................. 68Verifying the sequence of the recovery domain ........................................................ 70Changing the sequence of backup nodes ................................................................. 71

Examples of changing the backup sequence...................................................... 73

Chapter 4 Working with status of procedures and steps 77Displaying status of procedures ................................................................................ 78

Displaying status of the last run of all procedures............................................... 78Displaying available status history of procedure runs ......................................... 79

Resolving problems with procedure status................................................................ 80Responding to a procedure in *MSGW status..................................................... 81Resolving a *FAILED or *CANCELED procedure status..................................... 82

Displaying status of steps within a procedure run ..................................................... 83Resolving problems with step status ......................................................................... 85

Responding to a step with a *MSGW status ....................................................... 87Resolving *CANCEL or *FAILED step statuses .................................................. 88

Acknowledging a procedure ...................................................................................... 89Running a procedure................................................................................................. 90

Resuming a procedure ........................................................................................ 91Overriding the attributes of a step ................................................................. 91

Canceling a procedure .............................................................................................. 92

Chapter 5 Monitoring status with MIMIX Availability Status 93Checking replication status from the MIMIX Availability Status display .................... 95Checking audit and notification status from the MIMIX Availability Status display.... 96Checking status of supporting services from the MIMIX Availability Status display.. 96

Chapter 6 Working with data group status 98The Work with Data Groups display.......................................................................... 99

Problems reflected in the Audits/Recov./Notif. field .......................................... 101Problems reflected in the Data Group column .................................................. 101

Resolving problems highlighted in the Data Group column......................... 102Manager problems reflected in the Source and Target columns....................... 103Replication problems reflected in the Source and Target columns ................... 103Setting the automatic refresh interval ................................................................ 104

Working with the detailed status of data groups...................................................... 105Displaying data group detailed status ............................................................... 105

Merged view ................................................................................................ 106

4

Object detailed status views ........................................................................ 110Database detailed status views ................................................................... 112

Identifying replication processes with backlogs....................................................... 115Data group status in environments with journal cache or journal state ................... 117

Resolving a problem with journal cache or journal state ................................... 119

Chapter 7 Working with audits 121Auditing overview .................................................................................................... 122

Components of an audit .................................................................................... 122Phases of audit processing ............................................................................... 123Object selection methods for automatic audits.................................................. 123

How priority auditing determines what objects to select.............................. 124How audits are submitted automatically ............................................................ 124Audit status and results ..................................................................................... 125Audit compliance ............................................................................................... 125

Guidelines and considerations for auditing ............................................................. 126Auditing best practices ...................................................................................... 126Considerations for specific audits...................................................................... 127Recommendations when checking audit results ............................................... 127

Displaying audit runtime status ............................................................................... 129Running an audit immediately ........................................................................... 131Resolving audit problems .................................................................................. 133Checking the job log of an audit ........................................................................ 135Ending audits..................................................................................................... 136

Displaying audit history ........................................................................................... 137Audits with no selected objects ......................................................................... 139

Working with audited objects................................................................................... 139Displaying audited objects from a specific audit run ......................................... 141Displaying a customized list of audited objects ................................................. 141

Working with audited object history......................................................................... 142Displaying the audit history for a specific object................................................ 143

Displaying audit compliance.................................................................................... 144Determining whether auditing is within compliance........................................... 145

Displaying scheduling information for automatic audits .......................................... 147

Chapter 8 Working with system-level processes 149Displaying status of system-level processes........................................................... 149

Resolving *ACTREQ status for a system manager ........................................... 151Checking for a system manager backlog .......................................................... 151Starting a system manager or a journal manager ............................................. 152Ending a system manager or a journal manager .............................................. 152Starting collector services ................................................................................. 152Ending collector services................................................................................... 153Starting target journal inspection processes ..................................................... 153Ending target journal inspection processes....................................................... 154

Displaying status of target journal inspection .......................................................... 155Displaying results of target journal inspection ......................................................... 156

Displaying details associated with target journal inspection notifications.......... 157Displaying messages for TGTJRNINSP notifications.................................. 157

Identifying the last entry inspected on the target system ........................................ 158

5

Chapter 9 Working with notifications and recoveries 159What are notifications and recoveries ..................................................................... 159Displaying notifications............................................................................................ 160

What information is available for notifications ................................................... 160Detailed information..................................................................................... 161

Options for working with notifications ................................................................ 162Notifications for newly created objects .................................................................... 163Displaying recoveries .............................................................................................. 164

What information is available for recoveries...................................................... 165Detailed information..................................................................................... 166

Options for working with recoveries .................................................................. 166Orphaned recoveries......................................................................................... 167

Determining whether a recovery is orphaned.............................................. 167Removing an orphaned recovery ................................................................ 168

Chapter 10 Starting and ending replication 169Before starting replication........................................................................................ 171Commands for starting replication........................................................................... 171

What is started with the STRMMX command.................................................... 171STRMMX and ENDMMX messages............................................................ 172

What is started by the default START procedure for an application group ....... 172Choices when starting or ending an application group...................................... 172

What occurs when a data group is started .............................................................. 174Journal starting point identified on the STRDG request .................................... 175

Journal starting point when the object send process is shared ................... 175Clear pending and clear error processing ......................................................... 175

Starting MIMIX......................................................................................................... 179Starting an application group................................................................................... 180Starting selected data group processes .................................................................. 181Starting replication when open commit cycles exist ................................................ 183

Checking for open commit cycles...................................................................... 183Resolving open commit cycles .......................................................................... 183

Before ending replication......................................................................................... 184Commands for ending replication............................................................................ 184

Command choice by reason for ending replication ........................................... 184Additional considerations when ending replication............................................ 186Ending immediately or controlled ...................................................................... 186

Controlling how long to wait for a controlled end to complete ..................... 187Ending all or selected processes....................................................................... 187When to end the RJ link .................................................................................... 188What is ended by the ENDMMX command....................................................... 188What is ended by the default END procedure for an application group ............ 189

What occurs when a data group is ended ............................................................... 190Ending MIMIX.......................................................................................................... 192

Ending with default values................................................................................. 192Ending by prompting the ENDMMX command.................................................. 192After you end MIMIX products........................................................................... 193

Ending an application group.................................................................................... 194Ending a data group in a controlled manner ........................................................... 195

Preparing for a controlled end of a data group.................................................. 195

6

Performing the controlled end ........................................................................... 195Confirming the end request completed without problems ................................. 196

Ending selected data group processes ................................................................... 198What replication processes are started by the STRDG command.......................... 199What replication processes are ended by the ENDDG command .......................... 203

Chapter 11 Resolving common replication problems 207Working with message queues ............................................................................... 208Working with the message log ................................................................................ 209Working with user journal replication errors ............................................................ 210

Working with files needing attention (replication and access path errors)......... 210Working with journal transactions for files in error....................................... 213

Placing a file on hold ......................................................................................... 214Ignoring a held file ............................................................................................. 214Releasing a held file at a synchronization point ................................................ 215Releasing a held file .......................................................................................... 215Releasing a held file and clearing entries.......................................................... 216Correcting file-level errors ................................................................................. 216Correcting record-level errors............................................................................ 217

Record written in error ................................................................................. 217Working with tracking entries .................................................................................. 219

Accessing the appropriate tracking entry display .............................................. 219Holding journal entries associated with a tracking entry ................................... 221Ignoring journal entries associated with a tracking entry................................... 222Waiting to synchronize and release held journal entries for a tracking entry .... 222Releasing held journal entries for a tracking entry ............................................ 223Releasing and clearing held journal entries for a tracking entry........................ 223Removing a tracking entry................................................................................. 223

Working with objects in error ................................................................................... 224Using the Work with DG Activity Entries display ............................................... 225Retrying data group activity entries ................................................................... 227

Retrying a failed data group activity entry ................................................... 227Determining whether an activity entry is in a delay/retry cycle .......................... 228

Removing data group activity history entries........................................................... 229

Chapter 12 Starting, ending, and verifying journaling 230What objects need to be journaled.......................................................................... 231

Authority requirements for starting journaling.................................................... 232MIMIX commands for starting journaling................................................................. 233Journaling for physical files ..................................................................................... 235

Displaying journaling status for physical files .................................................... 235Starting journaling for physical files................................................................... 235Ending journaling for physical files .................................................................... 236Verifying journaling for physical files ................................................................. 237

Journaling for IFS objects........................................................................................ 238Displaying journaling status for IFS objects ...................................................... 238Starting journaling for IFS objects ..................................................................... 238Ending journaling for IFS objects ...................................................................... 239Verifying journaling for IFS objects.................................................................... 240

Journaling for data areas and data queues............................................................. 241

7

Displaying journaling status for data areas and data queues............................ 241Starting journaling for data areas and data queues .......................................... 241Ending journaling for data areas and data queues............................................ 242Verifying journaling for data areas and data queues......................................... 243

Chapter 13 Switching 244About switching ....................................................................................................... 244

Planned switch .................................................................................................. 245Unplanned switch .............................................................................................. 246Switching application group environments with procedures.............................. 247Switching data group environments with MIMIX Model Switch Framework ...... 248

Switching an application group................................................................................ 250Switching a data group-only environment ............................................................... 251

Switching to the backup system ........................................................................ 251Synchronizing data and starting MIMIX on the original production system....... 252Switching to the production system................................................................... 252

Determining when the last switch was performed ................................................... 253Checking the last switch date............................................................................ 253

Problems checking switch compliance.................................................................... 254Performing a data group switch............................................................................... 255Switch Data Group (SWTDG) command................................................................. 257

Chapter 14 Less common operations 259Starting the TCP/IP server ...................................................................................... 260Ending the TCP/IP server........................................................................................ 261Working with objects ............................................................................................... 262

Displaying long object names............................................................................ 262Considerations for working with long IFS path names ................................ 262

Displaying data group spooled file information.................................................. 262Viewing status for active file operations .................................................................. 263Displaying a remote journal link .............................................................................. 264Displaying status of a remote journal link................................................................ 265Identifying data groups that use an RJ link ............................................................. 267Identifying journal definitions used with RJ ............................................................. 268Disabling and enabling data groups ........................................................................ 269

Procedures for disabling and enabling data groups .......................................... 270Determining if non-file objects are configured for user journal replication............... 271

Determining how IFS objects are configured .................................................... 271Determining how data areas or data queues are configured ............................ 272

Using file identifiers (FIDs) for IFS objects .............................................................. 273Operating a remote journal link independently........................................................ 274

Starting a remote journal link independently ..................................................... 274Ending a remote journal link independently ...................................................... 274

Chapter 15 Troubleshooting - where to start 276Gathering information before reporting a problem .................................................. 278

Obtaining MIMIX and IBM i information from your system ................................ 278Reducing contention between MIMIX and user applications................................... 279Data groups cannot be ended ................................................................................. 280Verifying a communications link for system definitions ........................................... 281Verifying the communications link for a data group................................................. 282

8

Verifying all communications links..................................................................... 282Checking file entry configuration manually.............................................................. 283Data groups cannot be started ................................................................................ 285Cannot start or end an RJ link................................................................................. 286

Removing unconfirmed entries to free an RJ link.............................................. 286RJ link active but data not transferring .................................................................... 287Errors using target journal defined by RJ link.......................................................... 288Verifying data group file entries............................................................................... 289Verifying data group data area entries .................................................................... 289Verifying key attributes ............................................................................................ 289Working with data group timestamps ...................................................................... 291

Automatically creating timestamps.................................................................... 291Creating additional timestamps ......................................................................... 291Creating timestamps for remote journaling processing ..................................... 292Deleting timestamps.......................................................................................... 293Displaying or printing timestamps ..................................................................... 293

Removing journaled changes.................................................................................. 294Performing journal analysis ..................................................................................... 295

Removing journal analysis entries for a selected file ........................................ 297

Appendix A Interpreting audit results - supporting information 299Interpreting results for configuration data - #DGFE audit........................................ 300When the difference is “not found” .......................................................................... 302Interpreting results of audits for record counts and file data ................................... 303

What differences were detected by #FILDTA.................................................... 303What differences were detected by #MBRRCDCNT......................................... 304

Interpreting results of audits that compare attributes .............................................. 306What attribute differences were detected.......................................................... 306Where was the difference detected................................................................... 308What attributes were compared ........................................................................ 309

Appendix B IBM Power™ Systems operations that affect MIMIX 310MIMIX procedures when performing an initial program load (IPL) .......................... 310MIMIX procedures when performing an operating system upgrade........................ 312

Prerequisites for performing an OS upgrade on either system ......................... 313MIMIX-specific steps for an OS upgrade on a backup system.......................... 313MIMIX-specific steps for an OS upgrade on a production system with switching ...

315MIMIX-specific steps for an OS upgrade on the production system without switch-

ing............................................................................................................................ 317MIMIX procedures when upgrading hardware without a disk image change .......... 319

Considerations for performing a hardware system upgrade without a disk image change..................................................................................................................... 319

MIMIX-specific steps for a hardware upgrade without a disk image change..... 320Hardware upgrade without a disk image change - preliminary steps .......... 320Hardware upgrade without a disk image change - subsequent steps ......... 321

MIMIX procedures when performing a hardware upgrade with a disk image change...321

Considerations for performing a hardware system upgrade with a disk image change..................................................................................................................... 322

9

MIMIX-specific steps for a hardware upgrade with a disk image change.......... 322Hardware upgrade with a disk image change - preliminary steps ............... 323Hardware upgrade with a disk image change - subsequent steps .............. 324

Handling MIMIX during a system restore ................................................................ 326Prerequisites for performing a restore of MIMIX ............................................... 327

Index 328

10

Who this book is for

Who this book is forThe MIMIX Operations - 5250 book describes how to perform routine operational

tasks and basic troubleshooting for MIMIX® Enterprise™ and MIMIX® Professional™ from a 5250 emulator.

What is in this book

The MIMIX Operations - 5250 book provides these distinct types of information:

• A summary of concepts within MIMIX

• Application group and data group status and troubleshooting

• Audit status, troubleshooting, scheduling, and history

• Procedures for starting, ending, and switching replication

• Procedures for starting, ending, and verifying journaling

• Procedures for handling MIMIX when performing operations such as IPLs or hardware and operating system upgrades.

The MIMIX documentation setThe following documents about MIMIX® Availability™ products are available:

Using License Manager

License Manager currently supports MIMIX® Availability™, iTERA Availability™, and iOptimize™. This book describes software requirements, system security, and other planning considerations for installing software and software fixes for Vision Solutions products that are supported through License Manager. The preferred way to obtain license keys and install software is by using Vision AutoValidate™ and the product’s Installation Wizard. However, if you cannot use the wizard or AutoValidate, this book provides instructions for obtaining licenses and installing software from a 5250 emulator. This book also describes how to use the additional security functions from Vision Solutions which are available for License Manager and MIMIX and implemented through License Manager.

MIMIX Administrator Reference

This book provides detailed conceptual, configuration, and programming information for MIMIX® Enterprise™ and MIMIX® Professional™. It includes checklists for setting up several common configurations, information for planning what to replicate, and detailed advanced configuration topics for custom needs. It also identifies what information can be returned in outfiles if used in automation.

MIMIX Operations with IBM i Clustering

This book is for administrators and operators in an IBM i clustering environment who either use the basic support for IBM i clustering provided within MIMIX or who use MIMIX® Global™ to integrate cluster management with MIMIX logical replication or supported hardware-based replication techniques. This book

11

The MIMIX documentation set

focuses on addressing problems reported in MIMIX status and basic operational procedures such as starting, ending, and switching.

MIMIX Operations - 5250

This book provides high level concepts and operational procedures for managing your high availability environment using MIMIX® Enterprise™ or MIMIX® Professional™ from a 5250 emulator. This book focuses on tasks typically performed by an operator, such as checking status, starting or stopping replication, performing audits, and basic problem resolution.

Using MIMIX Monitor

This book describes how to use the MIMIX Monitor user and programming interfaces available with MIMIX® Enterprise™ or MIMIX® Professional™. This book also includes programming information about MIMIX Model Switch Framework and support for hardware switching.

Using MIMIX Promoter

This book describes how to use MIMIX commands for copying and reorganizing active files. MIMIX Promoter is available with MIMIX® Enterprise™ and as no-charge feature for MIMIX® Professional™.

MIMIX for IBM WebSphere MQ

This book identifies requirements for the MIMIX for MQ feature which supports replication in IBM WebSphere MQ environments. This book describes how to configure MIMIX for this environment and how to perform the initial synchronization and initial startup. Once configured and started, all other operations are performed as described in the MIMIX Operations - 5250 book.

12

Sources for additional information

13

Sources for additional informationThis book refers to other published information. The following information, plus additional technical information, can be located in the IBM System i and i5/OS Information Center.

From the Information center you can access these IBM Power™ Systems topics, books, and redbooks:

• Backup and Recovery

• Journal management

• DB2 Universal Database for IBM Power™ Systems Database Programming

• Integrated File System Introduction

• Independent disk pools

• OptiConnect for OS/400

• TCP/IP Setup

• IBM redbook Striving for Optimal Journal Performance on DB2 Universal Database for iSeries, SG24-6286

• IBM redbook AS/400 Remote Journal Function for High Availability and Data Replication, SG24-5189

• IBM redbook Power™ Systems iASPs: A Guide to Moving Applications to Independent ASPs, SG24-6802

The following information may also be helpful if you replicate journaled data areas, data queues, or IFS objects:

• DB2 UDB for iSeries SQL Programming Concepts

• DB2 Universal Database for iSeries SQL Reference

• IBM redbook AS/400 Remote Journal Function for High Availability and Data Replication, SG24-5189

http://publib.boulder.ibm.com/iseries/

http://publib.boulder.ibm.com/iseries/

How to contact us

14

How to contact usFor contact information, visit our Contact CustomerCare web page.

If you are current on maintenance, support for MIMIX products is also available when you log in to Support Central.

It is important to include product and version information whenever you report problems.



CHAPTER 1 MIMIX overview

This book provides operational information and procedures for using MIMIX®

Enterprise™ and MIMIX® Professional™ through its 5250 emulator user interface. For simplicity, this book uses the term MIMIX to refer to the functionality provided by either product unless a more specific name is necessary.

MIMIX® Availability™ version 7.1 provides high availability for your critical data in a

production environment on IBM Power™ Systems through real-time replication of changes and the ability to quickly switch your production environment to a ready backup system. These capabilities allow your business operations to continue when you have planned or unplanned outages in your System i environment. MIMIX also provides advanced capabilities that can help ensure the integrity of your MIMIX environment.

Replication: MIMIX continuously captures changes to critical database files and objects on a production system, sends the changes to a backup system, and applies the changes to the appropriate database file or object on the backup system. The backup system stores exact duplicates of the critical database files and objects from the production system.

MIMIX uses two replication paths to address different pieces of your replication needs. These paths operate with configurable levels of cooperation or can operate independently.

• The user journal replication path captures changes to critical files and objects configured for replication through a user journal. When configuring this path, shipped defaults use the remote journaling function of the operating system to simplify sending data to the remote system. In previous versions, MIMIX DB2 Replicator provided this function.

• The system journal replication path handles replication of critical system objects (such as user profiles, program objects, or spooled files), integrated file system (IFS) objects, and document library object (DLOs) using the system journal. In previous versions MIMIX Object Replicator provided this function.

Configuration choices determine the degree of cooperative processing used between the system journal and user journal replication paths when replicating database files, IFS objects, data areas, and data queues.

Switching: One common use of MIMIX is to support a hot backup system to which operations can be switched in the event of a planned or unplanned outage. If a production system becomes unavailable, its backup is already prepared for users. In the event of an outage, you can quickly switch users to the backup system where they can continue using their applications. MIMIX captures changes on the backup system for later synchronization with the original production system. When the original production system is brought back online, MIMIX assists you with analysis and synchronization of the database files and other objects.

15

Automatic verification and correction: MIMIX enables earlier and easier detection of problems known to adversely affect maintaining availability and switch-readiness of your replication environment. MIMIX automatically detects and corrects potential problems during replication and auditing. MIMIX also helps to ensure the integrity of your MIMIX configuration by automatically verifying that the files and objects being replicated are what is defined to your configuration.

MIMIX is shipped with these capabilities enabled. Incorporated best practices for maintaining availability and switch-readiness are key to ensuring that your MIMIX environment is in tip-top shape for protecting your data. User interfaces allow you to fine-tune to the needs of your environment.

Analysis: MIMIX also provides advanced analysis capabilities through the MIMIX portal application for Vision Solutions Portal (VSP). When using the VSP user interface, you can see what objects are configured for replication as well as what replicated objects on the target system have been changed by people or programs other than MIMIX. (Objects changed on the target system affect your data integrity.) You can also check historical arrival and backlog rates for replication to help you identify trends in your operations that may affect MIMIX performance.

Uses: MIMIX is typically used among systems in a network to support a hot backup system. Simple environments have one production system and one backup system. More complex environments have multiple production systems or backup systems. MIMIX can also be used on a single system.

You can view the replicated data on the backup system at any time without affecting productivity. This allows you to generate reports, submit (read-only) batch jobs, or perform backups to tape from the backup system. In addition to real-time backup capability, replicated databases and objects can be used for distributed processing, allowing you to off-load applications to a backup system.

The topics in this chapter include:

• “MIMIX concepts” on page 17 summarizes key concepts that you need to know about MIMIX.

• “Best practices for maintaining your MIMIX environment” on page 23 summarizes recommendations from Vision Solutions.

• “Authority to products and commands” on page 23 identifies authority levels to MIMIX functions when additional security features provided by Vision Solutions are used.

• “Accessing the MIMIX Main Menu” on page 24 describes the MIMIX Basic Main menu and the MIMIX Intermediate Main Menu. The MIMIX Basic Main menu is used to access the MIMIX Availability Status (WRKMMXSTS) display.

16

MIMIX concepts

MIMIX conceptsThe following subtopics organize the basic concepts associated with MIMIX® into related groups. More detailed information is available in the MIMIX Administrator Reference book.

Product concepts

MIMIX installation - The network of IBM Power™ Systems systems that transfer data and objects among each other using functions of a common MIMIX product. A MIMIX installation is defined by the way in which you configure the MIMIX product for each of the participating systems. A system can participate in multiple independent MIMIX installations.

Replication - The activity that MIMIX performs to continuously capture changes to critical database files and objects on a production system as they occur, send the changes to a backup system, and apply the changes to the appropriate database file or object on the backup system.

Switch - The process by which a production environment is moved from one system to another system and the production environment is made available there. A switch may be performed as part of a planned event such as for system maintenance, or an unplanned event such as a power or equipment failure. MIMIX provides customizable functions for switching.

Audits - Audits are predetermined programs that are used to check for differences in replicated objects and other conditions between systems. Audits run and can correct detected problems automatically. Policies control when audits run and many other aspects of how audits are performed. Additional auditing concepts and recommendations are described in the auditing chapter of this book.

Automatic recovery - MIMIX provides a set of functions that provide the ability to automatically correct problems detected in a MIMIX installation during database replication, object replication, and auditing. During these activities, when MIMIX detects any of a set of scenarios known to interfere with maintaining your MIMIX environment, it will automatically start recovery actions to correct them. Through policies, you have the ability to disable automatic recovery in any of these areas at the installation or data group level.

Application group - A MIMIX construct used to group and control resources from a single point in a way that maintains relationships between them. The use of

application groups is best practice for MIMIX® Professional™ and MIMIX®

Enterprise™ and required for MIMIX® Global™.

Data group - A MIMIX construct that is used to control replication activities. A data group is a logical grouping of database files, data areas, objects, IFS objects, DLOs, or a combination thereof that defines a unit of work by which MIMIX replication activity is controlled. A data group may represent an application, a set of one or more libraries, or all of the critical data on a given system. Application environments may define a data group as a specific set of files and objects.

17

MIMIX concepts

Prioritized status - MIMIX assigns a priority to status values to ensure that problems with the highest priorities, those for detected problems or situations that require immediate attention or intervention, are reflected on the highest level of the user interface. Additional detail and lower priority items can be viewed by drilling down to the next level within the interfaces. Those interfaces are the Work with Systems display and depending on your configuration, either the Work with Application Groups display or the Work with Data Groups display.

Policies - A policy is a mechanism used to enable, disable, or provide input to a function such as replication, auditing, or MIMIX Model Switch Framework. For most policies, the initially shipped values apply to an installation. However, policies can be changed and most can also be overridden for individual data groups. Policies that control when audits are automatically performed can be set only for each specific combination of audit rule and data group.

Notifications - A notification is the resulting automatic report associated with an event that has already occurred. The severity of a notification is reflected in the overall status of the installation. Notifications can be generated by a process, program, command, or monitor. Because the originator of notifications varies, it is important to note that notifications can represent both real-time events as well as events that occurred in the past but, due to scheduling, are being reported in the present.

Recoveries - This term recovery is used in two ways. The most common use refers to the recovery action taken by a replication process or an audit to correct a detected difference when automatic recovery polices are enabled. The second use refers to a temporary report that provides details about a recovery action in progress that is created when the recovery action starts and is removed when it completes.

System role concepts

MIMIX uses several pairs of terms to refer to the role of a system within a particular context. These terms are not interchangeable.

Production system and backup system - These terms describe the role of a system relative to the way applications are used on that system.

A production system is the system currently running the production workload for the applications. In normal operations, the production system is the system on which the principal copy of the data and objects associated with the application exist.

A backup system is the system that is not currently running the production workload for the applications. In normal operations, the backup system is the system on which you maintain a copy of the data and objects associated with the application. These roles are not always associated with a specific system. For example, if you switch application processing to the backup system, the backup system temporarily becomes the production system.

Typically, for normal operations in basic two-system environment, replicated data flows from the system running the production workload to the backup system.

Source system and target system - These terms identify the direction in which an activity occurs between two participating systems.

18

MIMIX concepts

A source system is the system from which MIMIX replication activity between two systems originates. In replication, the source system contains the journal entries. Information from the journal entries is either replicated to the target system or used to identify objects to be replicated to the target system.

A target system is the system on which MIMIX replication activity between two systems completes.

Management system and network system - These terms define the role of a system relative to how the products interact within a MIMIX installation. These roles remain associated with the system within the MIMIX installation to which they are defined. One system in the MIMIX installation is designated as the management system and the remaining one or more systems are designated as network systems.

A management system is the system in a MIMIX installation that is designated as the control point for all installations of the product within the MIMIX installation. The management system is the location from which work to be performed by the product is defined and maintained. Often the system defined as the management system also serves as the backup system during normal operations.

A network system is any system in a MIMIX installation that is not designated as the management system (control point) of that MIMIX installation. Work definitions are automatically distributed from the management system to a network system. Often a system defined as a network system also serves as the production system during normal operations.

Journaling concepts

MIMIX uses journaling to perform replication and to support newer analysis functionality.

Journaling and object auditing - Journaling and object auditing are techniques that allow object activity to be logged to a journal. Journaling logs activity for selected objects of specific object types to a user journal. Object auditing logs activity for all objects to the security audit journal (QAUDJRN, the system journal), including those defined to a user journal. MIMIX relies on these techniques and the entries placed in the journal receivers for replicating logged activity.

Journal - An IBM i system object that identifies the objects being journaled and the journal receivers associated with the journal. The system journal is a specialized journal on the system which MIMIX uses.

Journal receiver - An IBM i system object that is associated with a journal and contains the log of all activity for objects defined to the journal.

Journal entry - A record added to a journal receiver that identifies an event that occurred on a journaled object. MIMIX uses file and record level journal entries to recreate the object on a designated system.

Remote journaling - A function of IBM i that allows you to establish journals and journal receivers on one system and associate them with specific journals and journal receivers on another system. Once the association is established, the operating system can use the pair of journals to replicate journal entries in one direction, from the local journal to the remote journal on the other system. In some configurations,

19

MIMIX concepts

MIMIX uses remote journaling for transferring data to be replicated from the source system to the target system.

Configuration concepts

MIMIX configuration provides considerable flexibility to enable supporting a wide variety of customer environments. Configuration is implemented through sets of related commands. The following terms describe configuration concepts.

Definitions - MIMIX uses several types of named definitions to identify related configuration choices.

• System definitions identify systems that participate in a MIMIX installation. Each system definition identifies one system.

• Transfer definitions identify the communications path and protocol to be used between systems.

• Journal definitions identify journaling environments that MIMIX uses for replication Each journal definition identifies a system and characteristics of the journaling environment on that system.

• Data group definitions identify the characteristics of how replication occurs between two systems. Each data group definition determines the direction in which replication occurs between the systems, whether that direction can be switched, and the default processing characteristics for replication processes.

• Application group definitions identify whether the replication environment does or does not use IBM i clustering. When clustering is used, the application group also defines information about an application or proprietary programs necessary for controlling operations in the clustering environment.

Data group entries - A data group entry is a configuration construct that identifies a source of information to be replicated by or excluded from replication by a data group. Each entry identifies at least one object and its location on the source system. Classes of data group entries are based on object type. MIMIX uses data group entries to determine whether a journal entry should be replicated. Data groups that replicate from both the system journal and a user journal can have any combination of data group entries.

Remote journal link (RJ link) - An RJ link is a MIMIX configuration element that identifies an IBM i remote journaling environment used by user journal replication processes. An RJ link identifies the journal definitions that define the source and target journals, primary and secondary transfer definitions for the communications path used by MIMIX, and whether the IBM i remote journal function sends journal entries asynchronously or synchronously.

Cooperative processing - Cooperative processing refers to MIMIX techniques that efficiently replicate certain object types by using a coordinated effort between the system journal and user journal replication paths. Configuration choices in data group definitions and data group entries determine the degree of cooperative processing used between the system journal and user journal replication paths when replicating database files, IFS objects, data areas, and data queues.

20

MIMIX concepts

Tracking entries - Tracking entries identify objects that can be replicated using advanced journaling techniques and assist with tracking the status of their replication. A unique tracking entry is associated with each IFS object, data area, and data queue that is eligible for replication using advanced journaling. IFS tracking entries identify eligible, existing IFS objects while object tracking entries identify eligible, existing data areas and data queues.

Process concepts

The following terms identify MIMIX processes. Some, like the system manager, are required to allow MIMIX to function. Others, like procedures, are used only when invoked by users.

Replication path - A replication path is a series of processes used for replication that represent the critical path on which data to be replicated moves from its origin to its destination. MIMIX uses two replication paths to accommodate differences in how replication occurs for user journal and system journal entries. These paths operate with configurable levels of cooperation or can operate independently.

• The user journal replication path captures changes to critical files and objects configured for replication through a user journal. When configuring this path, shipped defaults use the remote journaling function of the operating system to simplify sending data to the remote system. The changes are applied to the target system.

• The system journal replication path handles replication of critical system objects (such as user profiles, program objects, or spooled files), integrated file system (IFS) objects, and document library object (DLOs) using the system journal. Information about the changes are sent to the target system where it is applied.

System manager - The system manager is a pair of communications jobs between the management system and a network system which must be active to enable replication. The system manager monitors for configuration changes and automatically moves any configuration changes to the network system. Dynamic status changes are also collected and returned to the management system. The system manager also gathers messages and timestamp information from the network system and places them in a message log and timestamp file on the management system. In addition, the system manager performs periodic maintenance tasks, including cleanup of the system and data group history files.

Journal manager - The journal manager is a job on each system that MIMIX uses to maintain the journaling environment on that system. By default, MIMIX performs both change management and delete management for journal receivers used by the replication process.

Collector services - A group of jobs that are necessary for MIMIX to track historical data and to support using the MIMIX portal application within the Vision Solutions Portal. One or more collector service jobs collect and combine MIMIX status from all systems.

Cluster services - When MIMIX Global is configured for IBM i clustering, MIMIX uses the cluster services function provided by IBM i to integrate the system management functions needed for clustering. Cluster services must be active in order for a cluster

21

MIMIX concepts

node to be recognized by the other nodes in the cluster. MIMIX integrates starting and stopping cluster services into status and commands for controlling processes that run at the system level.

Target journal inspection - A MIMIX process that reads a journal on a system being used as the target system for replication. The process identifies people or processes other than MIMIX that accessed replicated objects on the target system. Users can access the resulting information from the Replicated Objects portlet within the MIMIX portal application in Vision Solutions Portal.

Procedures and steps - Procedures and steps are a highly customizable means of performing operations for application groups. A set of default procedures for each application group provide the ability to start, end, perform pre-check activity for switching, and switch the application group. Each operation is performed by a procedure that consists of a sequence of steps and multiple jobs. Each step calls a predetermined step program to perform a specific sub-task of the larger operation. Steps also identify runtime attributes for handling before and after the program call within the context of the procedure.

Log space - A MIMIX object that provides an efficient storage and manipulation mechanism for replicated data that is temporarily stored on the target system during the receive and apply processes.

Additional switching concepts

The following concepts are specific to switching.

Environments configured with application groups perform switching through procedures.

Planned switch - An intentional change to the direction of replication for any of a variety of reasons. You may need to take the system offline to perform maintenance on its hardware or software, or you may be testing your disaster recovery plan. In a planned switch, the production system (the source of replication) is available. When you perform a planned switch, replication is ended on both the source and target systems. The next time you start replication, it will be set to replicate in the opposite direction.

Unplanned switch - A change the direction of replication as a response to a problem. Most likely the production system is no longer available. When you perform an unplanned switch, you must initiate the switch from the target system. Replication is ended on the target system. The next time you start replication, it will be set to replicate in the opposite direction.

MIMIX Model Switch Framework - A set of programs and commands that provide a consistent framework to be used when performing planned or unplanned switches in environments that do not use application groups. Typically, a model switch framework is customized to your environment through its exit programs.

MIMIX Switch Assistant - A guided user interface that guides you through switching using your default MIMIX Model Switch Framework. MIMIX Switch Assistant is accessed from the MIMIX Basic Main Menu and does not support application groups.

22

Best practices for maintaining your MIMIX environment

Best practices for maintaining your MIMIX environmentMIMIX is shipped with default settings that incorporate many best practices for maintaining your environment. Others may require changing policies and adopting best practices within your organization. Best practices include:

• Allow MIMIX to automatically correct differences detected during database and object replication processes that would otherwise result in errors. If MIMIX is unable to perform the recovery, the problem is reported as a replication error (a file is placed in held error or an object is in error).

• Allow MIMIX to automatically perform audits and to automatically recover any differences detected by audits. Best practice is to allow regularly scheduled audits of all objects configured for replication and daily audits of prioritized categories of replicated objects. User interfaces summarize audit results and indicate whether MIMIX is unable to recover an object.

• Perform all audits with the audit level set at level 30 immediately prior to a planned switch to the backup system and before switching back to the production system.

• Perform switches on a regular basis. Best practice is to switch every three to six months. You need to set aside time for performing planned switches. Environments that continue to use MIMIX Switch Assistant can use policies so that compliance with regular switching is automatically reported in the user interface.

Authority to products and commandsIf your MIMIX environment takes advantage of the additional security available in the product and command authority functions which Vision Solutions provides through License Manager, you may need a higher authority level in order to perform MIMIX daily operations.

A MIMIX administrator can change your authorization level to commands and displays. Authorization levels typically fall into these categories:

• Viewing information requires display (*DSP) authority.

• Controlling operations requires operator (*OPR) authority.

• Creating or changing configuration requires management (*MGT) authority.

For example, consider audits. You can view an audit if you have display authority, perform audits if you have operator authority, and change policies that affect how auditing is performed if you have management authority.

For more information about these provided security functions, see the Using License Manager book.

23

Accessing the MIMIX Main Menu

Accessing the MIMIX Main MenuThe MIMIX command accesses the main menu for a MIMIX installation. The MIMIX Main Menu has two assistance levels, basic and intermediate. The command defaults to the basic assistance level, shown in Figure 1, with its options designed to simplify day-to-day interaction with MIMIX. Figure 2 shows the intermediate assistance level.

The options on the menu vary with the assistance level. In either assistance level, the available options also depend on the MIMIX products installed in the installation library and their licensing. The products installed and the licensing also affect subsequent menus and displays.

Accessing the menu - If you know the name of the MIMIX installation you want, you can use the name to library-qualify the command, as follows:

Type the command library-name/MIMIX and press Enter. The default name of the installation library is MIMIX.

If you do not know the name of the library, do the following:

1. Type the command LAKEVIEW/WRKPRD and press Enter.

2. Type a 9 (Display product menu) next to the product in the library you want on the Vision Solutions Installed Products display and press Enter.

Changing the assistance level - The F21 key (Assistance level) on the main menu toggles between basic and intermediate levels of the menu. You can also specify the the Assistance Level (ASTLVL) parameter on the MIMIX command.

Figure 1. MIMIX Basic Main Menu

MIMIX Basic Main Menu System: SYSTEM1 MIMIX Select one of the following: 1. Work with application groups WRKAG 2. Start MIMIX 3. End MIMIX 4. Switch all application groups 5. Start or complete switch using Switch Asst. 6. Work with data groups WRKDG 10. Availability status WRKMMXSTS 11. Configuration menu 12. Work with monitors WRKMON 13. Work with messages WRKMSGLOG 14. Cluster menu More... Selection or command ===>__________________________________________________________________________ ______________________________________________________________________________ F3=Exit F4=Prompt F9=Retrieve F21=Assistance level F12=Cancel (C) Copyright Vision Solutions, Inc., 1990, 2014.

24

Accessing the MIMIX Main Menu

Note: On the MIMIX Basic Main Menu, options 5 (Start or complete switch using Switch Asst.) and 10 (Availability Status) are not recommended for installations that use application groups.

Figure 2. MIMIX Intermediate Main Menu

MIMIX Intermediate Main Menu System: SYSTEM1 MIMIX Select one of the following: 1. Work with data groups WRKDG 2. Work with systems WRKSYS 3. Work with messages WRKMSGLOG 4. Work with monitors WRKMON 5. Work with application groups WRKAG 6. Work with audits WRKAUD 7. Work with procedures WRKPROC 11. Configuration menu 12. Compare, verify, and synchronize menu 13. Utilities menu 14. Cluster menu More... Selection or command ===>__________________________________________________________________________ ______________________________________________________________________________ F3=Exit F4=Prompt F9=Retrieve F21=Assistance level F12=Cancel (C) Copyright Vision Solutions, Inc., 1990, 2014.

25

26

CHAPTER 2 MIMIX policies

Each MIMIX policy is a mechanism used to enable, disable, or provide input to a function such as replication, auditing, or MIMIX Model Switch Framework. A policy may also determine how you are notified about certain problems that may occur.

For most policies, the initially shipped values apply to an installation. However, policies can be changed and most can also be overridden for individual data groups. When a policy is set for a data group, it takes precedence over the installation policy. Some policies, such as ones that control when audits are automatically submitted, apply to individual audit rules for specific data groups.

Policies must be changed from the management system. Changing policies requires that you have management-level authority to the Set MIMIX Policy (SETMMXPCY) command.

You can set policies from a command line or from the Work with Audits, the MIMIX Availability Status, and the Work with DG Definitions displays.


• “Environment considerations for policies” on page 27 describes additional considerations for setting policies for environments with more than two nodes or bi-directional replication. Also, applications and features can conflict with policy-controlled automatic recovery functions.

• “Setting policies - general” on page 29 provides basic procedures for changing policies. Other topics in this chapter include more in-depth procedures for specific policy-controlled functionality.

• “Policies which affect an installation” on page 31 identifies the policies that are set for an installation and which cannot be overridden by a data group-level setting. Also, this includes procedures for changing retention criteria for procedure history.

• “Policies which affect replication” on page 32 identifies the policies associated with automatic error detection and correction during replication and identifies the common object and file error situations that can be automatically recovered.

• “Policies which affect auditing” on page 36 identifies policies that influence audit runtime behavior and control scheduling for automatically submitted audits. Shipped audits and their descriptions and default scheduling details are included.

• “Changing auditing policies” on page 41 provides additional information and procedures for changing policies associated with auditing. This includes changing the auditing level before switching, changing automatic audit scheduling, changing audit history retention, restricting auditing based on the state of data groups, and disabling auditing.

• “Policies for switching with model switch framework” on page 48 identify the policies associated with model switch framework and includes instructions for changing these policies.

• “Policy descriptions” on page 50 describes polices used by MIMIX.

Environment considerations for policies

Environment considerations for policiesDefault settings for policies are chosen to address the needs of a broad set of customer environments. However, in more complex environments, you need to consider the effect of policies. Also, applications and other MIMIX features in some environments can conflict with automatic recovery actions during replication and with auditing.

Policies in environments with more than two nodes or bi-directional repli-cation

Policy values may affect data throughout your entire environment, not just a single installation or data group. This is of particular concern in environments that have more than two systems (nodes) or which have replication occurring simultaneously in more than one direction (bi-directional). Specifically, be aware of the following:

• In these environments, the value *DISABLED for the Objects only on target policy is recommended. When the policy is disabled, audits will detect that objects exist only on the target system but will not attempt to correct them. The commands used by an audit are aware of all objects on the target system, not just those which originate from the source system of the data group associated with the audit. In these environments, the values *DELETE and *SYNC must be used with care. When the policy value is Delete, audits will delete objects which may have originated from systems not associated with the data group being audited. When the policy value is Synchronize, audits will synchronize the objects to the source system of the data group being audited, which may not be the source system from which they originated.

• Synchronization of user profiles and authorization lists associated with an object will occur unless the user profiles and authorization lists are explicitly excluded from the data group configuration. In the environments mentioned, this may result in user profiles and authorization lists being synchronized to other systems in your configuration. This behavior occurs whenever any of the automatic recovery policies are enabled (database, object, audit). To prevent this from occurring, you must explicitly exclude the user profiles and authorization lists from replication for any data group for which you do not want them synchronized.

• In a simultaneously bi-directional environment, determine which system ‘wins’ in the event of a data conflict, that is, which system will be considered as having the correct data. Choose one direction of replication that will be audited and allow auditing for those data groups. Disable audits for data groups that replicate in the opposite direction. For example, data groups AB and BA are configured for bi-directional replication between system A and system B. Data group AB replicates from system A to system B and data group BA replicates the opposite direction. System B is also the management system for this installation. You chose system A as the winning system and want to permit auditing in the direction from A to B. The Audit level policy for data group AB must be set to a level that permits audits to run (level 10 or higher). The Audit level policy for data group BA must be set to disable audits. The results of audits of data group AB will be available on system B, because system B is the management system and default policy values cause

27

Environment considerations for policies

rules to be run from the management system.

• In environments with three or more systems in the same installation, you need to evaluate each pair of systems. For each pair of systems, evaluate the directions in which replication is permitted. If any pair of systems supports simultaneous bi-directional replication, determine the winning system in each pair and determine the direction to be audited. Set the audit level policy to permit auditing for the data group that replicates in the chosen direction. Disable auditing for the data group which replicates in the other direction. You may also want to consider changing the values of the Run rule on system policy for the installation or the audited data groups to balance processing loads associated with auditing.

• In environments that permit multiple management systems in the same installation, in addition to evaluating the direction of replication permitted within each pair of systems, you must also consider whether the systems defined by each data group are both management systems. If any pair of systems supports simultaneous bi-directional replication, choose the winning system and change the Audit level policies for each data group so that only one direction is audited. You may need to change the Run rule on system policy to prevent certain data groups from being audited from specific management systems.

When to disable automatic recovery for replication and auditing

At times, you may need to disable automatic recoveries during replication and auditing for certain data groups because a feature in use or an application being replicated may interact with auditing in an undesirable way.

Features - Do not use automatic recoveries during auditing and replication in any

data group that is using functions provided by the MIMIX CDP™ feature. This feature, which requires an additional license key, permits you to perform operations associated with maintaining continuous data protection. By configuring a recovery window for a data group, you introduce an automatic delay into when the apply processes complete replication. By setting a recovery point for a data group, you identify a point that, when reached, will cause the apply processes to be suspended.

In both cases, source system changes have been transferred to the target system but have not been applied. In such an environment, comparisons will report differences and automatic recoveries will attempt recovery for items that have not completed replication. To prevent this from occurring, disable comparisons and automatic recoveries for any data group which uses the MIMIX CDP feature. For details, see “Disabling audits and recovery when using the MIMIX CDP feature” on page 29.

Applications - At times, data groups for some applications will encounter problems if the application cannot acquire locks on objects that are defined to MIMIX. These data groups may need to be excluded from auditing. MIMIX acquires locks occasionally to save and restore objects within the replication environment. Some applications may fail when they cannot acquire a lock on an object. Refer to our Support Central for FAQs that list specific applications whose data groups should be excluded from auditing. For those excluded data groups, you can still run compares to determine if objects are not synchronized between source and target systems. Care must be taken to recover from these unsynchronized conditions.The applications may need to be ended prior to manually synchronizing the objects.

28

http://portal.lakeviewtech.com/wps/myportal/suppbyprod?WCM_GLOBAL_CONTEXT=/wps/wcm/myconnect/LakeviewPortal/Support/Support%20By%20Product/MIMIX%20ha1/FAQs/

Setting policies - general

To exclude a data group from audits, use the instructions in “Preventing audits from running” on page 45.

Disabling audits and recovery when using the MIMIX CDP feature

The functions provided by the MIMIX CDP™ feature1 create an environment in which source system changes have been transferred to the target system but have not been applied. Any data group which uses this feature must disable automatic comparisons and automatic recovery actions for the data group.

Do the following from the management system:

1. From the command line type SETMMXPCY and press F4 (Prompt).

2. For the Data group definition, specify the full three-part name of the data group that uses the MIMIX CDP feature.

3. Press Enter to see all the policies and their current values.

4. For Automatic object recovery, specify *DISABLED.

5. For Automatic database recovery, specify *DISABLED.

6. For Automatic audit recovery, specify *DISABLED.

7. For Audit level, select *DISABLED.

8. To accept the changes, press Enter.

Setting policies - generalPolicies must be changed from the management system. Changing policies requires that you have management-level authority to the Set MIMIX Policy (SETMMXPCY) command.

The following procedures describe the basic procedures for setting policies.

Changing policies for an installation

This procedure changes a policy value at the installation level. The installation level value will overridden if a data group level policy has been specified with a value other than *INST.



2. Verify that the value specified for Data group definition is *INST.


4. Specify a value for the policy you want. Use F1 (Help) to view descriptions of possible values.

1. The MIMIX CDP™ feature requires an additional license key.

29

Setting policies - general


Changing policies for a data group



2. For the Data group definition, specify the full three-part name.


4. Specify a value for the policy you want defined for the data group. Use F1 (Help) to view descriptions of possible values.


Resetting a data group-level policy to use the installation level value





4. For the policy you want to reset, specify *INST.


30

Policies which affect an installation

31

Policies which affect an installationWhile many policies can be set for an installation, the policies in Table 1 cannot be overridden for an individual data group. At the data group level, these policies always have a value of *INST.

Changing retention criteria for procedure history

The procedure history retention policy determines how long to retain historical information about procedure runs that completed, completed with errors, or that failed or were canceled and have been acknowledged.

Environments configured with application groups use procedures to control operations such as starting, ending, or switching. History information for a procedure includes timestamps indicating when the procedure was run and detailed information about each step within the procedure. The policy specifies how many days to keep history information and the minimum number of runs to keep. You can specify a different number of runs to keep for switch procedure runs than what is kept for other types of procedures.

Each procedure run is evaluated individually against the policy and its history information is retained until the specified minimum days and minimum runs are both met. When a procedure run exceeds these criteria, system manager cleanup jobs will remove the historical information for that procedure run from all systems. The values specified at the time the cleanup jobs run are used for evaluation.

To change the procedure history retention policy for the installation, do the following:


2. Verify that the value *INST is specified for the Data group definition prompt:


4. Locate the Procedure history retention policy. The current values are displayed. Specify values for the elements you want to change.


Table 1. Policies that can be set only at the installation level and shipped default values.

Policy Shipped Values – Installation

Independent ASP library ratio 5

Procedure history retention

• Minimum days

• Minimum runs per procedure

• Min. runs per switch procedure

7

1

1

Policies which affect replication

Policies which affect replicationTable 2 identifies the policies which can affect replication and their shipped default values.

MIMIX can automatically attempt to correct problems it encounters during replication when the policies for Automatic system journal recovery and Automatic user journal recovery are enabled. The following topics identify what errors can be recovered in this way:

• “Errors handled by automatic database recovery” on page 33

• “Errors handled by automatic object recovery” on page 34

Table 2. Policies associated with replication and shipped default values.

Policy Shipped Values Replication Processes

Installation Data

Groups

System

Journal

User

Journal

Data group definition *INST Name1

1. A data group definition value of *INST indicates the policy is installation-wide. A name indicates the policies are in effect only for the specified data group.

Yes Yes

Automatic system journal recovery

*ENABLED *INST1 Yes2

2. When this policy is enabled, the other policies in the same column are in effect unless otherwise noted.

–

Automatic user journal recovery

*ENABLED *INST – Yes2

System journal recovery notify on success

*YES *INST Yes –

User journal recovery notify on success

*YES *INST – Yes

DB apply cache *DISABLED *INST – Yes

Access path maintenance3

• Optimize for DB apply

• Maximum number of jobs

3. This policy is available only on systems running service pack 7.1.15.00 or higher. When running on earlier levels, the Parallel AP maintenance provides similar functionality. For more information about both access path maintenance functions, see the MIMIX Administrator Reference book.

*DISABLED

99

*INST

*INST

– Yes

Synchronize threshold size 9,999,999 *INST Yes Yes

Number of third delay retry attempts

100 *INST Yes –

Third delay retry interval 15 *INST Yes –

32


Errors handled by automatic database recovery

MIMIX can detect and correct the most common file error situations that occur during database replication. When the Automatic database recovery policy is enabled, database replication processes detect the types of errors listed in Table 3. When an error is detected, MIMIX automatically attempts to correct the error by starting a job to perform an appropriate recovery action.

The recovery action also sends a report of a recovery in progress to the user interface. The reports are on the Work with Recoveries display (WRKRCY command). When the recovery action completes, the report is removed.

The DB rcy. notify on success policy determines whether a successful recovery generates an informational notification.

Only when all recovery options are exhausted without success is a file placed in hold error (*HLDERR) status. Recovery actions that end in an error do not generate a separate error notification because the error is already reflected in MIMIX status.

Table 3. Errors detected and corrected during database replication when automatic database recovery is enabled.

Error Description

File level errors

- and -

Unique-key record level error

Typically invoked when there is a missing library, file, or member. Also invoked when an attempt to write a record to a file results in a unique key violation. Without database autonomics, these conditions result in the file being placed in *HLDERR status.

Record level errors Invoked when the database apply process detects a data-level issue while processing record-level transactions.

Without database autonomics, any configured collision resolution methods may attempt to correct the error. Otherwise, these conditions result in the file being placed in *HLDERR status.

Errors on IFS objects configured for user journal replication

Invoked during the priming of IFS tracking entries when replicated IFS objects are determined to be missing from the target system. Priming of tracking entries occurs when a data group is started after a configuration change or when Deploy Data Grp. Configuration (DPYDGCFG) is invoked.

Errors on data area and data queue objects configured for user journal replication

Invoked during the priming of object tracking entries when replicated data area and data queue objects are determined to be missing from the target system. Priming of tracking entries occurs when a data group is started after a configuration change or when the Deploy Data Grp. Configuration (DPYDGCFG) is invoked.

Errors when DBAPY cannot open the file or apply transactions to the file

Invoked when a temporary lock condition or an operating system condition exists that prevents the database apply process (DBAPY) from opening the file or applying transactions to the file. Without database autonomics, users typically have to release the file so the database apply process (DBAPY) can continue without error.

33


Errors handled by automatic object recovery

MIMIX can detect and correct the most common object error situations that occur during replication. When the Automatic object recovery policy is enabled, object replication processes detect the types of errors listed in Table 4. When an error is detected, MIMIX automatically attempts to correct the error by starting a job to perform an appropriate recovery action.

Unless the object is explicitly excluded from replication for a data group, the autonomic recovery action will synchronize the object to ensure that it is on the target system.

Note: Object automatic recovery does not detect or correct the following problems:

• Missing spooled files on the target system.

• Files and objects that are cooperatively processed. Although the files and objects are not addressed, problems with authorities for cooperatively processed files and objects are addressed.

• Activity entries that are “stuck” in a perpetual pending status (PR, PS, PA, or PB).

The recovery action also sends a report of a recovery in progress to the user interface. In a 5250 emulator, the reports are on the Work with Recoveries display (WRKRCY command). When the recovery action completes, the report is removed.

The Obj. rcy. notify on success policy determines whether a successful recovery generates an informational notification.

Only when all recovery options are exhausted without success is an activity entry placed in error status. Recovery actions that end in an error do not generate a separate error notification because the error is already reflected in MIMIX status.

Table 4. Errors detected and recoveries attempted by object autonomics during object replication

Error Description

Missing objects on target system1

An object (library-based, IFS, or DLO) exists on the source system and is within the name space for replication, but MIMIX detects that the object does not exist on the target system. Without object automatic recovery, this results in a failed activity entry.

Notes:

• Missing spooled files are not addressed.

• Missing objects that are configured for cooperative processing are not synchronized. However, any problems with authorities (*AUTL or *USRPRF) for the missing objects are addressed.

Missing parent objects on target system1

Any operation against an object whose parent object is missing on the target system. Without object autonomics, this condition results in a failed activity entry due to the missing parent object.

Missing *USRPRF objects on target system1

Any operation that requires a user profile object (*USRPRF) that does not exist on the target system. Without object autonomics, this results in authority or object owner issues that cause replication errors.

34


Missing *AUTL objects on target system1

Any operation that requires a authority list (*AUTL) that does not exist on the target system.Without object autonomics, this results in authority issues that cause replication errors.

In-use condition

Applications which hold persistent locks on objects can result in object replication errors if the configured values for delay/retry intervals are exceeded. Default values in the data group definition provide approximately 15 minutes during which MIMIX attempts to access the object for replication. If the object cannot be accessed during this time, the result is activity entries with errors of Failed Retrieve (for locked objects on the source system) and Failed Apply (for locked objects on the target system) and a reason code of *INUSE.

Notes:

1. The Number of third delay/retries policy and the Third retry interval policy determine whether automatic recovery is attempted for this error.

2. Automatic recovery for this error is not attempted when the objects are configured for cooperative processing.

1. The synchronize command used to automatically recover this problem during replication will correct this error any time the command is used.

Table 4. Errors detected and recoveries attempted by object autonomics during object replication

Error Description

35

Policies which affect auditing

Policies which affect auditingPolicies for auditing are divided into these subsets:

• Policies that affect the behavior of all audits in an installation. These policies can be overridden at the data group level. When set for a specific data group, these policies affect all audits for the data group.

• Policies that affect when audits automatically run and how those audits select objects. These policies are set for each unique combination of audit and data group.

Policies for auditing runtime behavior

The policies identified in Table 5 affect all audit runs regardless of whether the audit was automatically submitted or manually invoked. These policies can be set for the installation as well as overridden for an individual data group. The shipped default values for both levels are indicated.

When the Set MIMIX Policies (SETMMXPCY) command specifies a data group definition value of *INST, the policies being changed are effective for all data groups in the installation, unless a data group-level override exists. When the data group definition specifies a name, policies which specify the value *INST inherit their value from the installation-level policy value and polices which specify other values are in effect for only the specified data group.

Table 5. Shipped default values of policies associated with auditing runtime behavior.

Policy Shipped Values

Installation Data Groups

Data group definition *INST Name

Automatic audit recovery *ENABLED *INST

Audit notify on success *RULE *INST

Notification severity *RULE *INST

Object only on target action *DISABLED *INST

Journal attribute differences action

• MIMIX configured higher

• MIMIX configured lower

*CHGOBJ

*NOCHG

*INST

*INST

User journal apply threshold action *END *INST

Maximum rule runtime 1440 *INST

Audit warning threshold1 7 *INST

Audit action threshold1 14 *INST

Audit level *LEVEL30 *INST

Run rule on system *MGT *INST

36


Policies for submitting audits automatically

The Audit rule, Audit schedule, and Priority audit policies control when audits are automatically submitted. These policies do not have a shipped value for the installation level. The shipped values for the data group level are listed in Table 6.

If the Audit level policy is disabled, all auditing is disabled, regardless of the values specified for Audit schedule and Priority audit policies. This includes manually submitted audits.

Each shipped audit rule has default values for submitting priority audits as well as scheduled audits. The shipped values for a rule are used for all new data groups. When you specify names for Data group definition and Audit rule on the SETMMXPCY command, you can adjust the values for a specific audit of a single data group.

Action for running audits

• Inactive data group

• Repl. process in threshold

*NOTRUN2

*NOTRUN

*INST

*INST

Audit history retention

• Minimum days

• Minimum runs per audit

• Object details

• DLO and IFS details

7

1

*YES

*YES

*INST

*INST

*INST

*INST

Synchronize threshold size 9,999,999 *INST

CMPRCDCNT commit threshold *NOMAX *INST

1. These policies are not limited to recovery actions.2. This is the default shipped value on systems running MIMIX service pack 7.1.12.00 or higher. For

earlier software levels, the shipped default value is *NONE.

Table 5. Shipped default values of policies associated with auditing runtime behavior.



Table 6. Shipped default values of policies for automatically submitting audits.



Data group definition *INST Name

Audit rule – Varies by rule

37


When automatically submitted audits run

For each audit rule, its shipped values enable both prioritized audits and scheduled audits to run automatically. A prioritized audit starts one or more times an hour every day during the time range specified in the Priority audit policy. A scheduled audit runs once at its specified time on the days or dates for its frequency as specified in the Audit schedule policy. For scheduled audits, the shipped value for start time of each

Audit schedule

State

Frequency

Scheduled date

Scheduled day

Scheduled time

Relative day of month

–

*ENABLED1

*WEEKLY1

*SUN2

Varies by rule, see Table 7.

Priority audit

State

Start after

Start until

New objects selected

Changed objects selected

Unchanged objects selected

Audited with no differences

–

*ENABLED3

0300003

080000

*DAILY

*DAILY

*WEEKLY

*MONTHLY

1. The State element in the Audit schedule policy is available in MIMIX version 7.1.12.00 and higher. For data groups that existed before upgrading to version 7.1.12.00, if the Frequency specified was a value other than *NONE, that value is preserved by the upgrade process and the State is set to *ENABLED. If the Frequency value was *NONE, it is changed to *WEEKLY and the State set to *DISABLED.

2. The shipped default for Scheduled day changed in MIMIX version 7.1. For data groups created after installing version 7.1, the shipped default is *SUN (previously, it was *ALL). For data groups that existed before upgrading to version 7.1, the previous value for Scheduled day remains unchanged.

3. The Priority audit policy is new in MIMIX version 7.1. The State element for the Priority audit policy is available in MIMIX version 7.1.12.00 and higher. For data groups that existed before upgrading from any version 7.0 level to version 7.1.12.00 or higher, State is set to *DISABLED and Start after is set to 030000. For data groups that existed before upgrading from versions 7.1.01.00 through 7.1.11.00 to version 7.1.12.00 or higher, if the Start after value specified was a value other than *NONE, that value is preserved by the upgrade process and the State is set to *ENABLED. However if the Start after value was *NONE, it is changed to 030000 and State is set to *DISABLED.

Table 6. Shipped default values of policies for automatically submitting audits.



38


audit rule is staggered, beginning at 2 a.m. Table 7 shows the default times for priority audits versus scheduled audits.

Table 7. MIMIX rules and their shipped default times for Audit schedule (SCHEDULE) policy.

Shipped

Priority

Start Range

Shipped

Scheduled

Time

Rule Name Description Job Name

n/a1 2:00 a.m. #DGFE Checks configuration for files using cooperative processing.

Uses the Check Data Group File Entries (CHKDGFE) command.

sdn_DGFE

All other audits:

3 a.m.

to

8 a.m.

2:05 a.m. #OBJATR Compares all attributes for all object types supported for replication.

Uses the Compare Object Attributes (CMPOBJA) command

sdn_OBJATR

2:10 a.m. #FILATR Compares all file attributes.

Uses the Compare File Attributes (CMPFILA) command.

sdn_FILATR

2:15 a.m. #IFSATR Compares IFS attributes.

Uses the Compare IFS Attributes (CMPIFSA) command.

sdn_IFSATR

2:20 a.m. #FILATRMBR Compares basic file attributes at the member level.

Uses the Compare File Attributes (CMPFILA) command.

sdn_MBRATR

2:25 a.m. #DLOATR Compares all DLO attributes.

Uses the Compare DLO Attributes (CMPDLOA) command.

sdn_DLOATR

2:30 a.m. #MBRRCDCNT Compares the number of current records (*CURRDS) and the number of deleted records (*NBRDLTRCDS) for physical files that are defined to an active data group.

Uses the Compare Record Counts (CMPRCDCNT) command.

Note: Equal record counts suggest but do not guarantee that files are synchronized. This audit does not have a recovery phase. Differences detected by this audit appear as not recovered in the Audit Summary.

sdn_RCDCNT

2:35 a.m. #FILDTA2 Compares file contents.

Uses the Compare File Data (CMPFILDTA) command.

sdn_FILDTA

1. The #DGFE audit is not eligible for prioritized auditing because it checks configuration data, not objects.

39


2. The #FILDTA audit and the Compare File Data (CMPFILDTA) command require TCP/IP communications as their com-munications protocol.

40

Changing auditing policies

Changing auditing policiesThis topic describes how to change specific policies that affect auditing behavior and when automatic audits will run. MIMIX service providers are specifically trained to provide a robust audit solution that meets your needs.

Changing when automatic audits are allowed to run

Policies control aspects of when both prioritized auditing and scheduled auditing are automatically submitted. To effectively audit your replication environment you may need to fine-tune when one or both types of audits are submitted.

For both types of auditing, consider:

• How much time or system resource can you dedicate to audit processing each day, week, or month?

• How often should all data within the database be audited? Business requirements as well as time and system resources need to be considered.

• Does automatic scheduling conflict with regularly scheduled backups?

• Are there jobs running at the same time as audits that could lock files needing to be accessed during recovery?

For scheduled auditing (which select all objects), also consider:

• Are there are a large number of objects to be compared?

• Are there a large number of objects for which a rule is expected to attempt recovery?

• Specific audits may have additional needs. See “Considerations for specific audits” on page 127.

• While you may decide to vary the scheduled times, it is recommended that you maintain the same relative order indicated in “When automatically submitted audits run” on page 38.

Changing scheduling criteria for automatic audits

Both scheduled audits and priority audits have scheduling information. A change to an audit’s scheduling information is effective immediately. If an audit is in progress at the time its scheduling information is changed, the change is effective on the next automatic run of the audit.


1. Do one of the following to access the Schedule view of the Work with Audits display:

• From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Schedule view.

• Enter the command: installation-library/WRKAUD VIEW(*SCHEDULE)

2. Type 37 (Change audit schedule) next to the audit you want to change and press

41


Enter.

3. The Set MIMIX Policies (SETMMXPCY) command appears, showing the selected audit rule and data group. The current values for the Audit schedule and Priority audit policies are displayed. Do one of the following:

• To change when MIMIX is scheduled to run the audit to check all configured objects, specify the values you want for elements of the Audit schedule policy.

• To change when MIMIX is allowed to submit priority-based runs of the audit every day, specify values for the Start after and Start until elements of the Priority audit policy.

4. To make the changes effective, press Enter.

Changing the selection frequency of priority auditing categories

When priority auditing is used, you can control how often objects within priorities are eligible for selection. Objects which had differences in their previous audit are always selected. For other priority classes, you can change how often objects within the class are eligible for selection by a prioritized audit. For descriptions of the priority classes with changeable frequencies, see the Priority audit policy description.

If an audit is in progress at the time its category frequency information is changed, the change is effective on the next automatic run of the audit.


1. Do one of the following to access the Work with Audits display:

• From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Schedule view.

• Enter the command: installation-library/WRKAUD

2. Type 37 (Change audit schedule) next to the audit you want to change and press Enter.

3. The Set MIMIX Policies (SETMMXPCY) command appears, showing the selected audit rule and data group. Page Down to see the current values of the Priority audit policy.

4. Specify values in the following prompts that indicate how often objects in each category are eligible for selection by a priority audit.

• New objects selected

• Changed objects selected

• Unchanged objects selected

• Audited with no diff.

5. To make the changes effective, press Enter.

42


Changing the audit level policy when switching

Regardless of the level you use for daily operations, Vision Solutions strongly recommends that you perform audits at audit level 30 before the following events to ensure that 100 percent of the data is valid on the target system:

• Before performing a planned switch to the backup system.

• Before switching back to the production system.

For more information about the risks associated with lower audit levels, see “Considerations for user-defined rules” on page 660.

From a 5250 emulator, do the following from the management system:




4. For Audit level, specify *LEVEL30. Then press Enter.

Changing the system where audits are performed

The Run rule on system policy determines the system on which audits run. The shipped default is to run all audits for the installation from the management system.

When changing the value of this policy, also consider your switching needs. Click this link to see additional information about the Run rule on system policy.

Note: This procedure changes a policy value at the installation level. The installation level value can be overridden by a data group level policy value. Therefore, if a data group has value other than *INST for this policy, that value remains in effect.

To change the policy for the installation, do the following

1. On the management system type the following command and press F4 (Prompt)

installation-library/SETMMXPCY

2. Verify that the value *INST appears for the Data group definition.

3. Locate the Run rule on system policy. Specify the value you want.

4. Press Enter.

Changing retention criteria for audit history

The Audit history retention policy determines whether to retain information about the results of completed audits and the objects that were audited. The policy specifies how many days to keep history information and how many audit runs to keep, as well as whether details about audited library-based objects and audited DLO and IFS

43


objects are to be kept with the history information. Each audit is evaluated individually against the policy values.

The policy is checked when an audit runs to determine whether to keep details about the objects audited by that run. The policy is also checked when system manager cleanup jobs run to determine if any audit has history information which exceeds both specified retention criteria. The policy value in effect at the time each check occurs determines the result.

To change the audit history retention policy, do the following:

1. From the MIMIX Intermediate Main Menu, select option 6 (Work with Audits) and press Enter.

2. Determine whether to change the policy for the installation or at the data group level. From the Work with Audits display, do one of the following:

• To change the policy for all audits in the installation, press F16 (Inst. policies). Then, press Enter when the Set MIMIX Policies (SETMMXPCY) command appears.

• To change the policy for all audits for a specific data group, type 36 (Change DG policies) next to any audit for the data group you want and press Enter.

3. Locate the Audit history retention policy. The current values for the level you chose in Step 2 are displayed. Specify values for the elements you want to change.

Note: When large quantities of objects are eligible for replication, specifying *YES to retain either Object details or DLO and IFS details may use a significant amount of disk storage. Consider the combined effect of the quantity of replicated objects for each data group, the number of days to retain history, the number of audits to retain, and the frequency in which audits are performed.


Restricting auditing based on the state of the data group

You may want to control when audits are allowed to run based on the state of the data group at the time of the audit request. For example, if you end MIMIX so that a batch process can run, you may want to prevent audits from running while data groups are inactive. If a data group process has a backlog during peak activity, you may want to prevent audits from running while the backlog exists. Or, you may want to prevent only automatic recovery from occurring during a backlog or when the data group is inactive. The Action for running audits policy provides the ability to define what audit activity will be permitted based on the state of the data group at the time of audit request. This policy can be set for an installation or for a specific data group.

Note: For installations running service pack 7.1.12.00 and higher, most audits check for threshold conditions in all database and object replication processes, including the RJ link. #FILDTA audits only check for threshold warning conditions in the RJ link and database replication processes. #DLOATR audits only check for threshold warning conditions in object replication processes.

44


For installations running earlier service packs, only database and object apply processes are checked for thresholds.

Restricting audit activity in an installation based on data group state: Do the following from the management system:





4. For Action for running audits, do the following:

a. Specify the value you want for Inactive data group that indicates the audit actions to permit when the data group is inactive

b. Specify the value you want for Repl. process in threshold that indicates the audit actions to permit when any replication process checked by an audit has reached its configured threshold.


Restricting audit activity for a specific data group based on its state: Do the following from the management system:




4. For Action for running audits, do the following:

a. Specify the value you want for Inactive data group that indicates the audit actions to permit when the data group is inactive

b. Specify the value you want for Repl. process in threshold that indicates the audit actions to permit when any replication process checked by an audit has reached its configured threshold.


Preventing audits from running

There may be scenarios when you need to disable auditing completely for either an installation or a specific data group. Auditing may not be desirable on a test data group or during system or network maintenance.

The Audit level policy can be used to disable all auditing, including manually invoked audits.The Audit level can be set for an installation or for specific data groups. Note that an explicitly set value for a data group will override the installation value and may still allow an audit to run.

45


You can also prevent audits for a data group from being submitted automatically but still allow them to be invoked manually. Automatic submission can be prevented for a specific audit of a data group by values specified for its priority audit and audit schedule policies.

In addition to auditing, automatic recovery during replication may need to be prevented from running due to issues with applications or MIMIX features, For more information, see “When to disable automatic recovery for replication and auditing” on page 28.

Disabling all auditing for an installation






4. Specify *DISABLED for the Audit level policy.


Disabling all auditing for a data group





4. Specify *DISABLED for the Audit level policy.


Disabling automatically submitted audits

You can control whether each audit for a data group can be submitted automatically by priority or by schedule. The Priority audit and Audit schedule policies act independently so that you can have both, one, or neither type of automatic auditing.

Disabling a scheduled audit: Do the following from the management system:



3. For Audit rule, specify the name of the MIMIX rule.

4. Press Enter to see the current values for the Audit schedule policy.

5. Do one of the following:

46


a. For installations running version 7.1.12.00 or higher, specify *DISABLED for the State prompt.

b. For installations running earlier software levels, specify *NONE for the Frequency prompt.


Disabling a prioritized audit: Do the following from the management system:



3. For Audit rule, specify the name of the MIMIX rule.

4. Press Enter to see the current values for the Priority audit policy.


a. For installations running version 7.1.12.00 or higher, specify *DISABLED for the State prompt.

b. For installations running earlier software levels, specify *NONE for the Start after prompt.


47

Policies for switching with model switch framework

Policies for switching with model switch frameworkIn environments that do not use application groups, MIMIX Switch Assistant (which implements MIMIX Model Switch Framework) is usually used for switching. MIMIX Model Switch Framework cannot be used to switch application groups.

Table 8 identifies the policies associated with switching using MIMIX Model Switch Framework and the shipped default values of those policies.

For these policies, MIMIX Switch Assistant uses only the policy values specified for the installation. If MIMIX cannot determine whether a MIMIX Model Switch Framework is defined, the switch framework policy is *DISABLED.

If the SETMMXPCY command specifies a data group name, the switch framework is required to be *INST. The switch thresholds are *DISABLED by default but can be changed.

The policies in Table 8 have no effect on application group switching.

Specifying a default switch framework in policies

MIMIX Switch Assistant requires that you have a configured MIMIX Model Switch Framework and that you specify it in the default model switch framework policy for the installation. You may also want to adjust policies for thresholds associated with MIMIX Switch Assistant.

If you do not have a configured MIMIX Model Switch Framework, contact your Certified MIMIX Consultant.

From a 5250 emulator, do the following from the management system:




4. At the Default model switch framework prompt, specify the name of the switch framework to use for switching this installation.


Table 8. Shipped values of policies used by MIMIX Switch Assistant.



Data group definition *INST Name1

1. A data group definition value of *INST indicates the policy is installation-wide. A name indicates the policies are in effect only for the specified data group.

Switch warning threshold 90 *DISABLED

Switch action threshold 180 *DISABLED

Default model switch framework MXMSFDFT *INST

48

Policies for switching with model switch framework

Setting polices for MIMIX Switch Assistant

If the value of the installation-level policy is disabled, you must change the policy in order to use MIMIX Switch Assistant.





4. Specify values for the following fields:

a. For Switch warning threshold, the value 90 is recommended.

b. For Switch action threshold, the value 180 is recommended.

c. For Default model switch framework, specify the name of your MIMIX Model Switch Framework.


Setting policies when MIMIX Model Switch Framework is not used

If you do not use MIMIX Model Switch Framework for switching, you disable the default model switch framework policy at the installation level.





4. At the Default model switch framework prompt, specify *DISABLED.

5. To accept the change, press Enter.

49

Policy descriptions

Policy descriptionsThere are minor differences in the names of policies between user interfaces for a 5250 emulator and Vision Solutions Portal. The names shown here are those used in the 5250 emulator. For a complete description of all policy values, see online help for the command.

Data group definition - Select the scope of the policies to be set. When the value *INST is specified, the policies being set by the command apply to all systems and data groups in the installation, with the exception of any policy for which a data group-level override exists. When a three-part qualified name of a data group is specified, the policies being set by the command apply to only that data group and override the installation-level policy values.

Audit rule - Select the MIMIX rule for which an audit schedule will be set for the specified data group definition. The Audit schedule policy determines when this rule will audit the data group. The audit rule must specify the value *NONE when changing any policy except the audit schedule.

Automatic object recovery — Determines whether to enable functions that automatically start recovery actions to correct detected common object errors that occur during replication from the system journal.

Automatic database recovery — Determines whether to enable functions that automatically start recovery actions to correct detected common file errors that occur during replication from the user journal.

Automatic audit recovery — Determines whether to enable audits to start automatic recovery actions to correct differences detected during their compare phase.

Object recovery notify on success — Determines whether automatic object recovery actions send an informational (*INFO) notification upon successful completion. This policy is only valid when the Automatic object recovery policy is enabled.

Database recovery notify on success — Determines whether automatic database recovery actions send an informational (*INFO) notification upon successful completion. This policy is only valid when the Automatic database recovery policy is enabled.

Audit notify on success — Determines whether activity initiated by audits, including recovery actions, should automatically send an informational (*INFO) notification upon successful completion. If an audit is run when the Automatic audit recovery policy is disabled, successful notifications are sent only for the compare phase of the audit.

Notification severity — Determines the severity level of the notifications sent when a rule ends in error. This policy determines the severity of the notification that is sent, not the severity of the error itself. The policy is in effect whether the rule is invoked manually or automatically.

This policy is useful for setting up an order precedence for notifications at the data group level. For example, if you set this policy for data group CRITICAL to be

50

Policy descriptions

*ERROR when the value for the installation-level policy is *WARNING, any error notifications sent from data group CRITICAL will have a higher severity than those from other data groups.

Object only on target action — Determines how the recovery action for specific audits should handle objects that are configured for replication but exist only on the target system. The following rules check for the only-on-target error: #OBJATR, #IFSATR, #DLOATR, #FILATR, and #FILATRMBR. When the Automatic audit recovery (AUDRCY) policy is enabled, these rules use the value from this policy to attempt recovery for this error.

See “Policies in environments with more than two nodes or bi-directional replication” on page 27 for additional information.

Journaling attribute difference action — Determines the recovery action to take for scenarios in which audits have detected differences between the actual and configured values of journaling attributes for objects journaled to a user journal. This type of difference can occur for the Journal Images attribute and the Journal Omit Open/Close attribute. Differences found on either the source or target object are affected by this policy.

MIMIX configured higherDetermines the recovery for correcting a difference in which the MIMIX configuration specifies an attribute value that results in a higher number of journal transactions than the object's journaling attribute.

MIMIX configured lowerDetermines the recovery action for correcting a difference in which the MIMIX configuration specifies an attribute value that results in a lower number of journal transactions than the object's journaling attribute.

DB apply threshold action — Determines what action to pass to the Compare File Data (CMPFILDTA) command or the Compare Record Count (CMPRCDCNT) command when it is invoked with *DFT specified for its DB apply threshold (DBAPYTHLD) parameter. The command’s parameter determines what to do if the database apply session backlog exceeds the threshold warning value configured for the database apply process. This policy applies whenever these commands are used and the backlog exceeds the threshold.

The shipped default for this policy causes the requested command to end and may cause the loss of repairs in progress or inaccurate counts for members. You can also set this policy to allow the request to continue despite the exceeded threshold.

DB apply cache — Determines whether to use database (DB) apply cache to

improve performance for database apply processes.1 When this policy is enabled, MIMIX uses buffering technology within database apply processes in data groups that specify *YES for journal on target (JRNTGT). This policy is not used by data groups which specify JRNTGT(*NO) or by data groups whose target journals use journal caching or journal standby functionality provided by the IBM feature for High Availability Journal Performance (IBM i option 42).

1. This policy is not available in MIMIX Availability Manager.

51

Policy descriptions

Note: When DB apply cache is used, before and after journal images are sent to the local journal on the target system.This will increase the amount of storage needed for journal receivers on the target system if before images were not previously being sent to the journal.

Access path maintenance — Determines whether MIMIX can optimize access path maintenance during database apply processing as well as the maximum number of jobs allowed per data group when performing delayed maintenance. Enabling optimized access path maintenance improves performance for the database apply process. To make any change to this policy effective, end and restart the database apply processes for the affected data groups.

This policy and the access path maintenance function it controls are available on systems running 7.1.15.00 or higher and replace the parallel AP maintenance (PRLAPMNT) policy and its related function offered in earlier software levels. For more information about either method of optimizing access path maintenance, see the MIMIX Administrator Reference book.

Optimize for DB applySpecify whether to enable optimized access path maintenance. When enabled, the database apply processes are allowed to temporarily change the value of the access path maintenance attribute for eligible replicated files on the target system. Eligible files include physical files, logical files, and join logical files with keyed access paths that are not unique and that specify *IMMED for their access path maintenance.

Maximum number of jobsSpecify the maximum number of access path maintenance jobs allowed for a data group when optimized access path maintenance is enabled. The actual number of jobs varies as needed between a minimum of one job and the specified value. The default value is 99.

Maximum rule runtime — Determines the maximum number of minutes an audit can run when the Automatic audit recovery policy is enabled. The compare phase of the audit is always allowed to complete regardless of this policy’s value. The elapsed time of the audit is checked when the recovery phase starts and periodically during the recovery phase. When the time elapsed since the rule started exceeds the value specified, any recovery actions in progress will end. This policy has no effect on the #MBRRCDCNT audit because it has no recovery phase. The shipped default for this policy of 1440 minutes (24 hours) prevents running multiple instances of the same audit within the same day. Valid values are 60 minutes through 10080 minutes (1 week).

Audit warning threshold — Determines how many days can elapse after an audit was last performed before an indicator is set. When the number of days that have elapsed exceeds the threshold, the indicator is set to inform you that auditing needs your attention. The shipped default value of 7 days is at the limit of best practices for auditing.

Note: It is recommended that you set this value to match the frequency with which you perform audits. It is possible for an audit to be prevented from running for several days due to environmental conditions or the Action for running audit policy. You may not notice that the audit did not run when expected until the

52

Policy descriptions

Audit warning threshold is exceeded, potentially several days later. If you run all audits daily, specify 1 for the Audit warning threshold policy. If you do not run audits daily, set the value to what makes sense in your MIMIX environment. For example, if you run the #FILDTA audit once a week and run all other audits daily, the default value of 7 would cause all audits except #FILDTA to have exposure indicated. The value 1 would be appropriate for the daily audits but the #FILDTA audit would be identified as approaching out of compliance much of the time.

Audit action threshold — Determines how many days can elapse after an audit was last performed before an indicator is set. When the number of days that have elapsed exceeds the threshold, the indicator is set to inform you that action is required because the audit is out of compliance. The shipped default of 14 days is the suggested value for this threshold, which is 7 days beyond the limit of best practices for auditing.

Note: It is recommended that you set this value to match the frequency with which you perform audits. It is possible for an audit to be prevented from running for several days due to environmental conditions or the Action for running audit policy. You may not notice that the audit did not run when expected until the Audit action threshold is exceeded, potentially several days later. If you run all audits daily, specify 1 for the Audit action threshold policy. If you do not run audits daily, set the value to what makes sense in your MIMIX environment. For example, if you run the #FILDTA audit once a week and run all other audits daily, the default value of 14 would cause all audits except #FILDTA to have exposure indicated. The value 2 would be appropriate for the daily audits but the #FILDTA audit would be identified as approaching out of compliance much of the time.

Audit level — Determines the level of comparison that an audit will perform when a MIMIX rule which supports multiple levels is invoked against a data group. The policy is in effect regardless of how the rule is invoked. The amount of checking performed increases with the level number. This policy makes it easy to change the level of audit performed without changing the audit scheduling or rules. No auditing is performed if this policy is set to *DISABLED.

The audit level you choose for audits depends on your environment, and especially on the data compared by the #FILDTA, #DLOATR, and #IFSATR audits. When choosing a value, consider how much data there is to compare, how frequently it changes, how long the audit runs, how often you run the audit, and how often you need to be certain that data is synchronized between source and target systems.

Note: Best practice is to use level 30 to perform the most extensive audit. If you use a lower level, consider its effect on how often you need to guarantee data integrity between source and target systems.

Regardless of the level you use for daily operations, Vision Solutions strongly recommends that you perform audits at audit level 30 before the following events to ensure that 100 percent of the data is valid on the target system:

• Before performing a planned switch to the backup system.

• Before switching back to the production system.

53

Policy descriptions

For additional information, see “Guidelines and considerations for auditing” on page 126 and “Changing auditing policies” on page 41.

Run rule on system — Determines the system on which to run audits. This policy is used when audits are invoked with *YES specified for the value of the Use run rule on system policy (USERULESYS) parameter on the Run Rule (RUNRULE) or Run Rule Group (RUNRULEGRP) command. When *YES is specified in these commands, this policy determines the system on which to run audits. While this policy is intended for audits, any rule that meets the same criteria will use this policy.

The policy’s shipped default value, *MGT, runs audits from the management system. In multi-management environments where both systems defined to a data group are management systems, the value *MGT will run audits only on the target system.

You can also set the policy to run audits from the network system, the source or target system, or from a list of system definitions. When both systems of a data group are in the specified list, the target system is used.

When choosing the value for the Run rule on system policy, also consider your switching needs.

Action for running audits — Determines the type of audit actions permitted when certain conditions exist in the data group. If a condition exists at the time of an audit request, audit activity is restricted to the specified action. If multiple conditions exist and the values specified are different, only the most restrictive of the specified actions is allowed. If none of the conditions are present, the audit requests are performed according to other policy values in effect.

Inactive data groupSpecify the type of auditing actions allowed when any replication process required by the data group is inactive. For example, a data group of TYPE(*ALL) is considered inactive if any of its database or object replication processes is in a state other than active. This element has no effect on the #FILDTA and #MBRRCDCNT audits because these audits can run only when the data group is active.

Repl. process in thresholdSpecify the type of auditing actions allowed when a threshold warning condition exists for any process used in replicating the class of objects checked by an audit1. If a checked process has reached its configured warning value, auditing is restricted to the specified actions. Most audits check for threshold conditions in all database and object replication processes, including the RJ link. #FILDTA audits only check for threshold warning conditions in the RJ link and database replication processes. #DLOATR audits only check for threshold warning conditions in object replication processes.

Audit history retention — Determines criteria for retaining historical information about audit results and the objects that were audited. History information for an audit includes timestamps indicating when the audit was performed, the list of objects that were audited, and result statistics. Each audit, a unique combination of audit rule and

1. This behavior applies to instances running service pack 7.1.12.00 or higher. Instances running earlier services packs check for thresholds on only the database apply and object apply pro-cesses.

54

Policy descriptions

data group, is evaluated separately and its history information is retained until the specified minimum days and minimum runs are both met. When an audit exceeds these criteria, system manager cleanup jobs will remove the historical information for that audit from all systems and will remove the audited object details from the system on which the audit request originated. The values specified at the time the cleanup jobs run are used for evaluation.

Minimum daysSpecify the minimum number of days to retain audit history for each completed audit. Valid values range from 0 through 365 days.The shipped default is 7 days.

Minimum runs per auditSpecify the minimum number of completed audits for which history is to retained. Valid values range from 1 through 365 runs. The shipped default is 1 completed audit.

Object detailsSpecify whether to retain the list of audited objects and their audit status for each completed audit of library-based objects. The specified value in effect at the time an audit runs determines whether object details for that run are retained. The specified value has no effect on cleanup of details for previously completed audit runs. Cleanup of retained details occurs at the time of audit history cleanup. The shipped default is *YES.

DLO and IFS detailsSpecify whether to retain the list of audited objects and their audit status for each completed audit of DLO and IFS objects. The specified value in effect at the time an audit runs determines whether object details for that run are retained. The specified value has no effect on cleanup of details for previously completed audit runs. Cleanup of retained details occurs at the time of audit history cleanup. The shipped default is *YES.

Note: When large quantities of objects are eligible for replication, specifying *YES to retain either Object details or DLO and IFS details may use a significant amount of disk storage. Consider the combined effect of the quantity of replicated objects for all data groups, the number of days to retain history, the number of audits to retain, and the frequency in which audits are performed.

Synchronize threshold size — Determines the threshold, in megabytes (MB), to use for preventing the synchronization of large objects during recovery actions. When any of the Automatic system journal recovery, Automatic user journal recovery, or Automatic audit recovery policies are enabled, all initiated recovery actions use this policy value for the corresponding synchronize command's Maximum sending size (MB) parameter. This policy is useful for preventing performance issues when synchronizing large objects.

Number of third delay retry attempts — Determines the number of times to retry a process during the third delay/retry interval. This policy is used when the Automatic system journal recovery policy is enabled. Object replication processes use this policy value when attempting recovery of an in-use condition that persists after the data group’s configured values for the first and second delay/retry intervals are exhausted. The shipped default is 100 attempts.

55

Policy descriptions

This policy and its related policy, Third delay retry interval, can be disabled so that object replication does not attempt the third delay/retry interval but still allow recoveries for other errors.

Third delay retry interval — Determines the delay time (in minutes) before retrying a process in the third delay/retry interval. This policy is used when the Automatic system journal recovery policy is enabled. Object replication processes use this policy value when attempting recovery of an in-use condition that persists after the data group’s configured values for the first and second delay/retry intervals are exhausted. The shipped default is 15 minutes.

Switch warning threshold — Determines how many days can elapse after the last switch was performed before an indicator is set for the installation. When the number of days that have elapsed exceeds this threshold, the indicator is set to inform you that switching may need your attention. The shipped default is 90 days, which is considered at the limit of best practices for switching.

The indicator is associated with the Last switch field. The Last switch field identifies when the last completed switch was performed using the default model switch framework (DFTMSF) policy.

Switch action threshold — Determines how many days can elapse after the last switch was performed before an indicator is set for the installation. When the number of days that have elapsed exceeds this threshold, the indicator is set to inform you that action is required. The shipped default of 180 days is the suggested value for this threshold, which beyond the limit of best practices for switching.

The indicator is associated with the Last switch field. The Last switch field identifies when the last completed switch was performed using the default model switch framework (DFTMSF) policy.

Default model switch framework — Determines the default MIMIX Model Switch Framework to use for switching. This value is used by configurations which switch via model switch framework. The shipped default value is MXMSFDFT, which is the default model switch framework name for the installation. If the default name is not being used, this value should be changed to the name of the MIMIX Model Switch Framework used to switch the installation.

Independent ASP library ratio — Determines the number for n in a ratio (n:1) of independent ASP libraries (n) on the production system to SYSBAS libraries on the

backup system1. For each switchable independent ASP defined to MIMIX by a device resource group, a monitor with the same name as the resource group checks this ratio. When the number of independent ASP libraries falls to a level that is below the specified ratio, the monitor sends a notification to inform you that action may be required. This signals that your recovery time objective could be in jeopardy because of a prolonged independent ASP switch time.

CMPRCDCNT commit threshold — Determines the threshold at which a request to compare record counts (CMPRCDCNT command or #MBRRCDCNT audit) will not perform the comparison due to commit cycle activity on the source system. The value specified is the maximum number of uncommitted record operations that can exist for

1. The library ratio monitor and the policy it uses require a license key for MIMIX® Global™.

56

Policy descriptions

files waiting to be applied at the time the compare request is invoked. Each database apply session is evaluated against the threshold independently. As a result, it is possible that record counts will be compared for files in one apply session but will not be compared for files in another apply session. For additional information see the MIMIX Administrator Reference book.

Procedure history retention — Specifies criteria for retaining historical information about procedure runs that completed or completed with errors. History information for a procedure includes timestamps indicating when the procedure was run and detailed information about each step within the procedure. Each procedure run, a unique combination of procedure name and application group, is evaluated separately and its history information is retained until the specified minimum days and minimum runs are both met. When a procedure run exceeds these criteria, system manager cleanup jobs will remove the historical information for that procedure run from all systems. The values specified at the time the cleanup jobs run are used for evaluation.

Minimum daysSpecifies the minimum number of days to retain procedure run history. The default value is 7.

Minimum runs per procedureSpecifies the minimum number of completed procedure runs for which history is to retained. This value applies to procedures of all other types except *SWTPLAN and *SWTUNPLAN. The default value is 1.

Min. runs per switch procedureSpecifies the minimum number of completed switch procedure runs for which history is to retained. This value applies to procedures of type *SWTPLAN and *SWTUNPLAN that are used to switch an application group. The default value is 12.

Audit schedule — Determines the scheduling information that MIMIX uses to automatically submit audit requests for the specified data group and rule that will check all objects selected by data group configuration entries. Only configuration entries associated with the specified type of rule are used.

To allow an audit to be automatically submitted, *ENABLED must be specified for

State1. Changes to this policy are effective immediately. If an audit is in progress at the time of the change, the change will be reflected in the next scheduled run of the audit.

Scheduled dates are entered and displayed in job date format. When the job date format is Julian, the equivalent month and day are used to determine when to schedule audit requests.

State1

Specify whether scheduled auditing is enabled or disabled for this data group and audit rule.

1. The State element is available in installations running MIMIX version 7.1.12.00 or higher. In instal-lations running earlier software levels, scheduled auditing requires specifying a value other than *NONE for Frequency and specifying values for Scheduled time and either Scheduled date or Scheduled day. Frequency is qualified by the values specified in the other elements

57

Policy descriptions

FrequencySpecify how often the audit request is submitted. The values specified for other elements further qualify the specified frequency.

Scheduled dateSelect a value or specify a date, in job date format, on which the audit request is submitted.

Scheduled daySelect the day or days of the week on which the audit request is submitted. If today is the day of the week that is specified and the scheduled time has not passed, the audit request is submitted today. Otherwise, the job is submitted on the next occurrence of the specified day. For example, if it is 11:00 a.m. on a Friday when you set the audit schedule to specify Friday for Scheduled day and 12:00:00 for Scheduled time, the audit request is submitted today. If you are setting the policy at 4:00 p.m. on a Friday or at 11:00 a.m. on a Monday, the audit request is submitted the following Friday.

Scheduled time Select a value or specify a time in 24-hour format at which the audit request is submitted on the scheduled date or day. Although the time can be specified to the second, the activity involved in submitting a job and the load on the system may affect the exact time at which the job is submitted.

Time can be specified with or without a time separator.

Without a time separator, specify a string of 4 or 6 digits (hhmm or hhmmss) where hh = hours, mm = minutes, and ss = seconds. Valid values for hh range from 00 to 23. Valid values for mm and ss range from 00 to 59.

With a time separator, specify a string of 5 or 8 digits where the time separator specified for your job is used to separate the hours, minutes, and seconds. If this command is entered from the command line, the string must be enclosed in apostrophes. If a time separator other than the separator specified for your job is used, this command will fail.

Relative day of monthSelect a value or specify one or more numbers with which to qualify what day a monthly audit request is submitted, relative to its occurrence in the month. A relative day is only valid when the schedule Frequency is Monthly and Scheduled day is a value other than None.

For example, if Frequency is Monthly, Scheduled day is Tuesday and Thursday, and Relative day of month is 1, the audit request is submitted on the first Tuesday and first Thursday of every month. If both 1 and 4 are specified for relative day, the audit request is submitted on the first Tuesday, first Thursday, fourth Tuesday, and fourth Thursday of the month.

Priority audit — Determines when priority-based audit requests for the specified data group and rule are allowed to automatically start and how often replicated objects are eligible for auditing based on their priority classification. The #DGFE rule does not support priority auditing.

To allow priority-based auditing to be performed, *ENABLED must be specified for

State.1 Changes to this policy are effective immediately. If an audit is in progress at

58

Policy descriptions

the time of the change, the change will be reflected in the next priority-based run of the audit.

State1

Specify whether priority auditing is enabled or disabled for this data group and audit rule.

Start afterSelect a value or specify a time after which priority-based audits are allowed to start. This is the beginning of a range of time during which priority-based audits can start each day. The value *ANY allows priority-based audits to run repeatedly throughout the day.

Note: Times specified for Start after and Start until elements is in 24-hour format and can be specified with or without a time separator. Without a time separator, specify a string of 4 or 6 digits (hhmm or hhmmss) where hh = hours, mm = minutes, and ss = seconds. Valid values for hh range from 00 to 23. Valid values for mm and ss range from 00 to 59. With a time separator, specify a string of 5 or 8 digits where the time separator specified for your job is used to separate the hours, minutes, and seconds. If this command is entered from the command line, the string must be enclosed in apostrophes. If a time separator other than the separator specified for your job is used, this command will fail.

Start untilSpecify the end of the time range during which priority-based audits are allowed to start. Priority-based audits can start until this time. This value is ignored when Start after is *ANY.

New objects selectedSelect the frequency at which new objects are considered for auditing. A new object is one that has not been audited since it was created.

Changed objects selectedSelect the frequency at which changed objects are considered for auditing. A changed object is one that has been modified since the last time it was audited.

Unchanged objects selectedSelect the frequency at which unchanged objects are considered for auditing. An unchanged object is one that has not been modified since the last time it was audited.

Audited with no diff.Select the frequency at which objects with no differences are considered for auditing. An object with no differences is one that has not been modified since the last time it was audited and has been successfully audited on at least three consecutive audit runs.

1. The State element is available in installations running MIMIX 7.1.12.00 or higher. In installations running earlier software levels, priority auditing requires a value other than *NONE for Start after.

59

Checking application group status

CHAPTER 3 Checking status in environments with application groups

Monitoring status of environments that use application groups begins at the level of the application group and may include investigation into additional displays for more detailed information. The following displays are typically used:

• Work with Procedure Status (WRKPROCSTS command)

• Work with Application Groups (WRKAG command)

• Work with Node Entries (WRKNODE command)

• Work with Data Rsc. Grp. Ent. (WRKDTARGE command)

• Work with Data Groups (WRKDG command)

Note: This chapter does not include status for application groups that are configured for an IBM i clustering environment. If you are using clustering or have MIMIX® Global™ configured, see the MIMIX Operations with IBM i Clustering book for status information within a clustering environment.

Checking application group statusThe status view of the Work with Application Groups display provides a summary of all status associated with an environment configured with application groups.

1. Do one of the following to access the Work with Application Groups display:

• Select option 1 (Work with application groups) from the MIMIX Basic Main Menu.

• Select option 5 (Work with application groups) from the MIMIX Intermediate Main Menu.

• Enter the command: WRKAG

2. If necessary, use F10 to access the status view.

Figure 3. Status view of Work with Application Groups display

60


All status columns except the App Status column are summations of multiple processes. Investigation into lower-level displays may be necessary to determine the cause of a problem.

Ideal status conditions exist when the fields and columns have the following values:

• The Monitors field is *ACTIVE.

• The Notifications field is *NONE.

• The Proc. Status column is *COMP.

• For a non-cluster application group, the App Node Status and Repl. Status fields are *ACTIVE. The App Status, Data Rsc Grp Status, and Data Node Status columns will always be blank.

For any other status values, see the following:

• “Resolving problems reported in the Monitors field” on page 61

• “Resolving problems reported in the Notifications field” on page 63

• “Resolving problems reported in Status columns” on page 64

Resolving problems reported in the Monitors field

The Monitors field located in the upper right corner of the Work with Application Groups display summarizes the status of the MIMIX monitors on the local system. Each node or system in the product configuration has MIMIX monitors which run on that system to check for specific potential problems. A status of *ACTIVE indicates that all enabled monitors on the local system are active.

Work with Application Groups System: SYSA Monitors . . . . . : *ACTIVE Notifications . . : *NONE Type options, press Enter. 1=Create 2=Change 4=Delete 5=Display 6=Print 9=Start 10=End 12=Node entries 13=Data resource groups 15=Switch App App App Node Data Rsc Data Node Repl. Proc. Opt Group Status Status Grp Status Status Status Status __ __________ __ SAMPLEAG *ACTIVE *ACTIVE *COMP Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Create F9=Retrieve F10=View config F12=Cancel F13=Repeat F18=Subset F23=More options F24=More keys

61


Table 9 shows possible status values for the Monitors field that require user action. For a complete list of possible values, press F1 (Help).

Do the following:

1. Press F14 (Monitors) to display the list of monitors on the local system on the Work with Monitors display.

2. Check the Status column for status values of FAILED, FAILED/ACT, and INACTIVE.

3. If the monitor is needed on the local system as indicated in Table 10, use option 9 (Start) to start the monitor.

Table 9. Monitor field status values that may require user action

Monitor Status Description

*ATTN Either one or more monitors on the local system failed or there are both active and inactive monitors on the local system.

*INACTIVE All enabled monitors on the local system are inactive.

Table 10. Possible monitors and the nodes on which they should be active

Monitor When and Where Needed

journal-name - remote journal link monitor

Checks the journal message queue for indications of problems with the remote journal link. A monitor exists for both the local and remote system of the RJ link.

Primary node and the current Backup node of application groups which perform logical replication.

MMIASPMON - independent ASP threshold monitor

Checks the QSYSOPR message queue for indications that the independent ASP threshold has been exceeded. This monitor improves the ability to detect overflow conditions that put your high availability solution at risk due to insufficient storage.

On all nodes which control an independent ASP.

MMNFYNEWE - monitor for new object notification entries

Monitors the source system for the newly created libraries, folders, or directories that are not already included or excluded for replication by a data group configuration.

Primary node when the application group is configured for logical replication.

62


Resolving problems reported in the Notifications field

The Notifications field located in the upper right corner of the Work with Application Groups display summarizes the status of notifications that exist for the MIMIX installation. Notifications are sent by MIMIX processes, such as monitors or audits, to inform you of potential problems. A value of *NONE indicates that no new notifications exist.

Table 11 shows possible status values for the Notifications field that require user action.

Do the following:

1. Press F15 (Notifications) to display the list of notifications for the installation on the Work with Notifications display.

2. Use option 5 (Display) to view any notifications with a status of *NEW.

3. Take any further action indicated to resolve the problem.

4. When the problem is resolved, use either option 46 (Acknowledge) or option 4 (Remove) to address the notification itself. Notifications can only be removed from the system on which they originated.

short-data-group-name_PAPM - Parallel access path maintenance group monitor. When this monitor exists, there are always associated monitors of one of the following types:

• short-data-group-namePAPMnnn - Parallel access path maint monitor nnn

• short-data-group-nameJobname - Parallel access path maint monitor job-name

Target node of data group replication processes when Parallel access path maintenance policy has been enabled.

Note: These monitors and the policy which enables them are only available on systems running software levels earlier than 7.1.15.00. The replacement for this function on systems running 7.1.15.00 or higher does not use monitors. For more information about optimizing access path maintenance, see the MIMIX Administrator Reference book.

Table 10. Possible monitors and the nodes on which they should be active

Monitor When and Where Needed

Table 11. Notification field status values that may require user action

Notification

Status

Description

*ERROR Action is required. At least one new notification exists with a severity of *ERROR.

*WARNING At least one new notification exists with a severity of *WARNING, which indicates that the operation may be successful but an error exists. There are no new notifications with a severity of *ERROR.

*INFO At least one new notification exists with a severity of *INFO. There are no new notifications with severity of *ERROR or *WARNING.

63


Resolving problems reported in Status columns

Except for the App Status column, all other columns on the Work with Application Groups display represent summations of status for multiple nodes or multiple data resource groups associated with the application groups. Investigation into lower-level displays may be necessary to determine the cause of the problem.

Troubleshooting Tip: When investigating problems, begin with the Proc. Status column. A problem with procedure status can affect values in other columns. When any procedure status problems are resolved, refresh the display. Then check the other columns beginning the left-most column that is reporting a problem. Resolve the most severe problem in that column first, then refresh the display. Investigate problems in the remaining columns from left to right.

To address the most common problems with status for application groups, do the following:

1. Resolve any problems reported in the Proc. Status column using Table 12.

2. Resolve any *ATTN status problems first, using Table 13.

3. Then address less severe problems, using Table 14.

For a complete list of status values for each column, press F1 (Help).

Resolving a procedure status problem

The Proc. Status column represents a summary of the most recent run of all procedures defined for the application group.

Note: The status *COMP indicates that the most recently started run of each procedure for the application group has completed as directed. This includes procedures that completed with errors and cancelled or failed procedures

Table 12. Procedure Status values that require attention

Column Value Description and Action

*ACTIVE One or more of the last started runs of the procedures to run are still active or queued. Wait for the procedure to complete. Do not attempt to correct other status problems reported on the display until the procedure completes. Use option 21 (Procedure status) to view the status of the last started runs of procedures for the application group.

*ATTN One or more of the last started runs of the procedures for the application group have a status that requires attention. Use option 21 (Procedure status) to view the status of the last started runs of the procedures for the application group. The resulting procedures shown on the Work with Procedure Status display. which have status values of *ATTN, *CANCELED, *FAILED, *MSGW, or *PENDCNL require user action. Also, it may be necessary to check status of the steps within the procedure to resolve a step problem before the procedure can continue.

Do not attempt to correct other status problems reported on the Work with Application Groups display until the procedure problems have been resolved. For detailed information, see “Working with status of procedures and steps” on page 77.

64


whose status have been acknowledged by user action. For any individual procedure that completed with errors, user action is recommended to investigate the cause of the error and assess its implications.

Resolving an *ATTN status for an application group

The value *ATTN can appear in each column of the Work with Application Groups display to indicate that user action is required to correct a problem.

Important! Check the status of the Proc. Status column and address any problem indicated by *ATTN or *ACTIVE status before attempting to resolve any problem reported in other columns. Use “Resolving a procedure status problem” on page 64.

If there are no procedure problems, each of the other columns with an *ATTN status must be addressed individually, starting from the left-most column.

Table 13. Resolving *ATTN status for columns (except Proc. Status) on the Work with Appli-cation Groups display

*ATTN Status

in Column

Description and Actions for *ATTN Status

App Node Status

The App Node Status column is a summary of the status of the nodes associated with the application group. The status includes the MIMIX system manager, journal manager, target journal inspection, and collector services jobs for the nodes in the application group.

*ATTN indicates that the node status and the MIMIX manager status values do not match. Investigate the status of the associated nodes and MIMIX managers using option 12 (Node entries). For additional information see “Status for Work with Node Entries” on page 66.

Replication Status

The Replication Status column is a summary status of data replication activity for the data resource groups associated with an application group.

*ATTN indicates that data replication for at least one data group for the data resource groups has a status that does not match the status of the appropriate data resource group, has a failed state, an error condition, is active with an incorrect source system, has audit errors, or has pending recoveries. To determine the cause, use option 13 (Data resource groups) to identify the data resource group where the problem exists. For more information, see “Status for Work with Data Resource Group Entries” on page 68.

65

Status for Work with Node Entries

Resolving other common status values for an application group

Table 14 lists other common problems with application group status and identifies how to begin to their resolution.

Status for Work with Node EntriesThe Work with Node Entries displays a list of the nodes associated with an application group or a data resource group. The Resource group and Type fields at the top of the display indicate what the nodes are associated with.

Figure 4. Status view of Work with Node Entries display for an application group which does

Table 14. Other problem statuses which may appear in multiple columns on the Work with Application Groups display

Column

Value

Description and Action

*ATTN Each column has a unique recovery. See “Resolving an *ATTN status for an application group” on page 65.

*INACTIVE The current status of the resource group or node is inactive. This status is possible in the Repl. Status column.

• If all columns with a status value are *INACTIVE, the application group may have been ended intentionally. Use option 9 (Start) to start the application group.

• If this value appears only in the App Node Status column, the application resource group nodes are all inactive and all MIMIX manager jobs are also inactive. Use option 12 (Node entries) to investigate further. For more information see “Status for Work with Node Entries” on page 66.

• If this value appears only in the Repl. Status column, logical replication is not active. Use option 13 (Data resource groups) to investigate. For more information see “Status for Work with Data Resource Group Entries” on page 68.

*UNKNOWN The current status is unknown. The local node is a network node in a non-cluster application group and does not participate in the recovery domain. Its status cannot be determined.

When this status appears in all columns, do one of the following:

• Enter the command WRKSYS. On the Work with Systems display, check the status of Cluster Services for the local system definition. If necessary, use option 9 (Start) to start cluster services.

• Sign on to a node that is active and use the WRKAG command to check the application group status. If the status is still *UNKNOWN, use option 12 (Node entries) to check the status of Cluster Services on the node.

66

Status for Work with Node Entries

not participate in a cluster

For each node listed, check the Manager Status column for status values that require attention. For a complete list of status values for each field and column, press F1 (Help).

Manager Status - This column indicates the status of all of the MIMIX system manager, journal manager, target journal inspection, and collector services jobs for the specified node.

Work with Node Entries System: SYSA Application group . . . . . : SAMPLEAG Type options, press Enter. 1=Add 2=Change 4=Remove 5=Display 6=Print 9=Start 10=End -------------Current------------- Manager Opt Node Role Sequence Data Provider Status __ ________ __ SYSB *PRIMARY *PRIMARY *ACTIVE __ SYSA *BACKUP 1 *PRIMARY *ACTIVE Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F7=Systems F9=Retrieve F10=View config F11=Sort by node F12=Cancel F18=Subset F24=More keys

Table 15. Manager Status values that require user action.

Status

Value


*ATTN At least one of the system manager, journal manager, target journal inspection, or collector services jobs for the node has failed.

When all the nodes listed do not have the same value, use F7 (Systems) to access the Work with Systems display.

• Check the status of the system and journal managers, target journal inspection, and collector services.

• Use option 9 (Start) to start the managers and services that are not active on the node.

*INACTIVE All system manager, journal manager, target journal inspection, and collector services jobs for the specified node are inactive. This may be intentional when MIMIX is ended to perform certain activities.

Use F7 (Systems) to access the Work with Systems display.

67

Status for Work with Data Resource Group Entries

Status for Work with Data Resource Group EntriesThe Work with Data Resource Group Entries display lists the data resource groups associated with an application group. Each entry identifies a data resource group and the summary of the replication status from its associated data groups.

Figure 5. The Work with Data Resource Group Entries display for an application group that does not participate in a cluster

Resource Group Status - This column identifies the status of the data resource group. In environments that do not include IBM i clustering, this column is always blank.

Node Status - This column identifies the status of the nodes for the data resource group. In environments that do not include IBM i clustering, this column is always blank

Replication Status - The value in this column is a summary status of data replication activity for the data resource group. The status includes the status of all data group processes, replication direction, replicated object and file entries, audits, and recoveries.

Work with Data Rsc. Grp. Ent. System: SYSA Application group . . . . . : SAMPLEAG Type options, press Enter. 1=Add 2=Change 4=Remove 5=Display 6=Print 8=Data groups 9=Start 10=End 12=Node entries 14=Build environment 15=Switch Resource Resource Group Node Replication Opt Group Type Status Status Status __ __________ __ AGRSGRP *DTA *ACTIVE Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F9=Retrieve F12=Cancel F13=Repeat F18=Subset F19=Load F21=Print list

68

Status for Work with Data Resource Group Entries

Figure 16 identifies status values for replication that require user action.

Table 16. Replication Status values that require user action.

Status Value Description and Action

*ATTN One or more of the following problems exist for data groups within the data resource group.

• The source system of an active data group is not the primary node of of its application group.

• A data group has a failed state, an error condition, audit errors, or pending recoveries.

To prevent damage to data in your environment, it is important that you begin by determining which system should be the source for the data groups. Do the following:

1. From this display, use option 8 (Data groups) to check which system is the current source for the data group.

2. Determine which node has the role of current primary for the application group. From the Work with Application Groups display, use option 12 (Node entries), then check the current node role.

If the current primary node is correct and a data group with an incorrect source system is active, end the data group and contact CustomerCare.

If the data groups in question have the correct source system but the primary node for the application group is not correct, you need to change the recovery domain for the application group to make the correct node become primary. Use “Changing the sequence of backup nodes” on page 71.

Once you have ensured that the data groups have the correct source system, resolve any error conditions reported on the Work with Data Groups display.

Note: Not all data groups should necessarily be active. Only the data groups currently being used for data replication should be active. You will need to look at the current node roles and data providers for the node entries to determine which data groups should be active.

*INACTIVE All replication in the data resource group is inactive. This may be normal if replication was ended to perform certain activities. Use option 8 (Data groups) to access the Work with Data Groups display.

69

Verifying the sequence of the recovery domain

Verifying the sequence of the recovery domainEnsuring that sequence of the current backup nodes is set properly is critical to a successful and predictable switch process. The current sequence of backup nodes should match your recovery guidelines.

Do the following to confirm the sequence of the current backup nodes before performing a switch and before removing or restoring a backup node from the cluster.

1. From the MIMIX Intermediate Main Menu, type 5 (Work with application groups) and press Enter.

2. From the Work with Application Groups display, type 12 (Node entries) next to the application group you want and press Enter.

3. The Work with Node Entries display appears, showing current information for the nodes. Confirm that the current backup nodes have the sequence order that you expect.

Note: It is important that you are viewing current information on the status view of the display. Figure 6 shows an example of how the resulting Work with Node Entries display appears with current status information. If you see configured information instead, press F10 (View status).

4. If you need to change the sequence of current backup nodes, use “Changing the sequence of backup nodes” on page 71.

Figure 6. Example of displaying the current sequence information for backup nodes

Work with Node Entries System: NODED Application group . . . . . : APP1 Type options, press Enter. 1=Add 2=Change 4=Remove 5=Display 6=Print 9=Start 10=End -------------Current------------- Manager Opt Node Role Sequence Data Provider Status __ ________ __ NODEA *PRIMARY *NONE *ACTIVE __ NODEB *BACKUP 1 NODEA *ACTIVE __ NODEC *BACKUP 2 NODEA *ACTIVE __ NODED *BACKUP 3 NODEA *ACTIVE Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F7=Systems F9=Retrieve F10=View config F11=Sort by node F12=Cancel F18=Subset F24=More keys

70

Changing the sequence of backup nodes

Changing the sequence of backup nodesUse this procedure if you need to change the sequence of the current backup nodes. This procedure may change the configured sequence for multiple nodes so that you can achieve the desired sequence for backup nodes. The changes are not effective until Step 5 is performed.

Do the following from an active application group:

1. From the Work with Application Groups display, type 12 (Node entries) next to the application group you want and press Enter.

2. The Work with Node Entries display appears. Using F10 to toggle between configuration view and status views, confirm that the node with the configured role of *PRIMARY is the same node that is shown as the current *PRIMARY role.

• If the same node is identified as *PRIMARY for the current role and the configured role, skip to Step 4.

• If the configured *PRIMARY node does not match the current *PRIMARY node, perform Step 3 to correct this situation.before making any changes to the configured sequence of backup nodes.

Figure 7 is an example of how configuration information appears on the Work with Node Entries display.

3. Perform this step only if you need to correct the configured primary node to match the current primary node. This step will demote the configured primary node to a backup, then promote the correct node to become the configured primary node. Do the following:

a. From the configuration view of the Work with Node Entries display, type 2 (Change) next to the configured primary node and press Enter.

b. On the Change Node Entry (CHGNODE) display, specify *BACKUP for Role and press Enter. Then specify *FIRST for List position and press Enter.

c. On the Work with Node Entries display, press F5 (Refresh) to view changes. All nodes in the configured view should have *BACKUP roles.

d. If necessary toggle to the status view to confirm which node is the current primary node. Type 2 (Change) next to the current primary node and press Enter.

e. On the Change Node Entry (CHGNODE) display, specify *PRIMARY for Role and press Enter. Then press Enter two more times.

f. On the Work with Node Entries display, press F5 (Refresh) to view changes. You should see the correct node as the configured primary node.

Note: The numbering for the backup sequence may not update; however, the relative order for the configured backup sequence remains unchanged. Gaps in configured sequence numbers are ignored when switching to a backup. As long as the relative order is correct, it is not necessary to change the configured sequence of backup nodes just to remove gaps in numbering.

71


g. If the configured backup sequence is what you expect, skip to Step 5 to make the change effective.

4. To change the sequence of backup nodes, do the following:

a. From the configured view of the Work with Node Entries display, type 2 (Change) next to the backup node whose sequence you want to change.

b. On the Change Node Entry (CHGNODE) display, specify *BACKUP for Role and press Enter. Then specify either *FIRST or a number for List position and press Enter.

Note: If you specify a number, it cannot already be used in the configured sequence list.

c. On the Work with Node Entries display, press F5 (Refresh) to view changes.

d. Repeat Step 4 until the correct sequence is shown on the configuration view.

Note: Gaps in configured sequence numbers are ignored when switching to a backup. For example, in a configuration with two backup nodes, there is no operational difference between a backup sequence of 1, 2 and a backup sequence of 2, 5 as long as the same nodes are specified in the same relative order.

5. To make the changes to the backup order effective, do the following:

a. Press F12 (Cancel) to return to the Work with Application Groups display.

b. Type 9 (Start) next to the application group you want and press F4 (Prompt).

c. On the Start Application Group (STRAG) display, specify *CONFIG for Current node roles and press Enter.

d. The Procedure prompt appears. If needed, specify a different value and then press Enter.

6. Confirm that the node entries have changed. Type 12 (Node entries) next to the application group and press Enter. If necessary, use F10 to access the status view. The current backup nodes should be in the new order.

72


Figure 7. Example of displaying the configured sequence information for backup nodes:

Examples of changing the backup sequence

The following examples illustrate problems with the current backup sequence and how to correct them.

Example 1 - Changing the backup sequence when primary node is ok

Table 17 shows a four-node environment where the current backup sequence does not reflect the desired behavior in the event of a switch. Also, the relative order of the

Work with Node Entries System: NODED Application group . . . . . : APP1 Type options, press Enter. 1=Add 2=Change 4=Remove 5=Display 6=Print 9=Start 10=End -----------Configured------------ Opt Node Role Sequence Data Provider __ ________ __ NODEA *PRIMARY *PRIMARY __ NODEB *BACKUP 1 *PRIMARY __ NODEC *BACKUP 4 *PRIMARY __ NODED *BACKUP 5 *PRIMARY Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F7=Systems F9=Retrieve F10=View status F11=Sort by node F12=Cancel F18=Subset F24=More keys

73


configured backup sequence does not match the relative order of either the current sequence or the desired sequence.

Each row in Table 18 shows a change to be made to the nodes on the configured view of the Work with Node Entries display. The rows are in the order that the changes need to occur to correct this example configuration to the desired order.

Table 17. Example 1, showing discrepancies in backup sequences

Desired Order Initial Values, Example 1

Work with Node Entries

Status View -----------Current---------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODEB *BACKUP 1 *PRIMARY __ NODED *BACKUP 2 *PRIMARY __ NODEC *BACKUP 3 *PRIMARY

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODEC *BACKUP 1 *PRIMARY __ NODEB *BACKUP 2 *PRIMARY __ NODED *BACKUP 3 *PRIMARY

Table 18. Order in which to change nodes to achieve the desired configuration for example 1

Node to

Change

Change To Effect on Configured Order, Example 1 Notes

NODEB Role = *BACKUP Position = *FIRST

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODEB *BACKUP 1 *PRIMARY __ NODEC *BACKUP 2 *PRIMARY __ NODED *BACKUP 3 *PRIMARY

Intermediate step

74


Example 2 - Correcting the configured primary node and changing the backup sequence

Table 19 shows a four-node environment where the current backup sequence does not reflect the desired behavior in the event of a switch. Also, the current and configured primary node do not match.The configured primary node must be corrected first, before attempting to correct any backup node sequence problems.

NODED Role = *BACKUP Position = *FIRST

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODED *BACKUP 1 *PRIMARY __ NODEB *BACKUP 2 *PRIMARY __ NODEC *BACKUP 3 *PRIMARY

Desired configuration but it is not effective until STRAG ROLE (*CONFIG) is performed.

Table 18. Order in which to change nodes to achieve the desired configuration for example 1

Node to

Change


Table 19. Example 2, showing discrepancies in primary node and backup sequences

Desired Order Initial Values, Example 2

Work with Node Entries

Status View -----------Current---------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODEB *BACKUP 1 *PRIMARY __ NODED *BACKUP 2 *PRIMARY __ NODEC *BACKUP 3 *PRIMARY

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEB *PRIMARY *PRIMARY __ NODEC *BACKUP 1 *PRIMARY __ NODEA *BACKUP 2 *PRIMARY __ NODED *BACKUP 3 *PRIMARY

75


Each row in Table 20 shows a change to be made to the nodes on the configured view of the Work with Node Entries display. The rows are in the order that the changes need to occur to correct this example configuration to the desired order.

Table 20. Order in which to change nodes to achieve the desired configuration for example 2.

Node to

Change


NODEB Role = *BACKUP Position = *FIRST

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEB *BACKUP 1 *PRIMARY __ NODEC *BACKUP 2 *PRIMARY __ NODEA *BACKUP 3 *PRIMARY __ NODED *BACKUP 4 *PRIMARY

Intermediate step

NODEA Role = *PRIMARY

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODEB *BACKUP 1 *PRIMARY __ NODEC *BACKUP 2 *PRIMARY __ NODED *BACKUP 3 *PRIMARY

Intermediate step, corrects configured *PRIMARY.

The sequence number for Backup 3 may appear as 4. The relative order is equivalent.

NODED Role = *BACKUP Position = *FIRST

Configured View -----------Configured------------Opt Node Role Sequence Data Provider__ ________ __ NODEA *PRIMARY *PRIMARY __ NODED *BACKUP 1 *PRIMARY __ NODEB *BACKUP 2 *PRIMARY __ NODEC *BACKUP 3 *PRIMARY

Desired configuration but it is not effective until STRAG ROLE (*CONFIG) is performed.

76

CHAPTER 4 Working with status of procedures and steps

This chapter describes how to work with procedures and steps. Procedures are used to perform operations for application groups. All procedures are associated with an application group. This chapter does not apply to configurations that do not use application groups.

When working with status of procedures and steps, it is important to understand how multiple jobs are used to process the steps in a procedure. A procedure uses multiple asynchronous jobs to run the programs identified within its steps. Starting a procedure starts one job for the application group and an additional job for each of its data resource groups. These jobs operate independently and persist until the procedure ends. Each persistent job evaluates each step in sequence for work to be performed within its domain. When a job for a data resource group encounters a step that acts on data groups, it spawns an additional job for each subordinate data group. Each spawned data group job performs the work for that step and then ends.

This chapter contains the following topics:

• “Displaying status of procedures” on page 78 describes how to display the status of procedure runs, including the most recent run as well as runs kept for their status history.

• “Resolving problems with procedure status” on page 80 describes the conditions which cause each procedure status value and the actions required to resolve problem statuses. This includes how to resolve procedure inquiry messages and failed or canceled procedures.

• “Displaying status of steps within a procedure run” on page 83 describes how to display status of steps within a procedure as well as the differences between the collapsed and expanded views of the Work with Step Status display.

• “Resolving problems with step status” on page 85 describes the conditions which cause each step status value and the actions required to resolve problem statuses. This includes how to resolve step inquiry messages and failed or canceled steps.

• “Acknowledging a procedure” on page 89 describes how to manually change a procedure with a status of *CANCELED, *FAILED, or *COMPERR to an acknowledged status.

• “Running a procedure” on page 90 describes how to start a user procedure and the parameter that controls the step at which the procedure begins.

• “Canceling a procedure” on page 92 describes how to cancel an active procedure.

77

Displaying status of procedures

Displaying status of proceduresYou can view the status of runs of procedures from the Work with Procedure Status display. The term “the last run” of a procedure refers to the most recently started run of a procedure, which may be in progress or may have completed. Also, the status of other previously performed runs of procedures may be available, subject to the current settings of the Procedure history retention policy.

The Work with Procedure Status display lists procedures in reverse chronological order so that the most recently started procedures are at the top of the list. Procedures that have never been requested to run do not appear on this display.

Figure 8 shows an example of the Work with Procedure Status display subsetted to show only runs of a specific procedure and application group.

F11 toggles between views that show the Start time column and columns for the Duration of the procedure and the Node on which the procedure was started.

Timestamps are in the local job time. If you have not already ensured that the systems in your installation use coordinated universal time, see “Setting the system time zone and time” on page 313.

Figure 8. A subsetted view of the Work with Procedure Status display.

Displaying status of the last run of all procedures

To display the status of the last run of all procedures for an application group, do the following:

1. From the MIMIX Basic Main Menu, select option 1 (Work with application groups).

2. The Work with Application Groups display appears. Type 21 (Procedure status)

Work with Procedure Status System: SYSTEMA Type options, press Enter. 5=Display 6=Print 8=Step status 9=Run 11=Display message 12=Cancel 13=Change status 14=Resume Opt Procedure App Group Type Status ---Start Time---- __ SWTPLAN SAMPLEAG *SWTPLAN *COMPLETED 03/01/10 11:25:05 __ SWTPLAN SAMPLEAG *SWTPLAN *COMPLETED 03/01/10 11:04:58 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Duration F12=Cancel F13=Repeat F18=Subset F21=Print list

78

Displaying status of procedures

next to the application group you want and press Enter.

The last run of all procedures for the application group are listed on the Work with Procedure Status display.

3. Locate the procedure you want and check the value of the Status column.

Displaying available status history of procedure runs

To display status of all available runs of a selected procedure, do the following:

1. From the MIMIX Basic Main Menu, select option 1 (Work with application groups).

2. The Work with Application Groups display appears. Type 20 (Procedures) next to the application group you want and press Enter.

3. The Work with Procedures display appears, listing all procedures for the selected application group. Type 14 (Procedure status) next to the procedure you want and press Enter.

All available runs for the selected procedure are listed on the Work with Procedure Status display. The most recently started procedure runs are at the top of the list, and may still be active.

4. Locate the run of the procedure you want and check the value of the Status column.

Note: To view status of all runs of all procedures for all application groups, you can either press F20 (Procedure status) from the Work with Application Groups display, press F14 (Procedure status) from the Work with Procedures display, or enter the command: WRKPROCSTS.

79

Resolving problems with procedure status

Resolving problems with procedure statusTable 21 identifies the possible status values that can appear on the Work with Procedure Status display and identifies the action to take to resolve reported problems.

Table 21. Procedure status values with action required

Category Status Value Description and Action Required

Active *ACTIVE The procedure is currently running. No steps require attention.

*ATTN The procedure requires attention. Either there is a step with a status of *MSGW, or there is an active step and one or more steps with step status values of *ATTN, *CANCEL, *FAILED, or *IGNERR.

Action Required: Determine the status of each step and the action required to correct that status. See “Resolving problems with step status” on page 85.

*MSGW A step within the procedure is waiting for a response to an inquiry message. The procedure cannot process the step or any subsequent steps without a reply to the message.

Action Required: Display and respond to the inquiry message using “Responding to a procedure in *MSGW status” on page 81.

*PENDCNL A request to cancel the procedure is in progress. When the activity for the steps in progress at the time of the cancel request ends, the procedure status changes to *CANCELED.

*QUEUED A request to run the procedure is currently waiting on the job queue. When the procedure becomes an active job, the procedure status changes to *ACTIVE.

Resumable *CANCELED Either the procedure was canceled and did not complete, or steps within the procedure were canceled as a response to inquiry messages from the steps. The procedure was partially performed.

Action Required: Use “Resolving a *FAILED or *CANCELED procedure status” on page 82 to determine the state of your environment and whether to resume the procedure or to acknowledge its status.

*FAILED The procedure failed. Jobs for one or more steps had errors. Those steps were configured to end if they failed. The procedure was partially performed.

Action Required: Use “Resolving a *FAILED or *CANCELED procedure status” on page 82 to determine the state of your environment and whether to resume the procedure or to acknowledge its status

80


Responding to a procedure in *MSGW status

A procedure in *MSGW status is effectively paused at a known point in its processing as a result of a runtime attribute on one of its steps. The procedure sent an inquiry message because a step specified *MSGW for its Action before step (BEFOREACT) attribute. All jobs for the procedure have completed processing all previous steps and are waiting to run the step’s program. An operator response is required.

To respond to a procedure in *MSGW status, do the following from the Work with Procedure Status display:

1. To see which step is waiting, type 8 (Step status) next to the procedure and press Enter.

2. The Work with Step Status display appears. The information on this display can be used to determine which step is waiting to start. You will see steps with values of *COMP, *IGNERR, or *DSBLD followed by no status for all remaining steps. The first step with no status is the step that is waiting to start. Based on that step, determine how to respond to the message and whether you are ready to respond.

3. You cannot display or respond to the procedure message from the Work with Step Status display. Press F12 to return to the Work with Procedure Status display.

4. Type 11 (Display message) next to the procedure in *MSGW status and press Enter.

5. You will see the message “Procedure name for application group name requires response. (G C).” Do one of the following:

Acknowledged *ACKCANCEL The procedure was canceled and a user action acknowledged the cancellation so that the procedure can no longer be resumed.

*ACKFAILED The procedure failed and a user action acknowledged the failure so that the procedure can no longer be resumed.

*ACKERR The procedure completed with errors and a user action acknowledged the procedure. It is assumed that the user reviewed the steps with errors. A status of completed with errors is only possible when the steps with errors had been configured (within the procedure) to ignore errors or a user’s response to a step in message wait status was to ignore the error and continue running the procedure. After the step is acknowledged, the procedure status changes to *ACKERR.

Completed *COMPERR The procedure completed with errors. One or more steps had errors and were configured to continue processing after an error.

Action Recommended: Investigate the cause of the error and assess its implications.

*COMPLETED The procedure completed successfully.

Table 21. Procedure status values with action required

Category Status Value Description and Action Required

81


• A response of G (Go) is required to start processing the step. Type G and press Enter.

• A response of C (Cancel) will cancel the procedure. Type C and press Enter.

Resolving a *FAILED or *CANCELED procedure status

When a procedure fails or is canceled, subsequent attempts to run the same procedure will fail until user action is taken. You need to determine the best course of action for your environment based on the implications of the partially performed procedure. This topic will assist you in evaluating the cause of the failure or cancellation, as well as the state of other steps within the procedure.

Important! Steps with failed or canceled jobs need to be resolved. Other asynchronous jobs may have successfully processed the same step and continued on to process other subsequent steps before the procedure ended. The actions taken by those steps as well as by completed steps which preceded the problem are not reversed. Some steps may not have been processed at all.

Do the following from the Work with Procedure Status display:

1. Type 8 (Step status) next to the *FAILED or *CANCELED run of the procedure and press Enter.

2. The Work with Step Status display appears. Look for steps with a status of *CANCEL, *FAILED, or *ATTN. Also use F7 (Expand) to see status for the jobs which processed the steps.

A procedure with *FAILED status did not complete due to errors. In the collapsed status view, one or more steps will have a status *ATTN or *FAILED. Other jobs may have processed subsequent steps before the procedure ended. In the expanded view, look for one or more jobs with a status of *FAILED. For detailed information use “Resolving problems with step status” on page 85.

A procedure with *CANCELED status did not complete due to user action. Any of the following may have occurred:

• A user cancelled an inquiry message sent by the procedure because a step was configured to wait for a reply before starting. This scenario is identified by the absence of steps with status values of *FAILED, *CANCEL, or *ATTN. Instead, you will see steps with values of *COMP, *IGNERR, or *DSBLD followed by no status for all remaining steps. The first step with no status is the step that waited to start. Continue with step Step 3.

• A user cancelled an inquiry message sent by a step which had a job that ended in error. At least one step in the collapsed view will have a status of *ATTN or *CANCEL. One or more steps will have job with a status of *CANCEL in the expanded view. Other jobs may have processed subsequent steps before the procedure ended. For detailed information use “Resolving problems with step status” on page 85.

• A user canceled the procedure by using option 12 (Cancel) from the Work with Procedure Status display or by using the Cancel Procedure (CNLPROC) command. Steps in the collapsed view could have any status except *ACTIVE or *MSGW. Determine if there are any jobs with status values of *FAILED or

82

Displaying status of steps within a procedure run

*CANCEL in the expanded view. Other jobs may have processed subsequent steps before the procedure ended. For detailed information use “Resolving problems with step status” on page 85.

3. After you have completed your evaluation and have taken any needed corrective action to resolve why jobs failed or were canceled, determine how to best complete the procedure. Choices are:

• Resume the procedure. If you resume a failed procedure, processing will begin with the step that failed. If you resume a canceled procedure, processing will begin with steps following the cancelled step. Optionally, if you were unable to resolve a problem for a step in error, you can override the attributes of that step for when the procedure is resumed. See “Resuming a procedure” on page 91.

• Acknowledge the procedure status. Procedures with a status of *CANCELED or *FAILED can be acknowledged (set to *ACKCANCEL or *ACKFAILED, respectively) to indicated you have investigated the problem steps and want to run the procedure again starting at its first step. This option should only be used after you have evaluated the effect of activity performed by the procedure. See “Acknowledging a procedure” on page 89.

Displaying status of steps within a procedure runThe Work with Step Status display provides access to detailed information about status of steps for a specific run of a procedure for an application group.

Timestamps are in the local job time. If you have not already ensured that the systems in your installation use coordinated universal time, see the MIMIX Administrator Reference book for the setting system time topic.

To display step status for a procedure run, do the following:

1. Use one of the following to access the run of the procedure you want:

• “Displaying status of the last run of all procedures” on page 78

• “Displaying available status history of procedure runs” on page 79

2. From the Work with Procedure Status display, type 8 (Step status) next to the run of the procedure you want and press Enter.

3. Press F7 (Expand) to view status of the individual jobs used to process each step.

The steps listed on the Work with Step Status display appear in sequence number order as defined by steps in the procedure. If the procedure is in progress, the display shows status for the steps that have run, the start time and status of the step that is in progress, and blank status and start time for steps that have not yet run.

83

Displaying status of steps within a procedure run

Collapsed view - Figure 9 shows the initial collapsed view of the Work with Step Status display. In this view, each step of the procedure is shown as a single row and step status represents the summary of all jobs used by the step.

Figure 9. Collapsed view of the Work with Step Status display.

Expanded view - Figure 10 shows an example of an expanded view. In the expanded view, step programs of type *AGDFN will have one row for each node on which the step runs. Steps which run step programs at the level of the data resource group or data group are expanded to have multiple rows so that the status of the step for each data resource group or data group is visible. For step programs of type *DTARSCGRP, there will be a summary row for the application group followed by a row for each data resource group within the application group. For step programs of type *DGDFN, there will be a summary row for the application group, then for each data resource group, there is a summary row for the data resource group followed by a row for each of its data groups. Summary rows are identified by a dash (-) in the columns that are being summarized.

Work with Step Status System: SYSTEMA Procedure: SWTPLAN App. group: SAMPLEAG Type: *SWTPLAN Procedure status: *COMPLETED Start time: 03/01/10 11:04:58 Type options, press Enter. 5=Display 6=Print 8=Work with job 11=Display message Step Node Start Jobs Opt Program Type Type Time Duration Status Pend __ MXCHKCOM *AGDFN *LOCAL 11:05:00 00:00:01 *COMP *NO __ MXCHKCFG *DGDFN *NEWPRIM 11:05:00 00:00:01 *COMP *NO __ ENDUSRAPP *AGDFN *PRIMARY 11:05:00 00:00:03 *COMP *NO __ MXENDDG *DGDFN *NEWPRIM 11:05:01 00:00:05 *COMP *NO __ MXENDRJLNK *DGDFN *NEWPRIM 11:05:16 00:00:01 *COMP *NO __ MXAUDACT *DGDFN *NEWPRIM 11:05:18 00:00:01 *COMP *NO __ MXAUDCMPLY *DGDFN *NEWPRIM 11:05:19 00:00:01 *COMP *NO __ MXAUDDIFF *DGDFN *NEWPRIM 11:06:10 00:00:54 *COMP *NO More... Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F7=Expand F9=Retrieve F12=Cancel F13=Repeat F15=Cancel proc. F18=Subset F21=Print list

84

Resolving problems with step status

Also, for step programs of type *AGDFN, the Data Rsc. Grp. column and the Data Group column will always be blank. For step programs of type *DTARSCGRP, the Data Group column will always be blank.

Figure 10. Expanded view of the Work with Step Status display.

Resolving problems with step statusWhen working with step status, it is important that you understand how multiple jobs are used to process the steps in a procedure. At any given time, job activity may be in progress for multiple steps. Or, one job may have failed processing a step while other jobs may have already processed that step and continued beyond it.

Important! Before you take action to resolve a problem with status for a step, be sure you understand the current state of your environment as a result of completed steps and steps in progress, as well as the effect of any action you take.

Table 22 identifies the possible status values that can appear on the Work with Step Status display and the action to take to resolve reported problems.

Work with Step Status System: SYSTEMA Procedure: SWTPLAN App. group: SAMPLEAG Type: *SWTPLAN Procedure status: *COMPLETED Start time: 03/01/10 11:04:58 Type options, press Enter. 5=Display 6=Print 8=Work with job 11=Display message Step Data Data Start Opt Program Rsc. Grp. Group Node Time Duration Status __ MXCHKCOM LTIAS01 11:05:00 00:00:01 *COMP __ MXCHKCFG - - LTIAS02 11:05:00 00:00:01 *COMP __ MXCHKCFG DRG1 - LTIAS02 11:05:00 00:00:01 *COMP __ MXCHKCFG DRG1 DG1A LTIAS02 11:05:00 00:00:01 *COMP __ MXCHKCFG DRG1 DG1B LTIAS02 11:05:00 00:00:01 *COMP __ MXCHKCFG DRG1 DG1C LTIAS02 11:05:00 00:00:01 *COMP __ MXCHKCFG DRG2 - LTIAS02 11:05:00 00:00:01 *COMP __ MXCHKCFG DRG2 DG2A LTIAS02 11:05:00 00:00:01 *COMP More... Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F7=Expand F9=Retrieve F12=Cancel F13=Repeat F15=Cancel proc. F18=Subset F21=Print list

Table 22. Step status values with action required

Status

Value

Description and Action Required

blank The procedure has started but processing has not yet started for the step.

85


*ATTN The step requires attention. The value *ATTN can only appear in the collapsed view or on a summary row in the expanded view. If the procedure status is considered active, at least one job submitted by this step has a status of *FAILED, *CANCEL or *MSGW. If the procedure status is *FAILED or *CANCELED, this step has at least one job that has not started or has a status of *CANCEL or *FAILED.

Action Required: Use F7 to see the expanded view. Determine the specific data resource group or data group for which the problem status exists. Then address the status indicated for that job.

*ACTIVE The step is currently running.

*COMP The step has successfully completed.

*DSBLD The step has been disabled and did not run.

*CANCEL

or

*FAILED

One or more jobs used by the step ended in error. In the expanded view of status, the job is identified as *CANCEL or *FAILED. The status is due to the error action specified for the step.

• For *CANCEL status, user action canceled the step. The step ran, ended in error, and issued an inquiry message. The user’s response to the message was Cancel.

• For *FAILED status, the step ran, one or more jobs ended in error. The Action on error attribute specified to quit the job.

The type of step program used by the step determines what happens to other jobs for the step and whether subsequent steps are prevented from starting, as follows:

• If the step program is of type *DGDFN, jobs that are processing other data groups within the same data resource group continue. When they complete, the data resource group job ends. Subsequent steps that apply to that data resource group or its data groups will not be started. However, subsequent steps will still be processed for other data resource groups and their data groups.

• If the step program is of type *DTARSCGRP, subsequent steps that apply to that data resource group or its data groups will not be started. Jobs for other data resource groups may still be running and will process subsequent steps that apply to their data resource groups and data groups.

• If the step program is of type *AGDFN, subsequent steps that apply to the application group will not be started. Jobs for data resource group or data group steps may still be running and will process subsequent steps that apply to their data resource groups and data groups.

When all asynchronous jobs for the procedure finish, the procedure status is set to *CANCELED or * FAILED, accordingly. If both canceled and failed steps exist when the procedure ends, the procedure status will be *FAILED.

Action Required: Determine the cause of the problem using “Resolving *CANCEL or *FAILED step statuses” on page 88.


Status

Value


86


Responding to a step with a *MSGW status

When a step or a job for step has a status of *MSGW, it is the result of an error condition. An inquiry message was sent because the step specified *MSGW for its Action on error attribute. An operator response is required before any additional processing for the job can occur.

To respond to a step in *MSGW status, do the following from the Work with Step Status display:

1. To see which job is waiting, use F7 to view the Expanded view.

2. To view information about what caused the job to end in error, type 8 (Work with job) next to job with *MSGW status and press Enter.

3. On the Work with Job display, type 10 (Display job log, if active, on job queue, or pending) and press Enter.

4. The job log is displayed. Use F1 to view details of any of the messages. Find the error that caused the job to end. You will see the inquiry message in the job log; however you cannot respond to it from here.

5. Press F12 twice to return to the Work with Step Status display.

6. Type 11 (Display message) next to the step job in *MSGW status and press Enter.

7. You will see the message “Error in step at sequence number number in procedure name. (R C I).” Do one of the following:

• A response of R (Retry) will retry processing the step program within the same job. Type R and press Enter.

• A response of C (Cancel) will set the job status to *CANCEL as indicated in the expanded view of step status. Subsequent steps are handled in the same manner as if the Action on error has specified the value *QUIT. Type C and press Enter.

*IGNERR The step ran and an error occurred, but processing ignored the error and continued.

Action Recommended: Use option 8 (Work with job) to determine the cause of the failure. Consider whether any changes are needed to your procedure or step or to your operating environment to prevent this error from occurring again.

*MSGW The step ran and issued a message that is waiting to be answered. One or more jobs for the step ended in error. Step attributes require that an operator respond to the message.

Action Required: Determine which job issued the message, investigate the problem, and then respond to the inquiry message using “Responding to a step with a *MSGW status” on page 87.


Status

Value


87


• A response of I (Ignore) will set the job status to *IGNERR as indicated in the expanded view of step status, and processing continues as if the job had not ended in error. Type I and press Enter.

Resolving *CANCEL or *FAILED step statuses

Evaluate the cause of the failure or cancellation, as well as the state of other steps within the procedure. All steps with failed or canceled jobs need to be resolved.

Important! For any step which ended in error, other asynchronous jobs may have successfully processed the same step and continued on to process other subsequent steps. The actions taken by those steps as well as by completed steps which preceded the problem cannot be reversed.

Do the following from the Work with Step Status display:

1. Use F7 to view the Expanded view.

2. All steps which have a job that has a step status of *CANCEL or *FAILED must be evaluated and the cause of the problem must be resolved. To view information about why a job had an error processing a step, do the following:

a. Type 8 (Work with job) next to the job you want and press Enter.

b. On the Work with Job display, type 4 (Work with spooled file) and press Enter.

c. Display the spooled file for the job and check for the cause of the error.

d. Evaluate whether any immediate action is needed due to the condition which caused the error. Consider the nature and severity of the error.

3. If the procedure is still active and you need to take corrective action or perform additional investigation, cancel the procedure using F15 (Cancel proc.). Any steps that are currently running will complete, then the procedure status is set to *CANCELED.

4. Check which steps have completed, failed, were canceled, or have not yet started. Then evaluate the current state of your environment as a result. If needed, take corrective action that is appropriate for the extent of the errors and the extent to which steps completed.

Note: It is strongly recommended that you cancel the procedure, if it is active, before attempting any corrective action.

5. Determine how to best complete the procedure in the current state of your environment. When the procedure is *FAILED or *CANCELED, your choices are:

• Resume the procedure from the point where the procedure ended. If you resume a failed procedure, processing will begin with the step that failed. If you resume a canceled procedure, processing will begin with steps following the cancelled step. Optionally, if you were unable to resolve a problem for a step in error, you can override the attributes of that step for when the procedure is resumed. See “Resuming a procedure” on page 91.

• Acknowledge the procedure status allowing the procedure for *CANCELED or *FAILED to be resumed starting with the first step. This choice indicates you have investigated the problem steps and want to run the procedure again

88

Acknowledging a procedure

starting at its first step. This option should only be used after you have evaluated the effect of activity performed by the procedure. See “Acknowledging a procedure” on page 89.

Acknowledging a procedureAcknowledging a procedure allows you to manually change the status of procedures that either failed or have errors in order to control where the next attempt to run the procedure will start. Procedures with a status of *CANCELED, *FAILED, or *COMPERR can be acknowledged (set to *ACKCANCEL, *ACKFAILED, or *ACKERR, respectively) to indicated you have investigated the problem steps.

A procedure of *CANCELED or *FAILED allows you to rerun the procedure from its first step. Once acknowledged, a procedure with either of these statuses cannot be resumed from the point where the procedure ended. This is appropriate when you have determined that your environment will not be harmed if the next attempt to run starts at the first step.

A *COMPERR procedure that is acknowledged (*ACKERR) can never be resumed because the procedure completed. By acknowledging a procedure with this status, you are confirming the problems have been reviewed.

The last run of a procedure with a status of *ACKCANCEL or *ACKFAILED and the last run of the set of start/end/switch procedures can be returned to their previous status (*CANCELED or *FAILED, respectively). The next attempt to run the procedure will resume at the failed or canceled step or at the first step that has not been started.

Note: Acknowledging the last run of a failed or canceled procedure will acknowledge all previous failed or canceled runs of the procedure.

Important! Before changing status of a procedure, it is important that you evaluate and understand the effect of the partially performed procedure on your environment. Changing procedure status does not reverse the actions taken by preceding steps that completed or the actions performed by other asynchronous jobs which did complete the same step and then processed subsequent steps. It may not be appropriate for the next run of the procedure to begin with the first step, for example, if the failure occurred in a step which synchronizes data or changes states of MIMIX processes. Likewise, it may not be appropriate to return to the previous status to resume a procedure run was not recently run.

To change the status of a procedure, do the following:

1. From the Work with Procedure Status display type 13 (Change status) next to the failed or canceled procedure you want and press Enter.

2. The Change Procedure Status (CHGPROCSTS) display appears. Specify the value you want for the Status prompt and press Enter.

3. If you specified *ACK in Step 2, the Start time prompt appears, displaying the timestamp of the selected procedure run. Do one of the following:

• To acknowledge only the selected failed or canceled run, press Enter.

• To acknowledge all previously failed or canceled runs of the selected procedure, specify *ALL for Start time and press Enter.

89

Running a procedure

Running a procedureThe procedure type determines what command to use to run the procedure. For an application group, multiple procedures of type *USER can run at the same time if they have unique names. Only one run of a uniquely named procedure of type *USER can occur at a time.

All other procedure types must be invoked by the application group command associated with the procedure type. For example a procedure of type *START can only be invoked by the Start Application Group (STRAG) command.

Where should the procedure begin? The value specified for the Begin at step (STEP) parameter on the request to run the procedure determines the step at which the procedure will start. The status of the last run of the procedure determines which values are valid.

The default value, *FIRST, will start the specified procedure at its first step. This value can be used when the procedure has never been run, when its previous run completed (*COMPLETED or *COMPERR), or when a user acknowledged the status of its previous run which failed, was canceled, or completed with errors (*ACKFAILED, *ACKCANCEL, or *ACKERR respectively).

Other values are for resolving problems with a failed or canceled procedure. When a procedure fails or is canceled, subsequent attempts to run the same procedure will fail until user action is taken. You will need to determine the best course of action for your environment based on the implications of the canceled or failed steps and any steps which completed.

The value *RESUME will start the last run of the procedure beginning with the step at which it failed, the step that was canceled in response to an error, or the step following where the procedure was canceled. The value *RESUME may be appropriate after you have investigated and resolved the problem which caused the procedure to end. Optionally, if the problem cannot be resolved and you want to resume the procedure anyway, you can override the attributes of a step before resuming the procedure.

The value *OVERRIDE will override the status of all runs of the specified procedure that did not complete. The *FAILED or *CANCELED status of these procedures are changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the procedure begins at the first step.

.

To run a procedure of type *USER, do the following:

1. From the Work with Procedures or Work with Procedure Status display type 9 (Run) next to the user procedure you want and press F4 (Prompt).

2. Specify the value you want for Begin at step and press Enter.

To run a procedure type other than *USER, do the following:

From a command line, enter the application group command associated with the procedure type. For example a procedure of type *START can only be invoked by the Start Application Group (STRAG) command.

90

Running a procedure

To resume a procedure with a status of *CANCELED or *FAILED, see “Resuming a procedure” on page 91 .

Resuming a procedure

To resume a procedure of with a status of *CANCELED or *FAILED, do the following:

1. Investigate and resolve problems for steps with errors. See “Resolving problems with step status” on page 85.

2. Optional: If the problem cannot be resolved, and you want to resume the procedure anyway, use the Override Step (OVRSTEP) command to change the configured value of the step for when the procedure is resumed. See “Overriding the attributes of a step” on page 91.

3. For a procedure of type *USER, from the Work with Step Status display use F14 (Resume proc.). For all other procedure types, from a command line, enter the appropriate application group command and specify *RESUME as the value for Begin at step (STEP).

Overriding the attributes of a step

The attributes of a step can be overridden to change the configured value of the step for the current run of the procedure by using the Override Step (OVRSTEP) command. The attributes determine whether the step is run or actions if the step errors for the current run of the procedure when it is resumed.

The OVRSTEP command can be used for a procedure that has a status of active (*ACTIVE, *ATTN, *MSGW, *PENDCNL, or *QUEUED), *CANCELED or *FAILED and steps that have a status of *CANCEL or *FAILED. The overridden values apply only for the current run of the procedure when it is resumed.

Note: Regardless of procedure status, attributes cannot be overridden for a required MIMIX step or any step with a step status of *COMP or *IGNERR.

A procedure with a status of *CANCELED or *FAILED requires user action to resolve a problem. If the problem cannot be resolved and you want to resume the procedure anyway, you can use the OVRSTEP command to disable the step in error or specify the error action to occur when the step is retried.

Important! Overriding the attributes of a step should only be done after you have considered how rerunning the step impacts your environment. It is important that you understand the implications for steps which preceded the cancellation or failure in the last run of the procedure. Processing for steps that completed is not reversed.

The changes made when using the OVRSTEP command will only apply to the current run of the procedure. The attributes that can be changed will vary depending on the statuses of the specified procedure and step. Consider the following:

• When the specified procedure has a status of *ACKCANCEL, *ACKFAILED, *ACKERR, *COMPLETED, or *COMPERR, no attributes can be overridden on any step in the procedure.

91

Canceling a procedure

• When the specified procedure has a status that is considered active (*ACTIVE, *ATTN, *MSGW, *PENDCNL, or *QUEUED), only the Action on error (ERRACT) can be overridden.

• When the specified procedure has a status that can be resumed (*CANCELED or *FAILED), the Action before step (BEFOREACT), Action on error (ERRACT), or State (STATE) can be overridden only on steps that have not yet run, that failed, or that were canceled.

Do the following from the Work with Step Status display:

1. Press F7 (Expand) to view status of the individual jobs used to process each step.

2. Type 13 (Override step) next to the step you want and press Enter.

3. On the Override Step (OVRSTEP) display, specify the values you want and press Enter. From the Work with Step Status display, use F14 (Resume proc.) to resume the procedure. See “Resuming a procedure” on page 91.

Canceling a procedureUse this procedure to cancel a procedure with a status that is considered active. This includes procedure statuses of: *ACTIVE, *ATTN, *MSGW, *PENDCNL, and *QUEUED.

Important! Use this command with caution. Processing ends without reversing any actions performed by completed steps, which may leave your environment in an undesirable state. For example, ending a switch procedure could result in partially switched data.

The status of the procedure will be changed to *PENDCNL. If there are any inquiry messages waiting for an operator response, they are processed as if the response was Cancel. When all activity for currently running steps end, the status of the procedure will be automatically changed to *CANCELED.

To cancel an active procedure, do one of following:

• From the Work with Procedure Status display, type 12 (Cancel) next to the procedure you want and press Enter.

• From the Work with Step Status display, press F15 (Cancel proc.).

A procedure that has been canceled can be resumed later, as long as its status has not been changed to *ACKCANCEL. When a canceled procedure is resumed, processing begins immediately after the point where it was ended.

92

CHAPTER 5 Monitoring status with MIMIX Availability Status

The MIMIX Availability Status display is useful in environments that do not use application groups.

Note: The MIMIX Availability Status should not be used in environments that use application groups.

The MIMIX Availability Status display, shown in Figure 11, provides one location for quickly assessing the overall state of an entire MIMIX installation, reflecting both source and target systems. The status values are prioritized and are a composite view reflecting both source and target systems. In addition to determining status, unique features of this display enable its use as the starting point for performing routine actions and resolving problems.

To access this display, do one of the following:

• Select option 1 on the MIMIX Basic Main Menu

• Enter the command WRKMMXSTS and press Enter.

Figure 11 shows the MIMIX Availability Status display.

Figure 11. MIMIX Availability Status window. This example shows that MIMIX is active but the installation is not complying with best practices for switching (red) and audits (yellow).

Additional fields - In the upper right corner of the display, additional fields report information that is relevant to maintaining the installation.

Recoveries - Identifies the total number of recoveries in progress for the

93

installation. Active recoveries represent problems detected and being corrected by MIMIX AutoGuard. Before certain activity, such as ending MIMIX, it is important that there are no recoveries in progress in the installation. If more than 9999 recoveries exist, the field displays ++++.

Last switch - This field is only displayed when there is a value specified for the Default model switch framework policy. The date indicates when the last completed switch was performed using the switch framework specified in the policy. If you have not yet performed a switch using the switch framework defined in policies, this date is when the MIMIX environment was first started or when the system managers were started and explicitly reset the configuration.

Activity/Status - The main area of the display provides a reporting area for status of activity in key areas. Replication, Audits and notifications, and Services.

For each activity area, status represents a summation of multiple processes. The text shown within each activity area changes to identify the most severe problem within its processes. Text, as well as background color, also identify the summarized status and indicate what action is appropriate.

Blue indicates there are no problems with the activity and that no action is required.

Yellow indicates warnings that may need your attention.

Red indicates errors or inactive processes that require immediate action.

Options - On this display, the activity you select with an option and the status of the activity determines what you see as the result of using the option. This behavior is unlike that of options on other MIMIX displays. The following subtopics describe the results of using the available options.

Option 5 (Display details) from the MIMIX Availability Status display results in a display showing detailed status for the selected activity. Take option 5 next to the item to access detailed information for the activity.

• For Replication, the result is the Work with Data Groups display.

• For Audits and notifications, the result is the Summary view of the Work with Audits display. (To see details for notifications, press F20 (Command line), then enter the command WRKNFY.)

• For Services, the result is the Work with Systems display for status of the MIMIX managers. (To see details for monitors, press F4 (MIMIX Menu), then use option 12 (Work with monitors).)

Option 9 (Troubleshoot) from the MIMIX Availability Status display results in the appropriate display to use as a starting point for troubleshooting the stated problem for the selected activity. The stated problem reflects the highest severity problem present. Other less severe problems may exist, they may be reflected on the subsequent display but will not be reflected on the MIMIX Availability Status display until higher severity problems are resolved.Take option 9 next to the item to access detailed information for the activity.

• For Replication, the result is the Work with Data Groups display.

94

Checking replication status from the MIMIX Availability Status display

• For Audits and notifications, the result is dependent on the severity of the stated problem. All auditing conditions are prioritized before any notifications. For audits with status conditions, the result is the Summary view of the Work with Audits display. For audits with compliance conditions, the result is the Compliance view of the Work with Audits display. For notifications with errors, the result is the Work with Notifications display.

• For Services, the result is dependent on the severity of the stated problem. All system manager, journal manager, and target journal inspection errors are prioritized before any monitor errors. For system manager, journal manager, and target journal inspection errors, the result is the Work with Systems display. For monitor errors, the result is the Work with Monitors display.

Checking replication status from the MIMIX Availability Status display

The first activity listed on the MIMIX Availability Status display is Replication, as shown in Figure 11. The replication area summarizes status of replication activity for all data groups in the installation. This includes processes required for replication and also reflects potential problems.

Status values are shown by color while message text within the highlighted area indicates the nature of any problem.

Blue - There are no problems with replication processes and no action is required.

Yellow - Warnings exist that may need your attention. Possible causes include:

• A file is being synchronized by MIMIX AutoGuard. This condition usually resolves itself.

• A process has a backlog which has reached its threshold.

• An object on the target system is not journaled as expected.

• Journal state or cache are not as expected.

Red - Conditions exist that require immediate action or a switch is in progress. Possible scenarios that require immediate action include:

• Error conditions

• Processes required for replication are not active

• Some objects are not journaled and therefore cannot be replicated

• Journal state or cache is not as expected.

Status may change due to warnings or problems with any of the replication processes, with replication errors associated with data group entries (file, object, IFS tracking, and object tracking), or with a change in switch status.

To begin resolving problems, use option 9 (Troubleshoot) to access the Work with Data Groups display, from which you can view detailed information and take action. See “The Work with Data Groups display” on page 99 for more information.

95

Checking audit and notification status from the MIMIX Availability Status display

Note: Replication status can indicate action required (red) while a switch is in progress. When you are ready to switch from the backup system to the production system, press F4 (MIMIX Menu). From there, use option 5 to continue switching.

Checking audit and notification status from the MIMIX Availability Status display

The middle activity listed the MIMIX Availability Status display is Audits and notifications, as shown in Figure 11. This activity area summarizes status of all audit activity, problems with audit results, audit compliance, and new notifications for a MIMIX installation.


Blue - No action is required. No audits are active, have differences, or are out of compliance, and there are no new error or warning notifications.

Yellow - An audit or notification may need your attention. An out-of-compliance audit is running its compare phase, an audit is approaching an out-of-compliance state, or a new warning notification exists.

Red - A condition exists that requires immediate action. An audit has failed, had unresolved differences, is out-of-compliance, was prevented from running because of policy values, or a new error notification exists.

Status may change due to the highest severity condition with audits, audit results, audit compliance, or new notifications.

To begin resolving problems, use option 9 (Troubleshoot) to access the appropriate display for the indicated problem.

• For audit status problems, see “Resolving audit problems” on page 133.

• To resolve audit compliance problems, the audits must be run. See “Running an audit immediately” on page 131 and “Displaying audit compliance” on page 144.

• For additional information about notifications see “Displaying notifications” on page 160.

Checking status of supporting services from the MIMIX Availability Status display

The last activity listed on the MIMIX Availability Status display is Services, as shown in Figure 11. This area summarizes status and also reflects potential problems with system managers, journal managers, target journal inspection, collector services, and all enabled monitors for the installation.


Blue - There are no problems for the managers, target journal inspection, collector

96

Checking status of supporting services from the MIMIX Availability Status display

services, and monitors. No action is required.

Red - A system manager, journal manager, target journal inspection, collector service, or a monitor is in a state that requires immediate action. The status text indicates which problem occurred and where you can see detailed information.

To begin resolving problems, use option 9 (Troubleshoot) to access the appropriate display.

When the text in the Services area indicates a problem with system managers, journal managers, target journal inspection, or collector services option 9 will access the Work with Systems display, from which you can view detailed information and take action. See “Working with system-level processes” on page 149 for more information.

When the text in the Services area indicates a problem with a monitor, option 9 will access the Work with Monitors display. For more information about working with monitors, see the Using MIMIX Monitor book.

97

98

CHAPTER 6 Working with data group status

This chapter describes common MIMIX operations that help keep your MIMIX environment running. In order for MIMIX to provide a hot backup of your critical information, all processes associated with replication must be active at all times. Supporting service jobs must also be active. MIMIX allows you to display and monitor the statuses of these processes.

The topics included in this chapter are:

• “The Work with Data Groups display” on page 99 describes the errors reported on this display and provides procedures for resolving them.

• “Working with the detailed status of data groups” on page 105 describes how to access detailed status for a data group.

• “Identifying replication processes with backlogs” on page 115 describes what fields to check for detailed status of a data group.

Running H/F 1

The Work with Data Groups displayFrom the Work with Data Groups display you can start and end replication, track replication status, perform a data group switch, as well as work with files, objects, and tracking entries in error and access displays for data group entries and tracking entries.

Do one of the following to access the Work with Data Groups display:

• From the MIMIX Intermediate Main menu, select option 1 (Work with data groups) and press Enter.

• From the MIMIX Availability Status display, type 5 (Display details) next to Replication and press Enter.

Figure 12. Sample Work with Data Groups display. The display uses letters and colored high-lighting to call your attention to warning and problem conditions. This example shows items in color which would appear with color highlighting on the display. If you are viewing this page in printed form, the color may not be shown.

For each data group listed, you can see the current source system and target system processes, and the number of errors reported.

The following fields and columns are available.

Audit/Recov./Notif. -This field is located in the upper right corner of the Work with Data Groups display. The first number is the total number of audits that require action to correct a problem or that require your attention to prevent a situation from becoming a problem. The second number indicates the number of active recoveries, including those resulting from audits.The third number indicates the number of new notifications that require action or attention. If more than 999 items exist in any field, the field will display +++. When a field is highlighted in red, a problem exists. When a field is

Work with Data Groups CHICAGO 11:02:05 Type options, press Enter. Audits/Recov./Notif.: 001 / 002 / 003 5=Display definition 8=Display status 9=Start DG 10=End DG 12=Files needing attention 13=Objects in error 14=Active objects 15=Planned switch 16=Unplanned switch ... ---------Source--------- --------Target-------- -Errors- Opt Data Group System Mgr DB Obj DA System Mgr DB Obj DB Obj __ APP1 LONDON A I CHICAGO A I __ APP2 LONDON A A A CHICAGO A A A __ APP3 LONDON A I CHICAGO A I 2 __ CRITICALAP LONDON A R A A CHICAGO A A A 1 4 __ RJAPP4 LONDON A L CHICAGO A I Bottom F3=Exit F5=Refresh F7=Audits F8=Recoveries F9=Automatic refresh F10=Legend F13=Repeat F16=DG definitions F23=More options F23=More keys

104

Running H/F 1

highlighted in yellow, at least one out-of-compliance audit is currently active or an audit is approaching out of compliance. For details, see “Problems reflected in the Audits/Recov./Notif. field” on page 101.

Data group - When a data group name is highlighted, a problem exists. For details, see “Problems reflected in the Data Group column” on page 101

Source - The following columns provide summaries of processes that run on the source system. For details about status values, see “Replication problems reflected in the Source and Target columns” on page 103.

Mgr - Represents a summation of the system manager and the journal manager processes on the source system of the data group.

DB - Represents the status of the remote journal link. It is possible to have an active status in this column even though the data group has not been started. When the RJ link is active, database changes will continue to be sent to the target system. MIMIX can read and apply these changes once the data group is started. For data groups configured for source-send replication, this represents the status of the database send process.

Obj - Represents a summation of the object processes that run on the source system. These include the object send, object retrieve and container send processes.

DA - This column represents the status of the data area polling process when the data group replicates data areas through the data area poller. This column does not contain data when data areas are replicated through the user journal with advanced journaling or through the system journal.

Target - The following columns provide summaries of processes that run on the target system. For details about status values, see “Replication problems reflected in the Source and Target columns” on page 103.

Mgr - Represents a summation of the system manager, journal manager, and target journal inspection processes on the target system of the data group. Target journal inspection status includes status of inspection jobs for both target journals (user and system) for the data group.

DB - Represents the summation of status for the database reader process, the database apply process, and access path maintenance jobs1. For data groups configured for source-send replication, this column represents the summation of the status of database apply processes and access path maintenance jobs.

Obj - Represents the object apply processes.

Errors - When any errors are indicated in the following columns (DB and Object), they are highlighted in red.

DB - Represents the sum of the number for database files, IFS objects, *DTAARA and *DTAQ objects that are on hold due to errors plus the number of logical (LF) and physical (PF) files that have access path maintenance1 failures for the data

1. Access path maintenance status and errors are reported on the Work with Data Groups display only in installations running MIMIX 7.1.15.00 or higher. Access path maintenance jobs run only if the access path maintenance (APMNT) policy is enabled.

104

Running H/F 1

group. To work with a subsetted list of file errors and access path errors, use option 12 (Files needing attention). For a subsetted list of IFS object errors, use option 51 (IFS tracking entries not active), For a subsetted list of *DTAARA and *DTAQ errors, use option 53 (Object tracking entries not active).

Obj - Represents a count of the number of objects for which at least one activity entry is in a failed state. To work with a subsetted list, use option 13 (objects in error).

For additional information, see “Working with files needing attention (replication and access path errors)” on page 210, “Working with tracking entries” on page 219, and “Working with objects in error” on page 224.

Problems reflected in the Audits/Recov./Notif. field

When the Audits field is highlighted in reverse red, at least one audit has failed, has unresolved differences, is out of compliance, or was not run due to a policy. When it is highlighted in reverse yellow, at least one out-of-compliance audit is currently active or an audit is approaching out of compliance. For more information about audits, see “Displaying audit runtime status” on page 129.

The Recov. (recoveries) field indicates the number of active recoveries, including those resulting from audits. Active recoveries are an indication of problems detected by MIMIX AutoGuard which is attempting to correct them. For more information about recoveries, see “Displaying recoveries” on page 164.

When the Notif. (notifications) field is highlighted in reverse red, at least one new notification with a severity of *ERROR exists. When it is highlighted in reverse yellow, at least one new notification with a severity of *WARNING exists. For more information about notifications, see “Displaying notifications” on page 160.

Problems reflected in the Data Group column

When a data group name is highlighted in color, journaling problems exist that affect replication of one or more types of data.

Table 23. Conditions which highlight the data group name in color.

Color Possible Problems

Red One of the following conditions exists:

• FIles, IFS tracking entries, or object tracking entries defined to the data group are not journaled or not journaled correctly on the source system.

• The source side journal is in standby or inactive state.

104

Running H/F 1

Resolving problems highlighted in the Data Group column

In most environments, the most likely causes indicated in Table 23 are problems with journaling. Problems associated with journal state or journal cache are only reported in data groups which are configured to use those high availability journal performance enhancements.

Journaling problems: If the data group name is highlighted in red or yellow, do the following to check for and resolve journaling problems:

1. Check for not journaled conditions for each of the following:

• To determine which files are not journaled, use option 17 (File entries) for the data group. Then press F10 (journaled view) to see journaling status.

• To determine which IFS tracking entries are not journaled, use option 50 (IFS tracking entries) for the data group. Then press F10 (journaled view) to see journaling status.

• To determine which object tracking entries are not journaled, use option 52 (object tracking entries) for the data group. Then press F10 (journaled view) to see journaling status.

2. To start journaling for a file or a tracking entry, use option 9 (Start journaling) to start journaling.

3. You can use option 11 (Verify journaling) to verify that journaling has started.

Yellow One of the following conditions exists:

• Files, IFS tracking entries, or object tracking entries defined to the data group are not journaled or journaled correctly on the target system. This is only enforced if the data group is set up to journal on the target system as defined in the data group definition.

• Data group file entries, IFS tracking entries, or object tracking entries are on hold for reasons other than an error.

• The journal cache value for the source journal does not match the configured value in the journal definition.

• The journal cache value for the target journal does not match the expected cache value and the database apply session is active. If another data group is using the journal definition as a source journal, the actual journal cache value may be different than the configured value.

• The target journal state value for the target journal does not does not match the expected state value and the database apply session is active. If another data group is using the journal definition as a source journal, the actual state may be different than the configured value.

Note: In a cooperative processing environment, files, IFS tracking entries, or object tracking entries being added dynamically to the configuration for user journal replication may reflect an intermediate state of not journaled until they have been synchronized and become active to MIMIX.

Table 23. Conditions which highlight the data group name in color.

Color Possible Problems

104

Running H/F 1

Journal cache or journal state problems: If the data group name is highlighted in red or yellow, do the following to check for and resolve problems:

1. From the Work with Data Groups display, use option 8 (Display status).

2. From the Data Group Status display, press F8 (Database).

3. The Jrn State and Cache Src and Tgt fields are located In the upper left corner of the Data Group Database Status display. For each system (Src or Tgt) status of the journal state is shown first, followed by the status of the journal cache. The example below shows v for value in all for status positions. If any of these fields are highlighted, there is a problem. Use “Resolving a problem with journal cache or journal state” on page 119.

Manager problems reflected in the Source and Target columns

The status of needed system-level processes is reflected in the Mgr column for the source and target system. The managers must be active for replication to occur. For any status other than A (active), use “Working with system-level processes” on page 149.

Replication problems reflected in the Source and Target columns

The status of each process is represented by a status letter and the color of the box surrounding the letter. Table 24 describes the letters and colors used for status of the replication process summaries shown in the Source and Target columns.

Jrn State and Cache Src: v v Tgt: v v

Table 24. Possible status values for source and target process summaries

I Inactive (highlighted red) – The process is currently not active.

L Inactive RJ link (highlighted red) – The RJ link is currently not active. This status is only displayed in the database source column when a data group uses MIMIX RJ support.

A Active (highlighted blue) – The process is currently active. For the database source column, this value indicates that the send/receive processes are active.

C RJ Catch-up mode (highlighted blue) – The remote journal is currently in catch-up mode. This status can only be displayed in the database source column for data groups that use remote journaling. Catch-up mode indicates that the operating system is transferring journal entries from the source system journal to the remote journal as quickly as possible. When the database reader process is active, MIMIX processes the journal entries as they reach the target system.

R Active RJ link (highlighted blue) – The RJ link is currently active. This status is only displayed in the database source column when a data group uses MIMIX RJ support.

U Unknown (highlighted white) – The status of the process cannot be determined possibly because of an error or communications problem.

104

Running H/F 1

Note: Use F10 (Legend) to view a pop-up window that displays the status values and colors. To remove the pop-up window, press Enter or F12 (Cancel).

Setting the automatic refresh interval

You can control how frequently the data shown on the Work with Data Groups display is refreshed by doing the following:

1. Press F9 (Automatic refresh).

2. The Automatic Refresh Value pop-up appears. Specify how long you want the system to wait before refreshing the information and press Enter.

The status displayed will automatically refresh when the specified interval passes. To end the automatic refresh process, press Enter.

J RJ Link in Threshold (highlighted turquoise) – The RJ link has fallen behind its configured threshold. View detailed status to determine the extent of the backlog.

T Threshold reached (highlighted turquoise) – A process has fallen behind a configured threshold. View detailed status to determine which process has exceeded its backlog threshold and to determine the extent of the backlog. See “Working with the detailed status of data groups” on page 105

X Switch mode (highlighted red) – The data group is in the middle of switching the data source system and status may not be retrievable or accurate.

P Partially active (highlighted red) - At least one subprocess is active, but one or more subprocesses is not active. This status is only displayed in process columns that represent multiple processes. The data group name may also be shown in a highlighted field of red.

In the Target DB column, partial status is also possible when all other processes, including database apply, are active but access path maintenance1 is enabled and does not have at least one active job.

D Disabled – The process is currently not active and the data group is disabled.

Note: The status value for a disabled data group is the letter D displayed in standard format. No colored blocks are used.

W Waiting at a recovery point (highlighted red) - The process is currently suspended at a recovery point.

1. Access path maintenance is available only on installations running 7.1.15.00 or higher.

Table 24. Possible status values for source and target process summaries

104

Working with the detailed status of data groups

Working with the detailed status of data groupsBasic support for detailed data group status is available in the 5250 emulator interface.

The Data Group Status display (DSPDGSTS command) uses multiple views to present status of a single data group. The views identify and provide status for each of the processes used by the data group. Error conditions for the data group as well as process statistics and information about the last entry processed by each replication process are included. Some fields are repeated on more than one view.

The data group configuration determines what fields are visible. If the data group is database only, the object fields are not shown. Similarly, if the data group is object only, the database fields are not shown.

Displaying data group detailed status

Detailed status is available for one data group at a time. There are multiple ways of locating and subsetting to the data group.

Do the following to access detailed status for a data group:

1. Use one of the following to locate the data group you want:

• To select a data group from a list of all data groups in the installation, select option 6 (Work with data groups) on the MIMIX Basic Main Menu and press Enter.

• To select a data group from a subsetted list for an application group, from the Work with Application Groups display use option 13 (Data resource groups) to select a resource group. On the resulting display use option 8 (Data groups).

2. The Work with Data Groups display appears. Type an 8 (Display status) next to the data group you want and press Enter.

3. The Data Group Status display shows a merged view of data group activity on the source and target systems. (See Figure 13.)

Only fields for the type of information replicated by the data group are displayed. For example, if the data group replicates only objects from the system journal, you will only see fields for system journal replication. If the data group replicates from both the system journal and the user journal, you will see fields for both. To see additional status information for object processes or database processes, do the following:

• If the data group contains object information, press F7 (Object) to view additional object status displays. The Data Group Object Status display appears.

• If the data group contains database information, press F8 (Database) to view additional database status displays. The Data Group Database Status display appears. Tracking entry information for advanced journaling is also available.

4. For object information, there are three views. For database information, there are four views available. Use F11 to change between views.

105


Note: If the data group contains both database and object information, you can toggle between object details and database details by using the F7 and F8 keys.

Merged view

The initial view displayed is the merged view. This view summarizes status for the replication paths configured for the data group. The status of each process is represented by the color of the box surrounding the process and a status letter. Table 25 shows possible status values.

Figure 13 shows a sample of the merged view of the Data Group Status display. The data group in this view is configured for user journal replication using remote journaling and for system journal replication. Also, access path maintenance is enabled.

Figure 13. Merged view of data group status. The inverse highlighted blocks are not shown in this example.

Note: Journal sequence numbers shown in the Source Statistics and Target Statistics areas may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left-

Data Group Status 17:39:36 Data group . . . . : CRITICALAP Database errors . . . . : 1 Elapsed time . . . : 00:52:51 Objects in error/active : 4 / 0 Transfer definition: PRIMARY-A State. . . . . . . . . : *ASYNCPEND --------------------------- Source Statistics --------------------------- System: LONDON-A Jrn Mgr-A RJLNK Mon-A Receiver Sequence # Date Time Trans/Hour Database Source Jrn. LONDN0002 >0,000,002,591 4/20/08 11:02:35 Link-A RJ Tgt Jrn. LONDN0002 >0,000,002,591 4/20/08 11:02:35 Last Read . LONDN0002 >0,000,002,591 4/20/08 11:02:35 Entries not read: 0 Est. time to read: Object Current . . AUDRCV0108 22,314,732 4/22/08 17:37:13 748 Send-I Last Read . AUDRCV0103 22,175,464 4/21/08 11:05:56 *SHARED Entries not read : 139,268 Est. time to read: --------------------------- Target Statistics --------------------------- System: CHICAGO-A Jrn Mgr-A DB Rdr- A AP Maint-A RJLNK Mon-A Sys Jrn Insp -A Last Received Unprocessed Entry Count Est Time User Jrn Insp-A Sequence # Entry Count Trans/Hour To Apply DB Apply-A >0,000,002,590 Obj Apply-A 22,023,868 4 F3=Exit F5=Refresh F7=Object view F8=Database F9=Automatic refresh F10=Restart statistics F12=Cancel F14=Start DG F24=More keys

106


most) are omitted. Truncated journal sequence numbers are prefixed by '>'. This is shown in Figure 13.

Top left corner: The top left corner of the Data Group Status display identifies the data group, the elapsed time, and the status of the transfer definition in use. The elapsed time is the amount of time that has elapsed since you accessed this display or used the F10 (Restart statistics) key.

Top right corner: The top right corner of the display identifies the number of errors identified by MIMIX. If the workstation supports colors, the number files and objects in error will be displayed in red.

• The Database errors field identifies the number of errors in user journal replication processes. This includes all file entries, IFS tracking entries, and object tracking entries in error. When access path maintenance1 is enabled, this also includes the number of logical and physical files that have access path maintenance failures for the data group.

Table 25. Possible values for detailed status. Not all statuses are used by each process.

Color and

Status

Description

Red When displayed on the Data group, Database errors, or Objects in error fields, a problem exists that requires action.

Red - I The process is inactive.

Red - W The process is suspended at a recovery point. This status is only available for apply processes.

Yellow When displayed on the Data group field, a problem exists that may require attention.

Yellow - P One or more of the processes is active but others are inactive. On the merged view, this status is only possible for the Object Send field.

Turquoise - T The process has a backlog which exceeds its configured threshold. On fields which summarize status for multiple processes, use F7 and F8 to view the specific threshold. The -T is not shown in statistical fields. If a threshold condition persists over time, refer to the MIMIX Administrator Reference book for information about possible resolutions.

White - U The status of the process is unknown.

Blue - A The process is active.

Blue - C The RJ Link is in catch-up mode.This status is only possible for the Database Link process in the merged view and the RJ link field in some database views.

Green - D The data group is disabled. This also means the data group is currently inactive.

1. Access path maintenance is available only on installations running MIMIX 7.1.15.00 or higher.

107


• The Objects in error/active fields indicate the number of objects that are failed and the number of objects with pending activity entries. The first number in these fields indicates the number of objects defined to the data group that have a status of *FAILED. The second number indicates the number of objects with active (pending) activity entries.

• The State field identifies the state of the remote journal link. The values for the state field are the same as those which appear on the Work with RJ Links display. This field is not shown if the data group uses source-send processes for user journal replication.

Source statistics: The middle of the display shows status and summarized statistics for the journals being used for replication and the processes that read from them. The following process fields are possible:

System - Identifies the current source system definition. The status value is an indication of the success in communicating with that system.

Jrn Mgr - Displays the status of the journal manager process for the source system.

DA Poll - Displays the status of the data area poller. This field is present only if the data group replicates data areas using this process.

RJLNK Mon - Displays status of the RJLNK monitor on the source system. This field is present only for data groups that use remote journaling.

Database (Link or Send) - Identifies the status of the process which transfers user journal entries from the source system to the target system.

Link - Displayed when the data group is configured for remote journaling. The status is that of the of the RJ link.

Send -Displayed when the data group id configured for MIMIX source-send processes. The status is that of the database send process.

Object Send - Displays a summation of status from the object send, object retrieve, and container send processes. The highest priority status from each process determines the status displayed. Use F7 (Object view) to see the individual processes. When the data group uses a shared object send job, either the value *SHARED or a three-character job prefix is displayed below the Send process status, The value *SHARED indicates that the data group uses the MIMIX generated shared object send prefix for this source system. A three-character prefix indicates this data group uses a shared object send job on this system that is shared only with other data groups which specify the same prefix.

For the Database and Object processes, additional fields identify current journal information, the last entry that has been read by the process, and statistics related to arrival rate, entries not read, and estimating the time to read.

Current - For the Database Send and Object Send processes, this identifies the last entry in the currently attached journal receiver. This information is used to show the arrival rate of entries to the journals.

Note: If the data group uses remote journaling, current information is displayed in two rows, Source jrn and RJ tgt jrn. The source journal sequence number refers to the last sequence number in the local journal on the

108


source system. The remote journaling target journal sequence number refers to the last sequence number in the associated remote journal on the target system.

Transactions per hour - For current journal information, this is based on the number of entries to arrive on the journal over the elapsed time the statistics have been gathered. For last read information, this is based on the actual number of entries that have been read over the elapsed time the statistics have been gathered.

Last Read - Identifies the journal entry that was last read and processed by the object send, database send, or database reader.

Transactions per hour - For current journal fields, this is based on the number of entries to arrive on the journal over the elapsed time the statistics have been gathered. For last read fields, this is based on the actual number of entries that have been read over the elapsed time the statistics have been gathered and will change due to elapsed time and the rate at which entries arrive in the journal.

Entries not read - This a calculation of the number of journal entries between the last read sequence number and the sequence number of the last entry in the current receiver for the source journal. An asterisk (*) preceding this field indicates that the journal receiver sequence numbers have been reset between the last entry in the current receiver and the last read entry.

Estimated time to read - This is a calculation using the entries not read and the transactions per hour rate. This calculation is intended to provide an estimate of the length of time it may take the process (database reader, database send, or object send) to complete reading the journal entries.

Target statistics: The lower part of the display shows status and summarized statistics for all target system processing. The following process fields are possible:

System - Identifies the current target system definition. The status value is an indication of the success in communicating with that system.

Jrn Mgr - Displays the status of the journal manager process for the target system.

DB Rdr - Displays status of the database reader. This field is present only for data groups that use remote journaling.

AP Maint - Displays status of the access path maintenance1 processes. This field is only present when optimized access path maintenance has been enabled.

RJLNK Mon - Displays status of the RJLNK monitor on the target system. This field is present only for data groups that use remote journaling.

Sys Jrn Insp - Displays the status of target journal inspection for the system journal (QAUDJRN) on the target system of the data group. This field is displayed when the journal definition for the system journal on the current target system permits target journal inspection and the data group is enabled and has been started at least once.

1. Access path maintenance is available only on installations running MIMIX 7.1.15.00 or higher. In earlier levels of MIMIX, if parallel access path maintenance is enabled, its status is displayed in the Prl AP Mnt field that appears in this location.

109


User Jrn Insp - Displays the status of target journal inspection for the user journal on the target system of the data group. This field is displayed when the journal definition for the user journal on the current target system permits target journal inspection and the data group is enabled, performs user journal replication, permits journaling on target, and has been started at least once.

DB Apply and Obj Apply - Each field displays the combined status for the apply jobs in use by the process. For each process, additional fields show statistics for the last received journal sequence number, number of unprocessed entries, approximate number of transactions per hour being processed, and the approximate amount of time needed to apply the unprocessed transactions for all database or object apply sessions.

Object detailed status views

Figure 14, Figure 15, and Figure 16 show samples of the information available when you use F7 (Object) to view the detailed object information. Use F11 to move between the three views of detailed object status. On each view, you can use the F1 (Help) key to see a description of that view’s contents.

In all object views, journal sequence numbers may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left-most) are omitted. Truncated journal sequence numbers are prefixed by '>'.

The possible status values are indicated in Table 25, with the following additional status values that are unique to several system journal replication processes.

The Min, Act, and Max fields for the Retrieve, Send, and Apply processes indicate the minimum, active, and maximum number of jobs for each process. The number of active jobs vary based on the work load. The active count is highlighted with color for the following conditions:

Red - The number of active jobs is zero (0).

Yellow - The number of active jobs is greater than zero (0) but less than the minimum number of processes.

Turquoise - The process has a backlog that exceeds its configured threshold. When this occurs, the backlog field for the process is also highlighted in the color turquoise.

Blue - The number of active jobs is equal to or greater than the minimum number of processes.

110


Figure 14 and Figure 17 show the active count highlighted.

Figure 14. Data group detail status, object view 1.


Data Group Object Status System: CHICAGO 17:50:00 Data group . . . . : CRITICALAP Objects in error . . 4 Elapsed time . . . : 00:52:51 Send Process -I *SHARED Jrn Manager -A Receiver Sequence # Date Time Trans/Hour Current . . AUDRCV0108 10,022,314,732 4/22/08 17:37:13 748 Last Read . AUDRCV0103 10,022,175,464 4/21/08 11:05:56 Entries not read: 139,268 Est. time to read: --------------------- Object Retrieve/Container Send ---------------------- Retrievers Retrieve Senders Send Containers Containers Min Act Max Backlog Min Act Max Backlog Sent Per Hour 1 0 5 1 0 5 1,145 ------------------------------- Object Apply ------------------------------- Applies Apply Active Entries Entries Min Act Max Backlog Objects Sequence # Applied Per Hour 1 1 5 4 >0,022,023,871 1,133 F3=Exit F5=Refresh F7=Merged view F8=Database view F9=Automatic refresh F11=View 2 F12=Cancel F24=More keys

Data Group Object Status System: CHICAGO 17:57:31 Data group . . . . : CRITICALAP Objects in error . . 4 Elapsed time . . . : 00:52:51 Send Process -I *SHARED Jrn Manager -A Receiver Sequence # Date Time Trans/Hour Current . . AUDRCV0108 10,022,314,732 4/22/08 17:37:13 748 Last Read . AUDRCV0103 10,022,175,464 4/21/08 11:05:56 Entries not read: 139,268 Est. time to read: --------------------- Object Retrieve/Container Send ---------------------- Retrievers Retrieve Senders Send Containers Containers Min Act Max Backlog Min Act Max Backlog Sent Per Hour 1 0 5 1 0 5 1,145 ------------------------------- Object Apply ------------------------------- Applies Apply ------------- Last Applied ------------- Min Act Max Backlog Sequence # Type Object 1 1 5 0 >0,022,023,871 *DOC BVT#I/PBBDOCXX.002 F3=Exit F5=Refresh F7=Merged view F8=Database view F9=Automatic refresh F11=View 3 F12=Cancel F24=More keys

111



Database detailed status views

Figure 17, Figure 18, Figure 19, and Figure 20 show samples of the information available when you use F8 (Database) to view the detailed database information. On each view, you can use the F1 (Help) key to see a description of that view’s contents.

In database views that include sequence numbers, the journal sequence numbers may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left-most) are omitted. Truncated journal sequence numbers are prefixed by '>'.

Most fields that display status of a process have some or all of the possible values indicated in Table 25. Possible values for the Jrn State and Cache (Src and Tgt) fields are indicated in Table 27 (journal state) and Table 28 (journal cache).

The data group configuration determines whether the Send process field is replaced by the RJ Link field. When remote journaling is configured, the RJ Link and DB Rdr fields are shown.

The AP Maint field is displayed on views 1 and 2 (Figure 17 and Figure 18) only when

the access path maintenance1 policy is enabled. When present, this field displays the status of the access path maintenance job that persists while the database apply process is active.

DG Object Journal Entry Detail System: CHICAGO 18:01:20 Data group . . . . : CRITICALAP Source system: LONDON-A Entry Sequence # Receiver Date Time Current entry TSF 10,022,314,732 AUDRCV0108 4/22/08 17:37:13 Last read entry - 10,022,175,464 AUDRCV0103 4/21/08 11:05:56 Last received - Target system: CHICAGO-A ------------------------------- Object Send ------------------------------- Entry Sequence # Date Time Type Object Active TCO >0,022,023,868 4/20/08 13:59:23 *DOC BVT#I/PBBDOCXX.002 Processed TCO >0,022,023,868 4/20/08 13:59:23 *DOC BVT#I/PBBDOCXX.002 ------------------------------- Object Apply ------------------------------- Entry Sequence # Date Time Type Object Processed TCA >0,022,023,871 4/20/08 13:59:23 *DOC BVT#I/PBBDOCXX.002 F3=Exit F5=Refresh F7=Merged view F8=Database view F9=Automatic refresh F11=View 1 F12=Cancel F24=More keys

1. Access path maintenance is available only on installations running 7.1.15.00 or higher.

112


In the top right corner of database views 1 and 2 (Figure 17 and Figure 18), these fields display combined counts of replicated entries and errors for file entries, IFS tracking entries, and object tracking entries:

• File and Tracking entries

• Not journaled Src Tgt - If the number of not journaled errors on either system exceeds 99,999, that system’s field displays +++++.

• Held due to error

• Access path maint. errors

• Held for other reasons

Database view 4 (Figure 20) separates this information into columns for file entries, IFS tracking entries, and object tracking entries.

If a data group has multiple database apply sessions you will see an entry for each session in the Apply Status column on database views 1, 2, and 3 (Figure 17, Figure 18, and Figure 19). Each session has its own status value. In these sample figures there is only one apply session (A) which is active (-A).

Figure 17. Data group detail status—database view 1. In this example, the Link status of -A and the presence of the Reader status indicate that the data group uses remote journaling and access path maintenance. The display also shows that journal standby state is active and journal caching is not active. The unprocessed entry count indicates that the final journal entry has not been applied. The > character preceding sequence numbers for the apply session indicate truncated sequence numbers that are asso-ciated with *MAXOPT3 support.

Data Group Database Status System: CHICAGO 18:07:02 Data group . . . . : CRITICALAP File and Tracking entries : 12 Elapsed time . . . : 00:52:51 Not journaled Src: 1 Tgt: 1 Jrn State and Cache Src: A N Tgt: A N Held due to error . . . . : 1 RJ Link-A AP Maint-A Access path maint. errors : 1 Jrn Mgr-A DB Rdr- A Held for other reasons . : 0 Receiver Sequence # Date Time Trans/Hour Source Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Rj Tgt Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Last Read . LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Entries not read: 0 Est. time to read: ------------------------------- Database Apply --------------------------- Apply Received Processed Unprocessed Entry Count Est Time Open Status Sequence # Sequence # Entry Count Trans/Hour To Apply Commit A-A >0,000,002,593 >0,000,002,592 1 *NO F3=Exit F5=Refresh F7=Object view F8=Merged view F9=Automatic refresh F11=View 2 F12=Cancel F24=More keys

113


Figure 18. Data group database status—view 2. In this example, the Link status of A and the presence of the Reader status indicates that the data group uses remote journaling. The display also shows that access path maintenance is used and active, and that journal standby state is active and journal caching is not active.

Figure 19. Data group database status, view 3.

Data Group Database Status System: CHICAGO 16:07:03 Data group . . . . : CRITICALAP File and Tracking entries. : 12 Elapsed time . . . : 00:52:51 Not journaled Src: 1 Tgt: 1 Jrn State and Cache Src: A N Tgt: A N Held due to error . . . . : 1 RJ Link-A AP Maint-A Access path maint. errors : 1 Jrn Mgr-A DB Rdr- A Held for other reasons . : 0 Receiver Sequence # Date Time Trans/Hour Source Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Rj Tgt Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Last Read . LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Entries not read: 0 Est. time to read: ------------------------------- Database Apply --------------------------- Apply Received Apply point Clock Time Hold MIMIX Log Open Status Sequence # Sequence # Difference Sequence # Commit Id A-A >0,000,002,590 >0,000,002,590 F3=Exit F5=Refresh F7=Object view F8=Merged view F9=Automatic refresh F11=View 3 F12=Cancel F24=More keys

DG Database Jrn Entry Detail System: CHICAGO 18:16:04 Data group . . . . : CRITICALAP Source system: LONDON-A Entry Sequence # Receiver Date Time Current entry UMX 12,345,678,900,000,002,591 LONDN0002 4/20/08 11:02:35 RJ target entry UMX 12,345,678,900,000,002,591 LONDN0002 4/20/08 11:02:35 Last read entry UMX 12,345,678,900,000,002,591 LONDN0002 4/20/08 11:02:35 Last received - 12,345,678,900,000,002,590 - 4/20/08 11:01:04 Target system: CHICAGO-A ------------------------------- Database Apply ----------------------------- Apply Entry Sequence # Date Time Object Library Member A-A UMX >0,000,002,590 4/20/08 11:01:04 F3=Exit F5=Refresh F7=Object view F8=Merged view F9=Automatic refresh F11=View 1 F12=Cancel F24=More keys

114

Identifying replication processes with backlogs

Figure 20. Data group detail status—database view 4. In this example, the combined number of file and tracking entries shown in Figure 17 and Figure 18 are separated into separate col-umns for file entries, IFS tracking entries, and object tracking entries.

Identifying replication processes with backlogsIf replication processes are active and have no reported error conditions, a replication process that has exceeded its backlog threshold will have a status that reflects this condition. However, if a replication process is inactive or has an error condition with a higher priority status, the threshold condition will not be visible in the process status until the process is started or the problem is resolved. Also, a backlog may exist but not be large enough to exceed the threshold setting, or the threshold warning setting may have been disabled (set to *NONE).

Do the following to check for a backlog condition:

1. To access the details for a data group, use the procedure in “Displaying data group detailed status” on page 105.

2. Use F7 or F8 on the Data Group Status display to locate the appropriate view for the process you want to check. Table 26 identifies this information and the

File and Tracking Entry Status System: CHICAGO 16:07:03 Data group . . . . : CRITICALAP File IFS Trk Obj Trk Entries Entries Entries Number of entries . . . . : 7 3 2 Not journaled on source . : 1 0 0 Not journaled on target . : 0 1 0 Held due to error . . . . : 0 1 1 Access path maint. errors : 1 - - Held for other reasons . .: 0 0 0 F3=Exit F5=Refresh F7=Object view F8=Merged view F9=Automatic refresh F11=View 1 F12=Cancel F24=More keys

115

Identifying replication processes with backlogs

appropriate fields for each process.

Table 26. Location of fields which identify backlogs and threshold conditions for replication processes

Process Description

View Fields to Check for Backlog Fields Highlighted

When Threshold

Exceeded

RJ Link The backlog is the quantity of source journal entries that have not been transferred from the local journal on the source system to the remote journal on the target system. The time difference between the last entry in each journal can also be an indication of a backlog.

Merged view,

Database views 1 and 2

Differences between journal entries identified by Source Jrn and RJ Tgt jrn for the database link.

• RJ tgt jrn Sequence # 1

• RJ tgt jrn Date and Time 2

DB Reader or DB Send

The backlog is the quantity of journal entries that are waiting to be read by the process. The time difference between the last entry that was read by the process and the last entry in the journal on the source system can also be an indication of a backlog. This may be a temporary condition due to maximized log space capacity. If the log space capacity was reached, the database reader job will be idle until the database apply job is able to catch up. If the condition is unable to resolve itself, action may be required.

Merged view,

Database views 1 and 2

For remote journaling configurations, differences between journal entries identified by Source Jrn and Last read.

For MIMIX source-send configurations, differences between journal entries identified by Current and Last Read.

• Entries not read Sequence # 1

• Last Read Date and Time 2

DB Apply The backlog is the number of entries waiting to be applied to the target system. Each apply session is listed as a separate entry with its own backlog.

Database views 1, 2, and 3

Unprocessed Entry Count • Apply Status

• Unprocessed Entry Count

Object Send

The backlog is the quantity of journal entries that have not been read from the system journal. The time difference between the last entry that was read by the process and the last entry in the system journal can also be an indication of a backlog.

Multiple data groups sharing the object send job is one possible cause of a persistent backlog.

Merged view,

Object views 1, 2, and 3

Differences between transactions identified for Object Current and Last Read

• Entries not read Sequence # 1

• Last Read Date and Time 2

Object Retrieve

The backlog is the number of entries for which MIMIX is waiting to retrieve objects.

Object views 1 and 2 Retrieve Backlog • Retrievers, Act column

• Retrieve Backlog

116

Data group status in environments with journal cache or journal state


Additional information is reported within data group status configured to use MIMIX support for IBM’s High Availability Journal Performance IBM i option 42, Journal Standby feature and Journal caching. When these high availability journal performance enhancements are in use, conditions that require action or attention are reflected in these locations:

• The data group name is highlighted on the Work with Data Groups display. The possible problems associated with journal cache or journal state are identified Table 23 in topic “Problems reflected in the Data Group column” on page 101.

• Jrn State and Cache (Src and Tgt) fields within the data group detailed status are highlighted. These fields are on the database views 1 and 2 of the Data Group Database Status display (Figure 17, Figure 18 respectively, shown in “Database detailed status views” on page 112). The possible values for the Jrn State and Cache (Src and Tgt) fields are indicated in Table 27 (journal state) and Table 28 (journal cache).

The Jrn State and Cache (Src and Tgt) fields reflect journal standby state and journal caching actual values for the journals when the IBM high availability performance enhancements are installed on the systems defined to the data group. These fields appear on database views 1 and 2 (Figure 17 and Figure 18). The target journal state and cache values are set on the journal when the database apply session is started.

Journal State - The status values indicate the actual state value for the source and

Container Send

The backlog is the number of packaged objects for entries that are waiting to be sent to the target system.

Object views 1 and 2 Container Send Backlog • Senders, Act column

• Container Send Backlog

Object Apply

The backlog is the number of entries waiting to be applied to the target system.

Object views 1 and 2 Apply Backlog • Applies, Act column

• Apply Backlog

Notes:

1. When highlighted, the threshold journal entry quantity criterion is exceeded.2. When highlighted the threshold time criterion is exceeded.

Table 26. Location of fields which identify backlogs and threshold conditions for replication processes

Process Description

View Fields to Check for Backlog Fields Highlighted

When Threshold

Exceeded

117


target journals. Table 27 shows the possible values for each field.

Journal Cache - The status indicate the actual cache value for the source and target journals. Table 28 shows the possible values for each field.

For each system (Src or Tgt) status of the journal state is shown first, followed by the status of the journal cache. If a problem exists with journal state or journal cache, the data group name is also highlighted with the same color. For information about resolving journal cache or journal state problems, see “Resolving a problem with journal cache or journal state” on page 119.

Table 27. Possible status values for Journal State fields

Field Color and

Status

Description

Either system White U Unknown. MIMIX was not able to retrieve values, possibly because the journal environment has not yet been built.

No color A Journal state is active

No color X The required IBM feature, IBM i option 42 - High Availability Journal Performance, is not installed on this system

No color S Journal is in standby state as expected

Source Red S Source journal is in standby state but that state is not expected.

Red I Source journal in inactive state but that state is not expected.

Target Yellow S Target journal state or cache is not as expected and the database apply session is active

Yellow I Target journal state is inactive but that state is not expected.

blank blank The IBM feature is installed but the data group is configured to not journal on the target system.

Table 28. Possible status values for Journal Cache fields

Field Color and

Status

Description

White U Unknown. MIMIX was not able to retrieve values, possibly because the journal environment has not yet been built

118


Resolving a problem with journal cache or journal state

Problems with journal state or journal cache can cause the name of a data group to be highlighted on the Work with Data Groups display. If the data group name is highlighted in red or yellow, do the following to check for and resolve problems:

1. From the Work with Data Groups display, use option 8 (Display status).

2. From the Data Group Status display, press F8 (Database).

3. The Jrn State and Cache Src and Tgt fields are located In the upper left corner of the Data Group Database Status display. For each system (Src or Tgt) status of the journal state is shown first, followed by the status of the journal cache. The example below shows v for value in all for status positions. Based on the status displayed in these fields, you can take the actions described in the following steps to correct the problem:

4. Source system journal state (first Src: value) - If the source system state is red and the value for the journal state is standby (S) or inactive (I), the journal state must be changed and all data replicated through the user journal must be synchronized. Do the following:

a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which system is specified as the source system for the data group.

b. Use option 45 (Journal Definitions) to view the journal definitions used for the data group in error.

Either System

No color X The required IBM feature, IBM i option 42 - High Availability Journal Performance, is not installed on this system

No color Y Caching is active

No color N Caching is not active.

Source Yellow Y Source journal cache value is not as expected.

Yellow N Source journal cache value is not as expected.

Target Yellow Y Target journal cache value not as expected and the database apply session is active.

Yellow N Target journal cache value not as expected and the database apply session is active.

blank blank The IBM feature is installed but the data group is configured to not journal on the target system.

Table 28. Possible status values for Journal Cache fields

Field Color and

Status

Description

Jrn State and Cache Src: v v Tgt: v v

119


c. On the Work with Journal Definitions display, determine the journal name and library specified for the system that is the source system for the data group.

d. Specify the name and library of the source system journal in the following command:CHJRN CHGJRN JRN(library/name) JRNSTATE(*ACTIVE)

e. All data replicated through the user journal must be synchronized. For detailed information about synchronizing a data group, refer to your Runbook or to the MIMIX Administrator Reference book.

5. Source system journal cache (second Src: value) - If the source system cache is yellow, the actual status does not match the configured value in the journal definition used on the source system. Do the following:

a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which system is specified as the source system for the data group.


c. On the Work with Journal Definitions display, use option 5 (Display next to the journal definition listed for the source system.

d. Check the value of the Journal caching (JRNCACHE) parameter.

e. Determine which value is appropriate for journal cache, the configured value or the actual status value. Once you have determined this, either change the journal definition value or change the journal cache (CHGJRN command) so that the values match.

6. Target system state (first Tgt: value) or Target system cache (second Tgt: value) - If the target system state or cache is yellow, the actual value for state or cache does not match the configured value. Do the following:

a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which system is specified as the target system for the data group.


c. On the Work with Journal Definitions display, use option 5 (Display next to the journal definition listed for the target system.

d. Check the value of the following parameters, as needed:

• Target journal state (TGTSTATE)

• Journal caching (JRNCACHE)

e. Determine why the actual status of the journal state or journal cache does not match the configured value of the journal definition used on the target system.

f. Determine which values are appropriate for journal state and journal cache, the configured value or the actual status value. Once you have determined this, either change the journal definition value or change the journal state or cache (CHGJRN command) so that the values match.

120

121

CHAPTER 7 Working with audits

Audits are defined by and invoked through rules and influenced by policies. Aspects of audits include schedules, status, reported results, and their compliance status.

MIMIX is shipped so that auditing can occur automatically. For day-to-day operations, auditing requires minimal interaction to monitor audit status and results. MIMIX user interfaces separate audit runtime status, compliance status, and scheduling information onto different views to simplify working with audits. Compliance errors and runtime errors require different actions to correct problems.

This chapter provides information and procedures to support day-to-day operations as well as to change aspects of the auditing environment. The following topics are included.

• “Auditing overview” on page 122 describes concepts associated with auditing and describes the differences between automatic priority audits and automatic scheduled audits.

• “Guidelines and considerations for auditing” on page 126 identifies considerations for specific audits, auditing best practices, and recommendations for checking the audit results.

• “Displaying audit runtime status” on page 129 identifies the Audit Summary interfaces and provides procedures for common activities with audits, such as running audits immediately and resolving reported problems.

• “Displaying audit history” on page 137 describes how to display history for specific audits of a data group.

• “Working with audited objects” on page 139 describes how to display a list of objects compared by one or more audits.

• “Working with audited object history” on page 142 describes how to access the audit history for a specific object.

• “Displaying audit compliance” on page 144 identifies the Audit Compliance interfaces and describes how to determine if an audit is audit has a compliance problem.

• “Displaying scheduling information for automatic audits” on page 147 describes how to access the Audit Schedule interfaces, how to display when prioritized audits will run, and how to display when scheduled audits will run.

Auditing overview

Auditing overviewAll businesses run under rules and guidelines that may vary in the degree and in the methods by which they are enforced. In a MIMIX environment, auditing provides rules and enforcement of practices that help maintain availability and switch-readiness at all times.

Not using or limiting audit use does little to confirm the integrity of your data. These approaches can mean lost time and issues with data integrity when you can least afford them.

In reality, successful auditing means finding the right balance somewhere between these approaches:

• Audit your entire replication environment every day. The benefit of this approach is knowing that your data integrity exposure is limited to data that changed since the last audit. The trade-off with this approach can be time and resources to perform audits.

• Audit only replicated data that “needs” auditing. This approach can be faster and use fewer resources because each audit typically has fewer objects to check. The trade-offs are determining what needs auditing and knowing when objects were last audited.

MIMIX makes auditing easy by automatically auditing all objects periodically and auditing a subset of objects every day. MIMIX also provides the ability to fine-tune aspects of auditing behavior and their automatic submission and the ability to manually invoke an audit at any time.

Components of an audit

Together, three components identify a unique audit. Each component must exist to allow an audit to run.

Rule - A program by which an audit is defined and invoked. Each rule shipped with MIMIX pre-defines a compare command to be invoked and the possible actions that can be initiated, if needed, to correct detected problems. When invoked, each rule can check only the class of objects associated with its compare command. Names of rules shipped with MIMIX begin with the pound sign (#) character.

Data group - A data group provides the context of what to check and how results are reported. Multiple audits (rules) exist for each data group.

Note: Audits are not allowed to run against disabled data groups.

Schedule - Each unique combination of audit rule and data group has its own schedule, by which it is automatically submitted to run. MIMIX ships default scheduling information associated with each shipped rule. Scheduling can be adjusted for individual audits through policies. A manually invoked audit can be thought of as an immediate override of scheduling information.

Although people use the terms “audit” and “rule” interchangeably, a rule is a component of an audit. The process of auditing runs a rule program.

122

Auditing overview

Phases of audit processing

The process of auditing consists of a compare phase and a recovery phase.

In the compare phase of an audit, the identified audit rule initiates a specific compare command against the data group. The Audit level policy determines if an audit is allowed to run and how aggressively an audit checks your environment during its compare phase. If a shipped audit rule provides more than one audit level, each level provides increasingly more checking capability.

If there are detected differences when the compare phase completes, the audit enters its recovery phase to start automatic recovery actions as needed. MIMIX attempts to correct the differences and sends generated reports, called recoveries, to the user interface. MIMIX removes these generated reports when the recovery action completes successfully. If the recovery job fails to correct the problem, MIMIX removes the recovery and sends an error notification to the user interface.

Most audit rules support a recovery phase. MIMIX is shipped with defaults that enable audits to enter the recovery phase automatically when needed. The recovery phase can be optionally disabled in the Automatic audit recovery policy.

Object selection methods for automatic audits

MIMIX provides two approaches to performing audits automatically. The biggest difference between these approaches is how objects are selected to be audited. The other significant difference is when each type of audit is allowed to run.

• In scheduled object auditing, an audit run selects all objects that are configured for the data group and within the class of objects checked by the audit. MIMIX automatically runs an audit according to its specified scheduling criteria. Each time a scheduled audit runs, all eligible configured objects are selected.

• In prioritized object auditing, an audit run selects replicated objects according to their internally assigned priority category and an auditing frequency assigned to the category. The result is often a subset of the objects replicated by the data group. Each time a prioritized audit runs, its subset of objects selected to check may be unique. MIMIX automatically runs a prioritized audit periodically within its specified time range every day. It may run approximately once per hour or more often during its time range.

An audit that is manually invoked from the Work with Audits display in a 5250 emulator is an immediate run of a scheduled audit. Priority audits cannot be manually invoked from this display. From Vision Solutions Portal, you have the ability to perform an immediate run of either method of auditing.

Prioritized auditing can reduce the impact of auditing on resources and performance. This benefits customers who cannot complete IFS audits, cannot audit every day, or do not audit at all because of time or resource issues.

When both types of auditing are used, you can achieve a balance between verifying data integrity and resources. Either or both types of automatic auditing can be disabled, although that is not recommended.

123

Auditing overview

How priority auditing determines what objects to select

MIMIX determines the auditing priority of each replicated object based on its most recent change, most recent audit, and the frequency specified for auditing priority categories. At any time, every replicated object falls within one of several predetermined categories. Objects in each category are eligible for selection according to the frequency assigned to their category.

Each prioritized audit runs approximately once per hour, or more often, every day during its time range specified in the Priority audit policy. Each time the audit starts, it selects only the objects eligible in each category.

Initially, the objects selected by a prioritized audit may be nearly the same as those selected by a scheduled audit. However, over time the number of objects selected by a prioritized object stabilizes to a subset of those selected by a scheduled audit.

When both scheduled and priority audits are allowed for the same rule and data group, MIMIX may not start a prioritized audit if the scheduled audit will start in the near future.

How audits are submitted automatically

When MIMIX is started (STRMMX command), all system-level processes necessary for replication and auditing are started, including the master monitor. On each system, the master monitor starts job scheduling activities for auditing. This ensures that

Table 29. Priority auditing categories

Category Description Eligibility Frequency

Objects not equal Objects that had any value other than equal (*EQ) in their most recent audit. This includes objects for which a detected difference was automatically resolved.

Objects in this category have the highest priority and are always selected.

New objects A new object is one that has not been audited since it was created.

Objects in these categories are eligible for selection according to the category frequency specified in the Priority audit policy.

Changed objects A changed object is one that has been modified since the last time it was audited.

Unchanged objects

An unchanged object is one that has not been modified since the last time it was audited.

Audited with no differences

An object with no differences is one that has not been modified since the last time it was audited and has been successfully audited with no changes on at least three consecutive audit runs. Objects remain in this category until a change occurs.

The #FILDTA audit always selects all members of a file for which auditing is less than 100 percent complete. The occurs in all of the above object selection categories.

124

Auditing overview

audits are submitted automatically according to the polices in effect for when to run priority audits and scheduled audits.

The time specified in policies is local to each system. At the appropriate time for each audit, a job is initiated on each system in the data group. MIMIX uses the Run rule on system policy to determine where the audit should run and immediately ends the audit job if it is not on the appropriate system.

For a scheduled audit, the Audit schedule policy determines the time and frequency of when the audit runs. A scheduled audit can be set to run on specific dates or days of the week, or on relative days of the month.

For a prioritized audit, the Priority audit policy determines the range of time during which the audit can start each day. A prioritized audit can run multiple times during the specified range, approximately once per hour or more often.

If you start replication through procedures or processes that invoke the Start Data Group (STRDG) command, you also need to ensure that the master monitor is started on all systems in your installation (STRMSTMON command) so that automatic auditing can occur.

Audit status and results

When audits complete or end in error, their status is reported in the audit summary user interfaces In a 5250 emulator, this is on the Work with Audits display (WRKAUD command). In Vision Solutions Portal, this is the Audits portlet. A summary of all audit status also “bubbles up” to the level of data group interfaces.

The information available about each audit identifies the status of actions performed by its rule, how the audit selected objects for comparison, the audit’s compliance status, policy values which affect the actions of each phase, and scheduling information. When a phase completes, its timestamps and statistics are also available.

When audit recoveries are enabled, you can control the severity level of the notifications that are returned when the rule ends in error with the Notification severity policy.

You can also view job logs associated with notifications and recoveries. Job logs are accessible from the system on which the audit comparison or recovery job ran.

Audit compliance

Compliance is an indication of whether an audit ran within the time frame of the compliance thresholds set in auditing policies.

For audits configured for scheduled object auditing or both scheduled and prioritized object auditing, compliance status is based on the last run of a scheduled audit or a user-invoked audit. For audits configured for only prioritized object auditing, compliance status is based on the last run, which may have been a prioritized audit or a user-invoked audit. A user-invoked audit or a scheduled audit checked all objects that are configured for the data group and within the class of objects checked by the audit whereas a prioritized audit may have checked only a subset of those objects.

125

Guidelines and considerations for auditing

Guidelines and considerations for auditingAuditing is most effective when it is performed regularly and you take action to investigate and resolve any reported differences that cannot be automatically corrected.

Auditing best practices

Regular auditing helps you detect problems in a timely manner and can help you to address detected problems during normal operations instead of during a crisis. Policy values for auditing are shipped with defaults set to values that Vision Solutions recommends as best practice. New data groups and new installations will automatically use these policy values. If you determine that default policy values do not meet your auditing needs, you can customize the policy settings. Auditing best practices include:

Automatically auditing: MIMIX is shipped so that auditing occurs automatically.

• Allow both priority audits and scheduled audits to run automatically. This provides a balance between checking all objects periodically and checking a subset of objects every day. You can adjust the Priority audit and Audit schedule policies that control when each type of audit is automatically submitted to meet the needs of your environment.

• Allow audits to perform the most extensive comparison possible. The shipped value (level 30) for the Audit level policy enables this. If you choose to run audits at a lower audit level, be aware of the risks, especially when switching.

• Allow audits to perform automatic recovery actions. This provides automatic correction of detected problems. Recovery is possible when the Automatic audit recovery policy is enabled.

• Allow MIMIX to run all audits even if you do not replicate certain object types (such as DLOs). This ensures that if you add new objects in the future, you will be automatically auditing them. Audits that do not have any objects to check complete quickly with little use of system resources.

Manually auditing: In addition, manually invoke audits in these conditions:

• Before switching, run all audits at audit level 30. Click this link to see additional information about the audit level policy.

• If you make configuration changes, run the #DGFE audit to check actual configuration data against what is defined to your configuration. Click this link to see additional information about when to run the #DGFE audit.

Where to run audits: Run audits from a management system. For most environments, the management system is also the target system. If you cannot run rules from the management system due to physical constraints or because of complex configurations, you can change the Run rule on system policy to meet your needs. Click this link to see additional information about the Run rule on system policy.

126


Considerations for specific audits

#DGFE audit - This audit is not eligible for prioritized auditing because it checks configuration data, not objects. As a result, configuration problems for a data group can only be detected when a scheduled audit or a manually invoked audit runs.

Run the #DGFE audit during periods of minimal MIMIX activity to ensure that replication is caught up and that added or deleted objects are reflected correctly in the journal. If the command is run during peak activity, it may contain errors or indicate that files are in transition.

In addition to regularly scheduled audits, check your configuration using the #DGFE audits for your data groups whenever you make configuration changes, such as adding an application or creating a library. Running the audit prior to audits that compare attributes ensures that those audits will compare the objects and attributes you expect to be present in your environment.

#DLOATR audit - This audit supports multiple levels of comparisons. The level used is controlled by the value of the Audit level policy in effect when the audit runs. The #DLOATR audit compares attributes as well as data for objects defined to a data group when audit level 20 or 30 is used. Audit level 10 compares only attributes. When data is compared the audit may take longer to run and may affect performance.

#FILDTA audit - This audit supports multiple levels of comparisons. The level used is is controlled by the value of the Audit level policy in effect when the audit runs. The #FILDTA audit compares all file member data defined for file members defined to a data group only when audit level 30 is used. Level 10 and level 20 compare 5 percent and 20 percent of data, respectively. Lower audit levels may take days or weeks to completely audit file data. New files created during that time may not be audited. Regardless of the audit level you use for regular auditing, Vision Solutions strongly recommends running a level 30 audit before switching.

#IFSATR audit - This audit supports multiple levels of comparisons. The level used is controlled by the value of the Audit level policy in effect when the audit runs. The #IFSATR audit compares data when audit level 20 or 30 is used. At level 10, only attributes are compared. Regardless of the audit level you use for regular auditing, Vision Solutions strongly recommends running a level 30 audit before switching.

#MBRRCDCNT audit - This audit compares the number of current records (*CURRDS) and the number of deleted records (*NBRDLTRCDS) for physical files that are defined to an active data group. Equal record counts suggest but do not guarantee that files are synchronized.

The #MBRRCDCNT audit does not have a recovery phase. Differences detected by this audit appear as not recovered in the Audit Summary.

In some environments using commitment control, the #MBRRCDCNT audit may be long-running. Refer to the MIMIX Administrator Reference book for information about improve performance of this audit.

Recommendations when checking audit results

Consider these recommendations when you check results of audits:

• Always review the results of the audits. Audit results reflect only what was

127


actually compared. Some objects may not have been compared due to object activity or due to the audit level policy value in effect, even when no differences (*NODIFF) are reported. You may need to take actions other than running an audit to correct detected issues. For example, you may need to change a procedure so that target system objects are only updated by replication processes.

• Be aware of priority auditing behavior. Priority audits differ from other audits in how they select objects to audit and in the number of objects selected. Be aware of the implications of those differences when checking audit results. Priority audits select replicated objects based on their auditing eligibility. As a result, priority audits cannot check newly created source objects until after their create transactions have been replicated. Priority audits can return results indicating that zero (0) objects were selected. This occurs when no objects were eligible for selection by an audit.

• Deleted objects reported as not found. Audits can report not found conditions for objects that have been deleted. A not found condition is reported when a delete transaction is in progress for an object eligible for selection when the audit runs.This is more likely to occur when there are replication errors or backlogs at the time the audit runs.

• Fixing one error may expose another. It may take multiple iterations of running audits with recoveries before the results are clean. Recovering from one error may result in a different error surfacing the next time the audit is performed. For example, a recovery that adds data group file entries may result in detecting a database relationship difference (*DBRIND) error the next time the audit is performed, where the root problem is that a library of logical files is not identified for replication.

• Watch for trends in the audit results. Trends may indicate situations that need further investigation. For example, objects that are being recovered for the same reason every time you run an audit can be an indication that something in your environment is affecting the objects between audits. In this case, investigating the environment for the cause may determine that a change is needed in the environment, in the MIMIX configuration, or in both. Trends may also indicate a MIMIX problem, such as reporting an object as being recovered when it was not. Report these scenarios to MIMIX CustomerCare. You can do this by creating a new case using the Case Management page in Support Central.

128

http://www.visionsolutions.com/Support/Support-Product-Index.aspx

Displaying audit runtime status

Displaying audit runtime statusThe audit summary view of the Work with Audits display shows audit runtime status in the Audit Status column. F11 toggles between variations of audit summary views.

Do the following:

1. Do one of the following to access the Summary view of the Work with Audits display:

• From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Audit summary view.

• Enter the command: installation-library/WRKAUD VIEW(*AUDSTS)

2. The Work with Audits display appears. If audit compliance problems exist, you may see a different view of the Work with Audits display. Use F10 to access the Summary view.

3. Check the value shown in the Audit Status column. Press F1 (Help) for a description of status values.

4. To view additional information about an audit, use option 5 (Display).

On the summary view of the Work with Audits display, audits are sorted and displayed so that the highest severity item is at the top of the list.

In addition to audit runtime status, the initial summary view (Figure 21) also includes the full name of the data group and the following information:

The Object Diff column identifies the number of audited objects with differences remaining after the audit completed.

The Objects Selected column indicates how objects were selected for auditing in

129


the most recent run of the audit.

Figure 21. Audit Summary, view - data group definition columns

Note: Audit runtime status and compliance status values are prioritized and are also “bubbled up” to the next higher level in the user interface, which is the installation. In a 5250 emulator, audit status is included in the summarized replication status displayed on the Work with Application Groups display. The Work with Data Groups display provides an indication of the number of audits that require action or attention.

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 46=Mark recovered ... Audit Audit ---------Definition--------- Object Objects Opt Status Rule DG Name System 1 System 2 Diff Selected __ *NOTRUN #OBJATR EMP AS01 AS02 0 *PTY __ *CMPACT #DLOATR EMP AS01 AS02 0 *PTY __ *CMPACT #FILATR EMP AS01 AS02 0 *PTY __ *CMPACT #FILATRMBR EMP AS01 AS02 0 *PTY __ *RCYACT #FILDTA EMP AS01 AS02 0 *PTY __ *QUEUED #IFSATR EMP AS01 AS02 0 *PTY __ *QUEUED #MBRRCDCNT EMP AS01 AS02 0 *PTY __ *NODIFF #DGFE EMP AS01 AS02 0 *ALL Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Compliance summary F11=Last run F14=Audited objects F16=Inst. policies F23=More options F24=More key

130


The additional view of audit summary information (Figure 22) displays policies in effect when the audit was last run

Figure 22. Audit Summary view - last run columns.

The Last Run columns show the values of policies in effect at the time the audit was last run through its compare phase.

Recovery identifies the value of the automatic audit recovery policy. When this policy is enabled, after the comparison completes, MIMIX automatically starts recovery actions to correct differences detected by the audit. Recovery may also indicate a value of *DISABLED if a condition checked by the Action for running audits (RUNAUDIT) policy existed and the policy value for that condition specified *CMP, preventing audit recoveries from running.

Level identifies the value of the audit level policy. The audit level determines the level of checking performed during the compare phase of the audit. If an audit was never run, the value *NONE is displayed in both columns.

Running an audit immediately

You always have the option of running an audit immediately. You can do this by running the MIMIX rule associated with the audit. From a 5250 emulator, audits invoked in this manner always select all replication-eligible objects associated with the class of object for the audit. When running an audit immediately from Vision Solutions Portal, you have the ability to select whether the audit will select all replication-eligible objects or only prioritized objects.

In most cases, you want to run the audit from the management system. Policies determine whether a request to run an audit can be performed on the requesting system.

Most users should perform this procedure form the management system.

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 46=Mark recovered ... Audit Audit ------Last Run------- Object Opt Status Rule DG Name Recovery Level Diff __ *NOTRUN #OBJATR EMP *ENABLED *LEVEL30 0 __ *CMPACT #DLOATR EMP *ENABLED *LEVEL30 0 __ *CMPACT #FILATR EMP *ENABLED *LEVEL30 0 __ *CMPACT #FILATRMBR EMP *ENABLED *LEVEL30 0 __ *RCYACT #FILDTA EMP *ENABLED *LEVEL30 0 __ *QUEUED #IFSATR EMP *ENABLED *LEVEL30 0 __ *QUEUED #MBRRCDCNT EMP *ENABLED *LEVEL30 0 __ *NODIFF #DGFE EMP *ENABLED *LEVEL30 0 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Compliance summary F11=Last run F14=Audited objects F16=Inst. policies F23=More options F24=More key

131


To run a rule immediately, do the following:

1. From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter.

2. Type option 9 (Run rule) next to the audit you want and press Enter.

Note: Audits are not allowed to run against disabled data groups.

For more information, see, “Resolving audit problems” on page 133.

132

Resolving audit problems

When viewing results of audits, the starting point is the Summary view of the Work with Audits display. You may also need to view the output file or the job log, which are only available from the system where the audits ran. In most cases, this is the management system.


1. Do one of the following to access the Work with Audits display.

• From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Audit summary view.

• From a command line, enter WRKAUD VIEW(*AUDSTS)

2. Check the Audit Status column for values shown in Table 30. Audits with potential problems are at the top of the list. Take the action indicated in Table 30.

Table 30. Addressing audit problems

Status Action

*FAILED If the failed audit selected objects by priority and its timeframe for starting has not passed, the audit will automatically attempt to run again.

The audit failed for these possible reasons.

Reason 1: The rule called by the audit failed or ended abnormally.

• To run the rule for the audit again, select option 9 (Run rule). This will check all objects regardless of how the failed audit selected objects to audit.

• To check the job log, see “Checking the job log of an audit” on page 135.

Reason 2: The #FILDTA audit or the #MBRRCDCNT audit which required replication processes that were not active.

1. From the command line, type WRKDG and press Enter.

• If all processes for the data group are active, skip to Step 2.

• If processes for the data group show a red I, L, or P in the Source and Target columns, use option 9 (Start DG).

2. When the data group is active, return to the Work with Audits display and use option 9 (Run rule) to run the audit. This will check all objects regardless of how the failed audit selected objects to audit.

3. If the audit fails again, check the job log using “Checking the job log of an audit” on page 135.

133

For more information about the values displayed in the audit results, see “Interpreting audit results - supporting information” on page 299.

*DIFFNORCY The comparison performed by the audit detected differences. No recovery actions were attempted because of a policy in effect when the audit ran. Either the Automatic audit recovery policy is disabled or the Action for running audits policy prevented recovery actions while the data group was inactive or had a replication process which exceeded its threshold.

If policy values were not changed since the audit ran, checking the current settings will indicate which policy was the cause. Use option 36 to check data group level policies and F16 to check installation level policies.

• If the Automatic audit recovery policy was disabled, the differences must be manually resolved.

• If the Action for running audits policy was the cause, either manually resolve the differences or correct any problems with the data group status. You may need to start the data group and wait for threshold conditions to clear. Then run the audit again.

To manually resolve differences do the following:

1. Type 7 (History) next to the audit with *DIFFNORCY status and press Enter.2. The Work with Audit History display appears with the most recent run of the audit at the

top of the list. Type 8 (Display difference details) next to an audit to see its results in the output file.

3. Check the Difference Indicator column. All differences shown for an audit with *DIFFNORCY status need to be manually resolved. For more information about the possible values, see “Interpreting audit results - supporting information” on page 299.

To have MIMIX always attempt to recover differences on subsequent audits, change the value of the automatic audit recovery policy.

*NOTRCVD The comparison performed by the audit detected differences. Some of the differences were not automatically recovered. The remaining detected differences must be manually resolved.

Note: For audits using the #MBRRCDCNT rule, automatic recovery is not possible. Other audits, such as #FILDTA, may correct the detected differences.

Do the following:

1. Type 7 (History) next to the audit with *NOTRCVD status and press Enter.2. The Work with Audit History display appears with the most recent run of the audit at the

top of the list. Type 8 (Display difference details) next to an audit to see its results in the output file.

3. Check the Difference Indicator column. Any differences with values other than *RECOVERED must be manually resolved. For more information about the possible values, see “Interpreting audit results - supporting information” on page 299.

*NOTRUN The audit was prevented from running by the Action for running audits policy. Either the data group was inactive or a replication process exceeded its threshold. This may be expected during periods of peak activity or when data group processes have been ended intentionally. However, if the audit is frequently not run due to this policy, action may be needed to resolve the cause of the problem.

Table 30. Addressing audit problems

Status Action

134

Checking the job log of an audit

An audit’s job log can provide more information about why an audit failed. If it still exists, the job log is available on the system where the audit ran. Typically, this is the management system.

You must display the notifications from an audit in order to view the job log. Do the following:

1. From the Work with Audits display, type 7 (History) next to the audit and press Enter.

2. The Work with Audit History display appears with the most recent run of the audit at the top of the list.

3. Use option 12 (Display job) next to the audit you want and press Enter.

4. The Display Job menu opens. Select option 4 (Display spooled files). Then use option 5 (Display) from the Display Job Spooled Files display.

5. Look for messages from the job log for the audit in question. Usually the most recent messages are at the bottom of the display.

Message LVE3197 is issued when errors remain after an audit completed.

Message LVE3358 is issued when an audit failed. Check for following messages in the job log that indicate a communications problem (LVE3D5E, LVE3D5F, or LVE3D60) or a problem with data group status (LVI3D5E, LVI3D5F, or LVI3D60).

135

136

Ending audits

Only active or queued audits can be ended. This includes audits with the following statuses: Currently comparing (*CMPACT), Currently recovering (*RCYACT), or Currently waiting to run (*QUEUED).

You must end active or queued audits from the system that originated the audit. You can end active or queued audits from any view of the Work with Audits display. This procedure uses the Status view.

To end an active or queued audit, do the following:

1. From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Audit summary view.

2. Check the value shown in the Audit Status column. Press F1 (Help) for a description of status values.

3. Type option 10 (End) next to the active or queued audit you want to end and press Enter.

4. Audits in *CMPACT or *QUEUED status are set back to their previous status values. Audits in *RCYACT status are set according to the completed comparison result as well as the results of any completed recovery actions.

Displaying audit history

Displaying audit historyThe Work with Audit History display lists the available history for completed runs of a specific combination of audit rule and data group. Each item listed is a history of a completed audit run, shown in reverse chronological order so that the completed audit with the most recent start time is at the top of the list. Audits that are new or that have an active status are not included in this list.

Do the following to access retained history for a specific audit and data group combination:

1. From the MIMIX Intermediate Main Menu, type 6 (Work with audits) and press Enter.

2. From the Work with Audits display, type 7 (History) next to the audit and data group you want and press Enter.

The amount of history information available is determined by how frequently an audit runs and the settings of the Audit history retention policy. Having retained audit history enables you to look for trends across multiple runs of an audit that may be an indication of a configuration problem or some other issue with an object. For example, the Work with Audit History display makes it easy to notice that particular audit of one data group always has a similar number of recovered objects or always has differences that cannot be recovered automatically.

The initial view shows (Figure 23) the final audit status and recovery phase statistics.

Figure 23. Work with Audit History - view of recovery results.

F11 toggles between this view and additional views.

Work with Audit History SYSTEM: AS01 Audit rule . . . . . . : #FILATR Data group definition . : EMP AS01 AS02 Type options, press Enter. 5=Display 6=Print 8=View difference details 12=Display job 14=Audited objects 46=Mark recovered ------------------Objects----------------- Audit Total Not Not Opt Compare Start Status Selected Compared Recovered Recovered __ 01/02/10 15:25:31 *NODIFF 91 0 0 0 __ 12/31/09 09:06:04 *NODIFF 0 0 0 0 __ 12/30/09 08:50:29 *AUTORCVD 4 0 0 3 BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Summary results F12=Cancel F13=Repeat F14=Audited objects F21=Print list

137

Displaying audit history

The summary results view (Figure 24) shows the total number of objects selected by the audit and whether the objects selected were the result of a priority audit or a scheduled audit.

Figure 24. Work with Audit History - view of summary results.

The compare results view (Figure 25) shows the duration of the audit as well as statistics for the compare phase of the audit.

Figure 25. Work with Audit History - view of compare results.

Work with Audit History SYSTEM: AS01 Audit rule . . . . . . : #FILATR Data group definition . : EMP AS01 AS02 Type options, press Enter. 5=Display 6=Print 8=View difference details 12=Display job 14=Audited objects 46=Mark recovered Audit Total Objects Opt Compare Start Status Selected Selected __ 01/02/10 15:25:31 *NODIFF 91 *ALL __ 12/31/09 09:06:04 *NODIFF 0 *PTY __ 12/30/09 08:50:29 *AUTORCVD 4 *PTY BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Compare results F12=Cancel F13=Repeat F14=Audited objects F21=Print list

Work with Audit History SYSTEM: AS01 Audit rule . . . . . . : #FILATR Data group definition . : EMP AS01 AS02 Type options, press Enter. 5=Display 6=Print 8=View difference details 12=Display job 14=Audited objects 46=Mark recovered ------------Objects------------- Audit Audit Not Detected Opt Compare Start Status Duration Compared Compared Not Equal __ 01/02/10 15:25:31 *NODIFF 00:00:04 91 0 0 __ 12/31/09 09:06:04 *NODIFF 00:00:01 0 0 0 __ 12/30/09 08:50:29 *AUTORCVD 00:00:01 4 0 3 BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Recovery results F12=Cancel F13=Repeat F14=Audited objects F21=Print list

138

Working with audited objects

When viewing the Work with Audit History display from the system on which the audit request originated, you can use options to view the object difference details detected by the audit (option 8), the job log for the audit (option 9), and a list of objects that were audited (option 14).

Audits with no selected objects

On the Work with Audit History display, it is possible to see repeated audit runs that have zero (0) objects selected during the time frame that prioritized audits are allowed to run each day. Zero objects selected means that no objects matched the frequency specified for criteria for selecting objects at the time when the prioritized audit ran.

Consider this example of how prioritized audits operate. Audit #FILATR is set to run priority audits using its shipped default values for priority auditing. This means the audit will run approximately once per hour between 3 and 8 a.m. every day. Each audit run will select the following:

• Any replicated objects that were not equal in their last audit.

• Any new replicated objects had never been audited.

• Any replicated objects that changed in the past 24 hours.

• Any replicated objects that did not change since they were audited a week ago.

• Any replicated objects that did not change since their last audit a month (30 days) ago and have a history of repeated consecutive successful audits.

For the first run (between 3 and 4 a.m.) of a normal work day, it is likely the audit selected objects in the new and changed in the past day categories, and may have selected some objects in other categories as well. The second run is likely to have selected fewer objects, and may have selected only objects that had differences from the earlier run. If those differences were resolved, then the subsequent runs that day are likely to have selected no objects because none were eligible. While such a daily pattern may repeat, it is also subject to replication and other auditing activity within your environment.

Working with audited objectsThe Work with Audited Objects display shows a list of objects compared by one or more audits. This information is available only on the originating system for audits performed when the Audit history retention (AUDHST) policy in effect specified to keep details relevant to the type of audit and those audits have not exceeded the current policy's retention criteria.

The list of objects is sorted by severity of their final audit status (the status after comparisons and any recovery actions complete), with the most severe status first. Because this display lists audited object history, the #DGFE rule, which compares configuration data, is not included.

When the objects listed are for only one audit, the display appears as shown in Figure 26. This layout is used when the display is invoked by option 14 (Audited objects) on the Work with Audits display or the Work with Audit History display. Note that the Audit

139


start field is located at the top of the display in this case. If the selected audit is the audit run with the latest start date, (*LAST) will also appear in the Audit start field.

Figure 26. Work with Audited Objects display for a single audit.

When the list includes objects from multiple audits, the display appears as shown in Figure 27, with the specific audit rule and start time displayed in columns. A > symbol next to an object name indicates a long object path name exists which can be viewed with F22.

File member information is not automatically displayed. However, you can use F18 to change subsetting criteria to include members. When member information is displayed, the name is in the format: library/file(member). Also, the information displayed for file members may not be from the most recently performed audit. Because members can be compared by several audits, the most recent run of each of those audits is evaluated. The evaluated audit run with most severe status is displayed, even if it is not the most recently performed audit of the evaluated audit

Work with Audited Objects SYSTEM: AS01 Data group: EMP AS01 AS02 Audit rule: #FILATR Audit start: 06/17/09 15:01:34 (*LAST) Type options, press Enter. 5=Display 6=Print 9=Object history Audited Object Opt Status Type Name _ *NE *FILE L00SAMPLEA/RJFILE1 _ *EQ *FILE L00SAMPLEA/RJFILE2 _ *EQ *FILE L00SAMPLEA/RJFILE3 _ *RCVD *FILE L00SAMPLEA/RJFILE4 BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F12=Cancel F13=Repeat F18=Subset F21=Print list F22=Display entire field

140


runs. For all other objects, the information displayed is from the most recent audit run that compared the object.

Figure 27. Work with Audited Objects display with all audits displayed.

You can select option 5 to view the details of the audit in which the object was compared, such as the audit compare and recovery timestamps, and option 9 to view auditing history for a specific object.

Displaying audited objects from a specific audit run

Use this procedure to display the list of objects compared by a specific audit run.

For prioritized audits, not every object is audited in every audit run.

From Work with Audits display or the Work with Audit History display, do the following:

1. Ensure that you are on the system where the audit originated. The originating system is included in the audit details, which you can view using option 5 (Display).

2. Type 14 (Audited objects) next to audit run that you want and press Enter.

3. If necessary, press F18 (Subset) to specify criteria for filtering the list by object type, name, or audited status.

Displaying a customized list of audited objects

Use this procedure to list all objects compared by a data group or to specify filtering criteria such as object type, name, or audited status.


1. Ensure that you are on the system where the audit originated. The originating

Work with Audited Objects System: AS01 Data group: EMP AS01 AS02 Audit rule: *ALL Type options, press Enter. 5=Display 6=Print 9=Object history Audited Object -----------Audit------------ Opt Status Type Name Rule Date Time _ *NE *DTAARA L00SAMPLEA/AJDTAARA1 #OBJATR 12/11/09 09:40:27 _ *NE *DTAARA L00SAMPLEA/AJDTAARA2 #OBJATR 12/11/09 09:40:27 _ *NE *STMF /L00DIR/ALPHA.STM #IFSATR 12/11/09 09:47:57 _ *EQ *DTAARA L00SAMPLEA/DTAARA1 #OBJATR 12/11/09 09:40:27 _ *EQ *DTAARA L00SAMPLEA/DTAARA2 #OBJATR 12/11/09 09:40:27 _ *EQ *FILE L00SAMPLEA/RJFILE1 #FILATR 12/11/09 09:43:13 _ *EQ *FILE L00SAMPLEA/RJFILE2 #FILATR 12/11/09 09:43:13 _ *EQ *FILE L00SAMPLEA/RJFILE3 #FILATR 12/11/09 09:43:13 More... Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F12=Cancel F13=Repeat F18=Subset F21=Print list F22=Display entire field

141

Working with audited object history

system is included in the audit details, which you can view using option 5 (Display).

2. Press F14 (Audited objects). The Work with Audited Objects (WRKAUDOBJ) command appears.

3. Specify the Data group definition for which you want to see audited objects.

4. Specify the value you want for Object type and press Enter.

5. Additional fields appear based on the value specified in Step 5. Specify values to define the criteria for selecting the objects to be displayed.

Note: The value specified for Member (MBR) determines whether member-level objects are selected for their object history. The members selected are not automatically displayed in the list. To include any selected members, press F10 (Additional parameters), then specify *YES for Include member (INCMBR).

6. Press Enter to display the list of objects from the retained history details.

Working with audited object historyThe Work with Audited Obj. History display lists the available audit history for a single object compared by the indicated audit rules within the indicated data group. This capability provides the ability to check for trends for a specific object such as repeated automatic recovery of a difference.

The audit history for an object is available only on the originating system for audits performed when the Audit history retention (AUDHST) policy in effect specified to keep details relevant to the type of audit and those audits have not exceeded the current policy's retention criteria.

The list is sorted in reverse chronological order so that the audit history having the most recent start date is at the top of the list.

When the displayed object history is for a file member, the member is represented as object type *FILE with its name formatted as library/file(member). The Audit Rule column appears in the list to identify which audit rule compared the member in the audit run, as shown in Figure 28. When the audit history for any other object type is

142

Working with audited object history

displayed, there is only one possible audit rule so the Audit rule field is located at the upper right of the display.

Figure 28. Work with Audited Obj. History display showing audit history for a file member

From this display you can use option 5 to view details of the audit in which the object was compared, such as its audit compare and recovery timestamps, and option 8 to view object difference details that were detected by the audit.

Displaying the audit history for a specific object

Use this procedure to display the retained audit histories for a specific object.


1. Display a list of objects audited for a data group using either of the following procedures:

• “Displaying audited objects from a specific audit run” on page 141

• “Displaying a customized list of audited objects” on page 141

2. From the Work with Audited Objects display, type 9 (Object history) next to the object you want and press Enter.

Work with Audited Obj. History System: AS01 Data group: EMP AS01 AS02 Type: *FILE Name: ABCLIB/PF1(MBR1) Type options, press Enter. 5=Display 6=Print 8=View difference details Audit ----Compare Information--- -------Recovery Information------ Opt Rule Date Time Status Date Time Status _ #FILDTA 06/17/09 15:22:48 *EQ _ #MBRRCDCNT 06/17/09 15:01:34 *EQ _ #FILATRMBR 06/16/09 15:01:27 *NE 06/16/09 15:04:25 *RECOVERED Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F12=Cancel F13=Repeat F21=Print list F22=Display entire field

143

Displaying audit compliance

Displaying audit complianceThe audit compliance view of the Work with Audits display (Figure 29) shows audit compliance status in the Compliance column. F11 toggles between variations of audit compliance views.

Note: If other audit problems exist, you may see a different view of the Work with Audits display. Use F10 to access the Compliance view.

On the compliance view of the Work with Audits display, the list is initially sorted by compliance status. To sort the list by scheduled time, use F17.

In addition to audit compliance status, the initial compliance view (Figure 29) shows the timestamp of when the compare phase ended in the Compare End column.

Compliance is checked based on the last completed compare date. Compliance determines whether the date of the last completed compare completed by an audit is within the range set by policies. The Audit warning threshold policy and the Audit action threshold policy define when to indicate that an audit is approaching or exceeding that range.

For audits configured for scheduled object auditing or both scheduled and prioritized object auditing, compliance status is based on the last run of a scheduled audit or a user-invoked audit. For audits configured for only prioritized object auditing, compliance status is based on the last run, which may have been a prioritized audit or a user-invoked audit. A user-invoked audit or a scheduled audit checked all objects that are configured for the data group and within the class of objects checked by the audit whereas a prioritized audit may have checked only a subset of those objects.

Figure 29. Audit Compliance, view - data group definition columns.

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 36=Change DG policies 37=Change audit schedule Audit ---------Definition--------- ---Compare End--- Opt Compliance Rule DB Name System 1 System 2 Date Time __ *OK #DGFE EMP AS01 AS02 09/25/08 12:15:34 __ *OK #DLOATR EMP AS01 AS02 09/25/08 12:15:34 __ *OK #FILATR EMP AS01 AS02 09/25/08 12:15:34 __ *OK #FILATRMBR EMP AS01 AS02 09/25/08 12:15:35 __ *OK #FILDTA EMP AS01 AS02 09/25/08 12:15:38 __ *OK #IFSATR EMP AS01 AS02 09/25/08 12:15:36 __ *OK #MBRRCDCNT EMP AS01 AS02 09/25/08 12:15:38 __ *OK #OBJATR EMP AS01 AS02 09/25/08 12:15:37 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Schedule summary F11=Next scheduled F14=Audited objects F17=Sort sched. time F24=More keys

144


The additional view of audit compliance information (Figure 30) displays when the scheduled audit run will occur. The scheduled date and time in this view do not apply to prioritized audit runs.

Figure 30. Audit Compliance, view 2 - next scheduled time columns.

Note: Audit runtime status and compliance status values are prioritized and are “bubbled up” to the next higher level in the user interface, which is the installation. In a 5250 emulator, audit status is included in the summarized replication status displayed on the Work with Application Groups display. The Work with Data Groups display provides an indication of the number of audits that require action or attention.

Determining whether auditing is within compliance

Regular auditing detects and often repairs problems in the replication environment. Compliance with the best practice of regular auditing is determined for each individual audit based on the date when the audit last completed its compare phase.

Audit compliance problems are identified by the following a status values

*ATTN -The audit is approaching an out of compliance state as determined by the Audit warning threshold policy. Attention is required to prevent the audit from becoming out of compliance.

*ACTREQ - The audit is out of compliance with the Audit action threshold policy. Action is required. Perform an audit of the data group.

An audit with a compliance problem must be run to resolve the problem.

Do the following to check for compliance problems:

1. Do one of the following to access the Compliance view of the Work with Audits display:

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 36=Change DG policies 37=Change audit schedule Audit -Scheduled Time-- ---Compare End--- Opt Compliance Rule DG Name Date Time Date Time __ *OK #DGFE EMP 09/26/08 02:00:00 09/25/08 12:15:34 __ *OK #DLOATR EMP 09/26/08 02:25:00 09/25/08 12:15:34 __ *OK #FILATR EMP 09/26/08 02:10:00 09/25/08 12:15:34 __ *OK #FILATRMBR EMP 09/26/08 02:20:00 09/25/08 12:15:35 __ *OK #FILDTA EMP 09/26/08 02:35:00 09/25/08 12:15:38 __ *OK #IFSATR EMP 09/26/08 02:15:00 09/25/08 12:15:36 __ *OK #MBRRCDCNT EMP 09/26/08 02:30:00 09/25/08 12:15:38 __ *OK #OBJATR EMP 09/26/08 02:05:00 09/25/08 12:15:37 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Schedule summary F11=DG definition F14=Audited objects F17=Sort sched. time F24=More keys

145


• From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Compliance view

• Enter the command: installation-library/WRKAUD VIEW(*COMPLY)

2. Check the Compliance column for values of *ATTN and *ACTREQ.

3. To resolve a problem with audit compliance, the audit in question must be run and complete its compare phase.

• To see when the scheduled run of the audit will occur, press F11. To see when both scheduled and prioritized audits will run, press F10 to access the Audit summary view, then use F11 to toggle between views.

• To run the audit now, select option 9 (Run rule) and press Enter. This action will select all replicated objects associated with the class of the audit. For more Information, see “Running an audit immediately” on page 131.

146

Displaying scheduling information for automatic audits

Displaying scheduling information for automatic auditsAn audit can be configured to run by schedule, by priority, by schedule and priority, or not at all. The schedule summary views of the Work with Audits display allow you to see scheduling information for each audit.

Do the following to view when an audit can occur for a specific audit and data group combination:

1. From the MIMIX Intermediate Main Menu, type 6 (Work with audits) and press Enter.

2. The Work with Audits display appears, showing either the audit summary or compliance summary view. Press F10 as needed to access the Schedule summary view.

3. The initial view of the Schedule summary is displayed. Use F11 to toggle between additional variations of audit schedule views.

• The initial view (Figure 31) shows the date and time of the next scheduled audit run. You cannot view the exact time of when the next prioritized audit will run.

• To view current scheduled auditing settings, press F11 (Figure 32).

• To view current priority auditing settings, press F11 twice (Figure 33). Prioritized audit runs are allowed to start every day only during the specified time range. Multiple runs of an audit may occur during that time.

The list is initially sorted by rule and data group name.To sort the list by scheduled time, use F17.

Figure 31. Audit Schedule Summary, view - next scheduled time.

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 36=Change DG policies 37=Change audit schedule Audit ---------Definition--------- -Scheduled Time-- Opt Rule DG Name System 1 System 2 Frequency Date Time __ #DGFE EMP AS01 AS02 *WEEKLY 09/25/08 02:00:00 __ #DLOATR EMP AS01 AS02 *WEEKLY 09/25/08 02:25:00 __ #FILATR EMP AS01 AS02 *WEEKLY 09/25/08 02:10:00 __ #FILATRMBR EMP AS01 AS02 *WEEKLY 09/25/08 02:20:00 __ #FILDTA EMP AS01 AS02 *WEEKLY 09/25/08 02:35:00 __ #IFSATR EMP AS01 AS02 *WEEKLY 09/25/08 02:15:00 __ #MBRRCDCNT EMP AS01 AS02 *WEEKLY 09/25/08 02:30:00 __ #OBJATR EMP AS01 AS02 *WEEKLY 09/25/08 02:05:00 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Audit summary F11=Schedule settings F14=Audited objects F17=Sort sched. time F24=More keys

147

Displaying scheduling information for automatic audits

Figure 32. Audit Schedule Summary, view - schedule settings.

Figure 33. Audit Schedule Summary, view - priority settings.

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 36=Change DG policies 37=Change audit schedule Audit Weekday Rel.Day Opt Rule DG Name Frequency Date SMTWTFS 12345L Time __ #DGFE EMP *WEEKLY *NONE SMTWTFS 02:00:00 __ #DLOATR EMP *WEEKLY *NONE SMTWTFS 02:25:00 __ #FILATR EMP *WEEKLY *NONE SMTWTFS 02:10:00 __ #FILATRMBR EMP *WEEKLY *NONE SMTWTFS 02:20:00 __ #FILDTA EMP *WEEKLY *NONE SMTWTFS 02:35:00 __ #IFSATR EMP *WEEKLY *NONE SMTWTFS 02:15:00 __ #MBRRCDCNT EMP *WEEKLY *NONE SMTWTFS 02:30:00 __ #OBJATR EMP *WEEKLY *NONE SMTWTFS 02:05:00 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Audit summary F11=Priority settings F14=Audited objects F17=Sort sched. time F24=More keys

Work with Audits System: AS01 Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 9=Run rule 10=End 14=Audited objects 36=Change DG policies 37=Change audit schedule Audit -Start Range- ----Priority Objects Selected---- Opt Rule DG Name After Until New Chg Unchg No Diff __ #DGFE EMP *NONE __ #DLOATR EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY __ #FILATR EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY __ #FILATRMBR EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY __ #FILDTA EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY __ #IFSATR EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY __ #MBRRCDCNT EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY __ #OBJATR EMP 03:00 08:00 *DAILY *DAILY *WEEKLY *MONTHLY Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Audit summary F11=Priority settings F14=Audited objects F17=Sort sched. time F24=More keys

148

Displaying status of system-level processes

CHAPTER 8 Working with system-level processes

MIMIX uses several processes that run at the system level to support the replication environment and provide additional functionality. System-level processes include the system manager, journal manager, target journal inspection, collector services, and if needed, cluster services. These processes can be accessed from the Work with Systems display (WRKSYS command). Typically, these processes are automatically started and ended when MIMIX is started or ended. However, you may need to start or end individual processes when resolving problems.

The following topics are included in this chapter to help you resolve problems with system level processes:

• “Displaying status of system-level processes” on page 149 describes how to check for expected status values and resolve problems with system-level processes. This includes procedures for starting and ending managers, target journal inspection, and collector services.

• “Resolving *ACTREQ status for a system manager” on page 151 describes how to resolve a status of action required.

• “Checking for a system manager backlog” on page 151 describes how to check if there a backlog of unprocessed entries that require action.

• “Displaying status of target journal inspection” on page 155 describes how to display the status of a single inspection job on a system and how to resolve problems with its status.

• “Displaying results of target journal inspection” on page 156 describes where to find information about the objects identified by target journal inspection.

• “Identifying the last entry inspected on the target system” on page 158 describes how to determine the last entry in the target journal and the last entry processed by target journal inspection.

Displaying status of system-level processesStatus of processes that run at the system level can be viewed from the Work with Systems display.


• From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter.

• From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems).

2. The Work with Systems display appears. The first system definition in the list is the local system. Figure 34 shows expected status values for most two-system environments. For any other status values, continue with the next step.

149


Expected Status Values:

• System managers and journal managers have an expected status of *ACTIVE on all systems.

• For target journal inspection, the expected status is that systems that are currently the target for replication have a status of *ACTIVE and other systems have a status of *NOTTGT.

• For cluster services, most installations are not licensed for MIMIX® Global™ and have an expected status of *NONE. If a system participates in an IBM i cluster, the expected value is *ACTIVE. For more information about operation for MIMIX® Global™, see the MIMIX Operations with IBM i Clustering book.

Figure 34. Expected status on the Work with Systems display for a two-system environment.

3. If one or more processes are *INACTIVE, do one of the following:

• Type a 9 (Start) next to the system you want and press Enter. The Start MIMIX Managers display appears. Any processes except cluster services that are not active on the system are preselected. (To start cluster services, MIMIX® Global™ users must specify *YES for the Start cluster services prompt.) Press Enter.

4. For any other status values on a system, do the following;

• If one or more processes are *UNKNOWN, use the procedure in “Verifying all communications links” on page 282.

• For a system manager status of *ACTREQ, use “Resolving *ACTREQ status for a system manager” on page 151.

• To check for a system manager backlog, use “Checking for a system manager backlog” on page 151.

Work with Systems OSCAR Local system definition . . : OSCAR Cluster . . . . . . . . . . : *NONE Type option, press Enter 4=Remove cluster node 5=Display 6=Print 7=System manager status 8=Work with data groups 9=Start 10=End 11=Jrn inspection status System ----- Managers ----- Journal ----- Services ------ Opt Definition Type System Journal Inspect. Collector Cluster ___ OSCAR *MGT *ACTIVE *ACTIVE *ACTIVE *ACTIVE *NONE ___ HENRY *NET *ACTIVE *ACTIVE *NOTTGT *ACTIVE *NONE Bottom F3=Exit F5=Refresh F9=Automatic refresh F10=Legend F12=Cancel F13=Repeat F16=System definitions

150


• For target journal inspection status values other than *ACTIVE or *NOTTGT, see “Displaying status of target journal inspection” on page 155.

Resolving *ACTREQ status for a system manager

A system manager status of *ACTREQ indicates that at least one of the system manager pairs in which the system is a participant has failed. The system manager must be started. To start the system manager, type a 9 (Start) next to the system and press Enter.

Checking for a system manager backlog

The Work with System Pair Status panel includes the count of unprocessed entries for the source system job of the system manager process along with the timestamp of the oldest unprocessed entry. A count of unprocessed entries means that a backlog exists and action may be required.

A status of *INACTIVE indicates the system manager needs to be started. Type a 9 (Start) next to the system and press Enter.

A status of *ACTIVE with unprocessed entries indicates further action may be required. Since this data is a snapshot of work currently being done, it is important to refresh this panel (F5) to ensure data is up to date. Evaluate data for unprocessed entries with a status of *ACTIVE as follows:

• If the status is *ACTIVE and there are a high number of unprocessed entries for your environment or the timestamp is not changing when data is refreshed (F5), contact CustomerCare.

• If the status is *ACTIVE and there is a low number of unprocessed entries for your environment, refresh data (F5) and check whether the timestamp is changing. If the timestamp changes, the entries are being processed.

151

Starting a system manager or a journal manager

To selectively start a system manager or journal manager for a system, do the following




2. The Work with Systems display appears. Type a 9 (Start) next to the system definition you want and press Enter.

3. The Start MIMIX Managers display appears. By default, any manager that is not running will be selected to start. Specify the value for the type of manager you want to start at the Manager prompt and press Enter.

Ending a system manager or a journal manager

To end a system manager or journal manager, do the following:




2. The Work with Systems display appears with a list of the system definitions defined for the MIMIX installation. Type a 10 (End) next to the system definition you want and press Enter.

3. The End MIMIX Managers display appears. Specify the value for the type of manager you want to end at the Manager prompt and press Enter. The selected managers are ended.

Starting collector services

To start collector services for a system, do the following





3. The Start MIMIX Managers display appears. At the Collector services prompt, verify the value is *YES and press Enter.

152

Ending collector services

To end collector services for a system, do the following




2. The Work with Systems display appears. Type a 10 (End) next to the system definition you want and press Enter.

3. The End MIMIX Managers display appears. At the Collector services prompt, type *YES and press Enter.

Starting target journal inspection processes

These instructions will start target journal inspection processes on a selected system. If the system is the target system of one or more data groups whose journal definitions are configured for target journal inspection, a journal inspection job is started for the system journal and for each user journal on the system.

If the system is the target system for replication, an inspection job is started for the system journal and for each user journal on the system that is identified within data groups replicating to the system.

Target journal inspection processes start at the last sequence number in the currently attached journal receiver in the following cases:

• When it is the first time a target journal inspection process is started

• When starting after being ended and the last processed receiver is no longer available

• When starting after enabling target journal inspection in a journal definition where it was previously disabled

When starting target journal inspection after it was previously ended, processing begins with the next sequence number after the last processed sequence number.

To start target journal inspection processes for a system, do the following





3. The Start MIMIX Managers display appears. At the Target journal inspection prompt, verify the value is *YES and press Enter.

153

Ending target journal inspection processes

These instructions will end target journal inspection processes on a selected system. If the system is the target system for replication, the inspection process for the system journal is ended and all inspection processes are ended for the user journals identified as the target journal in data groups replicating to the system.

To end target journal inspection processes for a system, do the following




2. The Work with Systems display appears. Type a 10 (End) next to the system definition you want and press Enter.

3. The End MIMIX Managers display appears. At the Target journal inspection prompt, verify the value is *YES and press Enter.

154

Displaying status of target journal inspection

Displaying status of target journal inspectionTarget journal inspection consists of a set of jobs that read journals on the target system to check for people or processes other than MIMIX that have modified replicated objects on the target system. Best practice is to allow target journal inspection for all systems in your replication environment.

Each target journal inspection process runs on a system only when that system is the target system for replication. The number of inspection processes depends on how many journals are used by data groups replicating to that system. On a target system, there is one inspection job for the system journal and one job for each target user journal identified in data groups replicating to that system.

Because target journal inspection processes run at the system-level, the best location to begin checking status is from the Work with Systems display.




2. The Work with Systems display appears. The Journal Inspect. column shows the summarized status of all journal inspection processes on a system.

• Expected values are either *ACTIVE or *NOTTGT.

• For all other status values, type 11 (Jrn inspection status) next to the system you want and press Enter.

3. The Work with Journal Inspection Status display appears, listing the subset of journal definitions for the selected system. The status displayed is for target journal inspection for the journal associated with a journal definition.

Note: Journal definitions whose journals are not eligible for target journal inspection are not displayed. This includes journal definitions that identify the remote journal used in RJ configurations (whose names typically end with @R) as well as journal definitions JRNMMX and MXCFGJRN which are for internal use.

Table 31 identifies the status for the inspection job associated with a journal and how to resolve problems.

Table 31. Status values for a single target journal inspection process.

Journal

Inspection

Status


*INACTIVE (inverse red)

Journal inspection is not active.

Use option 9 (Start) to start all eligible target journal inspection processes on the system identified in the selected journal definition.

155

Displaying results of target journal inspection

Displaying results of target journal inspectionTarget journal inspection sends a warning notification for each user other than MIMIX who changed objects on the target system since the inspection job started. Because inspection jobs restart daily with other system level processes, a notification would typically be sent once per day per user. The notification identifies only the first object changed by the user.

Note: The MIMIX portal application for Vision Solutions Portal provides enhanced capabilities for displaying target journal inspection results. Notifications from target journal inspection processes are identified as originating from TGTJRNINSP in the Notifications portlet on the Summary page. Actions available for these notifications include displaying notification details as well as displaying a list of the objects changed on the target node by the user identified in the notification. Also, you can access a list of all objects changed on the target node by all users from the Replicated Objects portlet on the Analysis page.

*UNKNOWN (inverse white)

The status of the process on the system cannot be determined, possibly because of an error or communications problem.

Use the procedure in “Verifying all communications links” on page 282.

*ACTIVE (inverse blue)

Target journal inspection is active for the journal identified in the journal definition.

*NEWDG Target journal inspection has not run because all enabled data groups that use the journal definition as a target journal have never been started.

The inspection process will start when one or more of the data groups are started.

*NOTCFG Either the journal definition does not allow target journal inspection or all enabled data groups that use the journal definition (user journal) prevent journaling on the target system. Target journal inspection is not performed for the journal.

For instructions for configuring target journal inspection, see topics “Determining which data groups use a journal definition” and “Enabling target journal inspection” in the MIMIX Administrator Reference book.

*NOTTGT The journal definition is not used as a target journal definition by any enabled data group. Target journal inspection is not performed for the journal.

This is the expected status when the journal definition is properly configured for target journal inspection but the system is currently a source system for all data groups using this journal definition.

Table 31. Status values for a single target journal inspection process.

Journal

Inspection

Status


156

Displaying results of target journal inspection

Displaying details associated with target journal inspection notifications

This procedure displays notifications sent by target journal inspection and describes how to display related information from a 5250 emulator.

To check for notifications for target journal inspection, do the following:


• On the Work with Application Groups display, the Notifications field indicates whether any warning notifications exist. Press F15 (Notifications).

• On the Work with Data Groups display, the third number in the Audits/Recov./Notif. field displays the number of new notifications. Press F8 (Recoveries), then press F10 (Work with Notifications).

• From a command line, enter: WRKNFY.

2. The Work with Notifications display appears. Notifications from target journal inspection are identified by the name TGTJRNINSP in the Source column.

3. Type a 5 (Display) to view the notifications details.

4. On the Display Notification Details display, check these fields:

• The Originating system field on the Display Notification Details display identifies the system on which target journal inspection ran and sent the notification.

• The Notification details field identifies the user or program that made the change, the first object changed, the location it was found in the inspected journal, and a command string to run to see journal entries generated by the user.

Note: If the text of the Notification details field is truncated, you can view the full text of the message associated with the notification from the MIMIX message log. Use “Displaying messages for TGTJRNINSP notifications” on page 157.

5. Investigate why the identified user changed objects on the target system. Objects may need to be repaired.

Displaying messages for TGTJRNINSP notifications

The text of notifications by target journal inspection vary slightly with the object type of the reported object. When a notification is sent, an associated message is sent to the MIMIX message log.

You can use the following commands to view the full text of notification messages, Use the name of the originating system (Step 4 in previous procedure) as the name of the originating system (ORGSYS) in these commands:

• For library-based objects, Enter:

WRKMSGLOG MSGID(LVE3902) PRC(TGTJRNINSP) ORGSYS(name)

• For IFS objects, Enter:


157

Identifying the last entry inspected on the target system

• For DLO objects, Enter:


Identifying the last entry inspected on the target systemFor each target journal inspection process, you can view details that identify the last journal entry inspected and identify the last entry in the current journal receiver.

Do the following:

1. From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter.

2. The Work with Systems display appears. Type 11 (Jrn inspection status) next to the system you want and press Enter.

3. The Work with Journal Inspection Status display appears. Type 5 (Display) next to the journal definition on the system you want and press Enter.

4. The Display Journal Inspection Status Details display appears.

• The following fields identify the currently attached journal receiver and the last entry in the current receiver: Journal, Journal receiver, Last journal entry sequence. and Last journal entry time.

• The Target journal inspection fields identify the last entry processed by target journal inspection.

158

What are notifications and recoveries

CHAPTER 9 Working with notifications and recoveries

This topic describes what notifications and recoveries are and how to work with them.

This chapter includes the following topics:

• “What are notifications and recoveries” on page 159 defines terms used for discussing notifications and recoveries and identifies the sources that create them.

• “Displaying notifications” on page 160 identifies where notifications are viewed in the user interfaces and how to work with them.

• “Notifications for newly created objects” on page 163 describes the MIMIX® AutoNotify™ feature which can be used to monitor for newly created libraries, folders, or directories.

• “Displaying recoveries” on page 164 identifies where recoveries in progress are viewed in the user interfaces and how to work with them.

What are notifications and recoveriesA notification is the resulting automatic report associated with an event that has already occurred. The severity of a notification is reflected in the overall status of the installation.

Notifications can be generated in a variety of ways:

• Target journal inspection processes generate notifications when users or programs other than MIMIX have changed objects on the target node.

• Rules that are not associated with the audits provided by MIMIX also generate notifications to indicate that rule processing either ended in error or, if requested, completed successfully.

• Shipped monitors, such as the MMNFYNEWE monitor for the MIMIX® AutoNotify™ feature, generate notifications.

• Custom automation may initiate user-generated notifications when user-defined events are detected. User-generated notifications can be set to indicate a failure, a warning, or a successful operation.

• Audits generate notifications as a secondary mechanism for reporting when the activities performed by an audit complete or end in error. These notifications are automatically marked as acknowledged. (The primary mechanism is to report errors through replication processes and the audit summary.) Policies provide considerable control over notifications generated by audits.

Because the manner in which notifications are generated can vary, it is important to note that notifications can represent both real-time events as well as events that occurred in the past but, due to scheduling, are being reported in the present. For

159

Displaying notifications

example, the ownership of a file is changed on the target system at 8:00 PM. If your audit (CMPFILA) is scheduled to run at 1:00 AM, MIMIX will detect the change and push a notification to the user interface when the audit completes. Previously, detection of the change was contingent upon you viewing a report after the audit completed and noticing the difference.

Recoveries - The term recovery is used in two ways. The most common use refers to the recovery action taken by audits or replication processes to correct a detected difference when automatic recovery polices are enabled. The second use refers to a temporary report that provides details about a recovery action in progress. The report is automatically created when the recovery action starts and is removed when it completes. While it exists, the report identifies what originated the action and what is being acted upon, and may include access to an associated output file (outfile) and the job log for the associated job. The action which generated a report may also generate a notification when the recovery action ends.

Displaying notificationsDo one of the following to check for notifications:

Note: Notifications from audits are automatically set to a status of acknowledged. Audit status and results should be checked from the Work with Audits (WRKAUD) display.

• If there are no audit problems in the installation, the MIMIX Availability Status display will indicate whether there are any notifications requiring attention or immediate action that are from sources other than audits. From the MIMIX Availability Status display, type a 5 (Display details) next to Audits and notifications and press Enter.

• Notifications from all sources are listed on the Work with Notifications display. To access the Work with Notifications display, enter the command WRKNFY. The list is sorted so that new notifications appear at the top. To see details for a notification, type a 5 (Display) next to the notification you want and press Enter.

• The Work with Data Groups display includes the number of new notifications that require action or attention. From the MIMIX Basic Main Menu type 6 (Work with data groups) and press Enter. The Work with Data Groups display appears. The Audit/Recov./Notif. fields are located in the upper right corner.

What information is available for notifications

The following information is available for notifications listed on the Work with Notifications display. The F11 key toggles between views of status, timestamp, and text of the notification.

Additional details are available for each notification through the Display Notification Details display.

Status - The Work with Notifications display lists notifications grouped by their status.

*NEW - New notifications have not been acknowledged or removed and their status is reflected in higher level status.

160


*ACK - Acknowledged notifications are archived as viewed and their status is no longer reflected in higher level status.

Severity - Identifies the severity level of the notification.

*ERROR - An error occurred that requires immediate action.

*WARNING - Investigation may be necessary. An operation completed but an error may exist. For example, the MIMIX AutoNotify feature issues notifications with this severity that identify newly created objects that are not identified for replication.

*INFO - No user intervention is required.

Notification - Displays the notification text sent by audits, automatic recoveries, target journal inspection, monitors, user-defined or MIMIX rules, or a user-generated notification. To view the full text, use option 5 to display the notification details.

Data group - Identifies the data group associated with the notification. User-defined notifications and notifications from monitors or user-defined rules may indicate that there is no associated data group.

Note: On the Status view of the Work with Notifications display, the F7 key toggles between the Source column and the Data Group column. The full three-part name is available in the Timestamp view (F11).

Date - Indicates the date the notification was sent.

Time - Indicates the time the notification was sent.

Source - Identifies the process, program, or command that generated the recovery. Names that begin with the character # are generated by automatic recovery actions for audits or database replication or by a MIMIX rule. Names that begin with the characters ## are generated by automatic recovery actions for object replication.

From System - Identifies the name of the system on which the notification was generated. The name From System is used on the Timestamp view (F11) of the Work with Notifications display. When you display the notification details from the 5250 emulator, this is called the Originating system.

Detailed information

When you display a notification, you see its description, status, severity, data group, source, and sender as described above. You also have access to the following information:

Details - When the source of the notification is a rule, this identifies the command that was initiated by the rule. When the source of the notification is user-generated, this indicates the notification detail text specified when the notification entry was added. When the source of the notification is a monitor, this describes the events that resulted in the notification.

Output File - If available, this identifies an associated output file. Output file information associated with a notification is only available from the sender system. For user-generated notifications, output file information is available only if it was specified when the notification was added.

161


Job - If available, this identifies the job that generated the notification. Job information associated with a notification is only available from the sender system. For user-generated notifications, this information is available only if it was specified when the notification was added.

Options for working with notifications

Table 32 identifies the possible actions you can take for a notification. From the Notifications window, the Actions list for each notification contains only the actions possible for the selected notification.

Table 32. Options available for notifications

Option Description

4=Remove Deletes the notification. You are prompted to confirm this choice. For a notification generated by an audit or a MIMIX rule, the associated job and output files are also deleted.

This must be performed from the system on which the notification originated.

5=Display Displays available additional information associated with the notification.

For notifications generated by rules, this includes the details of the rule that generated the notification, including the substitution variables for the command the rule initiated

6=Print Prints the information associated with the notification.

8=View results When the information is available, this provides the Name and Library of the output file (outfile) associated with the notification. This option is only available from the system on which the notification originated.1

1. MIMIX manages an output file associated with a notification from an automatically recovery action or a MIMIX rule when the output file exists in a specific library. The format of the library name for such an output file is MIMIX-installation-library_0.

12=Display job Displays the job log for the job which generated the notification, if it is available. This option is only available from the system on which the notification originated

46=Acknowledge Sets the selected notification status to *ACK (Acknowledged).

47=Mark as new Sets the selected notification status to *NEW (New).

162

Notifications for newly created objects

163

Notifications for newly created objects The MIMIX® AutoNotify™ feature can be used to monitor for newly created libraries, folders, or directories. The AutoNotify feature uses a shipped journal monitor called MMNFYNEWE to monitor for new objects in an installation that are not already included or excluded for replication by a data group. The AutoNotify feature monitors the security audit journal (QAUDJRN), and when new objects are detected, issues a warning notification.

The MMNFYNEWE monitor is shipped in a disabled state. In order to use this feature, the MMNFYNEWE monitor must be enabled on the source system within your MIMIX environment. Once enabled, this monitor will automatically start with the master monitor.

Notifications will be sent when newly created objects meet the following conditions:

• The installation must have a data group configured whose source system is the system the monitor is running on.

• The journal entry must be a create object (T-CO) or object management change (T-OM).

• If the journal entry is a create object (T-CO), then the type must be new (N).

• The journal entry must be for a library, folder, or directory.

• If the journal entry is for a library, it cannot be a MIMIX generated library since MIMIX generated libraries are not replicated by MIMIX.

• If the journal entry is for a directory, it cannot be the /LAKEVIEWTECH directory, or any directory under /LAKEVIEWTECH.

• If the journal entry is for a directory, it must be a directory that is supported for replication by MIMIX.

• The object is not already known (included or excluded) in the installation.

Notifications can be viewed from the Work with Notifications (WRKNFY) display. The notification message will indicate required actions.

Displaying recoveries

Displaying recoveriesActive recoveries are an indication of problems detected and being corrected by MIMIX AutoGuard. Before certain activity, such as ending MIMIX, it is important that no recoveries are in progress in the installation. You can check for recoveries from either user interface.

You can see how many recoveries are in progress from the MIMIX Availability Status display or the Work with Data Groups display. The Work with Recoveries display lists recoveries and provides options for working with held recoveries associated with an audit or a MIMIX rule.

To see a count of recoveries in progress, do one of the following

• To access the MIMIX Availability Status display, enter the command WRKMMXSTS. The Recoveries field in the upper right corner of the display shows the number of active recoveries in progress for the installation.

• To access the Work with Data Groups display, use option 5 (Display details) next to the Replication area.

Figure 35 shows the Audits/Recov./Notif. fields in the upper right corner of the Work with Data Groups display. The first number is the total number of audits that require action to correct a problem or require your attention to prevent a situation from becoming a problem. The second number indicates the number of active recoveries, including those resulting from audits. The third number indicates the number of new notifications that require action or attention. If more than 999 items exist in any field, the field will display +++. A consistently high number of recoveries suggests that there may be configuration issues with one or more data groups.

To select a recovery to view or work with a held recovery, do the following:

1. To access the Work with Recoveries display, do one of the following:

• From the Work with Audits display, use option 8 (Recoveries) to see a list of recoveries associated with an audit.

• From the Work with Data Groups display, use F8 to see all recoveries.

• On a command line, enter the command WRKRCY.

2. To see details for a recovery, type a 5 (Display) next to the recovery you want and

164


press Enter.

Figure 35. Work with Data Groups display showing recoveries in progress

What information is available for recoveries

The following information is available for recoveries listed on the Work with Recoveries display. The F11 key toggles between views of status, timestamp, and text of the recoveries. Additional details are available for each recovery through the Display Recovery Details display.

Each recovery provides a brief description of the recovery process taking place as well as its current status.

Status - Shows the status of the recovery action.

*ACTIVE - The job associated with the recovery is active.

*ENDING - The job associated with the recovery is ending.

*HELD - The job associated with the recovery is held. A recovery whose source is a replication process cannot be held.

Data group - Identifies the data group associated with the recovery.

Note: On the Status view of the Work with Recoveries display, the F7 key toggles between the Source column and the Data Group column. The full three-part name is available in the Timestamp view (F11).

Date - Indicates the date the recovery process started.

Time - Indicates the time the recovery process started.

Source - Identifies the process, program, or command that generated the recovery. Names that begin with the character # are generated by automatic recovery actions

Work with Data Groups CHICAGO 10:49:06 Type options, press Enter. Audits/Recov./Notif.: 001 / 002 / 003 5=Display definition 8=Display status 9=Start DG 10=End DG 12=Files not active 13=Objects in error 14=Active objects 15=Planned switch 16=Unplanned switch ... ---------Source--------- --------Target-------- -Errors- Opt Data Group System Mgr DB Obj DA System Mgr DB Obj DB Obj __ TESTDG34 LONDON A A A CHICAGO A A A __ TESTDG43 LONDON A A A CHICAGO A A A Bottom F3=Exit F5=Refresh F7=Audits F8=Recoveries F9=Automatic refresh F10=Legend F16=DG definitions F23=More options F24=More keys

165


for audits or database replication or by a MIMIX rule. Names that begin with the characters ## are generated by automatic recovery actions for object replication.

Sender or From System - Identifies the system from which the recovery originated.

Detailed information

When you display a recovery, you see its description, status, data group, source, and sender as described above. You also have access to the following information.

Details - When the source of the recovery is a rule, this identifies the command run by the rule in an attempt to recover from the detected error.

Output File - If available, this identifies an associated output file that lists the detected errors the recovery is attempting to correct. Output file information associated with a recovery is only available from the sender system.

Job - If available, this identifies the job that is performing the recovery action. Job information associated with a recovery is only available from the sender system.

Options for working with recoveries

Table 33 identifies the possible actions you can take for a recovery. From the Recoveries window, the Actions list for each recovery contains only the actions possible for the selected recovery.

Table 33. Options available for recoveries

WRKRCY

Option

Description

4=Remove Removes the specified recovery, if it is not held or active. A confirmation panel is displayed after pressing Enter. Use this option to remove orphaned recoveries whose associated recovery job ended. This option is only available from the system on which the recovery job ran.

5=Display Displays available additional information associated with the recovery.

6=Print Prints the information associated with the recovery.

8=View progress Displays a filtered view of the output file associated with the recovery. MIMIX updates the output file while the recovery is in progress, identifying the detected errors it is attempting to correct and marking corrected errors as being recovered.This option is only available from the system on which the recovery job is running.

10=End job Ends an active recovery job. This action is valid for recoveries with names that begin with # and is only available from the system on which the recovery job is running.

12=Display job Displays the job log for the recovery job associated in progress. This option is only available from the system on which the recovery job is running.

166


Orphaned recoveries

There are times when recoveries exist but are no longer associated with a job. The following conditions could cause recoveries to become orphaned:

• An unplanned switch has occurred

• The MIMIX subsystem was ended unexpectedly

• A recovery job was ended unexpectedly

When automatic audit recovery is enabled, orphaned recoveries are converted to error notifications during system cleanup. If the orphaned recovery is older than the cleanup time specified in the system definition, it is deleted.

When automatic database recovery or automatic object recovery is enabled, orphaned recoveries are deleted, when possible.

Because recoveries are displayed on both systems, but jobs associated with them are only accessible from the originating system, you need to verify that the recovery is orphaned before removing it.

Determining whether a recovery is orphaned

Do the following to determine whether a recovery is orphaned:

1. From a command line, type WRKRCY and press Enter.

2. Press F11 to display the Timestamp view. This view allows you to see the From System column which lists the system from which the recovery originated.

3. Ensure you are operating from the originating system. Then type a 12 next to the recovery.


• If an error message is displayed indicating that the job associated with the recovery is not found, follow the steps in “Removing an orphaned recovery” on page 168.

• When the Display Job display appears, type a 10 in the Selection field and press Enter. The status of the job is displayed. If the job associated with the recovery is no longer valid, follow the steps in “Removing an orphaned recovery” on page 168.

13=Hold job Places an active recovery job on hold. This action is valid for recoveries with names that begin with # and is only available from the system on which the recovery job is running.

14=Release job Releases a held recovery job. This action is valid for recoveries with names that begin with # and is only available from the system on which the recovery job is held.

Table 33. Options available for recoveries

WRKRCY

Option

Description

167


Removing an orphaned recovery

These procedures assume that you have already confirmed that the recovery is orphaned using the procedures in “Determining whether a recovery is orphaned” on page 167.

Do the following to remove an orphaned recovery:

1. From the originating system, type WRKRCY on the command line and press Enter.

2. After you have ensured that the recovery is orphaned, type a 4 next to the orphaned recovery you wish to remove and press Enter.

3. Press Enter to confirm your request to remove the recovery.

168

CHAPTER 10 Starting and ending replication

MIMIX uses a number of processes to perform replication. These processes, along with a number of supporting processes must be active to enable MIMIX to function.

These pairs of commands will start and end replication:

• The Start MIMIX (STRMMX) and End MIMIX (ENDMMX) commands will start or stop replication processes as well as all supporting processes for the products in a MIMIX installation library in a single operation. These commands are the preferred method for starting and ending MIMIX.

• The Start Application Group (STRAG) and End Application Group (ENDAG) commands will start or stop replication processes in environments configured with application groups. Each command calls a default procedure with steps to perform its operations and can be customized.

• The Start Data Group (STRDG) and End Data Group (ENDDG) commands will start or stop data group replication processes. These commands are the basis for controlling replication processes and are invoked programmatically by the previously identified commands.

This chapter provides information about and procedures for use each set of commands. The following topics are included:

• “Before starting replication” on page 171 applies to all methods of starting replication.

• “Commands for starting replication” on page 171 describes the STRMMX, STRAG, and STRDG commands and considerations for their use.

• “What occurs when a data group is started” on page 174 describes what the STRDG command does in addition to starting replication, choices for specifying a journal starting point, and options for clearing pending and error entries.

• “Starting MIMIX” on page 179 provides a procedure for using the STRMMX command.

• “Starting an application group” on page 180 provides a procedure for using the STRAG command.

• “Starting selected data group processes” on page 181 provides a procedure for using the STRDG command and identifies when the start request should include clearing pending entries.

• “Starting replication when open commit cycles exist” on page 183 describes when MIMIX cannot start replication due to open commit cycles and how to resolve them and start replication.

• “Before ending replication” on page 184 to all methods of ending replication.

• “Commands for ending replication” on page 184 describes the ENDMMX, ENDAG, and ENDDG commands and considerations for their use, such as when to perform a controlled end or when to end the RJ link.

169

• “What occurs when a data group is ended” on page 190 describes the behavior of the ENDDG command.

• “Ending MIMIX” on page 179 provides procedures for using the ENDMMX command and describes when you may also need to end the MIMIX subsystem.

• “Ending an application group” on page 194 provides a procedure for using the ENDAG command.

• “Ending a data group in a controlled manner” on page 195 provides procedures for preparing to end, ending, and confirming that the end completed without problems.

• “Ending selected data group processes” on page 198 provides a procedure using the ENDDG command.

• “What replication processes are started by the STRDG command” on page 199 describes which replication processes are started with each possible value of the Start processes (PRC) parameter. Both data groups configured for remote journaling and data groups configured for MIMIX source-send processing are addressed.

• “What replication processes are ended by the ENDDG command” on page 203 describes what replication processes are ended with each possible value for the End Options (PRC) parameter. Both data groups configured for remote journaling and data groups configured for MIMIX source-send processing are addressed.

170

Before starting replication

Before starting replicationConsider the following:

• Before starting replication, the database files and objects to be replicated by a data group must be synchronized between the systems defined to the data group. For more information about performing the initial synchronization, see the MIMIX Administrator Reference book.

• If you are using the MIMIX for MQ function, you must use the procedures in the MIMIX for IBM WebSphere MQ book for initial synchronization and initial start of data groups that replicate data for IBM WebSphere MQ.

• Data groups that are in a disabled state are not started. Only data groups that have been enabled can be started.

Commands for starting replicationThese commands start replication processes. The significant differences between these commands are:

Start MIMIX (STRMMX) – The STRMMX command will start all MIMIX processes in a MIMIX installation, including those used for replication, in a single operation regardless of how replication is configured. This is the preferred method of starting MIMIX. Optionally, this command can be used to start all MIMIX processes on the local system only.

Start Application Group (STRAG) – The STRAG command will start replication processes for data groups that are part of an application group. This is the preferred method of starting replication in application groups. The command invokes a procedure which performs the operations to start replication for the participating data groups.

Start Data Group (STRDG) – The STRDG command will start replication processes for a data group and the remote journal link, if necessary. This command is the basis for all other methods of starting replication. Optionally, this command can specify a starting point in the journals, clear any pending or error entries, set object auditing levels, and start a subset of the replication processes.

What is started with the STRMMX command

The STRMMX command is shipped with default values that will start all MIMIX processes on all systems in the installation. Optionally, the command can be used to start MIMIX processes on only the local system. Processes are started in the following order:

MIMIX managers and services - All jobs for the system managers, journal managers, target journal inspection, and collector services are started on the specified systems. If you are using MIMIX with IBM i clustering, Cluster Services are started for all specified systems that are configured for clustering.

Data groups - For enabled data groups, starts the replication processes, remote

171

Commands for starting replication

journal links, and automatic recovery processes on the specified systems. Each data group starts from the journal receiver in use when the data group ended and with the sequence number following the last sequence number processed.

Master monitor - Starts the master monitor on each of the specified systems.

Monitors - On each of the specified systems, the master monitor starts monitors that are not disabled and which are configured to start with the master monitor.

Application groups - If all systems are specified, all application groups and any associated data resource groups are started. If IBM i clustering is used, default processing will start the IBM application CRG.

Note: The STRMMX command does not start promoter group activity. Start promoter group activity using procedures in the Using MIMIX Promoter book.

STRMMX and ENDMMX messages

Once you have run the STRMMX or ENDMMX command, one of the following messages is displayed:

• Completion LVI0902 – This message indicates that all MIMIX products were started or ended successfully.

• Escape LVE0902 – This message indicates one or more MIMIX products failed to start or end.

What is started by the default START procedure for an application group

When an application group is created, a default procedure named START is created for it from a shipped default procedure. The Start Application Group (STRAG) command automatically uses the application group’s default START procedure unless you specify a different procedure.

Steps in the shipped default START procedure are described in the MIMIX Administrator Reference book.

Choices when starting or ending an application group

For the purpose of describing their use, the Start Application Group (STRAG) and End Application Group (ENDAG) commands are quite similar. This topic describes their behavior for application groups that do not participate in a cluster controlled by the IBM i operating system (*NONCLU application groups).

What is the scope of the request? The following parameters identify the scope of the requested operation:

Application group definition (AGDFN) - Specifies the requested application group. You can either specify a name or the value *ALL.

Resource groups (TYPE) - Specifies the types of resource groups to be processed for the requested application group.

172

Commands for starting replication

Data resource group entry (DTARSCGRP) - Specifies the data resource groups to include in the request. The default is *ALL or you can specify a name. This parameter is ignored when TYPE is *ALL or *APP.

What is the requested behavior? The following parameters, when available, define the expected behavior:

Current node roles (ROLE) - Only available on the STRAG command, this parameter is ignored for non-cluster application groups.

What procedure will be used? The following parameters identify the procedure to use and its starting point:

Begin at step (STEP) - Specifies where the request will start within the specified procedure. This parameter is described in detail below.

Procedure (PROC) - Specifies the name of the procedure to run to perform the requested operation when starting from its first step. The value *DFT will use the procedure designated as the default for the application group. The value *LASTRUN uses the same procedure used for the previous run of the command. You can also specify the name of a procedure that is valid the specified application group and type of request.






.

For more information about starting a procedure with the step at which it failed, see “Resuming a procedure” on page 91.

173

What occurs when a data group is started

What occurs when a data group is startedThe Start Data Group (STRDG) command will start the replication processes for the specified data group.

The STRDG command can be used interactively or programatically. Default values for the command are used when it is invoked by the STRMMX command or by the STRAG command running the default START procedure.

When a STRDG request is processed, MIMIX may take a few minutes while it does the following for each specified data group:

• Determines whether the RJ link is active and whether all required system managers and journal managers on each system are started. If necessary, the managers and the remote journal function defined by the RJ link are started.

• Determines the starting point for replication (database, object, or both, as configured).

• Locates the starting point in the appropriate journal receiver. This will be the starting point for send processes.

• If necessary, changes the object audit level of existing objects identified for replication. This occurs when starting following a switch or a configuration change to any data group object, IFS, or DLO entry. This ensures that all replicated objects identified by all entries of each entry type are set with an object audit level suitable for replication. The processing order for data group entries can affect the auditing value of IFS objects. For examples and for information about manually specifying the audit level of objects, see the MIMIX Administrator Reference book.

• Submits the appropriate start requests for the processes specified on the start request.

• Makes configuration changes for the data group become effective. If a configuration change affects the set of objects to be replicated, the start request also automatically deploys the configuration changes to an internal list used by other functions. This may cause the start request to take longer.

• Attempts to recover any existing access path maintenance1 errors for the data group, if the Access path maintenance (APMNT) policy is enabled.

• If specified on the start request, clears all pending entries for apply processes and clears all error entries identified in replication processing for the data group. There are times when it is necessary to clear pending entries, error entries, or both, to establish a new synchronization point for the data group.

Starting a data group may take longer if the remote journal function is operating in catchup mode.

1. The access path maintenance function is available on installations running MIMIX 7.1.15.00 or higher. Access path maintenance replaces the parallel access path maintenance function avail-able on installations running earlier software levels, On earlier software levels, a start data group request creates and activates the monitors used by the parallel access path maintenance func-tion if the parallel access path maintenance (PRLAPMNT) policy is enabled.

174


Journal starting point identified on the STRDG request

On the STRDG command, you can optionally specify the point at which to start replication in the journal receivers. The parameters for database and object journal receivers and sequence numbers provide this capability. You may need to use these parameters when starting data groups for the first time.

• For user journal replication, the IBM i remote journal function controls where processing starts in the source journal receiver. The values specified for the Database journal receiver (DBJRNRCV) and Database large sequence number (DBSEQNBR2) identify the starting location for the database reader process and the database apply process.

• For system journal replication, the value specified for Object journal receiver (OBJJRNRCV) and Object large sequence number (OBJSEQNBR2) identify the starting location for the object send process and the object apply process.

Note: The parameters Database sequence number (DBSEQNBR) and Object sequence number (OBJSEQNBR) continue to be valid for journal definitions which specify *MAXOPT2 for the Receiver size option (RCVSIZEOPT) and for values that do not exceed 10 digits. To ensure continued compatibility, the use of parameters DBSEQNBR2 and OBJSEQNBR2 is recommended.

Journal starting point when the object send process is shared

When starting data groups that share an object send process, the first data group to start will start the shared job at that data group’s starting point in the system journal (QAUDJRN). As additional data groups start, each recognizes that the shared object send job is active. The object send job determines whether the starting point for that data group is earlier or later than the sequence number being read. If the data group’s starting point is later, replication will begin when the shared job reaches the data group's starting point. If the data group’s starting point is earlier, the shared job completes its current block of entries, then returns to the earliest point for any of the data groups being started. The shared job reads the earlier entries and routes the transactions to the data group being started. When the shared job reaches the last entry it read at the time of the STRDG request, it resumes routing transactions to all active data groups using the shared job.

If the starting data group has a significant object send backlog, the other data groups sharing the job will not receive transactions to replicate while the backlog for the starting data group is being addressed. Therefore, when a significant backlog exists, it is recommended that you change the data group configuration to use a dedicated job (*DGDFN for object send prefix), start the data group, and allow it to catch up to the current location of the shared job. Then end the data group, change its configuration to use the desired shared job, and restart the data group.

Clear pending and clear error processing

The Clear pending and Clear error prompts on the STRDG command provide flexibility when starting a data group by allowing you to optionally reset error status conditions on data group file entries and discard pending journal entries that are

175


stored in the journal log space. Clear pending resets the starting point for all data group file entries and object entries. Clear error clears the hold log spaces.

When clearing pending entries, you can optionally specify which system to use for determining database file network relationships when distributing files among database apply sessions. The System for DB file relations (DBRSYS) prompt identifies which system is used to assign data group file entries to apply sessions when the start request specifies to clear pending entries in all apply sessions.

Table 34 shows the processing that occurs based on the selection made for the Clear pending (CLRPND) and Clear error (CLRERR) prompts. The Clear pending and Clear error prompts work independently. For example, when CLRPND(*NO) is selected, no clear pending processing occurs.

Table 34. CLRPND and CLRERR processing

CLRPND CLRERR Processing Description Notes

*NO *NO Data groups start with regular processing:

• Data group file entry status remains unchanged.

• Hold logs remain unchanged.

*NO *CLRPND The value selected for the CLRPND parameter is used for CLRERR. Same processing as CLRPND(*NO) CLRERR(*NO).

*NO *YES • Data group file entries in *HLDERR, *HLDRGZ, *HLDRNM, *HLDPRM, and *HLDRLTD status are cleared.

• Tracking entries in *HLDERR status are cleared.

• Hold log space is deleted.

See File entry states

See Log spaces

*YES *NO Note: CLRPND(*YES) will not start a data group when there are open commit cycles on files defined to the data group.

• Data group file entries in *HLDRGZ, *HLDRNM, and *HLDPRM status are cleared and reset to active.

• Data group tracking entries in *HLDRNM are cleared and reset to active.

• Data group file entries and tracking entries in *HLDERR status remain unchanged.

• If there is a requested status at the time of starting, it is cleared.

• Journal, hold, tracking entry hold, and apply history log spaces are deleted.

• The apply session to which data group file entries are assigned may change.

See File entry apply session assignment

See Single apply session processing

See Log spaces

176


*YES *YES Note: CLRPND(*YES) will not start a data group when there are open commit cycles on files defined to the data group.

• Data group file entries in *HLDERR, *HLDRGZ, *HLDRNM, *HLDPRM, and *HLDRLTD status are cleared and reset to active.

• Data group file entries in *HLDRTY status remain unchanged.

• Data group object activity entries in any failed or active status are changed to CC (Completed by clear request).

• Tracking entries in *HLDERR and *HLDRNM status are cleared and reset to active.

• Tracking entries are primed if any configuration changes occurred for data group object entries or data group IFS entries.

• If there is a requested status at the time of starting, it is cleared.

• Journal, hold, and apply history log spaces are deleted.

• The apply session to which data group file entries are assigned may change.




See Log spaces

*YES *CLRPND Note: CLRPND(*YES) will not start a data group when there are open commit cycles on files defined to the data group.

The value selected for the CLRPND parameter is used for CLRERR. Same processing as CLRPND(*YES) CLRERR(*YES).




See Log spaces

File entry states: Files in specific states will not reset to active when you specify *YES on the Clear Error prompt. If you have set data group file entries to any of these states, the following process exception applies:

Note: The only states that can be set using the Set Data Group File Entry (SETDGFE) command are *HLD, *RLSWAIT, *ACTIVE, and *HLDIGN. All other states are the result of internal processing.

• *HLD - Journal entries cached before *YES is specified are discarded. If *ALL or *ALLSRC is specified on the Start processes prompt, all subsequent entries from the specified starting point will be cached again.

• *RLSWAIT - Journal entries are discarded as they wait for the synchronization point to arrive in the journal stream. This occurs regardless of the value specified for Clear Error or Clear Pending.

• *HLDIGN - Journal entries are discarded until the file status is changed to something else.

• *HLDSYNC - Journal entries are ignored since an external process is actively synchronizing the file. When that event completes normally, the file is set to *RLSWAIT.



177


File entry apply session assignment: Clear pending processing attempts to load balance the data group file entries among the defined apply sessions. If the requested apply session in the data group file entry definition is *ANY, or if it is *DGDFT and the requested apply session for the data group definition is *ANY, then the apply session to which the data group file entry is assigned may be changed when processing occurs. For data groups configured to replicate through the user journal, the requested apply session may be ignored to ensure that related files are handled by the same apply session.

The value specified for System for DB file relations (DBRSYS) determines the system used to determine database file relationships while assigning files to apply sessions. This parameter is evaluated only when the start request specifies to clear pending entries for all database apply sessions. The default value, *TGT, uses the target system to determine the file relationships.

Single apply session processing: In most situations, you will perform clear pending processing on all apply sessions belonging to a data group by specifying *ALL or *DBALL on the Start processes (PRC) prompt. MIMIX also supports the ability to perform clear pending processing on a single apply session, which is useful for recovery purposes in certain error situations. The System for DB file relations (DBRSYS) parameter is ignored when the start request specifies a specific apply session.To perform clear pending processing on a single apply session, specify PRC(*DBAPY) and the specific apply session (APYSSN).

Log spaces: Because they have not been applied, journal entries that exist in the journal log space are considered pending. Journal entries that exist in the hold log space, however, are considered in error. The Clear pending and Clear error prompts affect which log spaces are deleted (and recreated) when a data group is started.



178

Starting MIMIX

179

Starting MIMIXTo start all MIMIX products within an installation library, do the following:

1. If you are starting MIMIX for the first time or starting MIMIX after a system IPL, do the following:

a. Use the command WRKSBSJOB SBS(MIMIXSBS)to verify that the MIMIX subsystem is running. If the MIMIXSBS is not already active, start the subsystem using the STRSBS SBSD(MIMIXQGPL/MIMIXSBS)command.

b. If MIMIX uses TCP/IP for system communication, the TCP/IP servers must be running. If TCP/IP is not already active, start TCP/IP using the port number defined in the transfer definitions and the procedures described in “Starting the TCP/IP server” on page 260.


• From the MIMIX Basic Main Menu, select option 2 (Start MIMIX) and press Enter.

• From a command line type STRMMX and press Enter.

3. The Start MIMIX (STRMMX) display appears. Accept the default value for the System definition prompt and press Enter.

4. If you see a confirmation display, press Enter to start MIMIX.

Starting an application group

180

Starting an application groupFor an application group, a procedure for only one operation (start, end, or switch) can run at a time. For information about parameters and shipped procedures, see “What is started by the default START procedure for an application group” on page 172 and “Choices when starting or ending an application group” on page 172.

To start an application group, do the following:

1. From the Work with Application Groups display, type 9 (Start) next to the application group you want and press F4 (Prompt).

2. Verify that the values you want are specified for Resource groups and Data resource group entry.

3. If you are starting after addressing problems with the previous start request, specify the value you want for Begin at step. Be certain that you understand the effect the value you specify will have on your environment.

4. Press Enter.

5. The Procedure prompt appears. Do one of the following:

• To use the default start procedure, press Enter.

• To use a different start procedure for the application group, specify its name. Then press Enter.

Starting selected data group processes

Starting selected data group processesThis procedure can be used to do any of the following:

• Start all or selected processes for a data group, or start a specific database apply process

• Specify a starting point for journal receivers when starting a data group

• Clear pending and error entries when starting a data group

Data groups that are in an application group: The preferred method of starting data groups that are part of an application group is to use the Start Application Group (STRAG) command. Beginning with service pack 7.1.06.00, the default behavior of the STRDG command helps to enforce this best practice when necessary by not allowing the command to run when the data group is participating in a resource group with three or more nodes. (A data resource group provides the association between one or more data groups and an application group.). The STRDG request will run when the data group is participating in a resource group with two nodes. In earlier software levels, default behavior does not allow a start request when the data group is part of an application group.

In application group environments with three or more nodes, it is particularly important to treat all members of an application group as one entity. For example, a configuration change that is made effective by starting and ending a single data group would not be propagated to the other data groups in the same resource group. However, the same change would be propagated to the other data groups if it is made effective by ending and starting the parent application group.

When to clear pending entries and entries in error: Table 35 identifies when it is necessary to clear pending entries for apply processes and clear logs of entries indicating files in error to establish a new synchronization point when starting a data group. The reason for starting the data group determines whether you need to clear only pending entries for transactions waiting to be applied, clear only errors, or both.

Before clearing pending entries, determine if there are any file entries on hold. These are the transactions that will be lost by clearing pending entries.

When clearing pending entries, most environments can accept the default value for the System for DB file relations prompt. If necessary, you can specify a value when directed to by your MIMIX administrator.

181

Starting selected data group processes

For additional information about the STRDG command, refer to the following topics:

• “What occurs when a data group is started” on page 174

• “What replication processes are started by the STRDG command” on page 199

To start a data group, do the following:

1. From the Work with Data Groups display, type a 9 (Start DG) next to the data group that you want to start and press Enter.

The Start Data Group (STRDG) display appears.

2. At the Start processes prompt, specify the value for the processes you want to start. If you are starting the data group for the first time specify *ALL. To see a list of values, press F4 (Prompt).

3. Press Enter.

4. Additional prompts appear. For most situations, you should accept the default values. If necessary, specify the following:

• At the Database journal receiver and Database large sequence number prompts, identify where the database reader and apply processes begin.

• At the Object journal receiver and Object large sequence number prompts, identify where the object send and apply processes begin.

• If you are starting the data group for any of the reasons listed in Table 35, specify the indicated values for that reason in the Clear pending and Clear error prompts.

• If you are submitting this command for batch processing, you should specify *NO for the Show confirmation screen prompt.

5. To start the data group, press Enter.

Table 35. When to clear pending entries and entries in error when starting a data group

If starting the data group in any of these conditions: Specify these values on the

STRDG command:

After enabling a previously disabled data group Clear pending entries.

Specify *YES for the Clear pending prompt

After changing the Number of DB apply sessions (NBRDBAPY) parameter on the data group definition

After synchronizing database files and objects between two systems

Note: This assumes that you have synchronized the objects and database files and have changed the journal receivers using TYPE(*ALL) on the CHGDGRCV command.

Clear pending entries and entries in error.

• Specify *YES for the Clear pending prompt

• Specify *CLRPND or *YES for the Clear error prompt

After switching the direction of the data group, when starting replication on the system that now becomes the source system

182

Starting replication when open commit cycles exist

Starting replication when open commit cycles existOpen commit cycles may be present when a data group ends, or if a system event or failure occurred.

In most conditions, an open commit cycle present at the time that a data group ended will not prevent a request to start replication from running. However, MIMIX will prevent the data group from starting when either of these conditions exist:

• When the start request specifies to clear pending entries. Certain procedures may require a clear pending start. Message LVE387F is issued with reason code AP.

• When the commit mode specified for the database apply process changed. Changing the commit mode is not a common occurrence. Message LVEC0B3 is issued.

When these conditions exist, the open commit cycles must be resolved.

Checking for open commit cycles

Do the following to check for open commit cycles:

1. From the MIMIX Basic Main Menu, type a 6 (Work with data groups) and press Enter.

2. The Work with Data Group display appears. Type an 8 (Display status) next to the data group you ended and press Enter.

3. Press F8 (Database) to view the Data Group Detail Status display.

4. For each apply session listed, check the value shown in the Open Commit column at the right side of the display. If the value is *YES, open commit cycles exist for the data group.

Resolving open commit cycles

This procedure assumes that the data group is ended and that you have confirmed the presence of open commit cycles.

1. Start the data group, specifying *NO for the Clear pending prompt.

2. You must take action to resolve the open commit cycles, such as ending or quiescing the application or closing the commit cycle. MIMIX will process the open commit cycles when they are resolved.

3. Perform a controlled end of the data group.

4. When the data group is ended, check for open commit cycles again.

You may need to repeat this procedure until all open commit cycles have been resolved.

183

Before ending replication

Before ending replicationConsider the following:

• If the next time you start the data groups requires that you clear pending entries, or if you will be performing a switch, you should verify that no activity is still in progress before you perform these activities. Use the command WRKDGACTE STATUS (*ACTIVE) to ensure all activity entries completed.

• Data groups that are in a disabled state are not ended. Only data groups that have been enabled and have been started can be ended.

Commands for ending replicationThese commands end replication processes. The significant differences between these commands are:

• End MIMIX (ENDMMX) - The ENDMMX command will end all MIMIX processes in a MIMIX installation, including those used for replication, in a single operation. Optionally, this command can be used to end all MIMIX processes on the local system only.

• End Application Group (ENDAG) - The ENDAG command will end replication processes for data groups that are part of an application group. This is the preferred method of ending replication in application groups. The command invokes a procedure which performs the operations to end replication for the participating data groups.

• End Data Group (ENDDG) - The ENDDG command will end the specified replication processes for the data group either immediately or in a controlled manner. This command is the basis for all other methods of ending replication, and is also called by commands that perform switch operations. Optionally, this command can end a subset of replication processes or a selected database apply process, specify a wait time and end option for controlled ends, and end the remote journal link.

Command choice by reason for ending replication

Table 36 lists common reasons for ending MIMIX activity and the appropriate command to use. Depending on why you are ending replication, you may need to choose values other than the defaults.

Table 36. Choosing the appropriate command to end replication

Reason for Ending Replication Use Command Additional Information

Ending communications for any reason ENDMMX

Performing a full save and restore of data that is defined to MIMIX

ENDMMX

184

Commands for ending replication

Performing a save from the source system

ENDAG or ENDDG

When application groups are used, use the ENDAG command with its default END procedure. See “What is ended by the default END procedure for an application group” on page 189.

For ENDDG, specify *ALL for the Process (PRC) parameter. See “What replication processes are ended by the ENDDG command” on page 203.

The save request may not be able to save all the files or objects if they are opened or locked by MIMIX.

Performing a save from the target system If using step programs and procedures, run ENDTGT

or

ENDDG PRC(*ALLTGT)

See “Ending all or selected processes” on page 187.

You may be able to end only selected processes on the target system. See “Ending selected data group processes” on page 198.

The save request may not be able to save all the files or objects if they are opened or locked by MIMIX.

Preparing to update MIMIX software ENDMMX See controlled end information in “Ending immediately or controlled” on page 186.

Performing an IPL of either system ENDMMX Also end the RJ link

Upgrading the operating system release on either system

ENDMMX Also end the RJ link

Performing a switch in preparation for performing maintenance on either system

--- Let your switching mechanism end replication (switch procedure for application group or (MIMIX Switch Assistant or MIMIX Model Switch Framework)

Ending only a selected replication process

ENDDG See “Ending selected data group processes” on page 198.

Changing configuration, such as adding or changing data group entries

ENDAG or

ENDDG

When application groups are used, use the ENDAG command.

The changes are not available to active replication processes until the data group processes are ended and restarted.

Table 36. Choosing the appropriate command to end replication

Reason for Ending Replication Use Command Additional Information

185


Additional considerations when ending replication

The following questions will help you determine additional options you may need when ending replication. All methods of ending replication can accomplish these activities, but in some, the action is not default or may require additional programming.

• Do processes need to end in a controlled manner or can they be ended immediately? Both commands support these options. For more information, see “Ending immediately or controlled” on page 186

• Do you need to end only a subset of the replication processes? Only ENDDG supports ending selected processes. For more information see “Ending all or selected processes” on page 187.

• Does the RJ link also need to end? For data groups that use remote journaling you may also choose whether to end the RJ link. In most cases, the RJ link can remain active. For more information, see “When to end the RJ link” on page 188.

Ending immediately or controlled

Both ENDMMX and ENDDG commands provide the ability to choose whether replication processes end immediately or in a controlled manner through the End process (ENDOPT) parameter.

For the ENDAG command, the specified end procedure determines whether replication processes end immediately or in a controlled manner. If the procedure specifies a controlled end, the procedure also determines wait time and time out options.

When you perform an immediate end, the processes end independently of each other. For example, it is possible for the apply process to end before the send or receive process. Each replication process verifies that its processing is at a point that will permit ending, then ends. The amount of time it takes for an immediate end varies depending on the delay values set for each manager and what each process is doing at the time. An immediate end does not ensure that all journal entries generated are sent to or applied on the target system.

If an incomplete IFS or object tracking entry for a data group is being processed during an immediate end, the entire entry may not be applied. When the data group is restarted, the entire incomplete entry is rewritten to ensure the integrity of the object.

When you perform a controlled end, MIMIX creates either a journal entry or log space entry. This entry proceeds through the replication path. The date and time of the entry are compared to the date and time of when the process being considered was started. If the entry is earlier than the process start time, the end request is ignored. If the entry is later than when the process being considered was started, the process is ended.

A controlled end ensures that processes end in order and that each process completes any queued or in-progress transactions before the next process is permitted to end. This ensures that you have a known point in each journal at which you can restart replication.

186


If any processes have a backlog of entries, it may take some time for the entry created by the request to be processed through the replication path. Any entries that precede the entry requesting to end are processed first.

A data group that is ended in a controlled manner is prepared for a more effective and safer start when the start request specifies to clear pending entries. The existence of commit cycles implies that there is application activity on the source system that should not be interrupted; replication should be allowed to continue through the end of the commit cycle. It is preferable to ensure that commit cycles are resolved or removed before ending a data group. There are conditions in which a data group will not start if open commit cycles exist. For more information, see “Starting replication when open commit cycles exist” on page 183.

If the request to perform a controlled end also includes ending the RJ link, the RJ link is ended after all requested processes end.

Either type of end request may be ignored if the request is submitted just before the time that MIMIX jobs are restarted daily. For more information about restarting jobs, see ‘Configuring restart times for MIMIX jobs’ in the MIMIX Administrator Reference book.

Controlling how long to wait for a controlled end to complete

On the ENDMMX or ENDDG command, when you request a controlled end you can determine how long to wait for all specified data group processes to end. The Wait time (seconds) (WAIT) parameter specifies how long to wait for all of the specified data group processes to end. MIMIX will attempt to resolve all pending activity entries before ending the data groups. If a numeric value was specified, and the selected processes do not end within the specified time, the action specified for the Timeout option (TIMOUTOPT) will occur.

The WAIT parameter also supports special values of *SBMRQS and *NOMAX. When these values are used, the TIMOUTOPT parameter is ignored.

Note: If *ALL is specified for any part of the data group definition, the Wait time value must be *SBMRQS (submit request).

Ending all or selected processes

MIMIX determines which data group replication processes to end based on the command specified and options on the command.

The ENDMMX command ends all replication processes for all data groups on the systems specified on the end request.

The default END procedure for the ENDAG command uses the default settings of the ENDDG command. MIMIX also ships an ENDTGT procedure that, when specified on the ENDAG command, will end only processes on the target system.

Only the ENDDG command supports the ability to end selected replication processes through its Process (PRC) parameter. The default value is to end all replication processes for the specified data groups. The configuration of each data group determines which processes end with each possible value for the PRC parameter. If you choose to use this parameter, be sure that you understand what processes will

187


end. See “What replication processes are ended by the ENDDG command” on page 203.

When to end the RJ link

The RJ link remains active unless you change the value of the End remote journaling (ENDRJLNK) parameter on the ENDMMX command or the ENDDG command.

The RJ link can normally remain active unless you have a need to prevent data from being sent to the target system. Some situations where you need to end the RJ link include:

• Following a switch, to prevent data from returning to the system on which it originated (round-tripping), and to reduce communications and DASD usage

• Before performing an IPL on either the source system or target system

• Before upgrading the IBM i release on either the source system or the target system

• Before performing a hardware upgrade

The default END procedure for the ENDAG command used the default values for the ENDDG command. MIMIX also ships a step program, MXENDRJLNK, that can be added into the END procedure if necessary.

What is ended by the ENDMMX command

The ENDMMX command will end all MIMIX processes needed for replication on the specified systems in the installation. If you are using application groups, the application group is not specifically ended, and the associated end procedure will not be run. Any processes for user applications or IBM cluster resource groups must be ended separately. When you use this command, the following occurs:

Data groups - The end process specified is used to end all enabled data groups and their supporting processes, including automatic recovery, on the specified systems. This includes data groups associated with data resource groups. Default values end data groups in a controlled manner.

Remote journal links - If you selected to end remote journaling, all remote journal links associated with the specified systems are ended.

MIMIX managers and services - Ends the system managers, journal managers, target journal inspection, and collector services on the specified systems.

Monitors - Ends all individual monitors currently active in the installation library on the specified systems.

Master monitor - Ends the master monitor on each of the specified systems.

MIMIX Promoter - Ends promoter group activity on the specified systems.

Audits and Recoveries - All queued audits, all audits in progress, and all recoveries in progress that are associated with the specified systems are ended. This includes jobs with locks on the installation library. Queued audits are set to

188


*NOTRUN and audits in comparison phase are set to *FAILED. Audits in recovery phase reflect their state of processing at the time of the end request, which may be *NOTRCVD.

Note: Cluster services is not ended when MIMIX managers end because cluster services may be necessary for other applications.

What is ended by the default END procedure for an application group

When an application group is created, a default procedure named END is created for it from a shipped default procedure. The End Application Group (ENDAG) command automatically uses the application group’s default END procedure unless you specify a different procedure.

Steps in the shipped default END procedure, as well as steps in additional shipped procedures that end application groups, are described in the MIMIX Administrator Reference book.

189

What occurs when a data group is ended

What occurs when a data group is endedThe End Data Group (ENDDG) command will end replication processes for the specified data group.

The ENDDG command can be used interactively or programatically. This command is invoked by the ENDMMX command and by the ENDAG command running the default END procedure, using values other than default for some parameters.

When an ENDDG request is processed, MIMIX may take a few minutes while it does the following for each specified data group:

• Determines which data group replication processes to end based on the value you specify for the Process (PRC) parameter. The default value ends all MIMIX replication processes.

• When ending data groups that use a shared object send job, the job is ended by the last data group to end.

• When ending data groups that perform access path maintenance1, the database apply process signals the access path maintenance job and then ends. The access path maintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that MIMIX had previously changed to delayed. Any files that could not be changed are identified as having an access path maintenance error before the maintenance jobs end.

• Ends the specified replication processes in the manner specified for the End process (ENDOPT) parameter. The command defaults to processing the end request immediately (*IMMED). When invoked by the ENDMMX command, the default value specified on ENDMMX is *CNTRLD, which takes precedence. When invoked by a procedure specified on the ENDAG command, the procedure determines whether ENDDG is passed parameter values or uses the command defaults.

• Uses the specified Wait time and Timeout options if a controlled end is requested.

• If requested, ends the RJ link. The RJ link is not automatically ended. In most cases, the default value *NO for the End remote journaling (ENDRJLNK) parameter is appropriate. Keeping the RJ link active allows database changes to continue to be sent to the target system even though the data group is not active.

• If you have used the MIMIX CDP feature to set a recovery point in a data group and then end the data group, the recovery point will be cleared. When the data group is started again, the apply processes will process any available transactions, including those which may have had corruptions. (Recovery points are set with the Set DG Recovery Point (SETDGRCYP) command.) If a recovery window is configured for the data group, its configured duration is not affected by requests to end or start the data group.

• On installations running software earlier than 7.1.15.00, if the parallel access path maintenance function has been enabled, the End parallel AP maintenance

1. The access path maintenance function is available on installations running MIMIX 7.1.15.00 or higher and is the replacement for the parallel access path maintenance function in earlier soft-ware levels.

190

What occurs when a data group is ended

(PRLAPMNT) parameter1 determines whether MIMIX will end the monitors used by this function when the data group ends. The default value, *DFT, will end the monitors when the value specified for Processes (PRC) includes database processes that run on the target system (*ALL, *ALLTGT, *DBALL, *DBTGT, or *DBAPY) and the value *ALL is specified for the Apply session (APYSSN) parameter.

The ENDDG command does not end the system manager, journal manager, or other processes that run at the node level. To end those processes, either use the ENDMMX command or use the End MIMIX Managers (ENDMMXMGR) command after replication processes have ended.

1. This parameter is not available on installations running MIMIX 7.1.15.00 or higher.

191

Ending MIMIX

Ending MIMIXFor most configurations, It is recommended that you end MIMIX products from the management system, which is usually the backup system. If your installation is configured so that the backup system is a network system, you should end MIMIX from the network system.

Notes:

• If you are ending MIMIX for a software upgrade or to install a service pack, use the procedures in the software’s ReadMe document.

• The ENDMMX command cannot run when application groups are configured and there are any active, failed, or canceled procedures.

To end MIMIX, use the following procedures:

1. Use one of the following procedures:

• “Ending with default values” on page 192

• “Ending by prompting the ENDMMX command” on page 192

2. Complete any needed follow-up actions using the information and procedures in “After you end MIMIX products” on page 193.

Ending with default values

Use this procedure to end all MIMIX production in an installation library.

1. From the MIMIX Basic Main Menu, select option 3 (End MIMIX) and press Enter. You will see a confirmation display.

2. From the confirmation display, you can press F1 (Help) to see a description of the default values that will be used. To end MIMIX, press Enter,

Ending by prompting the ENDMMX command

To end all MIMIX processes for the specified systems within an installation library, do the following:

1. From a command line, type ENDMMX and press F4 (Prompt).

2. The End MIMIX display appears. At the End process prompt, specify *CNTRLD for a controlled end or *IMMED for an immediate end. This parameter applies to the application group (ENDAG) and data group (ENDDG) processes only.

Note: When ENDMMX ends data groups, it waits for each data group to end before attempting to end the next MIMIX product.

3. At the End remote journaling prompt, specify whether you want to end remote journaling.

Note: If you specify *YES, all data groups using the remote journal link in the installation library will be affected. If other data groups are using the same remote journal link, you should specify *NO.

4. If you specified *CNTRLD for Step 2, ensure that the values for the Wait time

192

Ending MIMIX

(seconds) and Timeout option prompts are what you want for the controlled end.

5. At the System definition prompt, indicate the scope of the request by specifying either *ALL or *LOCAL. This determines the systems on which to end MIMIX processes.

6. To end MIMIX processes, press Enter.

After you end MIMIX products

Some pending transactions may not be handled before the end process completes. You may need to ensure that all activity entries are complete before you issue additional commands. Examples of scenarios where it is important to check whether all pending transactions are completed include:

• Switching a data group (SWTDG command)

• Starting a data group with clear pending entries (STRDG CLRPND(*YES)).

To check for active entries, use the command WRKDGACTE STATUS(*ACTIVE).

When to also end the MIMIX subsystem - You will also need to end the MIMIX subsystem when you need to IPL the system, when upgrading MIMIX software, and when installing a MIMIX software service pack. The MIMIX subsystem must be ended from the 5250 emulator. To end the subsystem, do the following:

1. If you use MIMIX Availability Manager to monitor earlier releases of MIMIX, do the following:

a. Ensure that all users have logged out of MIMIX Availability Manager.

b. From the 5250 emulator, enter LAKEVIEW/ENDMMXAM.

2. Enter the command WRKSBS. The Work with Subsystems display appears.

3. Type an 8 (Work with subsystem jobs) next to subsystem MIMIXSBS and press Enter.

4. End any remaining jobs in a controlled manner. Type a 4 (End) next to the job and press F4 (Prompt). The How to end (OPTION) parameter should have a value of *YES. Press Enter. If you see a confirmation display, press Enter to continue.

5. Press F12 (Cancel) to return to the Work with Subsystems display.

6. Type a 4 (End subsystem) next to subsystem MIMIXSBS and press Enter.

193

Ending an application group

194

Ending an application groupFor an application group, a procedure for only one operation (start, end, or switch) can run at a time. For information about parameters and shipped procedures, see “What is ended by the default END procedure for an application group” on page 189 and “Choices when starting or ending an application group” on page 172.

To end an application group, do the following:

1. From the Work with Application Groups display, type 10 (End) next to the application group you want and press F4 (Prompt).


3. If you are starting the procedure after addressing problems with the previous end request, specify the value you want for Begin at step. Be certain that you understand the effect the value you specify will have on your environment.

4. Press Enter.


• To use the default end procedure, press Enter.

• To use a different end procedure for the application group, specify its name. Then press Enter.

Ending a data group in a controlled manner

Ending a data group in a controlled mannerThe following procedures describe how to check for errors before requesting a controlled end of a data group, how to perform the controlled end request, and how to confirm that the end completed. Held files must be released and the apply process must complete operations for journal entries stored in log spaces before you end data group activity.

Data groups that are in an application group: The preferred method of ending data groups that are part of an application group is to use the End Application Group (ENDAG) command.

Preparing for a controlled end of a data group

It is good practice to ensure that errors are resolved before requesting a controlled end of a data group.

Do the following:

1. From the Work with Data Groups display, type an 8 (Display status) next to the data group you want to end and press Enter.

2. The Data Group Status display appears. In the upper right of the display, you should see either one or both of the following fields. A non-zero value in these fields will not prevent the end request from completing.

• Database errors identifies the number of items replicated through the user (database) journal that have a status of *HLDERR. This number should be 0 before you end the data group.

• Object in error/active identifies two key statistics associated with objects replicated through the system journal. The first number identifies the number of objects that have a status of *FAILED and the second number identifies the number of objects with active (pending) activity entries. Both numbers should be 0 before you end the data group.

Note: Only information for the type of information replicated by the data group appears on the status displays. For example, if the data group does not contain database files, you will only see fields for object information.

3. For data groups which replicate from the user journal, you also need to check for any files that are held for other reasons. Press F8 (Database). The Held for other reasons field In the upper right of the Data Group Database Status display should also be 0 before you end the data group.

A non-zero value may or may not prevent the end request from completing. For more information, see topics “Working with files needing attention (replication and access path errors)” on page 210.

Performing the controlled end

1. From the Work with Data Groups display, type a 10 (End DG) next to the data group you want to end and press Enter.

2. The End Data Group (ENDDG) display appears. Specify *CNTRLD for the End

195


processes prompt.

3. If the data group uses remote journaling, verify that the value of the End remote journaling prompt is what you want.

4. Because you specified *CNTRLD in Step 2, you can also use the Wait Time (WAIT) parameter to specify how long MIMIX should try to end the selected processes in a controlled manner. Use F1 (Help) to see additional information about the possible options.

• Specify *SBMRQS to submit a request to end the data groups. The appropriate actions are issued to end the specified processes and control is returned to the caller immediately. When you specify this value, the TIMOUTOPT parameter (Step 5) is ignored.

• Specify *NOMAX. When you specify this value, MIMIX will wait until all specified MIMIX processes are ended.

• Specify a numeric value (number-of-seconds). MIMIX waits the specified time for a controlled end to complete before using the option specified in the TIMOUTOPT parameter.

5. If you specified a numeric value for the WAIT parameter in Step 4, you can also use the Timeout Option (TIMOUTOPT) parameter. You can specify what action you want the ENDDG command to perform if the time specified in the WAIT parameter is reached:

• The current process should quit and return control to the caller (*QUIT).

• A new request should be issued to end all processes immediately (*ENDIMMED). When this value is specified, pending activity entries may still exist after the data group processes are ended.

• An inquiry message should be sent to the operator notifying of a possible error condition (*NOTIFY). If you specify this value, the command must be run from the target system.

6. Press Enter to process the command.

Confirming the end request completed without problems

After you request a controlled end of a data group, the Work with Data Group display appears. Do the following:

1. From the Work with Data Group display appears. Type an 8 (Display status) next to the data group you ended and press Enter.

2. The Data Group Status display appears. In the Target Statistics section near the middle of the display, the Unprocessed Entry Count column should be blank for any database apply processes and any object apply processes. If unprocessed entries exist when you end the data group and perform a switch, you may lose these entries when the data group is started following the switch.

Note: To ensure that you are aware of any possible pending or delayed activity entries, enter the WRKDGACTE STATUS(*ACTIVE) command. Any activities that are still in progress will be listed. Ensure that all activities are completed.

196


3. Ensure that there are no open commit cycles.The next attempt to start the data group will fail if open commit cycles exist and either the start request specified to clear pending entries (CLRPND(*YES)) or the commit mode specified in the data group definition changed. (Certain process, such as performing a hardware upgrade with a disk image change, converting to MIMIX Dynamic Apply, or enabling a disabled data group, require a clear pending start.) To verify commit cycles, do the following:

a. Press F8 (Database) to view the Data Group Detail Status display.

b. For each apply session listed, verify that the value shown in the Open Commit column at the right side of the display is *NO.

c. If open commit cycles exist, restart the data group. You must take action to resolve the open commit cycles, such as ending or quiescing the application or closing the commit cycle. Then repeat the controlled end again.

197

Ending selected data group processes

198

Ending selected data group processesThis procedure can be used to end all or selected processes for a data group, or end a specific database apply process.

Data groups that are in an application group: The preferred method of ending data groups that are part of an application group is to use the End Application Group (ENDAG) command. Beginning with service pack 7.1.06.00, the default behavior of the ENDDG command helps to enforce this best practice when necessary by not allowing the command to run when the data group is participating in a resource group with three or more nodes. (A data resource group provides the association between one or more data groups and an application group.). The ENDDG request will run when the data group is participating in a resource group with two nodes. In earlier software levels, default behavior does not allow a end request when the data group is part of an application group.

In application group environments with three or more nodes, it is particularly important to treat all members of an application group as one entity. For example, a configuration change that is made effective by starting and ending a single data group would not be propagated to the other data groups in the same resource group. However, the same change would be propagated to the other data groups if it is made effective by ending and starting the parent application group.

For additional information about the ENDDG command, refer to the following topics:

• “What occurs when a data group is ended” on page 190

• “What replication processes are ended by the ENDDG command” on page 203

To selectively end processes for a data group, do the following:

1. From the Work with Data Groups display, type a 10 (End DG) next to the data group that you want to end and press Enter.

2. The End Data Group (ENDDG) display appears. At the Process prompt, specify the value for the processes you want to end. To see a list of values, press F4 (Prompt).

3. At the End process prompt, specify the value you want.

4. If the data group uses remote journaling, verify that the value of the End remote journaling prompt is what you want.

5. If you want to end only a selected apply session, press F10 (Additional parameters). Then specify the value for the session you want to end at the Apply session prompt.

6. To end the selected processes, press Enter.

sses are started by the STRDG command

199

What sses based on the value you specify for MIX Remote Journal support (MIMIX RJ

plication for each of the possible values a to the target system. On the target y process (DBAPY) completes

n the data groups are ended. If the RJ ified for the PRC parameter includes sult of each process when the RJ link is

k is already active.

Table 37. on processes are inactive when the STRDG

Value for

PRC ect replication

RCV CNRRCV STSSND OBJAPY

*ALL ts Starts Starts Starts

*ALLSRC ts Starts Starts Inactive

*ALLTGT tive Inactive Inactive Starts

*DBALL tive3 Inactive3 Inactive3 Inactive3

Notes:

A. Data g ects and data on the target system from becoming

B. When

C. When

D. When ere they will be processed by the DBRDR.

E. If data se source processes are selected.

What replication proce

replication processes are started by the STRDG commandMIMIX determines how each data group is configured and starts the appropriate replication procethe Start processes (PRC parameter). Default configuration values create data groups that use MIsupport) for database replication and source-send technology for object replication.

Table 37 identifies the processes that are started when MIMIX RJ support is used for database reon the PRC parameter. An RJ link identifies the IBM i remote journal function, which transfers datsystem, the data is processed by the MIMIX database reader (DBRDR) before the database applreplication.

For data groups that use MIMIX RJ support, it is standard practice to leave the RJ link active whelink is not already active when starting data groups, MIMIX starts the RJ link when the value specdatabase source system processes or all processes. The RJ Link column in Table 37 shows the renot active while the Notes column identifies behavior that may not be anticipated when the RJ lin

Processes started by data groups configured for MIMIX Remote Journal support. This assumes that all replicatirequest is made.

Notes Source Processes Target Processes

DB replication Object replication DB replication Obj

RJ Link 1 OBJSND OBJRTV CNRSND STSRCV DBRDR DBAPY2 OBJ

E Starts1 Starts Starts Starts Starts Starts Starts Star

A, E Starts1 Starts Starts Starts Starts Inactive Inactive Star

A, B Inactive1 Inactive Inactive Inactive Inactive Starts Starts Inac

A, E Starts1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Starts Inac

roups which use cooperative processing should have both database and object processes started to prevent obj not fully synchronized.

the RJ link is already active, database replication becomes operational.

the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link

the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link, wh

group data area entries are configured, the data area polling process also starts when values which start databa


200

e replication. Data groups created on

*OBJALL ts Starts Starts Starts

*DBSRC tive3 Inactive3 Inactive3 Inactive3

*DBTGT tive3 Inactive3 Inactive3 Inactive3

*OBJSRC ts Starts Starts Inactive

*OBJTGT tive Inactive Inactive Starts

*DBRDR tive3 Inactive3 Inactive3 Inactive3

*DBAPY tive3 Inactive3 Inactive3 Inactive3

1. This col es when the RJ Link is already active, which is default

2. If the ac e job is also started. Access path maintenance is availabl

3. These o4. These d

Table 37. on processes are inactive when the STRDG

Value for

PRC ect replication


Notes:


B. When

C. When

D. When ere they will be processed by the DBRDR.

E. If data se source processes are selected.


Optionally, data groups can use source-send technology instead of remote journaling for databasearlier levels of MIMIX may still be configured this way.

A, C Inactive1 Starts Starts Starts Starts Inactive4 Inactive 4 Star

A, C, E

Starts1 Inactive3 Inactive3 Inactive3 Inactive3 Inactive Inactive Inac

A, B Inactive1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Starts Inac

A, C Inactive1 Starts Starts Starts Starts Inactive4 Inactive4 Star

A, C Inactive1 Inactive Inactive Inactive Inactive Inactive4 Inactive4 Inac

A, D Inactive1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Inactive Inac

A, C Inactive1 Inactive3 Inactive3 Inactive3 Inactive3 Inactive4 Starts4 Inac

umn shows the effect of the specified value on the RJ link when the RJ link is not active. See the Notes for the effect of valubehavior. cess path maintenance (APMNT) policy has been enabled at the installation or data group level, an access path maintenance on installations running 7.1.15.00 or higher.bject replication processes are not available in data groups configured for database-only replication.atabase replication processes are not available in data groups configured for object-only replication.

Processes started by data groups configured for MIMIX Remote Journal support. This assumes that all replicatirequest is made.





the RJ link is already active, database replication becomes operational.

the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link

the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link, wh

group data area entries are configured, the data area polling process also starts when values which start databa


201

send technology is used for database s replace the IBM i remote journal

Table 38. sses are inactive when the STRDG request

Value for

PRC ject replication

JRCV CNRRCV STSSND OBJAPY

*ALL rts Starts Starts Starts

*ALLSRC rts Starts Starts Inactive

*ALLTGT ctive Inactive Inactive Starts

*DBALL ctive3 Inactive3 Inactive3 Inactive3

*OBJALL rts Starts Starts Starts

*DBSRC ctive3 Inactive3 Inactive3 Inactive3

*DBTGT ctive3 Inactive3 Inactive3 Inactive3

*OBJSRC rts Starts Starts Inactive

*OBJTGT ctive Inactive Inactive Starts

*DBRDR ctive3 Inactive3 Inactive3 Inactive3

*DBAPY ctive3 Inactive3 Inactive3 Inactive3

Notes:


1. When th2. If the ac e job is also started. Access path maintenance is

availabl3. These o4. These d


Table 38 identifies the processes that are started by each value for Start processes when source-replication. The MIMIX database send (DBSND) process and database receive (DBRCV) procesfunction and the DBRDR process, respectively.

Processes started by data groups configured for Source Send replication This assumes that all replication proceis made.


DB replication Object replication DB replication Ob

DBSND 1 OBJSND OBJRTV CNRSND STSRCV DBRCV DBAPY2 OB

— Starts 1 Starts Starts Starts Starts Starts Starts Sta

A Starts 1 Starts Starts Starts Starts Starts Inactive Sta

A Inactive Inactive Inactive Inactive Inactive Inactive Starts Ina

A Starts 1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Starts Ina

A Inactive 4 Starts Starts Starts Starts Inactive4 Inactive4 Sta

A Starts 1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Inactive Ina

A Inactive Inactive3 Inactive3 Inactive3 Inactive3 Inactive Starts Ina

A Inactive4 Starts Starts Starts Starts Inactive4 Inactive4 Sta

A Inactive4 Inactive Inactive Inactive Inactive Inactive4 Inactive4 Ina

5 — — Inactive3 Inactive3 Inactive3 Inactive3 — — Ina

A Inactive4 Inactive3 Inactive3 Inactive3 Inactive3 Inactive4 Starts 4 Ina


e database send (DBSND) process starts, the data area polling process also starts.cess path maintenance (APMNT) policy has been enabled at the installation or data group level, an access path maintenance on installations running 7.1.15.00 or higher.bject replication processes are not available in data groups configured for database-only replication.atabase replication processes are not available in data groups configured for object-only replication


202

5. The dat


abase reader (*DBRDR) process is not used by data groups configured for source-send replication.

sses are ended by the ENDDG command

203

What sses based on the value you specify for emote Journal support (MIMIX RJ

is used for database replication. An RJ target system, the data is processed by lication.

C. In most cases, leaving the RJ link aling (ENDRJLNK parameter). “When to

Table 39. on processes are active when the ENDDG

Value for

PRC ect replication


*ALL s Ends Ends Ends

*ALLSRC s Ends Ends Active

*ALLTGT ve Active Active Ends

*DBALL ve 3 Active 3 Active 3 Active 3

Notes:

A. Has no the RJ link, where they will be processed.

B. Data g hile object processes remain active may result in o active may result in files being placed on hold due

C. New da space on the target system before the end request w

D. New d y the DBRDR.

E. The da


replication processes are ended by the ENDDG commandMIMIX determines how each data group is configured and ends the appropriate replication procethe Process (PRC parameter). Default configuration values create data groups that use MIMIX Rsupport) for database replication and source-send technology for object replication.

Table 39 identifies the processes that are ended by each value for PRC when MIMIX RJ support link identifies the IBM i remote journal function, which transfers data to the target system. On the the MIMIX database reader (DBRDR) before the database apply process (DBAPY) completes rep

The communications defined by the RJ link remains active and is not affected by any value for PRactive is preferable. If necessary, you can end the RJ link by changing value for End remote journend the RJ link” on page 188 describes when you need to end the RJ link.

Processes ended by data groups configured for MIMIX Remote Journal support. This assumes that all replicatirequest is made and that the request does not specify to end the RJ link.




E Active1 Ends Ends Ends Ends Ends Ends End

A, E Active1 Ends Ends Ends Ends Active Active End

— Active1 Active Active Active Active Ends Ends Acti

B, E Active1 Active 3 Active 3 Active 3 Active 3 Ends Ends Acti

effect on database-only replication. New database journal entries continue to transfer to the target system over

roups that use cooperative processing may be affected by the result of this value. Ending database processes wbject activity entries being placed on hold. Similarly, ending object processes while database processes remain to error.

tabase journal entries continue to transfer to the target system over the RJ link. Existing entries stored in the logas processed will be applied.

atabase journal entries continue to transfer to the target system over the RJ link, where they will be processed b

ta area polling process ends when values which end database source processes are specified.


204

*OBJALL s Ends Ends Ends

*DBSRC ve 3 Active 3 Active 3 Active 3

*DBTGT ve 3 Active 3 Active 3 Active 3

*OBJSRC s Ends Ends Active

*OBJTGT ve Active Active Ends

*DBRDR ve 3 Active 3 Active 3 Active 3

*DBAPY ve 3 Active 3 Active 3 Active 3

1. The RJ over the RJ link. See the Notes column for addi-tional de

2. On insta ath maintenance job and then ends. The access path ma MIMIX had previously changed to delayed. Any files thaOn insta itors are also ended when ENDDG command specifie r PRLAPMNT, the function is always ended regardle

3. These o

Table 39. on processes are active when the ENDDG

Value for

PRC ect replication


Notes:

A. Has no the RJ link, where they will be processed.

B. Data g hile object processes remain active may result in o active may result in files being placed on hold due

C. New da space on the target system before the end request w

D. New d y the DBRDR.

E. The da


A, B Active1 Ends Ends Ends Ends Active 4 Active 4 End

A, B, E

Active1 Active 3 Active 3 Active 3 Active 3 Active Active Acti

B Active1 Active 3 Active 3 Active 3 Active 3 Ends Ends Acti

A, B Active1 Ends Ends Ends Ends Active 4 Active 4 End

A, B Active1 Active Active Active Active Active 4 Active 4 Acti

B, C Active1 Active 3 Active 3 Active 3 Active 3 Ends Active Acti

B, D Active1 Active 3 Active 3 Active 3 Active 3 Active Ends Acti

link is not ended by the End options (PRC) parameter. New database journal entries continue to transfer to the target systemtails.llations running 7.1.15.00 or higher, if access path maintenance is enabled, the database apply process signals the access pintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that t could not be changed are identified as having an access path maintenance error before the maintenance jobs end.llations running software earlier than 7.1.15.00, if parallel access path maintenance function is enabled, the associated mon

s *DFT for End parallel AP maintenance (PRLAPMNT) and *ALL for the Apply session (APYSSN). When *YES is specified foss of the values specified for PRC or APYSSN.bject replication processes are not available in data groups configured for database-only replication.

Processes ended by data groups configured for MIMIX Remote Journal support. This assumes that all replicatirequest is made and that the request does not specify to end the RJ link.




effect on database-only replication. New database journal entries continue to transfer to the target system over


tabase journal entries continue to transfer to the target system over the RJ link. Existing entries stored in the logas processed will be applied.

atabase journal entries continue to transfer to the target system over the RJ link, where they will be processed b

ta area polling process ends when values which end database source processes are specified.


205

e replication. Data groups created on ended by each value for End options D) process and database receive pectively.

4. These d

Table 40. sses are active when the ENDDG request is

Value for

PRC ject replication

JRCV CNRRCV STSSND OBJAPY

*ALL s Ends Ends Ends

*ALLSRC s Ends Ends Active

*ALLTGT ive Active Active Ends

*DBALL ive 2 Active 2 Active 2 Active 2

*OBJALL s Ends Ends Ends

*DBSRC ive 2 Active 2 Active 2 Active 2

*DBTGT ive 2 Active 2 Active 2 Active 2

*OBJSRC s Ends Ends Active

*OBJTGT ive Active Active Ends

*DBRDR ive 2 Active 2 Active 2 Active 2

*DBAPY ive 2 Active 2 Active 2 Active 2

Notes:

A. Data g hile object processes remain active may result in o active may result in files being placed on hold due

1. When th


Optionally, data groups can use source-send technology instead of remote journaling for databasearlier levels of MIMIX may still be configured this way. Table 40 identifies the processes that arewhen source-send technology is used for database replication. The MIMIX database send (DBSN(DBRCV) process are replaced by the IBM i remote journal function and the DBRDR process, res

atabase replication processes are not available in data groups configured for object-only replication.

Processes ended by data groups configured for Source Send replication This assumes that all replication procemade.


DB replication Object replication DB replication Ob

DBSND 1 OBJSND OBJRTV CNRSND STSRCV DBRCV DBAPY2 OB

— Ends 1 Ends Ends Ends Ends Ends Ends End

— Ends 1 Ends Ends Ends Ends Ends Active End

— Active Active Active Active Active Active Ends Act

A Ends 1 Active 3 Active 2 Active 2 Active 2 Ends Ends Act

A Active 4 Ends Ends Ends Ends Active 3 Active 3 End

A Ends 1 Active 2 Active 2 Active 2 Active 2 Ends Active Act

A Active Active 2 Active 2 Active 2 Active 2 Active Ends Act

A Active 3 Ends Ends Ends Ends Active 3 Active 3 End

A Active 3 Active Active Active Active Active 3 Active 3 Act

5 — — Active 2 Active 2 Active 2 Active 2 — — Act

A Active 3 Active 2 Active 2 Active 2 Active 2 Active 3 Ends 3 Act


e database send (DBSND) process ends, the data area polling process also ends.


206

2. On insta ath maintenance job and then ends. The access path ma MIMIX had previously changed to delayed. Any files thaOn insta itors are also ended when ENDDG command specifie r PRLAPMNT, the function is always ended regardle

3. These o4. These d5. The dat


llations running 7.1.15.00 or higher, if access path maintenance is enabled, the database apply process signals the access pintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that t could not be changed are identified as having an access path maintenance error before the maintenance jobs end.llations running software earlier than 7.1.15.00, if parallel access path maintenance function is enabled, the associated mon

s *DFT for End parallel AP maintenance (PRLAPMNT) and *ALL for the Apply session (APYSSN). When *YES is specified foss of the values specified for PRC or APYSSN.bject replication processes are not available in data groups configured for database-only replication.atabase replication processes are not available in data groups configured for object-only replicationabase reader (*DBRDR) process is not used by data groups configured for source-send replication.

207

CHAPTER 11 Resolving common replication problems

Occasionally, a journaled transaction for a file or object may fail to replicate. User intervention is required to correct the problem. This chapter provides procedures to help you resolve problems that can occur during replication processing.

The following topics are included in this chapter:

• “Working with message queues” on page 208 describes how to use the MIMIX primary and secondary message queues from a 5250 emulator.

• “Working with the message log” on page 209 describes how to access the MIMIX message log from either user interface.

• “Working with user journal replication errors” on page 210 includes topics for how to resolve a file that is held due to an error. It also includes topics about options for placing a file on hold and releasing held files.

• “Working with tracking entries” on page 219 describes how to use tracking entries to resolve replication errors for IFS objects, data areas, or data queues that are replicated cooperatively with the user journal. It also includes topics about options for placing a tracking entry on hold and releasing held tracking entries.

• “Working with objects in error” on page 224 describes how to resolve objects in error by working with the data group activities used for system journal replication. This topic includes information about how to retry failed activity entries and how to determine whether MIMIX is automatically attempting to retry an activity.

• “Removing data group activity history entries” on page 229 describes how to manually remove completed entries for system journal replication activity. This may be necessary if you need to conserve disk space.

Working with message queues

208

Working with message queuesYou can access the MIMIX primary and secondary message queues to display messages or manage the list of messages.

Do the following to access a MIMIX message queue:

1. Type the command DSPMMXMSGQ and press F4 (Prompt).

2. Specify either *PRI or *SEC to access the message queue you want and press Enter.

3. The Display MIMIX Message Queue display appears listing all of the current messages. To view all of the information for a message, place the cursor on the message you want and press Enter.

You can also use the function keys on this display to perform several message-related tasks. Refer to the help text (F1 key) for information about these function keys.

Note: The MIMIX primary and secondary message queues are defined for each system definition. You can control the severity and type of messages to be sent to each message queue through parameters on the system definition.

Working with the message log

209

Working with the message logThe MIMIX message log provides a common location for you to see all messages related to MIMIX products. A consolidated list of messages for all systems in the installation library is available on the management system.

Note: The target system only shows messages that occurred on the target system.

LVI messages are informational messages and LVE messages are error or diagnostic messages. CPF messages are generated by an underlying operating function and may be passed up to the MIMIX product.

Do the following to access the MIMIX message log:

1. Do one of the following to access the message log display:

• From the MIMIX Basic Main Menu, select option 13 (Work with messages) and press Enter.

• From the MIMIX Intermediate Main Menu, select option 3 (Work with messages) and press Enter.

2. The Work with Message Log appears with a list of the current messages. The initial view shows the message ID and text.

3. Press F11 to see additional views showing the message type, severity, the product and process from which it originated, whether it is associated with a group (for MIMIX, a data group), and the system on which it originated.

4. You can subset the messages shown on the display. A variety of subsetting options are available that allow you to manage the message log more efficiently.

5. To work with a message, type the number of the option you want and press Enter. The following options are available:

• 4=Remove - Use this option if you want to delete a message. When you select this option, a confirmation display appears. Verify that you want to delete the messages shown and press Enter. The message is deleted only from the local system.

• 5=Display message - Use this option to view the full text of the first level message and gain access to the second level text.

• 6=Print - Use this option to print the information for the message.

• 8=Display details - Use this option to display details for a message log entry including its from and to program information, job information, group information, product, process, originating system, and call stack information.

• 9=Related messages - Use this option to display a list of messages that relate to the selected message. Related messages include a summary and any detail messages immediately preceding it. This can be helpful when you have a large message log list and you want to show the messages for a certain job.

• 12=Display job - If job information exists on the system, you can use this option to access job information for a message log entry. The Work with Jobs display appears from which you can select options for displaying specific information about the job.

Working with user journal replication errors

Working with user journal replication errorsMIMIX reports user journal replication errors for files as status on the associated data group file entry. This status is also reported at the data group level in a consolidated form.

File replication problems are categorized as follows:

Held due to error - If a journal transaction is not replicated successfully, the file entry is placed in *HLDERR status. This indicates a problem that must be resolved.

Held for other reasons - File entries can also be placed in a variety of other held statuses by user action or by MIMIX. Generally, these statuses are also considered problems; some are transitional conditions that resolve automatically while others require user action. To determine if there are files on hold for other reasons, use the procedure in “Working with the detailed status of data groups” on page 105.

For information about resolving problems with IFS objects and library-based objects that are replicated by user journal, see “Working with tracking entries” on page 219.

Working with files needing attention (replication and access path errors)

The DB Errors column on the Work with Data Groups display identifies the number of errors for user journal replication. Specifically, this column identifies the sum of the number of database files, IFS, *DTAARA, and *DTAQ objects on hold due to errors

(*HLDERR) plus the number of LF and PF files that have access path maintenance1 failures for a data group. Data group file entries and tracking entries should not be left in *HLDERR state for any extended time. Access path maintenance errors occur when MIMIX could not change a file’s access path maintenance attribute back to immediate.

To access a list of files in error for a data group, do the following:

1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and press Enter.

2. The Work with Data Groups display appears. Type 12 (Files needing attention) next to the data group you want which has errors identified in the DB Errors column and press Enter.

3. The Work with DG File Entries display appears with a list of file entries for the data group that have replication errors, access path maintenance2 errors, or both. Do the following:

a. The initial view shows the current replication status of file entries. Any entry with a status of *HLD, *HLDERR, *HLDIGN or *HLDRLTD indicates that action is required. Use Table 41 to identify choices based on the file entry status.

1. Errors for the access path maintenance function are included on installations running MIMIX 7.1.15.00 or higher.

2. Access path maintenance errors can only be reported on data group file entries in installations running MIMIX 7.1.15.00 or higher.

210


Note: MIMIX retains log spaces for file entries with these statuses so that the journal entries that are being held can be released and applied to the target system. File entries should not be left in these states for an extended period.

b. Use Table 41 to identify choices based on the file entry status and Table 42 to identify available options from this display.

c. If necessary, take action to prevent the error from happening again. Refer to the following topics:

• “Correcting file-level errors” on page 216

• “Correcting record-level errors” on page 217

4. Press F10 as needed on the Work with DG File entries display until you see the access path maintenance view. The AP Maint. Status column identifies any AP maintenance errors for a file with the value *FAILED and failures for logical files associated with a file as *FAILEDLF.

Immediate action may not be necessary because MIMIX will attempt to retry access path maintenance when the data group ends and when it is restarted. To attempt an immediate retry, use option 40 (Retry AP maintenance).

Table 41. Possible actions based on replication status of a file entry

Status Preferred Action1

*ACTIVE Unless an error has occurred, no action is necessary. Entries in the user journal for the file are replicated and applied. If necessary, any of the options to hold journal entries can be used.

*HLD User action is required to release the file entry (option 26) so that held journal entries from the user journal can be applied to the target system.

*HLDERR User action is required. Attempt to resolve the error by synchronizing the file (option 16).

Note: Transactions and hold logs are discarded for file entries with a status of *HLDERR and an error code of IG. Such a file must be synchronized.

*HLDIGN User action is required to either synchronize the file (option 16) or to change the configuration if you no longer want to replicate the file. Journal entries for the file are discarded. Replication is not occurring and the file may not be synchronized.

Depending on the circumstances, Release may also be an option.

*HLDRGZ

*HLDRNM

*HLDPRM

*HLDSYNC

These are transitional states that should resolve to *ACTIVE. If these status persist, check the journaling status for the entry. MIMIX retains log spaces for the held journal entries for the duration of these temporary hold requests.

211


*HLDRTY The file entry is held because an entry could not be applied due to a condition which required waiting on some other condition (such as in-use). After a short delay, the database apply job will automatically attempt to process this entry again. The preferred action is to allow MIMIX to periodically retry the file entry. By default, the database apply job will automatically attempt to process the entry every 5 minutes for up to 1 hour.

Manually releasing the file entry will cause MIMIX to attempt to process the entry immediately

*HLDRLTD User action is required for a file in the same network. View the related files (option 35). A file that is related due to a dependency, such as a constraint or a materialized query table, is held. Resolving the problem for the related held file will resolve this status.

*RLSWAIT The file is waiting to be released by the DB apply process and will be changed to *ACTIVE. If the status does not change to *ACTIVE, check the journaling status. If this status persists, you may need to synchronize (option 16).

*CMPACT

*CMPRLS

*CMPRPR

These are transitional states that should resolve automatically. The file entry represents a member that is being processed cooperatively between the CMPFILDTA command and the database apply process.

1. Evaluate the cause of the problem before taking any action.

Table 42. Options for working with file entries from the Work with DG FIle Entries display

Option Additional Information

9=Start journaling See “Starting journaling for physical files” on page 235.

10=End journaling See “Ending journaling for physical files” on page 236.

11=Verify journaling See “Verifying journaling for physical files” on page 237.

16=Sync DG file entry

See topic ‘Synchronizing database files’ in the MIMIX Administrator Reference book.

20=Work with file error entries

See topic “Working with journal transactions for files in error” on page 213.

23=Hold file See topic “Placing a file on hold” on page 214.

24=Ignore file See topic “Ignoring a held file” on page 214.

25=Release wait See topic “Releasing a held file at a synchronization point” on page 215.

26=Release See topic “Releasing a held file” on page 215.

27=Release clear See topic “Releasing a held file and clearing entries” on page 216.

Table 41. Possible actions based on replication status of a file entry


212


Working with journal transactions for files in error

When resolving problems for a file that is in *HLDERR state, a MIMIX administrator may find it useful to examine the journal entries that are being held by MIMIX.

Although you can determine why a file is in error from either the source or target system, to view the actual journal entries, you must be on the target system. If you attempt to view the journal entries from the source system, MIMIX will indicate that you are on the incorrect system to view the information.

Do the following:

1. From the subsetted list of files in error for a data group on the Work with DG File Entries display, type 20 (Work with file error entries) next to the file entry you want and press Enter.

2. The Work with DG FE on Hold display appears. A variety of information about the transaction appears on the display.

Note: The values shown in the Sequence number column may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left-most) are omitted. Truncated journal sequence numbers are prefixed by '>'. The First journal sequence number field displays the full sequence number of the first item displayed in the list.

a. Locate the transaction that caused the file to be placed on hold. Use the Position to field to position the list to a specific sequence number.

b. Select the option (Table 43) you want to use on the journal transaction:

31=Repair member data

Available for entries with a status of *HLDERR that identify a member. See topic ‘Comparing and repairing file data - members on hold (*HLDERR)’ in the MIMIX Administrator Reference book.

35=Work with related files

Displays file entries that are related to the selected file by constraints or by other dependencies such as materialized query tables

40=Retry AP maintenance

Retries access path maintenance operations on the target system for the selected file. This option is only valid on data group file entries that have an access path maintenance status of *FAILED or *FAILEDLF.

Table 43. Options available from the Work with DG FE on Hold display.

2=Change You can change the contents or characteristics of the journal entry. Use this option with caution. Any changes can affect the validity of data in the journal entry.

Table 42. Options for working with file entries from the Work with DG FIle Entries display


213


Placing a file on hold

Use this procedure to hold any journal entries for a file identified by a data group file entry. Avoid leaving a file entry on hold for any extended period.

File entries with a status of *ACTIVE, *HLDRGZ, *HLDRNM, *HLDPRM, *HLDSYNC, *HLDRLTD, and *RLSWAIT can be placed on hold.

The request changes the file entry status to *HLD. Any journal entries for the associated file are replicated but not applied. If the file is being processed by an active apply session, suspending of the update process can take a short time to complete. You will receive a message when the file is held. MIMIX retains log spaces containing any replicated journal entries in anticipation that the file entry will be released. When the file is released, the accumulated journal entries will be applied. The *HLD status remains until additional action is taken.

Do the following:

1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter.

2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter.

3. The Work with DG File Entries display appears. Type 23 (Hold file) next to the entry you want and press Enter.

Ignoring a held file

Use this procedure to ignore any journal entries for an file identified by a data group file entry. The request changes the file entry status to *HLDIGN. Any journal entries for the associated file, including any hold logs, are discarded. The *HLDIGN status remains until additional action is taken.

Note: Be certain that you want to use the ignore feature. Any ignored transactions cannot be retrieved. You must replace the object on the target system with a current version from the source system.

If a file has been on hold for a long time or you expect that it will be, the amount of storage used by the error/hold log space can be quite large. If you anticipate that you

4=Delete You can delete the journal entry.

5=Display You can display details for the specified journal entry associated with the data group file entry in question.

9=Immediate apply

You can immediately apply a transaction that has caused a file to go on hold. The entry you selected is immediately applied to the file outside of the apply process. If the apply is successful, the error/hold entry that was applied is removed from the error/hold log. However, if the apply fails, a message is issued and the entry remains in the error/hold log. This process does not release the file; it only applies the selected entry.

Table 43. Options available from the Work with DG FE on Hold display.

214


will need to save and restore the file or replace it for any other reason, it may be best to just ignore all current transactions.

Do the following:



3. The Work with DG File Entries display appears. Type 24 (Ignore file) next to the entry you want and press Enter.

The status of the file is changed to *HLDIGN. The file entry is ignored. Journal entries for the file entry, including any hold logs, are discarded.

Releasing a held file at a synchronization point

Use this procedure to wait for a synchronization point to release any held journal entries for file identified by a data group file entry, then resume replication.

The request changes the file entry status to *RLSWAIT. Any journal entries for the associated file are discarded until a File member saved (F-MS) journal entry or a Start of save of a physical file member using save-while-active function (F-SS) is encountered. This is the synchronization point. The file entry status is then changed to *ACTIVE and all journal entries that were held after the synchronization point are applied.

If the F-MS or F-SS journal entry is not in the log space, the file entry remains in *RLSWAIT status. If you are unsure as to how many save requests might accumulate for an object, you can synchronize the file associated with the file entry. The entry status will become *ACTIVE.

To wait for a synchronization point before releasing a held file, do the following:



3. The Work with DG File Entries display appears. Type 25 (Release wait) next to the entry you want and press Enter.

Releasing a held file

Use this procedure to immediately release any held journal entries for file identified by a data group file entry with a status of *HLD and resume replication.

The request changes the file entry status to *ACTIVE. Any held journal entries for the associated file are applied. Normal replication of the file resumes.

While a file is being released, the appropriate apply session suspends its operations on other files. This allows the released file to catch up to the current level of processing. If a file or member has been on hold for a long time, this can be lengthy.

215


Do the following to immediately release a held file or file member:



3. The Work with DG File Entries display appears. Type 26 (Release) next to the entry you want and press Enter

Releasing a held file and clearing entries

Use this procedure to clear any held journal entries for a file identified by a data group file entry, then resume replication.

The request changes the file entry status to *ACTIVE. Any held journal entries for the associated file are discarded. Journal entries received after the file entry status became *ACTIVE are applied, resuming normal replication.

If a file entry is on hold and its associated file has been synchronized in such a way that the held entries already exist in the restored file, this procedure will ensure that those entries are not re-applied. This procedure will not work if the file is being actively updated on the source system.

Do the following to release a held file and clear any journal entries that were replicated but not applied:



3. The Work with DG File Entries display appears. Type 27 (Release clear) next to the entry you want and press Enter

Correcting file-level errors

Typically, file-level errors can be categorized as one of the following:

• A problem with the configuration of files defined for replication.

• A discrepancy in the file descriptions between the management and network systems

• An operational error.

This topic identifies the most common file-level errors and measures that you can take to prevent the problem from recurring. See also “Correcting record-level errors” on page 217.

Once you diagnose and correct a file-level error, the problem rarely manifests itself again. Some of the most common file-level errors are:

• Authority: The MIMIXOWN user profile defined in the MIMIX job description does not have authority to perform a function on the target system. You can prevent this problem by ensuring that the MIMIXOWN user profile has all object authority

216


(*ALLOBJ). This guarantees that the user profile has all the necessary authority to run IBM i commands and has the ability to access the library and files on the management system. Refer to the Using License Manager book for more information about the MIMIXOWN user profile and authority.

• Objects existence or corruption: MIMIX cannot run a function against a file on the target system because the file or a supporting object (such as logical files) does not exist or has become damaged. System security is the only way to prevent an object from being accidentally deleted from the target system. Make sure that only the correct personnel have the ability to remove objects from the target system where replicated data is applied. Also, ensure that application programs do not delete files on the target system when there are no apply sessions running.

• MIMIX subsystem ended: If the MIMIX subsystem is ended in an immediate mode while MIMIX processes are still active, files may be placed in a “Held” status. This is a result of MIMIX being unable to complete a transaction normally. After MIMIX is restarted, you only need to release the affected files.

Correcting record-level errors

Record-level errors occur when MIMIX updates or attempts to update a file and the feedback from the update process indicates a discrepancy between the files on the management and network system. Record-level errors can usually be traced back to problems with one of the following:

• The system

• Unique application environments, such as System 36 code running in native IBM i.

• Operational errors.

Record written in error

This section describes the most common record-level errors. MIMIX DB Replicator was able to write the record on the target system; however, it wrote to the wrong relative record number. In most situations, the IBM i database function writes a new record to the end of a file. MIMIX did so, but it did not match the relative record number of the sending system. Usually this error occurs when transactions (journal entries) are skipped on the send system. Common reasons why records are written in error include the following:

• Journaling was ended: When journaling is ended, transaction images are not being collected. If users update the files while journaling is not running, no journal entries are created and MIMIX DB Replicator has no way of replicating the missing transactions. The best way to prevent this error is to restrict the use of the Start Journaling Physical File (STRJRNPF) and End Journaling Physical File (ENDJRNPF) commands.

• User journal replication was restarted at the wrong point: When you change the starting point of replication for a data group, it is imperative that transactions are not skipped.

217


• Apply session restarted after a system failure: This is caused when the target system experiences a hard failure. MIMIX always updates its user spaces with the last updated and sent information. When a system fails, some information may not be forced to disk storage. The data group definition parameter for database apply processing determines how frequently to force data to disk storage. When the apply sessions are restarted, MIMIX may attempt to rewrite records to the target system database.

• Unable to write/update a record: This error is caused when MIMIX cannot access a record in a file. This is usually caused when there are problems with the logical files associated with the file or when the record does not exist. The best way to prevent this error is to make sure that replication is started in the correct position. This error can also be due to one of the problems listed in topic “Correcting file-level errors” on page 216.

• Unable to delete a record: This is caused when MIMIX is trying to delete a record that does not exist or has a corrupted logical file associated with the physical file. This error can also be due to one of the problems listed in topic “Correcting file-level errors” on page 216.

218

Working with tracking entries

Working with tracking entriesTracking entries identify library-based objects (data areas and data queues) and IFS objects configured for cooperative processing (advanced journaling).

You can access the following displays to work with tracking entries in any status:

• Work with DG IFS Trk. Entries display (WRKDGIFSTE command)

• Work with DG Obj. Trk. Entries display (WRKDGOBJTE command)

These displays provide access for viewing status and working with common problems that can occur while replicating objects identified by IFS and object tracking entries.

Held tracking entries: Status for the replicated objects is reported on the associated tracking entries. If a journal transaction is not replicated successfully, the tracking entry is placed in *HLDERR status. This indicates a problem that must be resolved.

Tracking entries can also be placed in *HLD, *HLDIGN statuses by user action These statuses are reported as ‘held for other reasons’ and also require user action.

When a tracking entry has a status of *HLD or *HLDERR, MIMIX retains log spaces so that journal entries that are being held can be released and applied to the target system. Tracking entries should not be left in these states for an extended period.

Additional information: To determine if a data group has any IFS objects, data areas, or data queues configured for advanced journaling, see “Determining if non-file objects are configured for user journal replication” on page 271.

When working with tracking entries, especially for IFS objects, you should be aware of the information provided in “Displaying long object names” on page 262.

Accessing the appropriate tracking entry display

To access IFS tracking entry or object tracking entry displays for a data group, do the following:



3. Next to the data group you want, type the number for the option you want and press Enter. Table 44 shows the options for tracking entries.

Table 44. Tracking entry options on the Work with Data Groups display

Select Option Result

50=IFS trk entries Lists all IFS tracking entries for the selected data group on the Work with DG IFS Trk. Entries display.

51=IFS trk entries not active

Lists IFS tracking entries for the selected data group with inactive status values (*HLD, *HLDERR, *HLDIGN, *HLDRNM, and *RLSWAIT) on the Work with DG IFS Trk. Entries display.

219


4. The tracking entry display you selected appears. Significant capability is available for addressing common replication problems and journaling problems. Do the following:

a. Use F10 to toggle between views showing status, journaling status, and the database apply session in use.

b. Any entry with a status of *HLD, *HLDERR or *HLDIGN indicates that action is required. The identified object remains in this state until action is taken. Statuses of *HLD and *HLDERR result in journal entries being held but not applied. Use Table 45 to identify choices based on the tracking entry status.

c. Use options identified in Table 46 to address journaling problems or replication problems.

52=Obj trk entries Lists all object tracking entries for the selected data group on the Work with DG Obj. Trk. Entries display.

53=Obj trk entries not active

Lists object tracking entries for the selected data group with inactive status values (*HLD, *HLDERR, *HLDIGN, and *RLSWAIT) on the Work with DG Obj. Trk. Entries display.

Table 45. Possible actions based on replication status of a tracking entry


1. Evaluate the cause of the problem before taking any action.

*ACTIVE Unless an error has occurred, no action is necessary. Entries in the user journal for the IFS object are replicated and applied. If necessary, any of the options to hold journal entries can be used.

*HLD User action is required to release the entry (option 26) so that held journal entries from user journal can be applied to the target system.

*HLDERR User action is required. Attempt to resolve the error by synchronizing the file (option 16).

*HLDIGN User action is required to either synchronize the object (option 16) or to change the configuration if you no longer want to replicate the object. Journal entries for the object are discarded. Replication is not occurring and the object may not be synchronized.

Depending on the circumstances, Release may also be an option.

*HLDRNM This is a transitional state for IFS tracking entries that should resolve to *ACTIVE. If this status persists, check the journaling status for the entry. Object tracking entries cannot have this status.

*RLSWAIT If the status does not change to *ACTIVE, you may need to synchronize (option 16)

Table 44. Tracking entry options on the Work with Data Groups display

Select Option Result

220


Holding journal entries associated with a tracking entry

Use this procedure to hold any journal entries for an object identified by a tracking entry. Avoid leaving a tracking entry on hold for any extended period.

The request changes the tracking entry status to *HLD. Any journal entries for the associated IFS object, data area, or data queue are replicated but not applied. MIMIX retains log spaces containing any replicated journal entries in anticipation that the tracking entry will be released. When the tracking entry is released, the accumulated journal entries will be applied. The *HLD status remains until additional action is taken.

Do the following:

1. Access the IFS or object tracking entry display as described in “Accessing the

Table 46. Options for working with tracking entries


4=Remove See “Removing a tracking entry” on page 223

5=Display Identifies an object, its replication status, journaling status, and the database apply session used.

6=Print Creates a spooled file which can be printed

9=Start journaling See “Starting journaling for IFS objects” on page 238 and “Starting journaling for data areas and data queues” on page 241.

10=End journaling See “Ending journaling for IFS objects” on page 239 and “Ending journaling for data areas and data queues” on page 242.

11=Verify journaling See “Verifying journaling for IFS objects” on page 240 and “Verifying journaling for data areas and data queues” on page 243.

16=Synchronize Synchronizes the contents, attributes, and authorities of the object represented by the tracking entry between the source and target systems.

For more information, see topic ‘Synchronizing tracking entries’ in the MIMIX Administrator Reference book.

23=Hold See “Holding journal entries associated with a tracking entry” on page 221.

24=Ignore See “Ignoring journal entries associated with a tracking entry” on page 222.

25=Release wait See “Waiting to synchronize and release held journal entries for a tracking entry” on page 222.

26=Release See “Releasing held journal entries for a tracking entry” on page 223.

27=Release clear See “Releasing and clearing held journal entries for a tracking entry” on page 223.

221


appropriate tracking entry display” on page 219.

2. Type 23 (Hold) next to the tracking entry for the object you want and press Enter.

Ignoring journal entries associated with a tracking entry

Use this procedure to ignore any journal entries for an object identified by a tracking entry. The request changes the tracking entry status to *HLDIGN. Any journal entries for the associated IFS object, data area, or data queue, including any hold logs, are discarded. The *HLDIGN status remains until additional action is taken.

Note: Be certain that you want to use the ignore feature. Any ignored transactions cannot be retrieved. You must replace the object on the target system with a current version from the source system.

If a tracking entry has been on hold for a long time or you expect that it will be, the amount of storage used by the error/hold log space can be quite large. If you anticipate that you will need to save and restore the object or replace it for any other reason, it may be best to just ignore all current transactions.

Do the following:

1. Access the IFS or object tracking entry display as described in “Accessing the appropriate tracking entry display” on page 219.

2. Type 24 (Ignore) next to the tracking entry for the object you want and press Enter.

Waiting to synchronize and release held journal entries for a tracking entry

Use this procedure to wait for a synchronization point to release any held journal entries for an object identified by a tracking entry, then resume replication.

The request changes the tracking entry status to *RLSWAIT. Any journal entries for the associated IFS object, data area, or data queue are discarded until an object saved journal entry is encountered. This is the synchronization point. The tracking entry status is then changed to *ACTIVE and all journal entries that were held after the synchronization point are applied.

If the object saved journal entry is not in the log space, the tracking entry remains in *RLSWAIT status. If you are unsure as to how many save requests might accumulate for an object, you can synchronize the object associated with the tracking entry. The tracking entry status will become *ACTIVE.

Do the following:


2. Type 25 (Release wait) next to the tracking entry for the object you want and press Enter.

222


Releasing held journal entries for a tracking entry

Use this procedure to immediately release any held journal entries for an object identified by a tracking entry with a status of *HLD or *HLDERR and resume replication.

The request changes the tracking entry status to *ACTIVE. Any held journal entries for the associated IFS object, data area, or data queue are applied. Normal replication of the object resumes.

Do the following:


2. Type 26 (Release) next to the tracking entry for the object you want and press Enter

Releasing and clearing held journal entries for a tracking entry

Use this procedure to clear any held journal entries for an object identified by a tracking entry, then resume replication.

The request changes the tracking entry status to *ACTIVE. Any held journal entries for the associated IFS object, data area, or data queue are discarded. Journal entries received after the tracking entry status became *ACTIVE are applied, resuming normal replication.

If a tracking entry is on hold and its associated object has been synchronized in such a way that the held entries already exist in the restored object, this procedure will ensure that those entries are not re-applied.

Do the following:


2. Type 27 (Release clear) next to the tracking entry for the object you want and press Enter.

Removing a tracking entry

Use this procedure to remove a duplicate tracking entry for an IFS object, data area, or data queue. A tracking entry with a status of *HLDERR cannot be removed.

Note: Do not use this procedure to prevent user journal replication of an object represented by a tracking entry. If you need to exclude the object from replication or have it replicated through the system journal instead of the user journal, change or create the appropriate data group IFS entry or object entry.

Do the following:


2. Type 4 (Remove) next to the tracking entry you want to remove and press Enter.

3. You will see a confirmation display. To remove the tracking entry, press Enter.

223

Working with objects in error

Working with objects in errorUse this topic to work with replication errors for objects replicated through the system journal.

To access a list of objects in error for a data group, do the following:


2. The Work with Data Groups display appears. Type 13 (Objects in error) next to the data group you want which has values shown in the Obj Errors column and press Enter.

3. The Work with Data Group Activity display appears with a list of the objects in error for the data group you selected. You can do any of the following:

• Use F10 (Error view) to see the reason why the object is in error.

• Use F11 to change between views for objects, DLOs, IFS objects, and spooled files.

• Use the options identified in Table 47 to resolve the errors. Type the number of the option you want next to the object and press Enter

Table 47. Options on the Work with Data Group Activity display for working with objects in error.

4=Remove Use this option to remove an entry with a *COMPLETED or *FAILED status from the list. For entries with *FAILED status, this option removes only the failed entry. Prompting is available for extended capability. You may need to take action to synchronize the object associated with the entry.

Note: If an entry with a status of *FAILED has related entries in *DELAYED status, you can remove both the failed and the delayed entries in one operation by using option 14 (Remove related).

For more information, see “Removing data group activity history entries” on page 229.

7=Display message Use this option to display any error message that is associated with the entry.

8=Retry Use this option to retry the data group activity. MIMIX changes the entry status to pending and attempts the failed operation again.

Note: It is possible to schedule the request for a time when the retry is more likely to be successful. For more information about retrying failed entries, see “Retrying data group activity entries” on page 227.

224


Using the Work with DG Activity Entries display

From the Work with DG Activity Entries display, you can display information about and take actions on activity entries for a replicated object. To access the display, select option 12 (Work with entries) from the Work with Data Group Activity display.

Table 48 lists the available options.

12=Work with entries Use this option to access the Work with DG Activity Entries display. From the display you can display additional information about replicated journal transactions for the object, including the journal entry type and access type (if available), as well as see whether the object is undergoing delay retry processing. You can also take options to display related entries, view error messages for a failure, and synchronize the object. For more information, see “Using the Work with DG Activity Entries display” on page 225.

14=Remove related Use this option to remove an entry with a status of *FAILED and any related entries that have a status of *DELAYED. You may need to take action to synchronize the object associated with the entry.

Table 47. Options on the Work with Data Group Activity display for working with objects in error.

Table 48. Options available on the Work with DG Activity Entries display.

4=Remove Use this option to remove an individual entry with a *COMPLETED or *FAILED status from the list. For entries with *FAILED status, this option removes only the failed entry. You may need to take action to synchronize the object associated with the entry.

Note: No prompting is available when using this option from this display. To prompt for additional capability, use the option to remove from the Work with Data Group Activity display. For more information, see “Removing data group activity history entries” on page 229.

5=Display Use this option display details about the individual entry. The information available about the object includes whether the object is undergoing delay retry processing, and journal entry information, including access type information for T-SF, T-YC, and T-ZC journal entry types. For more information, see “Determining whether an activity entry is in a delay/retry cycle” on page 228

6=Print Use this option to print the entry.

7=Display message Use this option to display the error message associated with the processing failure for the entry.

8=Retry Use this option to retry the data group activity entry. MIMIX changes the entry status to pending and attempts the failed operation again as soon as possible.

225


9=Display related Displays entries related to the specified object. For example, use this option to see entries associated with a move or rename operation for the object.

12=Display job Displays the job that was processing the object when the error occurred, if the still job information exists and is on this system.

16=Synchronize Use this option to synchronize objects defined to MIMIX for system journal replication (objects that are not configured for cooperative processing). Activity entries with *ACTIVE or *COMPLETED status can be synchronized, as well as entries with a *FAILED status and with the following journal types: T-CO, T-CP, T-OR, T-SE, T-ZC (see notes), T-YC, and T-SF (see notes).

A confirmation display allows you to confirm your choices before the request is processed. Entries are placed in a ‘pending synchronization’ status. When the data group is active, the contents of the object, its attributes, and its authorities are synchronized between the source and target systems. The status of the activity entry is set to ‘completed by synchronization.’

Notes:

• To synchronize files defined for cooperative processing, use the Synchronize DG File Entry (SYNCDGFE) command.

• Spooled files (T-SF journal entries) with the following access types can be synchronized: C = spooled file created; U = spooled file changed.

• Changed objects (T-ZC journal entries) with the following access types can be synchronized: 1 (Add); 7 (Change); 25 (Initialize); 29 (Merge); 30 (Open); 34 (Receive); 36 (Reorganize); 50 (Set); and 51 (Send).

Table 48. Options available on the Work with DG Activity Entries display.

226

Retrying data group activity entries

Data group activity entries that did not successfully complete replication have a status of *FAILED. These failed data group activity entries are also called error entries. You can request to retry processing for these activity entries.

Activity entries with a status of *ACTIVE can also be retried in some circumstances. For example, you may want to retry an entry that is delayed but which has no preceding pending activity entry. Or, you may want to retry a pending entry that is undergoing processing in a delay retry cycle.

The retry request places the activity entry in the queue for processing by the system journal replication process where the failure or delay occurred. Activity entries with a status of *FAILED or *DELAYED are set to *PENDING until they are processed.

Retrying a failed data group activity entry

You can manually request that MIMIX retry processing for a data group activity entry that has a status of *FAILED. The retry can be requested from either the Work with Data Group Activity display or from the Work with DG Activity Entries display.

Note: Only the Work with Data Group Activity supports the ability to schedule the retry request for a time in the future when the request is more likely to be successful.

To retry failed (error) activity entries, do the following:

1. From the Work with Data Groups display, type a 13 (Objects in error) next to the data group you want that has values shown in the Obj Errors column and press Enter.

2. The Work with Data Group Activity display appears with a list of the objects in error for the data group selected. Type an 8 (Retry) next to the entry you want and do one of the following:

• To submit the retry request for immediate processing, press Enter. Then skip to Step 4.

• To schedule the retry request for a time at which it is more likely to be successful, press F4 (Prompt).

3. On the Retry DG Activity Entries (RTYDGACTE) display, specify a value for the Time of day to retry prompt. Then press Enter.

You can specify a specific time within 24 hours. The scheduled time is based on the time on the system from which the request is submitted regardless of the system on which the activity to retry occurs. When you submit a retry request for a scheduled time, MIMIX will make the entry active and will wait until the specified time before retrying the request. The scheduled time is the earliest the request will be processed. Be sure to consider any time zone differences between systems as you determine a scheduled time. For additional information and examples, press F1 (Help).

4. The Confirm Retry of DG Activity display appears. Press Enter.

If failed activity entries occur frequently, consider using the third delay retry cycle. When the Automatic object recovery policy is enabled, a third retry cycle is performed

227

using the settings in effect from the Number of third delay/retries and Third retry interval (min.) policies. These policies can be set for the installation or for a specific data group.

Determining whether an activity entry is in a delay/retry cycle

This procedure allows you to check the status of an activity entry to determine whether MIMIX is attempting automatic delay retry cycles for the object.

1. From the Work with Data Groups display, type a 14 (Active objects) next to the data group you want and press Enter.

2. The Work with Data Group Activity display appears with a list of the objects that are actively being replicated.

3. Type a 12 (Work with Entries) next to the list entry for the object you want and press Enter. The Work with DG Activity Entries display appears with the list of activity entries for the object you selected.

4. To view additional details for an entry, type a 5 (Display) next to the activity entry you want and press Enter. The Display DG Activity Details display appears.

5. Check the value listed in the Waiting for retry field.

The value *YES is displayed when the activity entry is undergoing automatic delay/retry processing. Delayed or failed activity entries and pending activity entries that are not in a delay retry cycle will always have a value of *NO.

6. When the value of the Waiting for retry field is *YES, the Delay/Retry Processing Information fields are also available and provide the following information:

• The Retries attempted field identifies the number of times that MIMIX has attempted to process the activity entry.

• The Retries remaining field identifies the remaining number of times that MIMIX can automatically attempt to retry the activity entry. MIMIX uses only as many of the remaining retry attempts as necessary to achieve a successful attempt.

• The Delay interval (seconds) field identifies the number of seconds between the previous attempt and the next retry attempt.

• The Timestamp of next attempt field identifies the approximate date and time that MIMIX will make the next attempt to process the activity entry. If object replication processes are busy processing other entries, there may be a delay between this time and when processing of this entry is actually attempted. The value *PENDING indicates that the time of the next attempt has passed and processing for the entry is waiting while other entries are being processed. This field is displayed only on the system of the process that is in delay/retry.

228

Removing data group activity history entries

229

Removing data group activity history entriesMIMIX maintains history of successfully completed distribution requests to provide a record of all object, DLO, and IFS replication activity completed by system journal replication processes. While MIMIX efficiently uses disk space and removes completed requests according to the value specified in the Keep data group history parameter of the system definition, you may occasionally need to manually remove completed activity entries. One reason to manually remove completed entries may be to conserve disk space, while another may be to clean up entries for an object that has been removed from replication as a result of a configuration change.

Note: Your business policies and procedures may require that you archive completed activity entries to tape before you delete them.

To remove completed activity entries, do the following:

1. From the Work with Data Groups display, type 28 (Completed objects) next to the data group you want and press Enter. The Work with Data Group Activity display appears with a list of objects with completed entries.

2. Type a 4 (Remove) next to the entry you want and do one of the following:

• To remove all available completed entries for the selected object, press Enter. Then continue with Step 4.

• To change the selection criteria to include entries for additional objects or to limit the entries based on a time range, press F4 (Prompt). The Remove DG Activity Entries (RMVDGACTE) display appears.

3. To change the selection criteria, do the following as needed:

• To remove a subset of completed entries for the selected object based on the timestamp of the replicated journal entries, specify values for Starting date and time and Ending date and time prompts.

• To expand the set of objects for which completed entries will be removed, change the values of the following prompts as needed:

For an expanded set of object types, use the Object type prompt.

For a library based object, use the Object and Library prompts.

For a DLO, use the Document and Folder prompts.

For an IFS object use the IFS object prompt.

For a spooled file, use the Spooled file name, Output queue, and Library prompts.

4. A confirmation display appears. Press Enter.

230

CHAPTER 12 Starting, ending, and verifying journaling

This chapter describes procedures for starting and ending journaling. Journaling must be active on all files, IFS objects, data areas and data queues that you want to replicate through a user journal. Normally, journaling is started during configuration. However, there are times when you may need to start or end journaling on items identified to a data group.


• “What objects need to be journaled” on page 231 describes, for supported configuration scenarios, what types of objects must have journaling started before replication can occur. It also describes when journaling is started implicitly, as well as the authority requirements necessary for user profiles that create the objects to be journaled when they are created.

• “MIMIX commands for starting journaling” on page 233 identifies the MIMIX commands available for starting journaling and describes the checking performed by the commands.

• “Journaling for physical files” on page 235 includes procedures for displaying journaling status, starting journaling, ending journaling, and verifying journaling for physical files identified by data group file entries.

• “Journaling for IFS objects” on page 238 includes procedures for displaying journaling status, starting journaling, ending journaling, and verifying journaling for IFS objects replicated cooperatively (advanced journaling). IFS tracking entries are used in these procedures.

• “Journaling for data areas and data queues” on page 241 includes procedures for displaying journaling status, starting journaling, ending journaling, and verifying journaling for data area and data queue objects replicated cooperatively (advanced journaling). IFS tracking entries are used in these procedures.

What objects need to be journaled

What objects need to be journaledA data group can be configured in a variety of ways that involve a user journal in the replication of files, data areas, data queues and IFS objects. Journaling must be started for any object to be replicated through a user journal or to be replicated by cooperative processing between a user journal and the system journal.

Requirements for system journal replication - System journal replication processes use a special journal, the security audit (QAUDJRN) journal. Events are logged in this journal to create a security audit trail. When data group object entries, IFS entries, and DLO entries are configured, each entry specifies an object auditing value that determines the type of activity on the objects to be logged in the journal. Object auditing is automatically set for all objects defined to a data group when the data group is first started, or any time a change is made to the object entries, IFS entries, or DLO entries for the data group. Because security auditing logs the object changes in the system journal, no special action is need.

Requirements for user journal replication - User journal replication processes require that the journaling be started for the objects identified by data group file entries. Both MIMIX Dynamic Apply and legacy cooperative processing use data group file entries and therefore require journaling to be started. Configurations that include advanced journaling for replication of data areas, data queues, or IFS objects also require that journaling be started on the associated object tracking entries and IFS tracking entries, respectively. Starting journaling ensures that changes to the objects are recorded in the user journal, and are therefore available for MIMIX to replicate.

During initial configuration, the configuration checklists direct you when to start journaling for objects identified by data group file entries, IFS tracking entries, and object tracking entries. The MIMIX commands STRJRNFE, STRJRNIFSE, and STRJRNOBJE simplify the process of starting journaling. For more information about these commands, see “MIMIX commands for starting journaling” on page 233.

Although MIMIX commands for starting journaling are preferred, you can also use IBM commands (STRJRNPF, STRJRN, STRJRNOBJ) to start journaling if you have the appropriate authority for starting journaling.

Requirements for implicit starting of journaling - Journaling can be automatically started for newly created database files, data areas, data queues, or IFS objects when certain requirements are met.

The user ID creating the new objects must have the required authority to start journaling and the following requirements must be met:

• IFS objects - A new IFS object is automatically journaled if the directory in which it is created is journaled as a result of a request that permitted journaling inheritance for new objects. Typically, if MIMIX started journaling on the parent directory, inheritance is permitted. If you manually start journaling on the parent directory using the IBM command STRJRN, specify INHERIT(*YES). This will allow IFS objects created within the journaled directory to inherit the journal options and journal state of the parent directory.

• Database files created by SQL statements - A new file created by a CREATE

231

What objects need to be journaled

TABLE statement is automatically journaled if the library in which it is created contains a journal named QSQJRN.

• New *FILE, *DTAARA, *DTAQ objects - The default value (*DFT) for the Journal at creation (JRNATCRT) parameter in the data group definition enables MIMIX to support both release-specific techniques that the operating system uses to automatically start journaling for physical files, data areas, and data queues when they are created.

– On systems running IBM i 6.1 or higher releases, MIMIX uses the support provided by the IBM i command Start Journal Library (STRJRNLIB). Customers are advised not to re-create the QDFTJRN data area on systems running IBM i 6.1 or higher.

– On systems running IBM i 5.4, MIMIX uses the QDFTJRN data area for journal at creation. The operating system will automatically journal a new object if it is created in a library that contains a QDFTJRN data area and the data area has enabled automatic journaling for the object type.

When configuration requirements are met, MIMIX will either start library journaling or create the QDFTJRN data area for the appropriate libraries as well as enable automatic journaling for the configured cooperatively processed object types. When journal at creation configuration requirements are met, all new objects of that type are journaled, not just those which are eligible for replication.

When the data group is started, MIMIX evaluates all data group object entries for each object type. (Entries for *FILE objects are only evaluated when the data group specifies COOPJRN(*USRJRN).) Entries properly configured to allow cooperative processing of the object type determine whether MIMIX will enforce library journaling or create the QDFTJRN data area. MIMIX uses the data group entry with the most specific match to the object type and library that also specifies *ALL for its System 1 object (OBJ1) and Attribute (OBJATR).

Note: MIMIX prevents library journaling from starting or the QDFTJRN data area from being created in the following libraries: QSYS*, QRECOVERY, QRCY*, QUSR*, QSPL*, QRPL*, QRCL*, QRPL*, QGPL, QTEMP and SYSIB*.

For example, if MIMIX finds only the following data group object entries for library MYLIB, it would use the first entry when determining whether to enforce library journaling or create the QDFTJRN data area because it is the most specific entry that also meets the OBJ1(*ALL) and OBJATR(*ALL) requirements. The second entry is not considered in the determination because its OBJ1 and OBJATR values do not meet these requirements.

LIB1(MYLIB) OBJ1(*ALL) OBJTYPE(*FILE) OBJATR(*ALL) COOPDB(*YES) PRCTYPE(*INCLD)LIB1(MYLIB) OBJ1(MYAPP) OBJTYPE(*FILE) OBJATR(DSPF) COOPDB(*YES) PRCTYPE(*INCLD)

Authority requirements for starting journaling

Normal MIMIX processes run under the MIMIXOWN user profile, which ships with *ALLOBJ special authority. Therefore, it is not necessary for other users to account

232

MIMIX commands for starting journaling

for journaling authority requirements when using MIMIX commands (STRJRNFE, STRJRNIFSE, STRJRNOBJE) to start journaling.

When the MIMIX journal managers are started, or when the Build Journaling Environment (BLDJRNENV) command is used, MIMIX checks the public authority (*PUBLIC) for the journal. If necessary, MIMIX changes public authority so the user ID in use has the appropriate authority to start journaling.

Authority requirements must be met to enable the automatic journaling of newly created objects and if you use IBM commands to start journaling instead of MIMIX commands.

• If you create database files, data areas, or data queues for which you expect automatic journaling at creation, the user ID creating these objects must have the required authority to start journaling.

• If you use the IBM commands (STRJRNPF, STRJRN, STRJRNOBJ) to start journaling, the user ID that performs the start journaling request must have the appropriate authority requirements.

For journaling to be successfully started on an object, one of the following authority requirements must be satisfied:

• The user profile of the user attempting to start journaling for an object must have *ALLOBJ special authority.

• The user profile of the user attempting to start journaling for an object must have explicit *ALL object authority for the journal to which the object is to be journaled.

• Public authority (*PUBLIC) must have *OBJALTER, *OBJMGT, and *OBJOPR object authorities for the journal to which the object is to be journaled.

MIMIX commands for starting journalingBefore you use any of the MIMIX commands for starting journaling, the data group file entries, IFS tracking entries, or object tracking entries associated with the command’s object class must be loaded.

The MIMIX commands for starting journaling are:

• Start Journal Entry (STRJRNFE) - This command starts journaling for files identified by data group file entries.

• Start Journaling IFS Entries (STRJRNIFSE) - This command starts journaling of IFS objects configured for advanced journaling. Data group IFS entries must be configured and IFS tracking entries be loaded (LODDGIFSTE command) before running the STRJRNIFSE command to start journaling.

• Start Journaling Obj Entries (STRJRNOBJE) - This command starts journaling of data area and data queue objects configured for advanced journaling. Data group object entries must be configured and object tracking entries be loaded (LODDGOBJTE command) before running the STRJRNOBJE command to start journaling.

233

MIMIX commands for starting journaling

If you attempt to start journaling for a data group file entry, IFS tracking entry, or object tracking entry and the files or objects associated with the entry are already journaled, MIMIX checks that the physical file, IFS object, data area, or data queue is journaled to the journal associated with the data group. If the file or object is journaled to the correct journal, the journaling status of the data group file entry, IFS tracking or object tracking entry is changed to *YES. If the file or object is not journaled to the correct journal or the attempt to start journaling fails, an error occurs and the journaling status is changed to *NO.

234

Journaling for physical files

Journaling for physical filesData group file entries identify physical files to be replicated. When data group file entries are added to a configuration, they may have an initial status of *ACTIVE. However, the physical files which they identify may not be journaled. In order for replication to occur, journaling must be started for the files on the source system.

This topic includes procedures to display journaling status, and to start, end, or verify journaling for physical files.

Displaying journaling status for physical files

Use this procedure to display journaling status for physical files identified by data group file entries. Do the following:

1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the Work with Data Groups display.

2. On the Work with Data Groups display, type 17 (File entries) next to the data group you want and press Enter.

3. The Work with DG File Entries display appears. The initial view shows the current and requested status of the data group file entry. Press F10 (Journaled view).

At the right side of the display, the Journaled System 1 and System 2 columns indicate whether the physical file associated with the file entry is journaled on each system.

Note: Logical files will have a status of *NA. Data group file entries exist for logical files only in data groups configured for MIMIX Dynamic Apply.

Starting journaling for physical files

Use this procedure to start journaling for physical files identified by data group file entries. In order for replication to occur, journaling must be started for the file on the source system.

This procedure invokes the Start Journal Entry (STRJRNFE) command. The command can also be entered from a command line.

Do the following:

1. Access the journaled view of the Work with DG File Entries display as described in “Displaying journaling status for physical files” on page 235.

2. From the Work with DG File Entries display, type a 9 (Start journaling) next to the file entries you want. Then do one of the following:

• To start journaling using the command defaults, press Enter.

• To modify command defaults, press F4 (Prompt) then continue with the next step.

3. The Start Journal Entry (STRJRNFE) display appears. The Data group definition prompts and the System 1 file prompts identify your selection. Accept these values or specify the values you want.

235


4. Specify the value you want for the Start journaling on system prompt. Press F4 to see a list of valid values.

When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and starts or prevents journaling from starting as required.

5. If you want to use batch processing, specify *YES for the Submit to batch prompt.

6. To start journaling for the physical file associated with the selected data group, press Enter.

The system returns a message to confirm the operation was successful.

Ending journaling for physical files

Use this procedure to end journaling for a physical file associated with a data group file entry. Once journaling for a file is ended, any changes to that file are not captured and are not replicated. You may need to end journaling if a file no longer needs to be replicated, to prepare for upgrading MIMIX software, or to correct an error.

This procedure invokes the End Journaling File Entry (ENDJRNFE) command. The command can also be entered from a command line.

To end journaling, do the following:


2. From the Work with DG File Entries display, type a 10 (End journaling) next to the file entry you want and do one of the following:

Note: MIMIX cannot end journaling on a file that is journaled to the wrong journal. For example, a file that does not match the journal definition for that data group. If you want to end journaling outside of MIMIX, use the ENDJRNPF command.

• To end journaling using command defaults, press Enter. Journaling is ended.

• To modify additional prompts for the command, press F4 (Prompt) and continue with the next step.

3. The End Journal File Entry (ENDJRNFE) display appears. If you want to end journaling for all files in the library, specify *ALL at the System 1 file prompt.

4. Specify the value you want for the End journaling on system prompt. Press F4 to see a list of valid values.

When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and ends or prevents journaling from ending as required.

5. If you want to use batch processing, specify *YES for the Submit to batch prompt.

6. To end journaling, press Enter.

236


Verifying journaling for physical files

Use this procedure to verify if a physical file defined by a data group file entry is journaled correctly. This procedure invokes the Verify Journaling File Entry (VFYJRNFE) command to determine whether the file is journaled and whether it is journaled to the journal defined in the journal definition. When these conditions are met, the journal status on the Work with DG File Entries display is set to *YES. The command can also be entered from a command line.

To verify journaling for a physical file, do the following:


2. From the Work with DG File Entries display, type a 11 (Verify journaling) next to the file entry you want and do one of the following:

• To verify journaling using command defaults, press Enter.

• To modify additional prompts for the command, press F4 (Prompt) and continue with the next step.

3. The Verify Journaling File Entry (VFYJRNFE) display appears. The Data group definition prompts and the System 1 file prompts identify your selection. Accept these values or specify the values you want.

4. Specify the value you want for the Verify journaling on system prompt. When *DGDFN is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) when determining where to verify journaling.

5. If you want to use batch processing, specify *YES for the Submit to batch prompt

6. Press Enter.

237

Journaling for IFS objects

Journaling for IFS objectsIFS tracking entries are loaded for a data group after the data group IFS entries have been configured for replication through the user journal (advanced journaling). However, loading IFS tracking entries does not automatically start journaling on the IFS objects they identify. In order for replication to occur, journaling must be started on the source system for the IFS objects identified by IFS tracking entries.

This topic includes procedures to display journaling status, and to start, end, or verify journaling for IFS objects identified for replication through the user journal.

These references go to different files in different books.

You should be aware of the information in “Considerations for working with long IFS path names” on page 262.

Displaying journaling status for IFS objects

Use this procedure to display journaling status for IFS objects identified by IFS tracking entries. Do the following:


2. On the Work with Data Groups display, type 50 (IFS trk entries) next to the data group you want and press Enter.

3. The Work with DG IFS Trk. Entries display appears. The initial view shows the object type and status at the right of the display. Press F10 (Journaled view).

At the right side of the display, the Journaled System 1 and System 2 columns indicate whether the IFS object identified by the tracking is journaled on each system.

Starting journaling for IFS objects

Use this procedure to start journaling for IFS objects identified by IFS tracking entries.

This procedure invokes the Start Journaling IFS Entries (STRJRNIFSE) command. The command can also be entered from a command line.

To start journaling for IFS objects, do the following:

1. If you have not already done so, load the IFS tracking entries for the data group. For more information see the MIMIX Administrator Reference book.

2. Access the journaled view of the Work with DG IFS Trk. Entries display as described in “Displaying journaling status for IFS objects” on page 238.

3. From the Work with DG IFS Trk. Entries display, type a 9 (Start journaling) next to the IFS tracking entries you want. Then do one of the following:


• To modify the command defaults, press F4 (Prompt) and continue with the next step.

238


4. The Start Journaling IFS Entries (STRJRNIFSE) display appears. The Data group definition and IFS objects prompts identify the IFS object associated with the tracking entry you selected. You cannot change the values shown for the IFS objects prompts1.



6. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values.

7. The System 1 file identifier and System 2 file identifier prompts identify the file identifier (FID) of the IFS object on each system. You cannot change the values2.

8. To start journaling on the IFS objects specified, press Enter.

Ending journaling for IFS objects

Use this procedure to end journaling for IFS objects identified by IFS tracking entries.

This procedure invokes the End Journaling IFS Entries (ENDJRNIFSE) command. The command can also be entered from a command line.

To end journaling for IFS objects, do the following:


2. From the Work with DG IFS Trk. Entries display, type a 10 (End journaling) next to the IFS tracking entries you want. Then do one of the following:

• To end journaling using the command defaults, press Enter.


3. The End Journaling IFS Entries (ENDJRNIFSE) display appears. The Data group definition and IFS objects prompts identify the IFS object associated with the tracking entry you selected. You cannot change the values shown for the IFS objects prompts1.


When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and ends or

1. When the command is invoked from a command line, you can change values specified for the IFS objects prompts. Also, you can specify as many as 300 object selectors by using the + for more values prompt.

2. When the command is invoked from a command line, use F10 to see the FID prompts. Then you can optionally specify the unique FID for the IFS object on either system. The FID values can be used alone or in combination with the IFS object path name.

239


prevents journaling from ending as required.


6. The System 1 file identifier and System 2 file identifier identify the file identifier (FID) of the IFS object on each system. You cannot change the values shown2.

7. To end journaling on the IFS objects specified, press Enter.

Verifying journaling for IFS objects

Use this procedure to verify if an IFS object identified by an IFS tracking entry is journaled correctly. This procedure invokes the Verify Journaling IFS Entries (VFYJRNIFSE) command to determine whether the IFS object is journaled, whether it is journaled to the journal defined in the data group definition, and whether it is journaled with the attributes defined in the data group definition. The command can also be entered from a command line.

To verify journaling for IFS objects, do the following:


2. From the Work with DG IFS Trk. Entries display, type a 11 (Verify journaling) next to the IFS tracking entries you want. Then do one of the following:

• To verify journaling using the command defaults, press Enter.


3. The Verify Journaling IFS Entries (VFYJRNIFSE) display appears. The Data group definition and IFS objects prompts identify the IFS object associated with the tracking entry you selected. You cannot change the values shown for the IFS objects prompts1.

4. Specify the value you want for the Verify journaling on system prompt. Press F4 to see a list of valid values.

When *DGDFN is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and verifies journaling on the appropriate systems as required.


6. The System 1 file identifier and System 2 file identifier identify the file identifier (FID) of the IFS object on each system. You cannot change the values shown2.

7. To verify journaling on the IFS objects specified, press Enter.

“Using file identifiers (FIDs) for IFS objects” on page 273.

240

Journaling for data areas and data queues

Journaling for data areas and data queuesObject tracking entries are loaded for a data group after the data group object entries have been configured replication through the user journal (advanced journaling). However, loading object tracking entries does not automatically start journaling on the objects they identify. In order for replication to occur, journaling must be started for the objects on the source system for the objects identified by object tracking entries.

This topic includes procedures to display journaling status, and to start, end, or verify journaling for data areas and data queues identified for replication through the user journal.

Displaying journaling status for data areas and data queues

To check journaling status for data areas and data queues identified by object tracking entries. Do the following:


2. On the Work with Data Groups display, type 52 (Obj trk entries) next to the data group you want and press Enter.

3. The Work with DG Obj. Trk. Entries display appears. The initial view shows the object type and status at the right of the display. Press F10 (Journaled view).

At the right side of the display, the Journaled System 1 and System 2 columns indicate whether the object identified by the tracking is journaled on each system.

Starting journaling for data areas and data queues

Use this procedure to start journaling for data areas and data queues identified by object tracking entries.

This procedure invokes the Start Journaling Obj Entries (STRJRNOBJE) command. The command can also be entered from a command line.

To start journaling for data areas and data queues, do the following:

1. If you have not already done so, load the object tracking entries for the data group. For more information see the MIMIX Administrator Reference book.

2. Access the journaled view of the Work with DG Obj. Trk. Entries display as described in “Displaying journaling status for data areas and data queues” on page 241.

3. From the Work with DG Obj. Trk. Entries display, type a 9 (Start journaling) next to the object tracking entries you want. Then do one of the following:



4. The Start Journaling Obj Entries (STRJRNOBJE) display appears. The Data group definition and Objects prompts identify the object associated with the

241


tracking entry you selected. Although you can change the values shown for these prompts, it is not recommended unless the command was invoked from a command line.




7. To start journaling on the objects specified, press Enter.

Ending journaling for data areas and data queues

Use this procedure to end journaling for data areas and data queues identified by object tracking entries.

This procedure invokes the End Journaling Obj Entries (ENDJRNOBJE) command. The command can also be entered from a command line.

To end journaling for data areas and data queues, do the following:


2. From the Work with DG Obj. Trk. Entries display, type a 10 (End journaling) next to the object tracking entries you want. Then do one of the following:



3. The End Journaling Obj Entries (ENDJRNOBJE) display appears. The Data group definition and IFS objects prompts identify the object associated with the tracking entry you selected. Although you can change the values shown for these prompts, it is not recommended unless the command was invoked from a command line.


When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and ends or prevents journaling from ending as required.


6. To end journaling on the objects specified, press Enter.

242


Verifying journaling for data areas and data queues

Use this procedure to verify if an object identified by an object tracking entry is journaled correctly. This procedure invokes the Verify Journaling Obj Entries (VFYJRNOBJE) command to determine whether the object is journaled, whether it is journaled to the journal defined in the data group definition, and whether it is journaled with the attributes defined in the data group definition. The command can also be entered from a command line.

To verify journaling for objects, do the following:


2. From the Work with DG Obj. Trk. Entries display, type a 11 (Verify journaling) next to the object tracking entries you want. Then do one of the following:



3. The Verify Journaling Obj Entries (VFYJRNOBJE) display appears. The Data group definition and Objects prompts identify the object associated with the tracking entry you selected. Although you can change the values shown for these prompts, it is not recommended unless the command was invoked from a command line.

4. Specify the value you want for the Verify journaling on system prompt. Press F4 to see a list of valid values.

When *DGDFN is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and verifies journaling on the appropriate systems as required.


6. To verify journaling on the objects specified, press Enter.

243

About switching

CHAPTER 13 Switching

Switching is when you temporarily reverse the roles of the systems. The original source system (production) becomes the temporary target system and the original target system (backup) becomes the temporary source system. When the scenario that required you to switch directions is resolved, you typically switch again to return the systems to their original roles.

This chapter provides information and procedures to support switching. The following topics are included:

• “About switching” on page 244 provides information about switching with MIMIX including best practice and reasons why a switch should be performed. Subtopics describe:

– What is a planned switch and requirements for a planned switch

– What is an unplanned switch and actions to be completed after the failed source system is recovered

– The role of procedures for switching environments that use application groups

– The role of MIMIX Model Switch Framework for switching environments that do not use application groups

• “Switching an application group” on page 250 describes how to run a procedure to switch an application group.

• “Switching a data group-only environment” on page 251. describes how to switch from a 5250 emulator.

• “Determining when the last switch was performed” on page 253 describes how to check the Last switch field which indicates the switch compliance status and provides the date when the last switch was performed.

• “Problems checking switch compliance” on page 254 describes problems that can occur with data for the Last switch field.

• “Performing a data group switch” on page 255 describes how to switch a single data group using the SWTDG command.

• “Switch Data Group (SWTDG) command” on page 257 provides background information about the SWTDG command, which is used in all switch interfaces.

About switchingReplication environments rarely remain static. Therefore, best practice is to perform regular switches to ensure that you are prepared should you need to perform one during an emergency.

MIMIX supports two methods for switching the direction in which replication occurs for a data group. These methods are known as a planned switch and an unplanned switch.

244

About switching

You may need to perform a switch for any of the following reasons:

• The production system becomes unavailable due to an unplanned outage. A switch in this scenario is unplanned.

• You need to perform hardware or software maintenance on the production system. Typically, you can schedule this in advance so the switch is planned.

• You need to test your recovery plan. This activity is also a planned switch.

Historically, the concept of switching consists of three phases: switch to the backup system, synchronize the systems when the production is ready to use, and switch back to the production system.This round-trip view of switching assumes your goal is to return to your original production system as quickly as possible. However, this view overlooks the fact some customers may have an extended time pass between phase one and the other phases, or may even view a switch as a one-way trip. MIMIX supports both conceptual views of switching.

Switching data groups is only a part of performing a switch. MIMIX provides robust support for customizing switching activity include all the needs of your environment.

Best practice for switching includes performing regular switches. Best practice also includes performing all audits with the audit level set at level 30 immediately prior to a planned switch to the backup system and before switching back to the production system. For performing the switch in an environment that uses application groups is to use option 4 (Switch all application groups) from the MIMIX Basic Main Menu. Best practice for performing a switch in an environment using only data groups is to use option 5 (start or complete switch using Switch Asst.) from the MIMIX Basic Main Menu.

Planned switch

You can start a planned switch from either system. In a planned switch, MIMIX initiates a controlled shutdown of the data group. Both systems and the communications between them must be active.

Before you start a planned switch of a data group, you should ensure that the following actions have been completed. Your enterprise may have additional requirements.

• Perform an full set of audits with the audit level policy set to level 30. Running the #FILDTA audit at this audit level checks 100 percent of file member data for the data group for synchronization between source and target systems and is strongly recommended.

• Shut down any applications that use database files or objects defined to the data group. If any users or other non-MIMIX processes remain active while the switch is being performed, the data can become not synchronized between systems and orphaned data may result.

• Ensure that there are no jobs other than MIMIX currently active on the source system. This may require ending all interactive and batch subsystems other than MIMIX and ending communications.

• Users should be prevented from accessing either system until after the switch is complete and the data group is restarted.

245

About switching

• If you use user journal replication processes, you should address any files, IFS tracking entries, or object tracking entries in error for your critical database files. If you use system journal replication processes, you should address any object errors.

You are not required to run journal analysis after a planned switch. MIMIX retains information about where activity ended so that when you restart the data group, it is started at the correct point.

When the data group is started, the temporary target system (the production system) is now being updated with user changes that are being replicated from the temporary source system (the backup system). Do not allow users onto the production system until after the production system is caught up with these transactions and you run the switch process again to revert to the normal roles.

Unplanned switch

In an unplanned switch, the source system is assumed to be unavailable. An unplanned switch is generally required when the source system fails and, in order to continue normal operations, you must switch users to a backup system. (Typically MIMIX is configured so that the target for replication is your backup system.)

You must run an unplanned switch from the target system. MIMIX performs a controlled shutdown of replication processes on the target system. The controlled shutdown allows all apply processing to catch up before the apply processes are ended.

There are default (*DFT) values for several parameters on the SWTDG command that allow the switch operation to continue without intervention from the user. See “Planned switch” on page 245 for additional details about these default values.

In an unplanned switch of a data group that uses remote journaling, the default behavior is to end the RJ link.

Once the failed source system is recovered, the following actions should be completed:

• You should perform journal analysis on that system before restarting the data group or user applications. Journal analysis helps identify any possible loss of data that may have occurred when the source system failed. Journal analysis relies on status information on the source system about the last entry that was applied. This information will be cleared when the data group is restarted.

• Communication between the systems must be active before you restart the data group. The switch process is complete when you restart the data group. When the data group is restarted, MIMIX notifies the source system that it is now the temporary target system.

• New transactions are created on the temporary source system (the backup system) while the production system (the temporary target system) is unavailable for replication. After you have completed journal analysis, you can send these new transactions to the production system to synchronize the databases. Once the databases are synchronized, you must run the switch process again to revert to the normal roles before allowing users onto the production system.

246

About switching

When the data group is started after a switch, any pending transactions are cleared. The journal receiver is already changed by the switch process and the new journal receiver and first sequence number are used.

Switching application group environments with procedures

Application groups can only be switched using procedures. Procedures and steps are a highly customizable means of performing operations for application groups. Each application group has a set of default procedures that include procedures for performing pre-check activity for switching and switching. Each operation is performed by a procedure that consists of a sequence of steps and multiple jobs. Each step calls a predetermined step program to perform a specific sub-task of the larger operation.

This following paragraphs describe the behavior of the switch (SWTAG) command for application groups that do not participate in a cluster controlled by the IBM i operating system (*NONCLU application groups).

What is the scope of the request? The following parameters identify the scope of the requested operation:

Application group definition (AGDFN) - Specifies the requested application group. You can either specify a name or the value *ALL.

Resource groups (TYPE) - Specifies the types of resource groups to be processed for the requested application group.

Data resource group entry (DTARSCGRP) - Specifies the data resource groups to include in the request. The default is *ALL or you can specify a name. This parameter is ignored when TYPE is *ALL or *APP.

What is the requested switch behavior? The following parameters on the SWTAG command define the expected behavior:

Switch type (SWTTYP) - This specifies the reason the application group is being switched. The procedure called to perform the switch and the actions performed during the switch differ based on whether the current primary node (data source) is available at the start of the switch procedure. The default value, *PLANNED, indicates that the primary node is still available and the switch is being performed for normal business processes (such as to perform maintenance on the current source system or as part of a standard switch procedure). The value *UNPLANNED indicates that the switch is an unplanned activity and the data source system may not be available.

Node roles (ROLE) - This specifies which set of node roles will determine the node that becomes the new primary node as a result of the switch. The default value *CURRENT uses the current order of node roles. If the application group participates in a cluster, the current roles defined within the CRGs will be used. If *CONFIG is specified, the configured primary node will become the new primary node and the new role of other nodes in the recovery domain will be determined from their current roles. If you specify a name of a node within the recovery domain for the application group, the node will be made the new primary node and the new role of other nodes in the recovery domain will be determined from their current roles.

247

About switching

What procedure will be used? The following parameters identify the procedure to use and its starting point:

Begin at step (STEP) - Specifies where the request will start within the specified procedure. This parameter is described in detail below.

Procedure (PROC) - Specifies the name of the procedure to run to perform the requested operation when starting from its first step. The value *DFT will use the procedure designated as the default for the application group. The value *LASTRUN uses the same procedure used for the previous run of the command. You can also specify the name of a procedure that is valid the specified application group and type of request.






.

For more information about starting a procedure with the step at which it failed, see “Resuming a procedure” on page 91.

For more information about customizing procedures, see the MIMIX Administrator Reference book.

Switching data group environments with MIMIX Model Switch Framework

Note: MIMIX Model Switch Framework does not support switching application groups. Only data groups that are not associated with application groups should be switched with MIMIX Model Switch Framework.

248

About switching

MIMIX provides a customized implementation of MIMIX Model Switch Framework to perform a switch. MIMIX Model Switch Framework is ideally suited for customizing a switching solution that detects the need for an unplanned switch, switches the direction of data group replication, and switches users to the backup system. Typically, if you have a Runbook, it will direct you when to use your MIMIX Model Switch Framework implementation for both planned and unplanned switches.

The MIMIX Model Switch Framework calls the Switch Data Group (SWTDG) command. The SWTDG command only switches the direction in which replication occurs for a single data group; it does not switch users or any other facets of your normal operating environment to the backup system. However, MIMIX Model Switch Framework can be configured to address these additional facets of your environment for multiple data groups. If you choose to use the SWTDG command either by invoking it from a command line or by using the options for switching on the Work with Data Groups display, you must take action to switch users to the backup system and address other requirements for operating there.

The switching option from the MIMIX Basic Main menu are implementations of MIMIX Model Switch Framework. The implementation is identified within policies.

Instructions for switching using MIMIX Model Switch Framework are described in “Switching a data group-only environment” on page 251.

For additional information see the chapter “Using the MIMIX Model Switch Framework” in the Using MIMIX Monitor book.

249

Switching an application group

250

Switching an application groupFor an application group, a procedure for only one operation (start, end, or switch) can run at a time. For details about parameters and behavior of the SWTAG command, see “Switching application group environments with procedures” on page 247.

To switch an application group, do the following:

1. From the Work with Application Groups display, type 15 (Switch) next to the application group you want and press Enter.

The Switch Application Group (SWTAG) display appears.


3. Specify the type of switch to perform at the Switch type prompt.

4. Verify that the default value *CURRENT for Node roles prompt is valid for the switch you need to perform. If necessary, specify a different value.

5. If you are starting the procedure after addressing problems with the previous switch request, specify the value you want for Begin at step. Be certain that you understand the effect the value you specify will have on your environment.

6. Press Enter.


• To use the default switch procedure for the specified switch type, press Enter.

• To use a different switch procedure for the application group, specify its name. Then press Enter.

8. A switch confirmation panel appears. To perform the switch, press F16.

Switching a data group-only environment

Switching a data group-only environmentIn environments that do not use application groups, option 5 (Start or complete switch using Switch Asst.) on the MIMIX Basic Main Menu is designed to simplify switching by using a default MIMIX Model Switch Framework implementation. When you use this option, MIMIX keeps track of which phase of the switch process you are in. You will see a confirmation display that is appropriate for each phase. Each phase will prompt the Run Switch Framework command (RUNSWTFWK) with your default switch framework and appropriate values for the phase.

To change the default switch framework to a different implementation, see “Policies for switching with model switch framework” on page 48.

Switching to the backup system

This procedure switches operations to the backup system.

Before using this procedure, consult your runbook for any additional procedures that must be performed when switching to the backup system.

1. If this is a planned switch, Vision Solutions strongly recommends that you perform a full set of audits with the audit level policy set to level 30. Running the #FILDTA audit at this audit level checks 100 percent of file member data for the data group for synchronization between source and target systems.

2. Shut down all active applications that are reading or updating replicated objects from the production and backup systems.

Do the following from the backup system:

3. Ensure that all transactions have been applied to the backup system by doing the following:

a. Select option 6 (Work with data groups) from the MIMIX Basic Main Menu.and press Enter.

b. For each data group, select option 8 (Display status) and ensure that the Unprocessed entry counts for both database and object apply have no values.

4. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using Switch Asst.).

5. You will see the Confirm Switch to Backup confirmation display. Press F16 to confirm your choice to switch MIMIX and specify switching options.

6. The Run Switch Framework (RUNSWTFWK) command appears. The default Switch framework and the value *BCKUP for the Switch framework process are preselected and cannot be changed. Do the following:

a. You must specify the type of switch to perform, *PLANNED or *UNPLANNED, at the Switch type prompt.

b. You can change values for other parameters as needed.

c. To start the switch, press Enter.

7. Consult your runbook to determine if any additional steps are needed.

251

Switching a data group-only environment

After you complete this phase of the switch you must wait until the original production system is available again. Then perform the steps in “Synchronizing data and starting MIMIX on the original production system” on page 252.

Synchronizing data and starting MIMIX on the original production system

This procedure synchronizes data and starts replication from the backup system to the original production system. Synchronizing the data ensures that the data on both systems is equivalent before replication is started.

Before using this procedure, consult your runbook for any additional procedures that must be performed when synchronizing and starting replication from the backup system to the original production system

Do the following from the backup system:

1. Ensure the original production system is available again.


3. You will see the Confirm Synchronize and Start confirmation display. Press F16 to confirm your choice and specify switching options.

4. The Run Switch Framework (RUNSWTFWK) command appears. The default Switch framework and the value *SYNC for the Switch framework process are preselected and cannot be changed. Do the following:

a. Optionally, you can change the value of the Set object auditing level prompt.

b. To synchronize and start, press Enter.

5. Once replication has caught up, Vision Solutions strongly recommends that you perform a full set of audits with the audit level policy set to level 30. Running the #FILDTA audit at this audit level checks 100 percent of file member data for the data group for synchronization between source and target systems.


When you are ready to switch back to the original production system, use “Switching to the production system” on page 252.

Switching to the production system

This procedure returns operations to the original production system.

Before using this procedure, consult your runbook for any additional procedures that must be performed when switching to the production system

1. Shut down all active applications that are reading or updating replicated objects from the production and backup systems.

Do the following from the original production system:

2. Ensure that all transactions have been applied by doing the following:

a. Select option 6 (Work with data groups) from the MIMIX Basic Main Menu.and press Enter.

252

Determining when the last switch was performed

b. For each data group, select option 8 (Display status) and ensure that the Unprocessed entry counts for both database and object apply have no values.


4. You will see the Confirm Switch to Production confirmation display. Press F16 to confirm your choice to switch MIMIX and specify switching options.

5. The Run Switch Framework (RUNSWTFWK) command appears. The default Switch framework and the value *PROD for the Switch framework process are preselected and cannot be changed. Do the following:

a. You can change values for other parameters as needed.

b. To start the switch, press Enter.


Determining when the last switch was performedReplication environments rarely remain static. Therefore, best practice is to perform regular switches to ensure that you are prepared should you need to perform one during an emergency.

The Last switch field indicates compliance with best practices. The status of the field is highlighted to indicate the following:

Yellow - The number of days since the last switch is at the limit of what is considered to be best practice. This threshold is determined by the Switch warning threshold policy.

Red - The number of days since the last switch is beyond what is considered to be best practice. This threshold is determined by the Switch action threshold policy.

Checking the last switch date

A 5250 emulator session provides information on the last switch date for an installation from the Last switch field on the MIMIX Availability Status display. This field is only displayed when a value is specified for the Default model switch framework policy. The date indicates when the last completed switch was performed using the switch framework specified in the policy.

To check the last switch date from a 5250 emulator, do the following:

1. Access the MIMIX Basic Main Menu. See “Accessing the MIMIX Main Menu” on page 24.

2. From the MIMIX Basic Main Menu, select option 10 (Availability status) and press Enter. The MIMIX Availability Status display appears. The last switch date is located in the upper right corner of the display.

253

Problems checking switch compliance

Problems checking switch complianceThe Last switch field indicates the switch compliance status and provides the date when the last switch was performed. This field is displayed correctly when certain requirements have been met. The following problems can occur:

• Approaching or out of compliance - The status of the field is highlighted to indicate the number of days since the last switch is at the limit of what is considered to be best practice. Schedule and perform a switch to resolve this problem.

• No Last switch field - This field is only displayed when there is a value specified for the Default model switch framework policy. The date indicates when the last completed switch was performed using the switch framework specified in the policy. Specify the name of the model switch framework you use for switching in policies. See “Policies for switching with model switch framework” on page 48.

“Policies for switching with model switch framework” on page 48

254

Performing a data group switch

Performing a data group switchPerforming a data group switch changes the direction of replication for a data group through the Switch Data Group (SWTDG) command. Only replication for the selected data group is switched. You may want to perform a data group switch if you are having problems with an application that only affects a specific data group or if you need to manually load balance because of heavily used applications.

Note: You cannot switch a disabled data group. For more information, see “Disabling and enabling data groups” on page 269.

To perform a data group switch, do the following:

1. If you will be performing a planned switch, do the following:

a. Shut down any applications that have database file or objects defined to the data group.

b. Ensure that you have addressed any critical database files that are held due to error or held for other reasons.

c. Ensure there are no pending object activity entries by entering: WRKDGACTE STATUS(*ACTIVE)

2. From the Work with Data Groups display, type the option for the type of switch you want next to the data group you want to switch and press Enter.

• Use option 15 for a planned switch

• Use option 16 for an unplanned switch

3. Some of the parameter values that you may want to consider when the Switch Data Group display appears are:

• If you specified Switch type of *PLANNED and have specified a number for the Wait time (seconds) parameter, you can specify a value for the Timeout Option parameter to specify what action you want the SWTDG command to perform if the time specified in the Wait time (seconds) parameter is exceeded. When you are performing a planned switch you may want to specify the number of seconds to wait before all the active data group processes end. If you specify *NOMAX the switch process will wait until all data group processes are ended. This could delay the switch process.

• You can use the Conditions that end switch parameter to specify the types of errors that you want to end the switch operation. To ensure that the most comprehensive checking options are used, choose *ALL. For a planned switch, the default value, *DFT, is the same as *ALL. For an unplanned switch, *DFT will prevent the switch only when database apply backlogs exist.

• Verify that the value for the Start journaling on new source prompt is what you want. If necessary, change the value.

4. After the confirmation screen, press F16 to continue.

5. Press Enter. Messages appear indicating the status of the switch request. When you see a message indicating that the switch is complete, users can begin processing as usual on the temporary source system.

255

Performing a data group switch

6. If you performed an unplanned switch, perform journal analysis on the original source system as soon as it is available, to determine if any transactions were missed. Use topic “Performing journal analysis” on page 295.

7. Start the data group, clearing pending entries, using the procedure in “Starting selected data group processes” on page 181. This starts replication in the new temporary direction.

256

Switch Data Group (SWTDG) command

Switch Data Group (SWTDG) commandThe Switch Data Group (SWTDG) command provides the following parameters to control how you want your switch operation handled:

• The Wait time (seconds) parameter (WAIT) is used to specify the number of seconds to wait for all of the active data group processes to end. The function of the default value *DFT is different for planned switches than it is for unplanned switches. For a planned switch, the value *DFT is equivalent to the value *NOMAX. For an unplanned switch, the value *DFT is set to wait 300 seconds (5 minutes) for all of the active data group processes to end.

• If you specify a value for the WAIT parameter you can use the Timeout option parameter (TIMOUTOPT) to specify what action to take when the wait time you specified is reached. The function of the default value *DFT is different for planned switches than it is for unplanned switches. For a planned switch, the value *DFT is equivalent to the value *QUIT. When the value specified for the WAIT parameter is reached, the current process quits and returns control to the caller. For an unplanned switch, the value *DFT is equivalent to the value *NOTIFY. When the value specified for the WAIT parameter is reached, an inquiry message is sent to notify the operator of a possible error condition.

• The Conditions that end switch (ENDSWT) parameter is used to specify which conditions should end the switch process. The function of the default value *DFT is different for planned switches than it is for unplanned switches.

– For a planned switch, the value *DFT is equivalent to the value *ALL. The value *ALL provides the most comprehensive checking for conditions that are not compatible with best practices for switching. Additionally, the value *ALL ensures that your programs will automatically include any future ENDSWT parameter values that may be added to maintain a conservative approach to the switching operation.

– For an unplanned switch, the value *DFT ends the process if there are any backlogs for the database apply process. However, backlogs on other user journal processes are not checked and switch processing is not ended even though conditions may exist which are not compatible with best practices for switching and may result in the loss of data.

• The Start journaling on new source (STRJRNSRC) parameter is used to specify whether you want to start journaling for the data group on the new source system.

• The End journaling on new target (ENDJRNTGT) parameter is used to specify whether you want to end journaling of the data group on the new target system.

• The End remote journaling (ENDRJLNK) parameter is used in a planned switch of a data group that uses remote journaling. This parameter specifies whether you want to end remote journaling for the data group. The default behavior is to leave the RJ link running. You need to consider whether to keep the RJ link active after a planned switch of a data group. For more information, see “When to end the RJ link” on page 188.

• The Change user journal receiver (CHGUSRRCV) parameter is used to specify whether or not you want MIMIX to create and attach a new user (database) journal

257

Switch Data Group (SWTDG) command

receiver during the switch operation. If you have applications that are dependent on the receiver name for recovery purposes, It is recommended that you choose CHGUSRRCV(*NO) to prevent a new journal receiver from being created during a data group switch.

• The Change system journal receiver (CHGSYSRCV) parameter is used to specify whether or not you want MIMIX to create and attach a new journal receiver to the system (audit) journal (QAUDJRN) during the switch operation. If you have applications that are dependent on the receiver name for recovery purposes, it is recommended that you choose CHGSYSRCV(*NO) to prevent a new journal receiver from being created during a data group switch.

• The End if database errors (ENDDBERR) parameter has been obsoleted by the Conditions that end switch (ENDSWT) parameter. Previously, the ENDDBERR parameter was used to specify whether to switch the data group when data replication errors exist. Use the ENDSWT parameter and specify *DBERR to produce the equivalent of ENDDBERR(*YES), or *NONE to produce the equivalent of ENDDBERR(*NO).

• The Confirm (CONFIRM) parameter is used to specify if a confirmation panel is displayed. The default is *NO (the confirmation panel is not displayed). Note that options for switching on the Work with Data Groups display call the SWTDG command with *YES specified so that the confirmation panel is automatically displayed and the user must press F16 to continue.

258

259

CHAPTER 14 Less common operations

This chapter describes how to perform infrequently used operations that help keep your MIMIX environment running. The following topics are included:

• “Starting the TCP/IP server” on page 260 contains the procedure for starting the TCP/IP server.

• “Ending the TCP/IP server” on page 261 contains the procedure for ending the TCP/IP server.

• “Working with objects” on page 262 contains tips for working with long object and IFS path names.

• “Viewing status for active file operations” on page 263 describes how to check status when replicating database files that you are reorganizing or copying with MIMIX Promoter.

• “Displaying a remote journal link” on page 264 describes how to display information about he link between a source journal definition and a target journal definition.

• “Displaying status of a remote journal link” on page 265 includes procedures for determining whether a data group uses remote journaling and for checking the status of a remote journal link.

• “Identifying data groups that use an RJ link” on page 267 includes the procedure to determine which data groups use a remote journal link.

• “Identifying journal definitions used with RJ” on page 268 describes how to determine whether a journal definition is defined to one or more remote journal links.

• “Disabling and enabling data groups” on page 269 describes when it can be beneficial to disable and enable data groups. Procedures for these processes are included in this topic.

• “Determining if non-file objects are configured for user journal replication” on page 271 provides procedures for determining whether configured for IFS objects, data areas, and data queues are configured to be cooperatively processed through the user journal.

• “Using file identifiers (FIDs) for IFS objects” on page 273 describes file identifiers (FIDs) which are used by commands to uniquely identify the correct IFS tracking entries to process.

• “Operating a remote journal link independently” on page 274 describes how to configure, start, and end a remote journal link without defining data to be replicated by MIMIX processes.

Starting the TCP/IP server

260

Starting the TCP/IP serverUse this procedure if you need to manually start the TCP/IP server.

Once the TCP communication connections have been defined in a transfer definition, the TCP server must be started on each of the systems identified by the transfer definition.

You can also start the TCP/IP server automatically through an autostart job entry. Either you can change the transfer definition to allow MIMIX to create and manage the autostart job entry for the TCP/IP server, or you can add your own autostart job entry. MIMIX only manages entries for the server when they are created by transfer definitions.

When configuring a new installation, transfer definitions and MIMIX-added autostart job entries do not exist on other systems until after the first time the MIMIX managers are started. Therefore, during initial configuration you may need to manually start the TCP server on the other systems using the STRSVR command.

Note: Use the host name and port number (or port alias) defined in the transfer definition for the system on which you are running this command.

Do the following on the system on which you want to start the TCP server:

1. From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and press Enter.

2. The Utilities Menu appears. Select option 51 (Start TCP server) and press Enter.

3. The Start Lakeview TCP Server display appears. At the Host name or address prompt, specify the host name or address for the local system as defined in the transfer definition.

4. At the Port number or alias prompt, specify the port number or alias as defined in the transfer definition for the local system.

Note: If you specify an alias, you must have an entry in the service table on this system that equates the alias to the port number.

5. Press Enter.

6. Verify that the server job is running under the MIMIX subsystem on that system. You can use the Work with Active Jobs (WRKACTJOB) command to look for a job under the MIMIXSBS subsystem with a function of PGM-LVSERVER.

Ending the TCP/IP server

261

Ending the TCP/IP serverTo end the TCP server, do the following on both systems defined by the transfer definition. One example of why you might end the TCP server is when you are preparing to upgrade the MIMIX products in a product library.

Note: Use the host name and port number (or port alias) defined in the transfer definition for the system you on which you are running this command

To end the TCP server on a system, do the following:


2. The Utilities Menu appears. Select option 52 (End TCP server) and press Enter.

3. The End Lakeview TCP Server display appears. At the Host name or address prompt, specify the host name for the local system as specified in the transfer definition.

4. At the Port number or alias prompt, verify that the value shown is what you want. If necessary change the value.

Note: If the configuration uses port aliases, specify the alias for local system. Otherwise, specify the port number for the local system.

5. Press Enter.

Working with objects

262

Working with objectsWhen working with objects, these tips may be helpful.

Displaying long object names

The names of some IFS entries cannot be fully displayed in the limited space on a "Work with" display. These entries are shown with a ‘>’ character in the right-most column of the Object field.

You can display long object names from the following displays:

• Work with Data Group IFS Entries display

• Work with Data Group Activity

• Work with Data Group Activity Entries

To display the entire object name from any of these displays, position the cursor on an entry which indicates a long name and press F22 (Display entire field).

Considerations for working with long IFS path names

MIMIX currently replicates IFS path names of 512 characters. However, any MIMIX command that takes an IFS path name as input may be susceptible to a 506 character limit. This character limit may be reduced even further if the IFS path name contains embedded apostrophes ('). In this case, the supported IFS path name length is reduced by four characters for every apostrophe the path name contains.

For information about IFS path name naming conventions, refer to the IBM book, Integrated File System Introduction V5R4.

Displaying data group spooled file information

If spooled files are created as a result of MIMIX replication, you can access the spooled file and the associated data group entry from the Work with Data Group Activity display.

To access the spooled file information, do the following:

1. From the MIMIX Basic Main Menu, select option 6 (Work with data groups) and press Enter.

2. The Work with Data Groups display appears. Select option 14 (Active objects) for the data group you want to view and press Enter. The Work with Data Group Activity display appears.

3. From this display, press F16 (Spooled Files) to access the Display Data Group Spooled Files display. This display lists all of the current spooled files and shows the mapping of their names between the source and target systems.

http://publib.boulder.ibm.com/infocenter/iseries/v5r4/topic/ifs/rzaax.pdf

Viewing status for active file operations

263

Viewing status for active file operationsIf you are replicating database files that you are reorganizing or copying with MIMIX Promoter, you can check on the status of these operations. Do the following:

1. From the MIMIX Basic Main Menu, use F21 (Assistance level) to access the intermediate menu.


3. From the MIMIX Utilities Menu, select option 63 (Work with copy status) and press Enter.

4. The Work with Copy Status display appears. From this display you can track the status of active copy or reorganize operations, including the replication of physical file data as specified by METHOD(*DATA) on the Synchronize Data Group File Entry (SYNCDGFE) command.

Note: You can only see status for the system on which you are working.

Displaying a remote journal link

264

Displaying a remote journal linkTo display information about the link between a source journal definition and a target journal definition, do the following:

1. From the Work with RJ Links display, type a 5 (Display) next to the entry you want and press Enter.

2. The Display Remote Journal Link (DSPRJLNK) display appears, showing the current values defined for the link.

Displaying status of a remote journal link

Displaying status of a remote journal linkTo check the status of a remote journal link, do the following:

1. Type the command WRKRJLNK and press Enter.

2. The Work with RJ Links display appears with a list of defined links.

The Dlvry column indicates configured value for how the IBM i remote journal function sends the journal entries from the source journal to the target journal. The possible values for delivery are asynchronous (*ASYNC) and synchronous (*SYNC).

*ASYNC - Journal entries are replicated asynchronously, independent of the applications that create the journal entries. The applications continue processing while an independent system task delivers the journal entries. If a failure occurs on the source system, journal entries on the source system may become trapped because they have not been delivered to the target system.

*SYNC - Journal entries are replicated synchronously. The applications do not continue processing until after the journal entries are sent to the target journal. If a failure occurs on the source system, the target system contains the journal entries that have been generated by the applications.

The State column represents the composite view of the state of the remote journal link. Because the RJ link has both source and a target component, the state shown is that of the component which has the most severe state. Table 49 shows the possible states of an RJ link, listed in order from most severe to least severe.

Table 49. Possible states for RJ links, shown in order starting with most severe.

State Description

The following states are considered to be inactive:

*UNKNOWN Neither journal defined to the remote journal link resides on the local system so the state of the link cannot be checked.

*NOTAVAIL The ASP where the journal is located is varied off.

*NOTBUILT The remote journal link is defined to MIMIX but one of the associated journal environments has not been built.

*SRCNOTBLT The remote journal link is defined to MIMIX but the associated source journal environment has not been built.

*TGTNOTBLT The remote journal link is defined to MIMIX but the associated target journal environment has not been built.

*FAILED The remote journal cannot receive journal entries from the source journal due to an error condition.

*CTLINACT The remote journal link is processing a request for a controlled end.

*INACTIVE The remote journal link is not active.

The following states are considered to be active:

265

Displaying status of a remote journal link

*INACTPEND An active remote journal link is in the process of becoming inactive. For asynchronous delivery, this is a transient state that will resolve automatically. For synchronous delivery, one system is inactive while the other system is inactive with pending unconfirmed entries.

*SYNCPEND An active remote journal link is connected using synchronous delivery and is running in catch-up mode. The state will become *SYNC when catch-up mode ends.

*ASYNCPEND An active remote journal link is connected using asynchronous delivery and is running in catch-up mode. The state will become *ASYNC when catch-up mode ends.

*SYNC An active remote journal link is connected using synchronous delivery mode.

*ASYNC An active remote journal link is connected using asynchronous delivery mode.

Table 49. Possible states for RJ links, shown in order starting with most severe.

State Description

266

Identifying data groups that use an RJ link

267

Identifying data groups that use an RJ linkUse this procedure to determine which data groups use a remote journal link before you end a remote journal link or remove a remote journaling environment.

1. Enter the command WRKRJLNK and press Enter.

2. Make a note of the name indicated in the Source Jrn Def column for the RJ Link you want.

3. From the command line, type WRKDGDFN and press Enter.

4. For all data groups listed on the Work with DG Definitions display, check the Journal Definition column for the name of the source journal definition you recorded in Step 2.

• If you do not find the name from Step 2, the RJ link is not used by any data group. The RJ link can be safely ended or can have its remote journaling environment removed without affecting existing data groups.

• If you find the name from Step 2 associated with any data groups, those data groups may be adversely affected if you end the RJ link. A request to remove the remote journaling environment removes configuration elements and system objects that need to be created again before the data group can be used. Continue with the next step.

5. Press F10 (View RJ links). Consider the following and contact your MIMIX administrator before taking action that will end the RJ link or remove the remote journaling environment.

• When *NO appears in the Use RJ Link column, the data group will not be affected by a request to end the RJ link or to end the remote journaling environment.

Note: If you allow applications other than MIMIX to use the RJ link, they will be affected if you end the RJ link or remove the remote journaling environment.

• When *YES appears in the Use RJ Link column, the data group may be affected by a request to end the RJ link. If you use the procedure for ending a remote journal link independently in topic “Ending a remote journal link independently” on page 274, ensure that any data groups that use the RJ link are inactive before ending the RJ link.

Identifying journal definitions used with RJ

268

Identifying journal definitions used with RJTo see whether a journal definition is defined to one or more remote journal links, do the following:

1. From the MIMIX Basic Main Menu, select option 11 (Configuration menu) and press Enter.

2. The MIMIX Configuration menu appears. Select option 3 (Work with journal definitions) and press Enter.

3. The Work with Journal Definitions display appears. The RJ Link column indicates whether or not the journal definition is used by a remote journal link. A blank value indicates the journal definition is not associated with a remote journal link.

Values that indicate the definition is used by a remote journal link are as follows:

*SOURCE - The journal definition is a source journal definition in a remote journal link.

*TARGET - The journal definition is the target journal definition in a remote journal environment.

*BOTH - The journal definition is the source journal definition for one remote journal link and is also a target journal definition for another remote journal link in a cascading environment.

*NONE - The journal definition is not used with the MIMIX RJ support.

4. To see the remote journal links associated with a journal definition, type 12 (Work with RJ Links) and press Enter.

Disabling and enabling data groups

Disabling and enabling data groupsMIMIX supports the concept of disabled data groups in a replication environment. The ability to disable a data group, and enable it later as desired, can be beneficial in a variety of configuration scenarios.

The ability to disable a data group is particularly helpful in advanced cluster scenarios, where inactive data groups may be a necessary component of the replication environment. Because these data groups are inactive as part of the design, the user does not need to be notified when the data groups are in error.

Disabling a data group is also useful in non-cluster situations. If you create a data group for testing purposes, for example, you no longer have to delete the data group in order to clean up your environment when testing is complete. Instead, you can simply disable the data group until it is needed again. This provides the benefit of retaining your object, file, IFS, and DLO entries while the data group is not needed. Additionally, the journal manager does not retain journal receivers that have not been processed by a disabled data group, which allows you to save storage space on your system.

With support for disabled data groups, you also avoid having to start each data group individually when an installation has data groups configured to replicate in different directions. Let us assume you have two sets of data groups: one set configured to replicate from System A to System B, and another set configured to replicate from System B to System A. To start only those data groups replicating from System A to System B, it was previously necessary to start them individually in order to prevent those replicating from System B to System A from starting as well. Now you can disable the data groups you do not want to start and simply start the remaining data groups using the Start MIMIX (STRMMX) command.

Customers with many systems and data groups across varying time zones may find support for disabled data groups useful when performing upgrades. Disabling data groups allows you to stagger upgrades, causing minimal impact to your replication environment. In this situation, you install a new installation and copy the configuration data from the old installation using the Copy Configuration Data (CPYCFGDTA) command. Over a convenient period of time, you can end and disable each data group on the old (original) installation, then enable and start each data group on the new installation. Once all data groups in the old installation are disabled and all data groups in the new installation are enabled, the old installation can be deleted.

A disabled data group is initiated by a user and is in a state of *DISABLE. An enabled data group can be active or inactive. The Change Data Group (CHGDG) command can be used to change the state of a data group.

Only inactive data groups and data groups that do not have processes suspended at a recovery point can be disabled. To make a data group inactive, you must end the data group. The request to end the data group will clear any recovery point.

Disabled data groups are indicated by a status of -D (in green) on the Work with Data Groups (WRKDG) display. You can optionally not display disabled data groups by specifying a different value on the STATE parameter of the WRKDG command. Once a data group that is not part of an application group is disabled, it cannot be started, ended, or switched.

269

Disabling and enabling data groups

Note: If the data group is part of an application group, the Switch Application Group (SWTAG) procedure may change its state so that it gets enabled and switched. In this case, if you do not want the data group to be switched, change the Allow to be switched (ALWSWT) parameter to *NO in the Data Group Definition (DGDFN).

When a disabled data group is enabled, any pending entries must be cleared when the data group is started. Specify CLRPND(*YES) on the Start Data Group command.

Procedures for disabling and enabling data groups

The Change Data Group (CHGDG) command allows you to disable or enable a data group by changing its state. This command requires that the system manager is active and communication with the remote system is active.

To disable or enable an individual data group, do the following:

1. On a command line, type CHGDG and press Enter. The Change Data Group display appears.

2. At the Data group definition prompts, fill in the values you want or press F4 for a valid list.

3. At the State prompt, do one of the following:

• To keep the state of the data group the same, specify the default, *SAME.

• To change the state of an active data group, you must first end the data group by running the End Data Group (ENDDG) command. See “Ending selected data group processes” on page 198. To disable an enabled data group, specify *DISABLE. When the state of the data group is changed to disabled, the status of the data group changes from *INACTIVE to *DISABLED.

• To enable a disabled data group, specify *ENABLE. When the state of the data group is changed to enabled, the status of the data group changes from *DISABLED to *INACTIVE.

4. Press Enter to confirm your changes.

Note: To start an enabled data group, you must specify *YES for the Clear pending entries prompt on the Start Data Group (STRDG) command.

270

Determining if non-file objects are configured for user journal replication


MIMIX can take advantage of IBM i journaling functions that provide change-level details in journal entries in a user journal for object types other than files (*FILE). When properly configured, MIMIX can cooperatively process IFS stream files, data areas, and data queues between system journal and user journal replication processes. This enables changes to data or attributes to be replicated through the user journal instead of replicating the entire object through the system journal every time a change occurs.

Determining how IFS objects are configured

In order for IFS objects to be replicated from the user journal, one or more data group IFS entries must be configured to process cooperatively with the user journal. Also, IFS tracking entries must exist for the object identified by the data group IFS entries.

To determine if a data group has any IFS objects that are configured for user journal replication and has any corresponding IFS tracking entries, do the following:


2. The Work with Data Groups display appears. Type 22 (IFS entries) next to the data group you want and press Enter.

The Work with DG IFS Entries display appears, showing the IFS entries configured for the data group.

3. Press F10 twice to access the CPD view.

4. The values shown in the Coop with DB column indicate how objects identified by the data group IFS entries will be replicated.

• Entries with the value *YES are configured for user journal replication. Continue with the next step to ensure that IFS tracking entries exist for the IFS objects. Replication cannot occur without tracking entries.

• Entries the value *NO are configured for system journal replication.

To view additional information for a data group IFS entry, type 5 (Display) next to the entry and press Enter.

5. Press F12 (Cancel) to return to the Work with Data Groups display. Then type 50 (IFS trk entries) next to the data group you want and press Enter.

6. The Work with DG IFS Trk. Entries display appears with a list of tracking entries for the IFS objects identified for replication by the data group. If there are no tracking entries listed but Step 4 indicates that properly configured data group IFS entries exist, the tracking entries must be loaded. For more information about loading tracking entries, see the MIMIX Administrator Reference book.

271


Determining how data areas or data queues are configured

In order for data area and data queue objects to be replicated from the user journal, one or more data group object entries must be configured to process cooperatively with the user journal. Also, object tracking entries must exist for the object identified by the data group object entries.

To determine if a data group has any data area or data queue objects that are configured for user journal replication and has any corresponding object tracking entries, do the following:


2. The Work with Data Groups display appears. Type 20 (Object entries) next to the data group you want and press Enter.

The Work with DG Object Entries display appears, showing the object entries configured for the data group.

3. For each entry in the list, do the following:

a. Type a 5 (Display) next to the entry and press Enter.

b. The object entry must have the following values specified in the fields indicated:

• The Object type field must be *ALL, *DTAARA, or *DTAQ

• The Cooperate with database field must be *YES

• The Cooperating object types field must specify *DTAARA to replicate data areas and *DTAQ to replicate data queues.

4. Press F12 (Cancel) to return to the Work with Data Groups display. Then type 52 (Obj trk entries) next to the data group you want and press Enter.

5. The Work with DG Obj. Trk. Entries display appears with a list of tracking entries for the data area and data queue objects identified for replication by the data group. If there are no tracking entries listed but Step 3 indicates that properly configured data group object entries exist, the tracking entries must be loaded. For more information about loading tracking entries, see the MIMIX Administrator Reference book.

272

Using file identifiers (FIDs) for IFS objects

273

Using file identifiers (FIDs) for IFS objectsCommands used for user journal replication of IFS objects use file identifiers (FIDs) to uniquely identify the correct IFS tracking entries to process. The System 1 file identifier and System 2 file identifier prompts ensure that IFS tracking entries are accurately identified during processing. These prompts can be used alone or in combination with the System 1 object prompt.

These prompts enable the following combinations:

• Processing by object path: A value is specified for the System 1 object prompt and no value is specified for the System 1 file identifier or System 2 file identifier prompts.

When processing by object path, a tracking entry is required for all commands with the exception of the SYNCIFS command. If no tracking entry exists, the command cannot continue processing. If a tracking entry exists, a query is performed using the specified object path name.

• Processing by object path and FIDs: A value is specified for the System 1 object prompt and a value is specified for either or both of the System 1 file identifier or System 2 file identifier prompts.

When processing by object path and FIDs, a tracking entry is required for all commands. If no tracking entry exists, the command cannot continue processing. If a tracking entry exists, a query is performed using the specified FID values. If the specified object path name does not match the object path name in the tracking entry, the command cannot continue processing.

• Processing by FIDs: A value is specified for either or both of the System 1 file identifier or System 2 file identifier prompts and, with the exception of the SYNCIFS command, no value is specified for the System 1 object prompt. In the case of SYNCIFS, the default value *ALL is specified for the System 1 object prompt.

When processing by FIDs, a tracking entry is required for all commands. If no tracking entry exists, the command cannot continue processing. If a tracking entry exists, a query is performed using the specified FID values.

Operating a remote journal link independently

Operating a remote journal link independentlyYou can configure, start, and end a remote journal link without defining data to be replicated by MIMIX processes. For example, you might have a need to use remote journals without performing data replication. The Start Remote Journal Link (STRRJLNK) and End Remote Journal Link (ENDRJLNK) commands provide this capability.

Note: These commands should only be used by personnel with experience using the IBM i remote journal function.

For most needs, support for the RJ link that is integrated in the commands which start and end replication processes (STRMMX, STRDG, ENDMMX, and ENDDG).

Starting a remote journal link independently

To start a remote journal link separately from other MIMIX processes, do the following:

1. To access the Work with Journal Links display, type the command WRKRJLNK and press Enter.

2. From the Work with Remote Journal Links display, type a 9 (Start) next to the link in the list that you want to start and press Enter.

3. The Start Remote Journal Link (STRRJLNK) display appears. Specify the value you want for the Starting journal receiver prompt.

4. To start remote journaling for the specified link, press Enter.

Ending a remote journal link independently

Default values for this command will perform an immediate end for the specified link. Be aware that the actions taken by the ENDOPT parameter on this command are different from the actions taken when you perform an immediate or controlled end of a MIMIX data group. For more information about the differences between this command and the End Data Group (ENDDG) command, see the MIMIX Reference book.

For the following situations, an immediate end is always performed (the value specified for the ENDOPT parameter is ignored):

• The remote journal function is running in synchronous mode (DELIVERY(*SYNC)).

• The remote journal function is performing catch-up processing.

To end a remote journal link separately from other MIMIX processes, do the following:

1. To access the Work with Journal Links display, type the command WRKRJLNK and press Enter.

2. From the Work with Remote Journal Links display, type a 10 (End) next to the link in the list that you want to end.


• To perform an immediate end from the source system, press Enter. This completes the procedure for an immediate end.

274

Operating a remote journal link independently

• To perform a controlled end or to end from the target system, press F4 (Prompt), then continue with the next step.

4. The End Remote Journal Link (ENDRJLNK) display appears. Press F10 (Additional parameters).

5. To perform a controlled end, specify *CNTRLD at the End remote journal link prompt. If you need to end from the target system, specify *TGT at the End RJ link on system prompt. To process the request, press Enter.

275

CHAPTER 15 Troubleshooting - where to start

Occasionally, a situation may occur that requires user intervention. This section provides information to help you troubleshoot problems that can occur in a MIMIX environment.

You can also consult our website at www.mimix.com for the latest information and updates for MIMIX products.

The following topics are included in this chapter:

• “Gathering information before reporting a problem” on page 278 describes the information you should gather before you report a problem. A procedure is included to help you gather this information.

• “Reducing contention between MIMIX and user applications” on page 279 describes a processing timing issue that may be resolved by specifying an Object retrieval delay value on the commands for creating or changing data group entries.

• “Data groups cannot be ended” on page 280 describes possible causes for a data group that is taking too long to end.

• “Verifying a communications link for system definitions” on page 281 describes the process to verify that the communications link defined for each system definition is operational.

• “Verifying the communications link for a data group” on page 282 includes a process to use before synchronizing data to ensure that the communications link for the data group is active.

• “Checking file entry configuration manually” on page 283 includes the process for checking that correct data group file entries exist with respect to the data group object entries. This process uses the Check DG File Entries (CHKDGFE) command.

• “Data groups cannot be started” on page 285 describes some common reasons why a data group may not be starting.

• “Cannot start or end an RJ link” on page 286 describes possible reasons that can prevent you from starting or ending an RJ link. This topic includes a procedure for removing unconfirmed entries to free an RJ link.

• “RJ link active but data not transferring” on page 287 describes why an RJ link may not be transferring data and how to resolve this problem.

• “Errors using target journal defined by RJ link” on page 288 describes why errors when using a target journal defined by an RJ link can occur and how to resolve them.

• “Verifying data group file entries” on page 289 includes a procedure for verifying data group file entries using the Verify Data Group File Entries (VFYDGFE) command.

• “Verifying data group data area entries” on page 289 includes a procedure for

276

http://www.mimix.com

verifying data group data area entries using the Verify Data Group Data Area Entries (VFYDGDAE) command. Data area entries are only used when data areas are replicated by the data area poller process, which is not preferred.

• “Verifying key attributes” on page 289 includes a procedure for verifying key attributes using the VFYKEYATR (Verify Key Attributes) command.

• “Working with data group timestamps” on page 291 describes timestamps and includes information for creating, deleting, displaying, and printing them.

• “Removing journaled changes” on page 294 describes the configuration conditions that must be met using the Remove Journaled Changes (RMVJRNCHG) journal entry.

• “Performing journal analysis” on page 295 describes and includes the procedure for performing journal analysis of the source system.

277

Gathering information before reporting a problem

278

Gathering information before reporting a problemBefore you report a problem, you should gather the following information:

• The MIMIX product, library, installed version, and IBM i operating system level on the system you are using. To determine this information, follow the procedure “Obtaining MIMIX and IBM i information from your system” on page 278.

• The Message ID number for any error messages associated with the problem. If you receive error messages, record the message number, any replacement text (such as “Process X failed for file Y”), and the to and from program information, if available. Since many messages have similar text, this information is much more helpful to us and enables us to handle your call more efficiently.

• The specific operation you were attempting to perform when the error condition occurred. It is important that we understand what you were trying to do when you encountered the problem. Try to write down the specific sequence of events that you were doing when the error condition occurred, such as the commands entered, the display you were working from, or the program that was running.

Obtaining MIMIX and IBM i information from your system

To obtain the necessary MIMIX and IBM i information before reporting a problem, do the following:

1. Do one of the following to access the Lakeview Technology Installed Products display:

• If you are configured for a MIMIX replication environment, select option 31 (Product management menu). Then select option 2 (Work with products).

• From a command line, enter LAKEVIEW/WRKPRD

2. Next to the product you want, type a 6 (About version) and press Enter. The About pop-up appears, showing the Product, Library, Installed version, and the OS/400 level on this system.

3. Press F9 (Fixes) to see the Work with Installed Fixes display. From this display you can determine the latest level of the MIMIX cumulative fix package that is installed.

Note: You should know the version and release level (VnRnMn) of the IBM i operating system that is on each system with which you are working. Use the process above on each system.

Reducing contention between MIMIX and user applications

279

Reducing contention between MIMIX and user applica-tions

If your applications are failing in an unexpected manner, it may be caused by MIMIX locking your objects for object retrieval processing while your applications are trying to access the object. This is a processing timing issue and can be significantly reduced, or eliminated, by specifying an appropriate delay value for the Object retrieval delay element under the Object processing (OBJPRC) parameter on the change or create data group definition commands.

Although you can specify this value at the data group level, you can override the data group value at the object level by specifying an Object retrieval delay value on the commands for creating or changing data group entries.

For more information see “Selecting an object retrieval delay” in the MIMIX Administrator Reference book.

You should use care when choosing the object retrieval delay. A long delay may impact the ability of MIMIX system journal replication processes to move data from a system in a timely manner. Too short a delay may allow MIMIX to retrieve an object before an application is finished with it. You should make the value large enough to reduce or eliminate contention between MIMIX and applications, but small enough to allow MIMIX to maintain a suitable high availability environment.

Data groups cannot be ended

280

Data groups cannot be endedA controlled end for a data group may take some time if there is a backlog of files to process or if there are a number of errors that MIMIX is attempting to resolve before ending.

If you think that a data group is taking too long to end, check the following for possible causes:

• Check to see how many transactions are backlogged for the apply process. Use option 8 (Display status) on the Work with Data Groups display to access the detailed status. A number in the Unprocessed Entry Count column indicates a backlog. Use F7 and F8 to see additional information.

• Determine which replication process is not ending. Use the command WRKSBSJOB SBS(MIMIXSBS) to see the jobs in the MIMIXSBS subsystem. Look for jobs for replication processes that have not changed to a status of END. For example, abc_OBJRTV, where abc is a 3-character prefix.

• Check the QSYSOPR message log to see if there is message that requires a reply.

• You can use the WRKDGACTE STATUS(*ACTIVE) command to ensure all data group activity entries are completed. If a controlled end was issued, all activity entries must be processed before the object processes are ended.

Verifying a communications link for system definitions

281

Verifying a communications link for system definitionsDo the following to verify that the communications link defined for each system definition is operational:

1. From the MIMIX Basic Main Menu, type an 11 (Configuration menu) and press Enter.

2. From the MIMIX Configuration Menu, type a 1 (Work with system definitions) and press Enter.

3. From the Work with System Definitions display, type an 11 (Verify communications link) next to the system definition you want and press Enter. You should see a message indicating the link has been verified.

Note: If the system manager is not active, this process will only verify that communications to the remote system is successful. You will also see a message in the job log indicating that “communications link failed after 1 request.” This indicates that the remote system could not return communications to the local system.

4. Repeat this procedure for all system definitions. If the communications link defined for a system definition uses SNA protocol, do not check the link from the local system.

Note: If your transfer definition uses the *TCP communications protocol, then MIMIX uses the Verify Communications Link command to validate the information that has been specified for the Relational database (RDB) parameter. MIMIX also uses VFYCMNLNK to verify that the System 1 and System 2 relational database names exist and are available on each system.

Verifying the communications link for a data group

282

Verifying the communications link for a data groupBefore you synchronize data between systems, ensure that the communications link for the data group is active. This procedure verifies the primary transfer definition used by the data group. If your configuration requires multiple data groups, be sure to check communications for each data group definition.

Do the following:

1. From the MIMIX Basic Main Menu, type an 11 (Configuration menu) and press Enter.

2. From the MIMIX Configuration Menu, type a 4 (Work with data group definitions) and press Enter.

3. From the Work with Data Group Definitions display, type an 11 (Verify communications link) next to the data group you want and press F4.

4. The Verify Communications Link display appears. Ensure that the values shown for the prompts are what you want.

5. To start the check, press Enter.

6. You should see a message "VFYCMNLNK command completed successfully."

If your data group definition specifies a secondary transfer definition, use the following procedure to check all communications links.

Verifying all communications links

The Verify Communications Link (VFYCMNLNK) command requires specific system names to verify communications between systems. When the command is called from option 11 on the Work with System Definitions display or option 11 on the Work with Data Groups display, MIMIX identifies the specific system names.

For transfer definitions using TCP protocol: MIMIX uses the Verify Communications Link (VFYCMNLNK) command to validate the values specified for the Relational database (RDB) parameter. MIMIX also uses VFYCMNLNK to verify that the System 1 and System 2 relational database names exist and are available on each system.

When the command is called from option 11 on the Work with Transfer Definitions display or when entered from a command line, you will receive an error message if the transfer definition specifies the value *ANY for either system 1 or system 2.

1. From the Work with Transfer Definitions display, type an 11 (Verify communications link) next to all transfer definitions and press Enter.

2. The Verify Communications Link display appears. If you are checking a Transfer definition with the value of *ALL, you need to specify a value for the System 1 or System 2 prompt. Ensure that the values shown for the prompts are what you want and then press Enter.

You will see the Verify Communications Link display for each transfer definition you selected.

3. You should see a message "VFYCMNLNK command completed successfully."

Checking file entry configuration manually

Checking file entry configuration manuallyThe Check DG File Entries (CHKDGFE) command provides a means to detect whether the correct data group file entries exist with respect to the data group object entries configured for a specified data group in your MIMIX configuration. When file entries and object entries are not properly matched, your replication results can be affected.

Note: The preferred method of checking is to use MIMIX AutoGuard to automatically schedule the #DGFE audit, which calls the CHKDGFE command and can automatically correct detected problems. For additional information, see “Interpreting results for configuration data - #DGFE audit” on page 300.

To check your file entry configuration manually, do the following:

1. On a command line, type CHKDGFE and press Enter. The Check Data Group File Entries (CHKDGFE) command appears.

2. At the Data group definition prompts, select *ALL to check all data groups or specify the three-part name of the data group.

3. At the Options prompt, you can specify that the command be run with special options. The default, *NONE, uses no special options. If you do not want an error to be reported if a file specified in a data group file entry does not exist, specify *NOFILECHK.

4. At the Output prompt, specify where the output from the command should be sent—to print, to an outfile, or to both. See Step 6.

5. At the User data prompt, you can assign your own 10-character name to the spooled file or choose not to assign a name to the spooled file. The default, *CMD, uses the CHKDGFE command name to identify the spooled file.

6. At the File to receive output prompts, you can direct the output of the command to the name and library of a specific database file. If the database file does not exist, it will be created in the specified library with the name MXCDGFE.

7. At the Output member options prompts, you can direct the output of the command to the name of a specific database file member. You can also specify how to handle new records if the member already exists. Do the following:

a. At the Member to receive output prompt, accept the default *FIRST to direct the output to the first member in the file. If it does not exist, a new member is created with the name of the file specified in Step 6. Otherwise, specify a member name.

b. At the Replace or add records prompt, accept the default *REPLACE if you want to clear the existing records in the file member before adding new records. To add new records to the end of existing records in the file member, specify *ADD.

8. At the Submit to batch prompt, do one of the following:

• If you do not want to submit the job for batch processing, specify *NO and press Enter to check data group file entries.

283

Checking file entry configuration manually

• To submit the job for batch processing, accept *YES. Press Enter and continue with the next step.

9. At the Job description prompts, specify the name and library of the job description used to submit the batch request. Accept MXAUDIT to submit the request using the default job description, MXAUDIT.

10. At the Job name prompt, accept *CMD to use the command name to identify the job or specify a simple name.

11. To start the data group file entry check, press Enter.

284

Data groups cannot be started

285

Data groups cannot be startedTwo common reasons why a data group cannot be started are as follows:

• The communications link between systems defined to the data group is not active. Use the procedure “Verifying a communications link for system definitions” on page 281.

• The journaling environment for the data group has not been built. Verify that journaling environment defined in the journal definition exists. If necessary, use the appropriate procedure in the MIMIX Administrator Reference book.

• The journal receiver has been deleted from the system. You can use WRKJRNA to determine if the journal receiver exists on the source system.

Cannot start or end an RJ link

286

Cannot start or end an RJ linkIn normal operations, unconfirmed entries are automatically handled by the RJ link monitors. In the event of a switch, the unconfirmed entries are processed, ensuring that you have the latest updates to your data.

However, there is a scenario where you may end up with a backlog of unconfirmed entries that can prevent you from starting or ending an RJ link. This problem can occur when all of the following are true:

• The data group is not switchable or you do not want to switch it

• A link failure on an RJ link that is configured for synchronous delivery leaves unconfirmed entries

• The RJ link monitors are not active, either because you are not using them or they failed as a result of a bad link

To recover from this situation, you should run the Verify Communications Link (VFYCMNLNK) command to assist you in determining what may by wrong and why the RJ link will not start.

If you are using an independent ASP, check the transfer definition to ensure the correct database name has been specified.

You also need to end the remote journal link from the target system. Ending the link from the target system is a restriction of the IBM remote journal function.

Removing unconfirmed entries to free an RJ link

Note: You should never remove unconfirmed entries from a switchable data group unless you are directed to by your MIMIX administrator or a CustomerCare representative.

If you need to remove a backlog of unconfirmed entries, do the following:

1. Use the WRKRJLNK command to display the status of the RJ link. The status shown on the Work with RJ Links display is the status of the link on the system where you entered the command. (This system is identified at the upper right corner of the display.) An RJ link with unconfirmed entries will have a state of *INACTPEND.

Note: You may need to access this display from the other system defined by the RJ link.

2. Ending the remote journal link on the system with unconfirmed entries will cause them to be deleted. Do the following:

a. Type 10 (End) next to the link and press F4 (Prompt).

b. The End Remote Journal Link (ENDRJLNK) display appears. Default values on this command ends the link from the source system. If there are unconfirmed entries on the target system, press F10 (Additional parameters). Then specify *TGT at the End RJ link on system prompt.

c. To process the request, press Enter.

RJ link active but data not transferring

287

RJ link active but data not transferringFollowing an initial program load (IPL), the RJ link may appear to be active when data cannot actually flow from the source system to the target system journal receiver. This is an operating system restriction. MIMIX does not receive notification of a failure.

To recover, end the RJ lInk and restart it following an IPL. This can be included in automation programs.

Errors using target journal defined by RJ link

288

Errors using target journal defined by RJ linkIf you receive errors when using a target journal defined by an RJ link, you may need to change the journal definition and journaling environment. This situation is caused when the target journal definition is created as a result of adding an RJ link based on a source journal definition which specified QSYSOPR as the threshold message queue.

If you receive errors when using the target journal, do the following:

1. On the Work with Journal Definitions display, locate the target journal definition that is identified by the errors.

2. Type a 5 (Display) next to the target journal definition and press Enter.

3. Page down to see the value of the Threshold message queue.

• If the value is QSYSOPR, press F12 and continue with the next step.

• For any other value, the cause of the problem needs further isolation beyond this procedure.

4. Type a 2 (Change) next to the target journal definition and press Enter.

5. Press F9 (All parameters), then page down to locate the Threshold message queue and Library prompts.

6. Change the Threshold message queue prompt to *JRNDFN and the Library prompt to *JRNLIB, or to other acceptable values.

7. To accept the change, press Enter.

Verifying data group file entries

Verifying data group file entriesThe Verify Data Group File Entries (VFYDGFE) command allows you to verify files from a specific library by verifying the current state of the file on the system identified in the data group as the source of data.

This procedure generates a report in a spooled file named MXVFYDGFE. The information in the report includes whether each member for the specified search criteria is defined to MIMIX, the journal and library to which it is journaled, whether it uses after-image journaling or before- and after-image journaling, the apply session used. This information can help you verify that you have all the files you need from a library properly defined to MIMIX DB Replicator.

To verify data group file entries, do the following:

1. On a command line, type VFYDGFE (Verify Data Group File Entries). The Verify DG File Entries display appears.

2. Specify the name of the data group at the Data group definition prompt.

3. At the System 1 file and Library prompts, specify the value you want and the library in which the files are located.

4. If you want to create a spooled file that can be printed, specify *PRINT at the Output prompt. Then press Enter.

Verifying data group data area entriesThe Verify Data Group Data Area Entries (VFYDGDAE) command allows you to verify the data areas in a specific library defined to a data group definition. The audit report determines the data source for the data group and retrieves the appropriate information.

This procedure generates a report in a spooled file named MXVFYDAE. The information in the report includes whether each data area for the specified search criteria is defined to MIMIX and the length of each data area. This information can help you verify that you have all the data areas you need from a library defined to MIMIX DB Replicator.

To verify data group data area entries, do the following:

1. On a command line, type VFYDGDAE (Verify Data Group Data Area Entries). The Verify DG Data Area Entries (VFYDGDAE) display appears.

2. Specify the name of the data group at the Data group definition prompt.

3. At the System 1 data area and Library prompts, specify the value you want and the library in which the data areas are located and press Enter.

Verifying key attributesBefore you configure for keyed replication, verify that the file or files you for which you want to use keyed replication are actually eligible.

289

Verifying key attributes

Do the following to verify that the attributes of a file are appropriate for keyed replication:

1. On a command line, type VFYKEYATR (Verify Key Attributes). The Verify Key Attributes display appears.


• To verify a file in a library, specify a file name and a library.

• To verify all files in a library, specify *ALL and a library.

• To verify files associated with the file entries for a data group, specify *MIMIXDFN for the File prompt and press Enter. Prompts for the Data group definition appear. Specify the name of the data group that you want to check.

3. Press Enter.

4. A spooled file is created that indicates whether you can use keyed replication for the files in the library or data group you specified. Display the spooled file (WRKSPLF command) or use your standard process for printing. You can use keyed replication for the file if *BOTH appears in the Replication Type Allowed column. If a value appears in the Replication Type Defined column, the file is already defined to the data group with the replication type shown.

290

Working with data group timestamps

Working with data group timestampsTimestamps allow you to view the performance of the database send, receive, and apply processes for a data group to identify potential problem areas, such as a slow send process, inadequate communications capacity, or excessive overhead on the target system. Although they can assist you in identifying problem areas, timestamps are not intended as an accurate means of calculating the performance of MIMIX.

A timestamp is a single record that is passed between all replication processes. The timestamp originates on the source system as a journal entry, is sent to the target system, and then processed by the associated apply session. The timestamp record is updated with the date and time at each of the following areas during the replication process:

• Created - Date and time the journal entry is created

• Sent - Date and time when the journal entry is sent to the target system

• Received - Date and time when the journal entry is received

• Applied - Date and time when the journal entry is applied

Note: For data groups that use remote journaling, the created and sent timestamps will be set to the same value. The received timestamp will be set to the time when the record was read on the target system by the database reader process.

After all four timestamps have been added, the journal entry is converted and placed into a file for viewing or printing. You can view timestamps only from the management system. The system manager must be active to return the timestamps to the management system.

Automatically creating timestamps

The data group definition includes a parameter for automatically creating timestamps. MIMIX automatically creates a timestamp after the number of journal entries specified in the Timestamp interval (TSPITV) has passed. The timestamp entry created is placed at the end of all current entries in the journal receiver. You specify this value when you create or change a data group definition. You can change this value at any time.

Note: Data groups configured for remote journaling will not automatically generate timestamps. To generate timestamps in this case, refer to “Creating timestamps for remote journaling processing” on page 292.

Creating additional timestamps

Note: By using the Create Data Group Timestamps (CRTDGTSP) command in a batch job, you can use timestamps to monitor performance at critical times in your daily processing.

To create one or more timestamps, do the following:

1. From the Work with Data Groups display, type 41 (Timestamps) next to the data group you want and press Enter.

291


2. The Work with DG Timestamps display appears. Type a 1 (Create) next to the blank line at the top of the display and press Enter.

3. The Create Data Group Timestamps display appears. Specify the name of the data group and the number of timestamps you want to create and press Enter.

Note: You should generate multiple timestamps to receive a more accurate view of replication process performance.

Creating timestamps for remote journaling processing

If you need to generate timestamps to monitor replication performance, you can set up automation to create them for remote journaling (RJ) data groups that you wish to monitor.

In this procedure, you will create an interval monitor using the Create Monitor Object (CRTMONOBJ) command. This is accomplished by specifying *CMD for the interface exit program on the monitor object, and then Create Data Group Timestamps (CRTDGTSP) for the command (*CMD). You can also run CRTDGTSP manually or schedule a job to run the command in batch. For more information, see “Creating an interval monitor” in the MIMIX Monitor book.

Do the following to create an interval monitor:

1. From the Work with Monitors display, type a 1 (Create) in the Opt column next to the blank line at the top of the list and press Enter.

2. The Create Monitor Object (CRTMONOBJ) display appears. Do the following:

a. At the Monitor prompt, provide a unique name for the monitor.

b. At the Event class prompt, specify *INTERVAL.

c. At the Interface exit program prompt, specify *CMD.

d. At the Time interval (sec.) prompt, specify how often the interval monitor should run and press Enter. By default, this monitor runs every 15 seconds. Use your data group time stamp interval (default is every 20,000 entries) to estimate how many entries you process a day. From there, determine how often you need to run the monitor in order to provide an adequate sample.

3. The Add Monitor Information (ADDMONINF) display appears. Do the following:

a. At the Command prompt, type CRTDGTSP.

b. At the Library prompt, type the name of your installation library and press F4 (Prompt).

4. The Create Data Group Timestamps (CRTDGTSP) display appears. Do the following:

a. At the Data group definition prompts, specify the name of the RJ data group.

b. At the Number of stamps to create prompt, specify the number of timestamps you want to create and press Enter.

5. From the Work with Monitors display, type a 9 (Start) next to the interval monitor you created. This allows you to start generating timestamps. For information about viewing timestamps, see “Displaying or printing timestamps” on page 293.

292


Repeat this procedure for each RJ data group for which you want to generate timestamps.

Deleting timestamps

You can delete all timestamps or you can select a group of one or more timestamps to delete.

To delete timestamps for a data group, do the following:


2. The Work with DG Timestamps display appears. Type a 4 (Delete) next to the timestamps you want to delete and press Enter.

3. A confirmation screen appears. Press Enter.

To selectively delete a range of timestamps, do the following:

1. Type the command DLTDGTSP and press F4 (Prompt).

2. The Delete Data Group Timestamps display appears. Specify values you want for the Data group definition prompt.

3. Specify the values you want for the Starting date and time prompt and for the Ending date and time prompt, then press Enter.

Displaying or printing timestamps

To display or print data group timestamps, do the following:


2. The Work with DG Timestamps display appears. Do one of the following:

• To display the timestamp information, type a 5 (Display) next to the data group you want.

• To print the timestamp information, type a 6 (Print) next to the data group you want.

3. Press Enter.

4. If you selected to display, the Display Data Group Timestamps display appears. If you selected to print, a spooled file is created that you can print using your standard printing procedures.

293

Removing journaled changes

294

Removing journaled changesIf the necessary environment is available, MIMIX can support the Remove Journaled Changes (RMVJRNCHG) journal entry by simulating the Remove Journaled Changes process on the backup system.

Note: This is a long running procedure and will affect your existing journal changes. Ensure that performing this procedure is appropriate for your environment.

In order to use the Remove Journaled Changes journal entry, you must meet the following criteria:

• You must be configured for both before and after image journaling. This can be defined as a default file entry option at the data group level or it can be defined for individual data group file entries.

• You must be configured with *SEND as the value of the Before images element of the DB journal entry processing (DBJRNPRC) parameter of the data group definition. This permits the database apply process to roll back certain types of journal entries.

• If you have large objects (LOBs), *YES must be the value for the Use remote journal link (RJLNK) parameter of the data group definition.

• The target system (where replicated changes are applied) must have the log spaces that contain the original transactions. To ensure that the appropriate log spaces are retained, you can do one of the following:

– Calculate how many log spaces need to be retained using the log space size and the size and number of the receivers containing the appropriate journal transactions. Then, set elements of the database apply processing (DBAPYPRC) parameter in the data group definition.

– Use the Hold Data Group Log (HLDDGLOG) command to place a hold on the delete operation of all log spaces for all apply sessions defined to the specified data group. The log spaces are held until a request to release them with Release Data Group Log (RLSDGLOG) command is received.

If you are changing an existing data group to have these values, you must end and restart the data group before you are able to use the RMVJRNCHG command.

Performing journal analysis

Performing journal analysisWhen a source system fails before MIMIX has sent all journal entries to the target system, unprocessed transactions occur. Unprocessed transactions can also occur if journal entries are in the communications buffer being sent to the target system when the sending system fails.

Following an unplanned switch, unprocessed transactions on the original source system must be addressed in order to prevent data loss before synchronizing data and starting data groups.

The journal analysis process finds any missing transactions that were not sent to the target system when the source system went down and an unplanned switch to the backup was performed. Once unprocessed transactions are located, users must analyze the journal entries and take appropriate actions to resolve them.

The time at which to perform journal analysis is when the original source system has been brought back up and before performing the synchronization phase of the switch (which synchronizes data and starts data groups). Analyze all data groups that were not disabled at the time of the unplanned switch.

Note: The journal analysis tool is limited to database files replicated from a user journal. The tool does not identify unprocessed transactions for data areas, data queues, or IFS objects replicated through a user journal, or database files configured for replication from the system journal.

From the original source system, do the following:

1. Ensure the following are started:

a. The port communications jobs (PORTxxxxx)

b. The MIMIX system managers using STRMMXMGR SYSDFN(*ALL) MGR(*SYS) TGTJRNINSP(*NO)

IMPORTANT! Only the system managers should be started at this time. Do not start journal managers. Also, do not start data groups at this time! Doing so will delete the data required to perform the journal analysis.

2. From the Work with Data Groups display on the original source system, enter 43 (Journal analysis) next to the data group to be analyzed.

The Journal Analysis of Files display appears.

3. Check for the following:

• If a pop-up window with the message “Journal analysis information not collected” is displayed in the list area, press Enter to collect journal analysis information, then go to Step 6.

• If there is no pop-up window and message LVI379A is displayed at the bottom of the display, journal analysis determined that all journal entries have been applied. There are no unprocessed entries to display. The sequence number of the last applied journal entry is displayed in the Last applied field. No further action is needed.

295


• If there is no pop-up window and no message at the bottom of the display, information about files from a previous run of journal analysis exists, go to Step 4.

4. If you did not see a pop-up window in Step 3, and information about a previous run exists, clear data from the previous run of journal analysis and collect new information by doing the following:

a. If you want to keep information from a previous run, make a copy of file DM6500P located in the installation library.

b. Press F9 (Update display) to clear the screen and collect the new information.

A pop-up confirmation window with the following message is displayed: “WARNING! The journal analysis journal entries file will be cleared!”

c. Press Enter to submit the update request.

5. Press Enter to submit the update request.

6. The request to collect journal analysis information is submitted by job RTVFILANZ using the job description MIMIXQGPL/MIMIXDFT. When the job completes, “LVI3855 Retrieval of affected files for journal analysis completed normally” appears in the message log. Press F5 (Refresh) to see the collected information. It may take a short time to collect the information.

7. Retrieve journal entries. The journal entries for the files identified on the display must be retrieved before you can use options to display or print statistics (5 and 6) or display journal entries (11). Do one of the following:

• Press F14 (Retrieve all entries). A pop-up window stating “Confirm retrieval of ALL analysis journal entries” appears. Press Enter. (The retrieved information is placed in an internal file.This does not produce a spool file.)

• If there are a large number of files listed on the display, you may want to retrieve entries for only a selected file at a time. Type option 9 (Retrieve journal entries) next to the file to retrieve journal entries for and press Enter. The retrieved journal entries are placed in a spool file named MXJEANZL.

Message: “LVI3856 Retrieval of journal entries for journal analysis completed normally” appears in the message log.

8. Review the collected information using the following:

• Use option 11 (Display journal entries) to view the entries for each file.

• Use F21 (Print list) to print all entries for a file.

• You can use options 5 (Display statistics) and 6 (Print statistics) to see the statistical breakdown of journal entries for a selected file member identified by journal analysis. The statistics include the number of adds, deletes, and updates, along with the related file transactions and dates of the first and last journal entries.

Figure 36 shows an example of the information displayed by option 11 for one journal entry.

296


Figure 36. Sample of one journal entry

9. Determine what action you need to take for each unprocessed entry. For example:

• You may need to run the original job again on the current source system to reproduce the entries.

• If a file has already been updated on the current source system (manually or otherwise), you may need to merge data from both files. If this is the case, do not synchronize the files.

• If there are write changes (R-PT entries), these changes should be made on the current source system before running the synchronization phase of the switch or starting data groups in order to maintain Relative Record Number consistency within the file. If this is done after the data group has been started, the relative record numbers could become unsynchronized between the two systems.

Note: It is the customer’s responsibility to fix the files.

Removing journal analysis entries for a selected file

You can use option 4 (Remove journal entries) to remove all journal analysis journal entries for a selected file member. A confirmation display appears to confirm your choices. When you continue with the confirmation, the journal entries for the selected

Data group definition: <DGDFN> <SYS1> <SYS2> Journal definition: <JRNDFN> <SYSDFN> File identification File . . . . : <FILE> Library . : <LIB> Member . . . : <MBR> Journal header information Journal code . . . . . : R Record-level information Journal type . . . . . : DL Delete record Generated date . . . . : 9/08/09 Generated time . . . . : 10:36:31 Job name . . . . . . . : <JOB NAME> User name . . . . . . : <USER> Job number . . . . . . : <JOB NBR> Program name . . . . . : <PROGRAM> Journal header information (continued) Record length . . . . : 607 Record number . . . . : 838 Operation indicator . : 0 Commit cycle ID . . . : 0 Journal identification Journal name . . . . . : <JOURNAL> Library . . . . . . : <JRNLIB> Receiver identification Receiver name . . . . : <RCVR> Library . . . . . . : <RCVRLIB> Sequence number . . . : <JOURNAL SEQUENCE>

297


file member are immediately removed from the journal analysis information that is displayed. It does not delete any other information contained in other MIMIX files.

298

299

APPENDIX A Interpreting audit results - supporting information

Audits use commands that compare and synchronize data. The results of the audits

are placed in output files associated with the commands. The following topics provide supporting information for interpreting data returned in the output files.

• “When the difference is “not found”” on page 302 provides additional considerations for interpreting result of not found in priority audits.

• “Interpreting results for configuration data - #DGFE audit” on page 300 describes the #DGFE audit which verifies the configuration data defined to your configuration using the Check Data Group File Entries (CHKDGFE) command.

• “Interpreting results of audits for record counts and file data” on page 303 describes the audits and commands that compare file data or record counts.

• “Interpreting results of audits that compare attributes” on page 306 describes the Compare Attributes commands and their results.

Interpreting results for configuration data - #DGFE audit

Interpreting results for configuration data - #DGFE audit The #DGFE audit verifies the configuration data that is defined for replication in your configuration. This audit invokes the Check Data Group File Entries (CHKDGFE) command for the audit’s comparison phase. The CHKDGFE command collects data on the source system and generates a report in a spooled file or an outfile.

The report is available on the system where the command ran. The values in the Result column of the report indicate detected problems and the result of any attempted automatic recovery actions. Table 50 shows the possible Result values and describes the action to take to resolve any reported problems.

The Option column of the report provides supplemental information about the comparison. Possible values are:

*NONE - No options were specified on the comparison request.

*NOFILECHK - The comparison request included an option that prevented an error from being reported when a file specified in a data group file entry does not exist.

*DGFESYNC - The data group file entry was not synchronized between the source and target systems. This may have been resolved by automatic recovery

Table 50. CHKDGFE - possible results and actions to for resolving errors

Result Recovery Actions

*NODGFE No file entry exists.

Create the DGFE or change the DGOBJE to COOPDB(*NO)

Note: Changing the object entry affects all objects using the object entry. If you do not want all objects changed to this value, copy the existing DGOBJE to a new, specific DGOBJE with the appropriate COOPDB value.

*EXTRADGFE An extra file entry exists.

Delete the DGFE or change the DGOBJE to COOPDB(*YES)

Note: Changing the object entry affects all objects using the object entry. If you do not want all objects changed to this value, copy the existing DGOBJE to a new, specific DGOBJE with the appropriate COOPDB value.

*NOFILE No file exists for the existing file entry.

Delete the DGFE, re-create the missing file, or restore the missing file.

*NOMBR No file member exists for the existing file entry.

Delete the DGFE for the member or add the member to the file.

*RCYFAILED Automatic audit recovery actions were attempted but failed to correct the detected error.

Run the audit again.

*RECOVERED Recovered by automatic recovery actions.

No action is needed.

*UA File entries are in transition and cannot be compared.

Run the audit again.

300

Interpreting results for configuration data - #DGFE audit

actions for the audit.

One possible reason why actual configuration data in your environment may not match what is defined to your configuration is that a file was deleted but the associated data group file entries were left intact. Another reason is that a data group file entry was specified with a member name, but a member is no longer defined to that file. If you use the automatic scheduling and automatic audit recovery functions of MIMIX AutoGuard, these configuration problems can be automatically detected and recovered for you. Table 51 provides examples of when various configuration errors might occur.

Table 51. CHKDGFE - possible error conditions

Result File

exists

Member

exists

DGFE

exists

DGOBJE exists

*NODGFE Yes Yes No COOPDB(*YES)

*EXTRADGFE Yes Yes Yes COOPDB(*NO)

*NOFILE No No Yes Exclude

*NOMBR Yes No Yes No entry

301

When the difference is “not found”

302

When the difference is “not found” For audits that compare replicated data, a difference indicating the object was not found requires additional explanation. This difference can be returned for these audits:

• For the #FILDTA and #MBRRCDCNT audits, a value of *NF1 or *NF2 for the difference indicator (DIFIND) indicates the object was not found on one of the systems in the data group. The 1 and 2 in these values refer to the system as identified in the three-part name of the data group.

• For the #FILATR, #FILATRMBR, #IFSATR, #OBJATR, and #DLOATR audits, a not found condition is indicated by a value of *NOTFOUND in either the system 1 indicator (SYS1IND) or system 2 indicator (SYS2IND) fields. Typically, the DIFIND field result is *NE.

Audits can report not found conditions for objects that have been deleted from the source system. A not found condition is reported when a delete transaction is in progress for an object eligible for selection when the audit runs. This is more likely to occur when there are replication errors or backlogs, and when policy settings do not prevent audits from comparing when a data group is inactive or in a threshold condition.

A scheduled audit will not identify a not found condition for an object that does not exist on either system because it selects existing objects based on whether they are configured for replication by the data group. This is true regardless of whether the audit is automatically submitted or run immediately.

Because a priority audit selects already replicated objects, it will not audit objects for which a create transaction is in progress.

Prioritized audits will not identify a not found condition when the object is not found on the target system because prioritized auditing selects objects based on the replicated objects database. Only objects that have been replicated to the target system are identified in the database.

Priority audits can be more likely to report not found conditions when replication errors or backlogs exist.

Interpreting results of audits for record counts and file data


The audits and commands that compare file data or record counts are as follows:

• #FILDTA audit or Compare File Data (CMPFILDTA) command

• #MBRRCDCNT audit or Compare Record Count (CMPRCDCNT) command

Each record in the output files for these audits or commands identifies a file member that has been compared and indicates whether a difference was detected for that member.

What differences were detected by #FILDTA

The Difference Indicator (DIFIND) field identifies the result of the comparison. Table 52 identifies values for the Compare File Data command that can appear in this field

Table 52. Possible values for Compare File Data (CMPFILDTA) output file field Difference Indicator (DIFIND)

Values Description

*APY The database apply (DBAPY) job encountered a problem processing a U-MX journal entry for this member.

*CMT Commit cycle activity on the source system prevents active processing from comparing records or record counts in the selected member.

*CO Unable to process selected member. Cannot open file.

*CO (LOB) Unable to process selected member containing a large object (LOB). The file or the MIMIX-created SQL view cannot be opened.

*DT Unable to process selected member. The file uses an unsupported data type.

*EQ Data matches. No differences were detected within the data compared. Global difference indicator.

*EQ (DATE) Member excluded from comparison because it was not changed or restored after the timestamp specified for the CHGDATE parameter.

*EQ (OMIT) No difference was detected. However, fields with unsupported types were omitted.

*FF The file feature is not supported for comparison. Examples of file features include materialized query tables.

*FMC Matching entry not found in database apply table.

*FMT Unable to process selected member. File formats differ between source and target files. Either the record length or the null capability is different.

303


See “When the difference is “not found”” on page 302 for additional information.

What differences were detected by #MBRRCDCNT

Table 53 identifies values for the Compare Record Count command that can appear in the Difference Indicator (DIFIND) field.

*HLD Indicates that a member is held or an inactive state was detected.

*IOERR Unable to complete processing on selected member. Messages preceding LVE0101 may be helpful.

*NE Indicates a difference was detected.

*NF1 Member not found on system 1.


*REP The file member is being processed for repair by another job running the Compare File Data (CMPFILDTA) command.

*SJ The source file is not journaled, or is journaled to the wrong journal.

*SP Unable to process selected member. See messages preceding message LVE3D42 in job log.

*SYNC The file or member is being processed by the Synchronize DG File Entry (SYNCDGFE) command.

*UE Unable to process selected member. Reason unknown. Messages preceding message LVE3D42 in job log may be helpful.

*UN Indicates that the member’s synchronization status is unknown.

Table 52. Possible values for Compare File Data (CMPFILDTA) output file field Difference Indicator (DIFIND)

Values Description

Table 53. Possible values for Compare Record Count (CMPRCDCNT) output file field Dif-ference Indicator (DIFIND)

Values Description

*APY The database apply (DBAPY) job encountered a problem processing a U-MX journal entry for this member.

*CMT Commit cycle activity on the source system prevents active processing from comparing records or record counts in the selected member.

*EC The attribute compared is equal to configuration

*EQ Record counts match. No difference was detected within the record counts compared. Global difference indicator.

304


See “When the difference is “not found”” on page 302 for additional information.

*FF The file feature is not supported for comparison. Examples of file features include materialized query tables.

*FMC Matching entry not found in database apply table.

*HLD Indicates that a member is held or an inactive state was detected.

*LCK Lock prevented access to member.

*NE Indicates a difference was detected.



*SJ The source file is not journaled, or is journaled to the wrong journal.

*UE Unable to process selected member. Reason unknown. Messages preceding LVE3D42 in job log may be helpful.

*UN Indicates that the member’s synchronization status is unknown.

Table 53. Possible values for Compare Record Count (CMPRCDCNT) output file field Dif-ference Indicator (DIFIND)

Values Description

305

Interpreting results of audits that compare attributes

Interpreting results of audits that compare attributesEach audit that compares attributes does so by calling a Compare Attributes1 command and places the results in an output file. Each row in an output file for a Compare Attributes command can contain either a summary record format or a detailed record format. Each summary row identifies a compared object and includes a prioritized object-level summary of whether differences were detected. Each detail row identifies a specific attribute compared for an object and the comparison results.

For example, an authorization list can contain a variable number of entries. When comparing authorization lists, the CMPOBJA command will first determine if both lists have the same number of entries. If the same number of entries exist, it will then determine whether both lists contain the same entries. If differences in the number of entries are found or if the entries within the authorization list are not equal, the report will indicate that differences are detected. The report will not provide the list of entries—it will only indicate that they are not equal in terms of count or content.

You can see the full set of fields in the output file by viewing it from a 5250 emulator.

What attribute differences were detected

The Difference Indicator (DIFIND) field identifies the result of the comparison. Table 54 identifies values that can appear in this field. Not all values may be valid for every Compare command.

When the output file is viewed from a 5250 emulator, the summary row is the first record for each compared object and is indicated by an asterisk (*) in the Compared Attribute (CMPATR) field. The summary row’s Difference Indicator value is the prioritized summary of the status of all attributes checked for the object. When included, detail rows appear below the summary row for the object compared and show the actual result for the attributes compared.

The Priority2 column in Table 54 indicates the order of precedence MIMIX uses when determining the prioritized summary value for the compared object.

1. The Compare Attribute commands are: Compare File Attributes (CMPFILA), Compare Object Attributes (CMPOBJA), Compare IFS Attributes (CMPIFSA), and Compare DLO Attributes (CMP-DLOA).

Table 54. Possible values for output file field Difference Indicator (DIFIND)

Values1 Description Summary

Record2 Priority

*EC The values are based on the MIMIX configuration settings. The actual values may or may not be equal.

5

*EQ Record counts match. No differences were detected. Global difference indicator.

5

*NA The values are not compared. The actual values may or may not be equal.

5

306


For most attributes, when the outfile is viewed from a 5250 emulator, when a detailed row contains blanks in either of the System 1 Indicator or System 2 Indicator fields, MIMIX determines the value of the Difference Indicator field according to Table 55.

*NC The values are not equal based on the MIMIX configuration settings. The actual values may or may not be equal.

3

*NE Indicates differences were detected. 2

*NS Indicates that the attribute is not supported on one of the systems. Will not cause a global not equal condition.

5

*RCYSBM Indicates that MIMIX AutoGuard submitted an automatic audit recovery action that must be processed through the user journal replication processes. The database apply (DBAPY) will attempt the recovery and send an *ERROR or *INFO notification to indicate the outcome of the recovery attempt.

*RCYFAILED Used to indicate that automatic recovery attempts via MIMIX AutoGuard failed to recover the detected difference.

*RECOVERED Indicates that recovery for this object was successful. 1

*SJ Unable to process selected member. The source file is not journaled.

1

*SP Unable to process selected member. See messages preceding message LVE3D42 in job log.

1

*UA Object status is unknown due to object activity. If an object difference is found and the comparison has a value specified on the Maximum replication lag prompt, the difference is seen as unknown due to object activity. This status is only displayed in the summary record.

Note: The Maximum replication lag prompt is only valid when a data group is specified on the command.

2

*UN Indicates that the object’s synchronization status is unknown. 4

1. Not all values may be possible for every Compare command.2. Priorities are used to determine the value shown in output files for Compare Attribute commands.

Table 54. Possible values for output file field Difference Indicator (DIFIND)

Values1 Description Summary

Record2 Priority

307


For example, if the System 1 Indicator is *NOTFOUND and the System 2 Indicator is blank (Object found), the resultant Difference Indicator is *NE.

When viewed through Vision Solutions Portal, data group directionality is automatically resolved so that differences are viewed as Source and Target instead of System1 and System2.

For a small number of specific attributes, the comparison is more complex. The results returned vary according to parameters specified on the compare request and MIMIX configuration values. For more information about comparison results for journal status and other journal attributes, auxiliary storage pool ID (*ASP), user profile status (*USRPRFSTS), and user profile password (*PRFPWDIND) see the see the MIMIX Administrator Reference book.

Where was the difference detected

The System 1 Indicator (SYS1IND) and System 2 (SYS2IND) fields show the status of the attribute on each system as determined by the compare request. Table 56 identifies the possible values. These fields are available in both summary and detail rows in the output file.

Table 55. Difference Indicator values that are derived from System Indicator values.

Difference Indicator

System 1 Indicator

Object Found (blank value)

*NOTCMPD *NOTFOUND *NOTSPT *RTVFAILED *DAMAGED

System

2

Indicator

Object Found (blank value)

*EQ / *NE / *UA / *EC / *NC

*NA *NE *NS *UN *NE

*NOTCMPD *NA *NA *NE *NS *UN *NE

*NOTFOUND *NE / *UA *NE / *UA *EQ *NE / *UA *NE / *UA *NE

*NOTSPT *NS *NS *NE *NS *UN *NE

*RTVFAILED *UN *UN *NE *UN *UN *NE

*DAMAGED *NE *NE *NE *NE *NE *NE

Table 56. Possible values for output file fields SYS1IND and SYS2IND

Value Description Summary Record1

Priority

<blank> No special conditions exist for this object. 5

*DAMAGED Object damaged condition. 3

*MBRNOTFND Member not found. 2

*NOTCMPD Attribute not compared. Due to MIMIX configuration settings, this attribute cannot be compared.

N/A2

308


For comparisons which include a data group, the Data Source (DTASRC) field identifies which system is configured as the source for replication.

What attributes were compared

In each detailed row, the Compared Attribute (CMPATR) field identifies a compared attribute. For more information about identifying attributes that can be compared by each command and the possible values returned, see the MIMIX Administrator Reference book.

“Attributes compared and expected results - #FILATR, #FILATRMBR audits” on page 689

*NOTFOUND Object not found. 1

*NOTSPT Attribute not supported. Not all attributes are supported on all IBM i releases. This is the value that is used to indicate an unsupported attribute has been specified.

N/A2

*RTVFAILED Unable to retrieve the attributes of the object. Reason for failure may be a lock condition.

4

1. The priority indicates the order of precedence MIMIX uses when setting the system indicators fields in the summary record.

2. This value is not used in determining the priority of summary level records.

Table 56. Possible values for output file fields SYS1IND and SYS2IND

Value Description Summary Record1

Priority

309

MIMIX procedures when performing an initial program load (IPL)

APPENDIX B IBM Power™ Systems operations that affect MIMIX

The following topics describe how to protect the integrity of your MIMIX environment when you perform operations such as IPLs and IBM i operating system upgrades. Only basic procedures for a standard one-to-one MIMIX installation are covered. If you are operating in a complex environment—if you have cluster, SAP R/3, IBM WebSphere MQ, or other application considerations, for example—contact your Certified MIMIX Consultant. Ultimately, you must tailor these procedures to suit the needs of your particular environment.

These topics describe MIMIX-specific steps only. Refer to the user manuals that correspond to any additional applications installed in your environment. For

instructions on performing IBM Power™ Systems operations, consult your IBM manuals or the IBM Information Center at http://publib.boulder.ibm.com/pubs/html/as400/infocenter.html.

The following topics are included:

• “MIMIX procedures when performing an initial program load (IPL)” on page 310 includes the MIMIX-specific steps for performing an initial program load (IPL) to help ensure the integrity of your MIMIX environment is not compromised.

• “MIMIX procedures when performing an operating system upgrade” on page 312 describes when and how to perform recommended MIMIX-specific steps while performing a standard upgrade of IBM i.

• “MIMIX procedures when upgrading hardware without a disk image change” on page 319 describes MIMIX prerequisites and procedures for performing a hardware upgrade without a disk image change.

• “MIMIX procedures when performing a hardware upgrade with a disk image change” on page 321 describes prerequisites for saving and restoring MIMIX software when upgrading from one system to another.

• “Handling MIMIX during a system restore” on page 326 includes prerequisites for restoring MIMIX software within a MIMIX system pair, to one system from a save of the other system when an environment meets the conditions specified.


An initial program load (IPL) loads the operating system and prepares the system for user operations. Performing the recommended MIMIX-specific steps can help ensure that objects are not damaged during the IPL and that the integrity of your MIMIX environment is not compromised.

Note: This procedure describes an IPL performed under normal circumstances. It does not address IPL considerations for system switching environments.

310

http://publib.boulder.ibm.com/pubs/html/as400/infocenter.html


Before beginning this procedure, review your startup procedures to determine whether subsystems will start after the IPL. This startup program is defined in the QSTRUPPGM system value.

To perform an IPL in a MIMIX environment, do the following:

1. End MIMIX including the MIMIX managers and end the RJ links using the following command:

ENDMMX ENDOPT(*IMMED) ENDRJLNK(*YES)

Note: For more information about the ENDMMX command, see “Commands for ending replication” on page 184.

2. If the VSP server is running on the system you are about to IPL, use the following command to end the VSP server:

VSI001LIB/ENDVSISVR

3. Ensure that all MIMIX jobs are ended before performing this step. End the MIMIX subsystem on the system you are about to IPL. Type the following on a command line and press Enter:

ENDSBS SBS(MIMIXSBS) OPTION(*IMMED)

Note: If you are running VSP in the MIMIX subsystem for any product, be aware that all VSP processes will end.

4. Perform the IPL.

5. If your subsystems do not start during the startup procedures defined in the QSTRUPPGM system value, start the MIMIX subsystems on both the source and target systems. On each system, type the following on a command line and press Enter:

STRSBS SBSD(MIMIXQGPL/MIMIXSBS)

6. Verify the communication links start, using the Verify Communications Link (VFYCMNLNK) command. For more information about the VFYCMNLNK command, see “Verifying a communications link for system definitions” on page 281.

If the communications link is not active, you may need to start the port job. On a command line type the following and press Enter: STRSVR HOST(host-name-or-address) PORT(port-number)

7. Start MIMIX from either the source or target system. The Start MIMIX (STRMMX) command starts the MIMIX processes for the installation, including the MIMIX managers and the data groups.

For more information about the STRMMX command, see “Starting MIMIX” on page 179.

8. If you ended the VSP server, restart it using the following command:

VSI001LIB/STRVSISVR

311

MIMIX procedures when performing an operating system upgrade


This topic describes when and how to perform recommended MIMIX-specific steps while performing a standard upgrade of the IBM i operating system (slip-install, where the IBM i release is upgraded without a restore of the user libraries). Performing these recommended steps can help ensure that MIMIX products start properly once the operating system upgrade is complete.

Table 57 indicates which procedures are needed for different upgrade scenarios. Use these instructions in conjunction with the instructions provided by IBM for upgrading from one IBM i release to another IBM i release.

Table 57. IBM i operating system upgrade scenarios and recommended processes for handling MIMIX dur-ing the upgrade

To upgrade Perform these procedures

Backup system only 1. Perform the preparation steps described in “Prerequisites for performing an OS upgrade on either system” on page 313.

2. Follow the procedure in “MIMIX-specific steps for an OS upgrade on a backup system” on page 313.

Production system only 1. Perform the preparation steps described in “Prerequisites for performing an OS upgrade on either system” on page 313.

2. Perform one of the following procedures:

• If you need to maintain user access to production applications during the upgrade, perform a planned switch as described in “MIMIX-specific steps for an OS upgrade on a production system with switching” on page 315. Your production operations will be temporarily running on the backup system.

• If you have more flexibility with scheduling downtime, you can perform the upgrade without switching as described in “MIMIX-specific steps for an OS upgrade on the production system without switching” on page 317.

Both backup and production systems

1. Perform the preparation steps described in “Prerequisites for performing an OS upgrade on either system” on page 313.

2. Upgrade the backup system first following the “MIMIX-specific steps for an OS upgrade on a backup system” on page 313. By doing this first, you can ensure that the backup system supports all the capabilities of the production system and you can work through problems or custom operations before affecting your production environment.

3. Once you have the verified that the backup system is upgraded and operating as desired, perform one of the following procedures to upgrade IBM i on the production system:

• If you need to maintain user access to production applications during the upgrade, perform a planned switch as described in “MIMIX-specific steps for an OS upgrade on a production system with switching” on page 315. Your production operations will be temporarily running on the backup system.

• If you have more flexibility with scheduling downtime, you can perform the upgrade without switching as described in “MIMIX-specific steps for an OS upgrade on the production system without switching” on page 317

312


Prerequisites for performing an OS upgrade on either system

Before you start an upgrade of the IBM i operating system on either system, do the following:

1. Access Support information on the web as you perform the following steps to ensure that the system is ready to upgrade:

a. Check the compatibility of the operating systems on the production and backup systems, ensuring the systems will meet the requirements of a MIMIX-supported environment once the IBM i operating system upgrade has occurred.

b. Ensure the recommended IBM IBM i PTFs have been applied according to your IBM i version.

c. Ensure the recommended MIMIX service packs have been applied according to your MIMIX version. Review the Read Me document that corresponds to the MIMIX service pack, and check the website for relevant Technical Alerts and FAQs.

2. Review your startup procedures to understand how your environment is configured to start after an IPL. This startup program is defined in the QSTRUPPGM system value. An IBM i upgrade may include rebuilding access paths, converting formats, or performing other operations that must be complete before MIMIX or other applications are started. The upgrade may not complete successfully if your QSTRUPPGM procedures start MIMIX or other applications during an IPL. Ensure that these processes are disabled before continuing with the IBM i upgrade.

MIMIX-specific steps for an OS upgrade on a backup system

Use this procedure to upgrade the operating system on the backup system.

Notes:

• If you plan to upgrade both the production and backup systems during the same scheduled maintenance period, upgrade the backup system first.

• In the following steps, the terms production and backup always refer to the original roles of the systems before upgrading the operating system on either system. The icons at the beginning of some steps show the state of the systems and replication as a result of the action in the step. The arrow in the icon indicates the direction and state of replication for a classic production to backup environment.

When performing an operating system upgrade of a backup system in a MIMIX environment, do the following:

1. Ensure that you have completed any prerequisite tasks for your upgrade scenario. See Table 57 for a list of required tasks for different upgrade scenarios.

2. End all user applications, user interfaces, and operations actively running on the backup system. Disarm any monitors and all job schedulers.

3. End MIMIX including the MIMIX managers and end the RJ links using the following command:

313

http://www.mimix.com/support



Note: For more information about the ENDMMX command, see “Commands for ending replication” on page 184.

4. If the VSP server is running on the system you are about to upgrade, use the following command to end the VSP server:

VSI001LIB/ENDVSISVR

5. Ensure that all MIMIX jobs are ended before performing this step. End the MIMIX subsystem on the system you are about to upgrade. Type the following on a command line and press Enter:



6. Complete the operating system upgrade. Allow any upgrade conversions and access path rebuilds to complete before continuing with the next step.

Note: During the IBM i upgrade, make sure you perform a system save on the system being upgraded. This step will provide you with a backup of existing data.

7. Ensure the names of the journal receivers match the journal definitions:

a. From the backup system, specify the command: installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL)

b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build) and press F4. Type *JRNDFN for the Source for values parameter and press Enter.

8. Start the MIMIX subsystems on both the production and backup systems using the following command from each system:


9. Verify the communication links start, using the Verify Communications Link (VFYCMNLNK) command. For more information about the VFYCMNLNK command, see “Verifying a communications link for system definitions” on page 281.


10. Perform a normal start of the data groups from either system using the STRMMX command. This step also starts the MIMIX managers.

11. Perform your normal process for validating the IBM i release upgrade.

Notes:

• At your convenience, schedule a switch to verify that your applications function on the new operating system on the backup system.

• MIMIX supports replication for up to two version level differences. If running different OS versions, Vision Solutions recommends that the backup node run the

314


higher OS. The following restrictions and limitations may also apply:

• All objects must be compiled using the Target Release parameter to specify the IBM i version of the lower level operating system.

• Possible inherent restrictions include those specific to new functionality in the OS. There may be features in a higher OS release that would not be available in a lower one, such as new command parameters or APIs.

• Some errors may be encountered during or following a role swap due to the different OS versions.

12. If you ended the VSP server, restart it using the following command:

VSI001LIB/STRVSISVR

MIMIX-specific steps for an OS upgrade on a production system with switching

Use this procedure if you need to maintain user access to production applications during the production system upgrade. This procedure temporarily switches production activity to the backup system before the upgrade and switches back to normal operations after the production system upgrade is complete.

Notes:

• In the following steps, the terms production and backup always refer to the original roles of the systems before upgrading the operating system on either system. The icons at the beginning of some steps show the state of the systems and replication as a result of the action in the step. The arrow in the icon indicates the direction and state of replication for a classic production to backup environment.

• MIMIX supports replication for up to two version level differences. If running different OS versions, Vision Solutions recommends that the backup node run the higher OS. The following restrictions and limitations may also apply:




To perform an operating system upgrade of the production system in a MIMIX environment while maintaining availability, do the following:


2. If applicable, disable auditing, including prioritized audits to avoid having them start before ending MIMIX or immediately after re-starting MIMIX during the upgrade. For instructions to disable audits, see “Preventing audits from running” on page 45.

3. Use the procedures in your Runbook to perform a planned switch to the

315


backup system.

Note: Do not perform steps to synchronize data and start replication from the backup system to the original production system.

If you do not have a Runbook, you need to follow your processes for the following:

a. End all user applications, user interfaces, and operations actively running on the production system. Disarm any monitors and all job schedulers.

b. Perform a planned switch to the backup system.

c. Start user applications on the backup system and allow users to access their applications from the backup system.

4. End MIMIX, including the MIMIX managers, and end the RJ links using the following command:


For more information about the ENDMMX command, see “Commands for ending replication” on page 184.

5. If the VSP server is running on the system you are about to upgrade, use the following command to end the VSP server:

VSI001LIB/ENDVSISVR

6. Ensure that all MIMIX jobs are ended before performing this step. End the MIMIX subsystem on the system you are about to upgrade. Type the following on a command line and press Enter:



7. On the production system, complete the operating system upgrade. Allow any upgrade conversions and access path rebuilds to complete before continuing with the next step.



a. From the original production system, specify the command: installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL)


9. Start the MIMIX subsystem using the following command:


10. Verify the communication links start, using the Verify Communications Link (VFYCMNLNK) command. For more information about the VFYCMNLNK command, see “Verifying a communications link for system definitions” on

316


page 281.


11. Start the MIMIX managers and collector services with the following command:

STRMMXMGR SYSDFN(*ALL) MGR(*SYS) COLSRV(*YES)

12. If applicable, start the VSP server using the command:

VSI001LIB/STRVSISVR

13. Follow your Runbook procedures to start replication (sync). If you do not have a Runbook, follow your processes for starting data groups or application groups.

14. Ensure that no backlog exists. See “Identifying replication processes with backlogs” on page 115.

15. Re-enable all audits.

16. When you are ready to switch back to the production system and start replication, follow your Runbook procedures. If you do not have a Runbook, follow your processes to switch replication so that you return to your normal replication environment.

MIMIX-specific steps for an OS upgrade on the production system with-out switching

Use this procedure if you have more flexibility with scheduling downtime and can perform the upgrade without switching.

Note: In the following steps, the terms production and backup always refer to the original roles of the systems before upgrading the operating system on either system. The icons at the beginning of some steps show the state of the systems and replication as a result of the action in the step. The arrow in the icon indicates the direction and state of replication for a classic production to backup environment.

To perform an operating system upgrade of the production system in a MIMIX environment without switching, do the following:


2. End all user applications, user interfaces, and operations actively running on the production system. Disarm any monitors and job schedulers.

For more information, refer to your Runbook and your applications’ user manuals.

3. End the data groups from either system using the command:

ENDDG DGDFN(*ALL) ENDOPT(*CNTRLD)

For more information about ending data groups see “Commands for ending replication” on page 184.

4. Wait until the status of each data group becomes inactive (red) by monitoring the

317


status on the Work with Data Groups (WRKDG) display.

For more information about the WRKDG display, see “The Work with Data Groups display” on page 99.

5. If you have applications that use commitment control, ensure there are no open commit cycles. For more information, see “Checking for open commit cycles” on page 183.

If an open commit cycle exist, restart the data group and repeat Step 3, Step 4, and Step 5 until there is no open commit cycle for any apply session.

6. Use the following command to end other MIMIX products in the installation library, end the MIMIX managers, and end the RJ link:

ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES)

7. End the MIMIX subsystems on the production system and on the backup system. On each system, type the following on a command line and press Enter:


8. Complete the operating system upgrade. Allow any upgrade conversions and access path rebuilds to complete before continuing with the next step.


9. Start the MIMIX subsystems on the production system and the backup system as you would during the synchronization phase of a switch. From each system, type the following on a command line and press Enter:



a. From the production system, specify the command: installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL)


c. Record the newly attached journal receiver name by placing the cursor on the posted message and pressing F1 or Help.

11. Using the information you gathered in Step 10, start each data group as follows (This step also starts the MIMIX managers.):

a. From the WRKDG display, type an 9 (Start DG) next to the data group and press Enter. The Start Data Group display appears.

b. At the Object journal receiver prompt, specify the receiver name recorded in Step 10c.

c. At the Object large sequence number prompt, specify *FIRST.

d. At the Clear pending prompt, specify *YES.

12. Start any applications that you disabled prior to completing the IBM i upgrade

318

MIMIX procedures when upgrading hardware without a disk image change

according to your Runbook instructions. These applications are normally started in the program defined in the QSTRUPPGM system value. Allow users back on the production system.

Note: MIMIX supports replication for up to two version level differences. If running different OS versions, Vision Solutions recommends that the backup node run the higher OS. The following restrictions and limitations may also apply:





This topic describes MIMIX prerequisites and procedures for a hardware upgrade without a disk image change that will change a model, feature, or serial number and require a new license key. Performing these steps can ensure that MIMIX products start properly once the hardware upgrade is complete.

Considerations for performing a hardware system upgrade without a disk image change

Before you start a hardware upgrade on either system, consider the following:

• Ensure the new system is compatible with and meets the requirements for a MIMIX-supported environment. For more information, see the Supported Environments Matrix in the Technical Documents section of Support Central.

• Apply the latest MIMIX fixes on both systems. The fixes are available by product in the Downloads section of Support Central.

• Obtain new MIMIX product license keys. These codes are required for products when a model, feature, or serial number changes. For more information, see “Working with license keys” in the License and Availability Manager book.

• Determine whether a planned switch is required prior to the hardware upgrade. For example, a switch would be necessary if the source system is being upgraded and users need to continue working while the upgrade takes place. To perform a switch, follow the steps in your runbook. For more information, see “Switching” on page 244.

• Determine if the transfer definitions need to be changed. For example, transfer definitions would need to be changed if the IP addresses or host names change. For more information, see “Configuring transfer definitions” in the MIMIX Administrator Reference book.

319

http://portal.lakeviewtech.com/wps/portal



MIMIX-specific steps for a hardware upgrade without a disk image change

Use this procedure to restart your MIMIX installation when updating your hardware. If you have special considerations, contact your Certified MIMIX Consultant for assistance. Before you begin, ensure that “Considerations for performing a hardware system upgrade without a disk image change” on page 319 have been reviewed and completed where applicable.

Hardware upgrade without a disk image change - preliminary steps

To perform this portion of the upgrade process, do the following on the system prior to the upgrade:

1. Ensure MIMIX is operating normally before performing the upgrade. There should be no files or objects in error and all transactions should be caught up. See “Resolving common replication problems” on page 207 for more information about resolving problems.

2. If upgrading a production system, ensure users are logged off the system and perform a controlled end of all MIMIX data groups. For more information, see “Ending a data group in a controlled manner” on page 195.

3. Optional step: If upgrading the source system and performing a planned switch, follow the steps in your runbook. For more information, see “Switching” on page 244.

4. Use the following command to end all MIMIX products in the installation library, end the MIMIX managers, and end the RJ links:


For more information, see “Ending MIMIX” on page 179.

5. Record status information for each data group in case it is needed later. Do the following:

WRKDG DGDFN(*ALL) OUTPUT(*OUTFILE) OUTFILE(MIMIXQGPL/SWITCH)

6. Optional step: Save the MIMIX software and Vision Solutions Portal by doing a full system save or by saving the following MIMIX installation libraries and IFS directories:

• LAKEVIEW

• MIMIXQGPL

• MIMIX-installation-library

• MIMIX-installation-library_0


• VSI001LIB

• /LakeviewTech (directory tree)

• /visionsolutions/http/vsisvr

320

MIMIX procedures when performing a hardware upgrade with a disk image change

Hardware upgrade without a disk image change - subsequent steps

To perform this portion of the upgrade process, do the following on the system after the upgrade has been completed:

1. Optional step: Update any transfer definitions that require changes. For more information, see “Configuring transfer definitions” in the MIMIX Administrator Reference book.

2. Enter the new product license key on the system. Do the following:

a. From the MIMIX main menu select option 31 (Product Management Menu). The License Manager Main Menu appears.

b. Select option 1 (Update license key). The Update License Keys (UPDLICKEY) command appears. Follow the instructions displayed for obtaining license keys. For more information, see “Obtaining license keys using UPDLICKEY command” in the License and Availability Manager book.

3. Confirm that communications work between the new system and other systems in the MIMIX environment. For more information, see “Verifying a communications link for system definitions” on page 281.

4. Optional step: If you need to keep users active, perform a planned switch to the backup system by following the steps in your runbook. See “Considerations for performing a hardware system upgrade without a disk image change” on page 319 to determine if a switch is required.

If you do not have a Runbook, you need to follow your processes for the following:

a. End all user applications, user interfaces, and operations actively running on the production system.

b. Perform a planned switch to the backup system.

c. Start user applications on the backup system and allow users to access their applications from the backup system.

5. Start MIMIX from either the source or target system. The Start MIMIX (STRMMX) command starts the MIMIX processes for the installation, including the MIMIX managers, data groups, and application groups. For more information about the STRMMX command, see “Starting MIMIX” on page 179.

6. Run your MIMIX audits to verify the systems are synchronized. See “Running an audit immediately” on page 131 for more information about running audits.


When a hardware upgrade is being performed on a system, MIMIX software may need to be saved from the system being replaced and then restored to the system that is its replacement. The saved MIMIX information must be restored on a system that performs the same role within MIMIX operations. For example, if the network system is being replaced, MIMIX software must be saved from the network system

321


and restored on the new network system. A network system cannot be restored to a new management system.

This topic describes steps to consider prior to saving and restoring MIMIX software when upgrading from one system to another. Performing these steps can ensure that MIMIX products start properly once the hardware upgrade is complete.

IMPORTANT! To ensure the integrity of your data, contact your Certified MIMIX Consultant for assistance performing a hardware upgrade.

Considerations for performing a hardware system upgrade with a disk image change

Before you start a hardware upgrade on either system, consider the following:

• Contact your contact your Certified MIMIX Consultant prior to performing the upgrade for instructions that may be specific to your environment.

• Ensure the new system is compatible with and meets the requirements for a MIMIX-supported environment. For more information, see the Supported Environments Matrix in the Technical Documents section of Support Central.

• Apply the latest MIMIX fixes on both systems. The fixes are available by product in the Downloads section of Support Central.

• Obtain new MIMIX product license keys. These codes are required for products when a model, feature, or serial number changes. For more information, see “Working with license keys” in the License and Availability Manager book.

• Determine whether a planned switch is required prior to the hardware upgrade. For example, a switch would be necessary if the source system is being upgraded and users need to continue working while the upgrade takes place. To perform a switch, follow the steps in your runbook. For more information, see “Switching” on page 244.

• Determine if the transfer definitions need to be changed. For example, transfer definitions would need to be changed if the IP addresses or host names change. For more information, see “Configuring transfer definitions” in the MIMIX Administrator Reference book.

• Copy all automation for MIMIX to the new machine, including exit programs.

• Transfer any modifications of programs such as QSTARTUP to the new system. Modifications may be needed to start the MIMIX subsystem after an IPL. Refer to your Runbook for an overview of the required automation changes that need to be performed on the system.

MIMIX-specific steps for a hardware upgrade with a disk image change

Use this procedure to save and restore your MIMIX installation when updating your hardware with a disk image change. If you have special considerations, contact your Certified MIMIX Consultant for assistance. Before you begin, ensure that “Considerations for performing a hardware system upgrade with a disk image change” on page 322 have been reviewed and completed where applicable.

322




Hardware upgrade with a disk image change - preliminary steps

To perform the save portion of the upgrade process, do the following on the old system prior to the upgrade:

1. Ensure MIMIX is operating normally before performing the upgrade. There should be no files or objects in error and all transactions should be caught up. See “Resolving common replication problems” on page 207 for more information about resolving problems.

2. Optional step: Perform a switch by following the steps in your runbook. See “Considerations for performing a hardware system upgrade with a disk image change” on page 322 to determine if a switch is required.

3. Ensure users are logged off the system and all applications have ended. Perform a controlled end of all MIMIX data groups. For more information, see “Ending a data group in a controlled manner” on page 192.

4. Use the following command to end all MIMIX products in the installation library, end the MIMIX managers, and end the RJ links:


For more information, see “Ending MIMIX” on page 179.

5. Ensure there are no open commit cycles. For more information, see “Checking for open commit cycles” on page 183.

If open commit cycles exist, restart the data group and repeat Step 4 to end all MIMIX products.

6. Print the status information for each data group by doing the following:

a. From the Work with Data Groups display, type 8 (Display detail status) next to each data group.

b. Press Enter.

c. Press F7 for object status and print the display. Keep the information for later use.

d. Press F8 for database status and print the display. Keep the information for later use.

7. Print the list of system values. Type the following on a command line and press Enter: WRKSYSVAL SYSVAL(*ALL) OUTPUT(*PRINT)

8. Optional step: Save the MIMIX software and Vision Solutions Portal from the old system by doing a full system save or by saving the following MIMIX installation libraries and IFS directories:

• LAKEVIEW

• MIMIXQGPL

• MIMIX-installation-library



323


• VSI001LIB

• /LakeviewTech (directory tree)

• /visionsolutions/http/vsisvr

Hardware upgrade with a disk image change - subsequent steps

To perform this portion of the upgrade process, do the following after you have upgraded and restored all user data, including all MIMIX libraries:

Note: To ensure that journaling is properly started, restore journals and journal receivers before restoring user data.

1. Ensure the following system values are set the same way on the new system as they were on the old system: QAUDCTL, QAUDLVL, QALWOBRST, QALWUSRDMN, and QLIBLCKLVL.

2. On a command line, type LAKEVIEW/UPDINSPRD and press Enter.

3. Enter the new product license key on the system. Do the following:

a. From the MIMIX main menu select option 31 (Product Management Menu). The License Manager Main Menu appears.

b. Select option 1 (Update license key). The Update License Key (UPDLICKEY) command appears. Follow the instructions displayed for obtaining license keys. For more information, see “Obtaining license keys using UPDLICKEY command” in the License and Availability Manager book.

4. On a command line, type CALL MXXPREG and press Enter to register the MIMIX exit points in the system registry.

5. Update any transfer definitions that require changes. For more information, see “Considerations for performing a hardware system upgrade with a disk image change” on page 322.

6. Confirm that communications work between the new system and other systems in the MIMIX environment. For more information, see “Verifying a communications link for system definitions” on page 281.

7. Ensure all automation, including MIMIX exit programs, for MIMIX is available and configured on the new system.

8. Make any necessary modifications to the QSTARTUP program. This may need to be modified to start the MIMIX subsystem after an IPL. For more information, see “Considerations for performing a hardware system upgrade with a disk image change” on page 322.

9. Start the MIMIX subsystem with the following command:


10. Optional step: Perform a data group switch by following the steps in your runbook, then skip to Step 13. See “Considerations for performing a hardware system upgrade with a disk image change” on page 322 to determine if a switch is required.

11. Start the system manager with the following command:

324


STRMMXMGR SYSDFN(*ALL) MGR(*SYS)

12. Start MIMIX on the upgraded system using the appropriate instructions in this step.

If the source system was upgraded, start MIMIX as follows:

a. On the source system, type WRKJRNDFN JRNDFN(*ALL *LOCAL) on a command line, and press Enter.

b. Press F10 to verify the Receiver Prefix, Library, and all other parameters (option 5) are correct. Make any necessary changes from the MIMIX management system before continuing.

c. For each journal definition that has an RJ Link parameter value of *SRC or *NONE do the following:

• Type option 14 and press F4=PROMPT.

• Type *JRNDFN for the Source for values parameter and press Enter.

• Record the newly attached journal receiver name by placing the cursor on the posted message and pressing F1 or Help.

d. Run the appropriate Verify Journaling command to ensure the objects are journaled to the correct journal:

For each data group runVFYJRNFE DGDFN(DGNAME) FILE1(*ALL)

For each IFS file entry runVFYJRNIFSE DGDFN(DGNAME)

For each object entry runVFYJRNOBJE DGDFN(DGNAME)

e. Start the MIMIX managers and collector services with the following command:STRMMXMGR SYSDFN(*ALL) MGR(*SYS) COLSRV(*YES)

f. Start the VSP server using the command:VSI001LIB/STRVSISVR

g. Start the data groups with a clear pending start from the receivers recorded in Step c of this procedure:STRDG DGDFN(data-group-name) DBJRNRCV(user-journal-receiver) DBSEQNBR2(*FIRST) OBJJRNRCV(security-journal-receiver) OBJSEQNBR2(*FIRST) CLRPND(*YES) DTACRG(*YES)

h. Delete any old receivers with different library or prefix names.

i. User and application activity can be resumed on the system.

If the target system was upgraded, start MIMIX as follows:

a. On the target system, type WRKJRNDFN JRNDFN(*ALL *LOCAL) on a command line, and press Enter.

b. Press F10 to verify the Receiver Prefix, Library, and all other parameters (option 5) are correct. Make any necessary changes from the MIMIX management system before continuing.

325

Handling MIMIX during a system restore

c. Type option 14 for each journal definition that has an RJ Link parameter value of *SRC or *NONE. Do not press enter.

d. On the command line, type the following and press Enter to build a new journal receiver for the journal definitions:JRNVAL(*JRNDFN)

e. On a command line, type the following and press Enter:WRKJRNDFN JRNDFN(*ALL *LOCAL) RJLNK(*TGT)

f. For each journal definition listed, do the following:

• Type option 17 (Work with jrn attributes) and press Enter.

• Type option 15 (Work with receiver directory) and press Enter.

• Type option 4 (Delete) for all receivers in the list. If message CPA7025 is issued, reply with an “I”.

g. Run the appropriate Verify Journaling command to ensure the objects are journaled to the correct journal:

For each data group runVFYJRNFE DGDFN(DGNAME) FILE1(*ALL)

For each IFS file entry runVFYJRNIFSE DGDFN(DGNAME)

For each object entry runVFYJRNOBJE DGDFN(DGNAME)

h. Start the MIMIX managers and collector services with the following command:STRMMXMGR SYSDFN(*ALL) MGR(*SYS) COLSRV(*YES)

i. Start the VSP server using the command:VSI001LIB/STRVSISVR

j. Use this Start Data Group (STRDG) command to start all data groups with the information collected in Step 6 of “Hardware upgrade with a disk image change - preliminary steps” on page 323: STRDG DGDFN(data-group-name) DBJRNRCV(last-processed-data-base-journal-receiver) DBSEQNBR2(last-processed-data-base-sequence-number) OBJJRNRCV(last-processed-object-journal-receiver) OBJSEQNBR2(last-processed-object-sequence-number) CLRPND(*YES) DTACRG(*YES)

k. Delete any old receivers with different library or prefix names.

13. Run your MIMIX audits to verify the systems are synchronized. See “Running an audit immediately” on page 131 for more information about running audits.

Handling MIMIX during a system restoreOccasionally, an entire system may need to be restored because of a system failure. For example, if there is a processor or OS (operating system) failure. This topic includes prerequisites for restoring MIMIX software within a MIMIX system pair (two systems using the same MIMIX installation) to one system from a save of the other

326

Handling MIMIX during a system restore

system. A system restore may need to be performed when the when the following conditions exist:

• The original production system, including the Licensed Internal Code and the OS, has been recovered from the backup system by tape.

• The IBM installed release level is the same on each system.

IMPORTANT! To ensure the integrity of your data, contact your Certified MIMIX Consultant for assistance performing a system restore.

For information about MIMIX-supported environments, see the Supported Environments Matrix in the Technical Documents section of Support Central.

Prerequisites for performing a restore of MIMIX

Before you restore MIMIX on a system, consider the following steps which help ensure that MIMIX products start properly once the restore is complete:

• Contact your contact your Certified MIMIX Consultant prior to performing the restore for instructions that may be specific to your environment.

• Locate your MIMIX product license keys. These codes may be required after the restore. For more information, see “Working with license keys” in the Using License Manager book.

327


Index

Symbols*ATTN

application group 65managers for node 67monitors 62replication 69

*CANCEL, step status of 88*CANCELED, procedure status of 82, 91*FAILED activity entry 224, 227*FAILED status

procedure 82, 91step 88

*HLDfile entry 210tracking entry 219

*HLDERRfile entry 210tracking entry 219

*HLDRLTD file entry 210*INACTIVE

application group 66node managers 67replication 69

*MSGW statusprocedure 81step 87

*UNKNOWN 66

Aaccessing

MIMIX Availability Status display 93MIMIX Main Menu 24

activity entries, objectconfirm delay/retry cycle 228failed, resolving 224remove history 229retrying 227

additional resources 13application group

resolving reported problems 64status of 60

application group definition 17, 20application node status 65applications, reducing contention with 279audit

#DGFE considerations 127#DLOATR considerations 127#FILDTA considerations 127#IFSATR considerations 127

#MBRRCDCNT considerations 127after a configuration change 126authority level to run 23automatic starting of 124before switching 126best practice 23, 126, 145bi-directional environment considerations 27change history retention criteria 43changing schedule 41compare phase 123comparison levels 53compliance 144compliance threshold 52, 53definition of 17differences, resolving 133displaying compliance status 145displaying history 137displaying runtime status 129displaying schedule 147displaying time of next scheduled run 147displaying when automatic audits run 147ending 136history 137job log 135last performed 145last successful run 144no objects selected 139objects compared 139policies which affect 36policies, runtime behavior 36policies, submitting automatically 37prevent from running 45priority selection example 139priority, default settings of 37problems reported in installation 99recovery phase 123results 133results recommendations 127retain history of 54rule name 39running immediately 131schedule 147schedule, changing 41scheduled, default settings of 37status from 5250 emulator 96status, compliance 144status, runtime 129summary 129three or more node considerations 27when not to audit 28

328

audit historychange criteria 43

audit levelbest practice 53changing before switch 43

audit results 133#DGFE rule 300#FILDTA rule 303#MBRRCDCNT rule 303interpreting, attribute comparisons 306interpreting, file data comparisons 303resolving problems 133, 300troubleshooting 135

auditing level, objectset when starting a data group 174used for replication 231

authority levelfor product access 23

AutoGuard, MIMIX 17automatic error recovery

replication, policies for 32system journal replication 34user journal replication 33

automatic recoveryaudits 50concept 17system journal replication 50user journal replication 50

AutoNotify feature, MIMIX 163Availability Status display, MIMIX 93

Bbacklog

starting shared object send job 175system manager 151

backlog, identifying a 115backup node sequence

changing 71examples of changing 73verifying 70

backup system 18best practice

audit frequency 145audit level 53, 126audit level before switch 43, 53, 126audit threshold 52, 53switch frequency 244, 253switch threshold 56switching 245

best practicesauditing 126

bi-directional environment policy consider-ations 27

Ccancel

procedure 92clear error entries

processing 175when to 181

clear pending entriescheck for open commits 183open commit cycle prevents 183processing 175resolving open commits before 183when to 181

cluster services 21cold start, replacement for 175collector services 21

ending 153starting 152status 149

collector services status 67command, by name

Work with Audit History 137commands, by mnemonic

CHGDG 270CHGPROCSTS 89CHKDGFE 283, 300CNLPROC 92CRTDGTSP 291DLTDGTSP 293DSPDGSTS 105DSPDGTSP 293DSPMMXMSGQ 208DSPRJLNK 264ENDAG 169ENDDG 169, 184, 190ENDJRNFE 236ENDJRNIFSE 239ENDJRNOBJE 242ENDJRNPF 236ENDMMX 169, 184, 192ENDRJLNK 274ENDSVR 261HLDDGLOG 294MIMIX 24RLSDGLOG 294

329

RUNPROC 90STRAG 169, 171STRDG 169, 171, 174STRJRNFE 235STRJRNIFSE 238STRJRNOBJE 241STRMMX 169, 171, 179STRRJLNK 274STRSVR 260SWTDG 255, 257VFYCMNLNK 281, 282VFYJRNFE 237VFYJRNIFSE 240VFYJRNOBJE 243VFYKEYATR 289WRKAG 60WRKAUDHST 137WRKAUDOBJ 139WRKAUDOBJH 142WRKCPYSTS 263WRKDG 99WRKDGACT 224WRKDGACTE 225WRKDGFE 210WRKDGIFSTE 219WRKDGOBJTE 219WRKDGTSP 291WRKDTARGE 68WRKMMXSTS 93, 164WRKMSGLOG 209WRKNFY 160WRKNODE 66WRKPROCSTS 78WRKRJLNK 265, 267WRKSTEPSTS 83

commands, by nameCancel Procedure 92Change Data Group 270Change Procedure Status 89Check Data Group File Entries 283, 300Create Data Group Timestamps 291Delete DG Timestamps 293Display Data Group Status 105Display Data Group Timestamps 293Display MIMIX Message Queue 208Display RJ Link 264End Application Group 169End Data Group 169, 184, 190End Journal Physical File 236End Journaling File Entry 236

End Journaling IFS Entries 239End Journaling Obj Entries 242End Lakeview TCP Server 261End MIMIX 169, 184, 192End RJ Link 274Hold Data Group Log 294MIMIX 24MIMIX Availability Status 93Release Data Group Log 294Run Procedure 90Start Application Group 169, 171Start Data Group 169, 171, 174Start Journaling File Entry 235Start Journaling IFS Entries 238Start Journaling Obj Entries 241Start Lakeview TCP Server 260Start MIMIX 169, 171, 179Start RJ Link 274Switch Data Group 255, 257Verify Communications Link 281, 282Verify Journaling File Entry 237Verify Journaling IFS Entries 240Verify Journaling Obj Entries 243Verify Key Attributes 289Work with Application Groups 60Work with Audited Obj. History 142Work with Audited Objects 139Work with Copy Status 263Work with Data Group Activity 224Work with Data Groups 99Work with Data Rsc. Grp. Ent. 68Work with DG Activity Entries 225Work with DG File Entries 210Work with DG IFS Tracking Ent. 219Work with DG Obj Tracking Ent. 219Work with DG Timestamps 291Work with Message Log 209Work with MIMIX Availability Status 164Work with Node Entries 66Work with Notifications 160Work with Procedure Status 78Work with RJ Links 265, 267Work with Step Status 83

commit cycleseffect on audit comparison 303, 304

commit cycles, openchecking for 183checking for after a controlled end 196preventing problems with 187preventing STRDG request 183

330

commit mode changeprevents starting with open commits 183

communicationsending TCP sever 261starting TCP sever 260

compare phase 123compliance

audit 144concept 125switch 253switch, policies for 49

compliance statusswitch 253

conceptsauditing 122MIMIX 17

configurationaudit after changing 126determining data areas and data queues 272determining, IFS objects 271results of #DGFE audit after changing 300

configuration changes deployed 174contacting Vision Solutions 14contention with applications, reducing 279controlled end

confirm end 196description 186procedure 195wait time 187

cooperative processing 20copying active files 263correcting

file-level errors 216record-level errors 217

CustomerCare 14

Ddata areas and data queues

determining configuration of 272holding user journal entries for 221resolving problems 220tracking entries 219verifying journaling 243

data group 17backlogs 115controlled vs. immediate end 186definition 20determining if RJ link used 267disabling 270

enabling 270ending considerations 190ending controlled 195ending immediately 198ending selected processes 198indication of disabled state 269recovery point cleared 190starting selected processes 181state, disabled or enabled 269status from 5250 emulator 95status, database view 112status, detailed 105status, merged view 106status, object view 110status, summary 99switching 249, 255timestamps 291when to exclude from auditing 28

data group entrydescription 20

data resource group 68replication status summary 68

database apply (DBAPY) status 113database apply cache policy 51database error recovery, automatic 33definition

application group 20data group 20journal 20remote journal (RJ) link 20system 20transfer 20

definitionsapplication group 17

delay/retry cycle, confirm object in a 228differences, resolving audit 133disabled data group 269displaying

data group spooled file information 262data group status details 105long IFS object names 262RJ link 264RJ link status 265status 93

documents, MIMIX 11

Eending

audit 136

331

collector services 153MIMIX managers 152MIMIXSBS subsystem 193system and journal managers 152target journal inspection 154TCP server 261

ending data groupclears recovery point 190considerations when ending 190controlled end 195controlled end wait time 187controlled vs. immediate 186how to confirm end 196immediate end 198processes 187processes, effect of 203processes, specifying selected 198when to end RJ link 188

ending journalingdata areas and data queues 242files 236IFS objects 239IFS tracking entry 239object tracking entry 242

ending MIMIX 192controlled vs. immediate 186end subsystem, when to also 193follow up after 193included processes 188using default values 192using specified values 192when to end RJ link 188

ending replication 169choices 184controlled vs. immediate 186

ending RJ linkindependently from data group 274when to end 188

errorsfile level 216record level 217system journal replicated objects 224target journal of RJ link 288user journal replicated files 210user journal replicated objects 219

examplepriority audit object selection 139

exampleschanging backup node sequence 73

Ffile

file-level errors 216hold journal entries 214new 231not journaled 102record-level errors 217replicated 210

file identifiers (FIDs) 273file in error

examine held journal entries 213resolving 210

file on holdrelease and apply held entries 215release and clear entries 216release at synchronization point 215

Hhardware upgrade

MIMIX-specific steps 320no disk image change 319prerequisites 319with a disk image change 321

held error (*HLDERR)file entry 210preferred action for entry 211, 220tracking entry 219

historyaudited object 142completed audits 137displaying audit 137

history log, removing completed entries 229history of, retaining

procedures 31hold (*HLD)

preferred action for held entry 211, 220put file entry on hold 214put tracking entry on hold 221release a held file entry 215release a held tracking entry 223

hold ignore (*HLDIGN)preferred action for ignored entry 211, 220put file entry on hold ignore 214put tracking entry on hold ignore 222

hold related (*HLDRLTD) 211hot backup 15

Ii5/OS upgrade 312

332

IFS objectsdetermining configuration 271file IDs (FIDs) 273hold user journal entries for 221path names 262resolving problems 220tracking entries for 219verifying journaling 240

immediate enddescription 186incomplete tracking entry 186

information and additional resources 13inspection

target journal 22installation, status of

from 5250 emulator 93IPL 310

Jjob log

for audit 135jobs

used by procedures 77used by procedures, status of 83

journal 19inspection on target system 22

journal at createrequirements 231requirements and restrictions 232

journal cache or stateresolving problems 103, 119status 117

journal definition 20defined to RJ Link 268

journal entrydescription 19unconfirmed 286

journal manager 21ending 152resolving problems 149starting 152status 149

journal receiver 19journaling 19

cannot end 236data group problem with 101ending for data areas and data queues 242ending for IFS objects 239ending for physical files 236

implicitly started 231requirements for starting 231starting for data areas and data queues 241starting for IFS objects 238starting for physical files 235starting, ending, and verifying 230verifying for data areas and data queues 243verifying for IFS objects 240verifying for physical files 237

journaling statusdata areas and data queues 241files 235IFS objects 238

Kkeyed replication

verifying file attributes 289

Llast audit performed 144last switch performed 253log space 22long IFS path names 262

Mmanagement system 19manager

journal 21system 21

manager status 67menu

MIMIX Main 24message queue, primary and secondary 208messages

ENDMMX 172STRMMX 172

MIMIX AutoGuard 17MIMIX CDP feature

exclude from audit 28recovery point cleared 190

MIMIX installation 17MIMIX managers

checking for a backlog 151ending 152resolving problems 149, 151starting 152

MIMIX Model Switch Framework 22, 249policy default 56

MIMIX rules 122

333

MIMIX subsystem (MIMIXSBS)starting 179when to end 193

MIMIX Switch Assistant 22setting default switch framework 48setting switch compliance policies 49

MMNFYNEWE monitor 163monitor for newly created objects 163monitors

nodes where needed 62status of 61

Nnames, displaying long 262network system 19new hardware upgrade 321

MIMIX-specific steps 322prerequisites 322

new objectsIFS object journal at create requirements 231journal at create selection criteria 232

newly created objects, notification of 163node entries 66node status

application group 65data resource group 68

nodes, policy considerations for multiple 27notification status 63notifications

definition 18, 159displaying 160, 164new problems in installation 99severity level 125, 161status 160

Oobject

audited history 142object auditing

concept 19setting level with STRDG 174used for replication 231

object error recovery, automatic 34object send process

considerations for starting a shared 175objects

audited object list 139configuration of non-file 271displaying long IFS names 262

displaying objects in error 108displaying objects with active entries 108in error, resolving 224new 231reducing contention 279tracking entries for data areas and data queues 219

open commit cyclesaudit results 303, 304prevent problems with 187resolving before starting replication 183shown in status 196when starting a data group 183

operationscommon, where to start 98less common 259

orphaned recoveries 167output file fields

Difference Indicator 303, 306System 1 Indicator field 308System 2 Indicator field 308

Ppath names, IFS 262planned switch 245policies 18

audit, automatically submitting 37audit, runtime behavior of 36changing values 29for auditing 36for replication 32for switching 48installation-level only 31introduction 26multi-node and bi-directional environment considerations 27

policyaction for running audits 54audit action threshold 53audit history retention 54audit level 53audit notify on success 50audit rule 50audit schedule 57audit warning threshold 52automatic audit recovery 50automatic database recovery 50automatic object recovery 50CMPRCDCNT commit threshold 56

334

data group definition 50database apply cache 51default model switch framework 56independent ASP library ratio 56journaling attribute difference action 51maximum rule runtime 52notification severity 50object only on target 51prioritized audit in effect 147procedure history retention 57run rule on system 54switch action threshold 56switch warning threshold 56synchronize threshold size 55system journal recovery success 50third delay retry interval 56third delay retry interval, number of 55user journal apply threshold 51user journal recovery success 50

PPRC replication status 65, 68problems

reporting a problem 278troubleshoot 276

problems, journalingdata areas and data queues 241files 235IFS objects 238

problems, resolvingaudit results 133, 300common errors 207common system level errors 149data group cannot end 280data group cannot start 285files in error 210files not journaled 102journal cache or state 103, 119objects in error 224open commits when starting data group 183RJ link cannot end 286RJ link cannot start 286switch compliance 254system level processes 149tracking entries 219

procedureacknowledging failed or canceled 89begin at step 90, 173, 248canceling 92defined 22displaying status 78history retention 57

how to run 90last run of all 78multiple jobs 77multiple jobs, status of 83overriding step attributes 91resolve problems 80resuming canceled or failed 91run type *USER 90run type other than *USER 90status 80status history of a 79step status 83

procedure historychange criteria 31

procedures 77change history retention criteria 31history retention 31

processessystem level 149

production system 18publications, IBM 13

QQDFTJRN data area

restrictions 232role in processing new objects 232

QSTRUPPGM system value 311, 313

Rrecommendations

auditing 126before planned switch 245checking audit results 127policies in bi-directional environment 27policies in three or more node environment 27starting shared object send 175

recoveriesactive in installation 99definition 18, 160detected database errors 33displaying details 164occurring in installation 164orphaned 167orphaned, identifying 167orphaned. removing 168

recovery domainchanging backup sequence 71verifying sequence 70

335

recovery phase 123recovery point

cleared by ENDDG 190release (*RLS)

held file entry 215held tracking entry 223

release clear (*RLSCLR)file entry 216tracking entry 223

release wait (*RLSWAIT)file entry 215tracking entry 222

remote journali5/OS function 19

remote journal (RJ) link 20remote journal environment

processes ended by ENDDG 203processes started by STRDG 199unconfirmed journal entry 286

removingactivity history entries 229duplicate tracking entries 223unconfirmed entries 286

reorganizing, active files 263replication

automatic error recovery 32backlogs, identifying 115before starting 171commands for ending 184commands for starting 171direction of 18ending 169policies that affect 32resolve replication errors 207starting 169status from 5250 emulator 95status summary 65, 68supported paths 15switching 244system journal 15, 21user journal 15, 21

replication path 21replication, problems

troubleshoot 276where to start 207

requirementsaudits 126journal at create 231journaling 231

resolving problems

application group 64application group *ATTN status 65application group other problem status val-ues 66common replication errors 207data resource group status 68node entry status 66system level jobs 149troubleshooting 276

resource group, data 68status 68

restore MIMIXprerequisites 327

restrictionsjournal at create 232QDFTJRN data area 232

retry objects in error 227retrying, data group activity entries 227RJ link 20

displaying 264ending independently 274errors for target journal of 288identifying data groups that use 267journal definitions by an 268operating without a data group 274removing unconfirmed entries 286status 265when to end 188

rule#DGFE 39#DLOATR 39#FILATR 39#FILATRMBR 39#FILDTA 39#IFSATR 39#OBJATR 39

rulesMIMIX 122

rules, MIMIXdescriptions 39

runprocedure 90

runningaudits immediately 131

Sschedule

automatically submitted audits 37changing audit 41

336

schedulerauditing 124

serversending TCP 261starting TCP 260

servicecluster 21status collector 21

servicescollector, ending 153collector, starting 152status from 5250 emulator 96

severity level, notification 161source system 18spooled files, displaying MIMIX-created 262standby journaling

IBM i5/OS option 42 117overview 117

startingcollector services 152MIMIX managers 152procedure at step 90, 173, 248RJ link independently 274system and journal managers 152target journal inspection 153TCP server 260

starting data groupat specified journal location 175deploy configuration 174prevented by open commit cycles 183procedure 181processes, effect of 199set object auditing level 174when to clear entries 181

starting journalingdata areas and data queues 241file entry 235files 235IFS objects 238IFS tracking entry 238object tracking entry 241

starting MIMIXincluded processes 171procedure 179

starting replication 169before 171choices 171

status 60active file operations 263application group 60

audit compliance 145audits 129audits (runtime) 96checking from 5250 emulator 93collector services 67data group detail 105data group summary 95database apply (DBAPY) 113installation summary 93journal cache or state 117journaling data areas and data queues 241journaling files 235journaling IFS objects 238journaling tracking entries 238, 241monitors 61node entries 66notification 160notification new in installation 96notifications 63procedures 78recoveries active in installation 164replication 95replication, application group level 65replication, data resource group level 68replication, logical 68replication, PPRC 68RJ link 265services 96steps in a procedure 83switch compliance 253switching 104system and journal managers 67system-level processes 149target journal inspection processes 155Work with Data Groups display 99

stepbegin procedure at 90, 173, 248defined 22resolve problems 85status 83, 85

subsystem, MIMIXSBSended 217ending 193starting 179

Switch Assistant, MIMIX 22switch framework

disable policy when not used 49specify a default 48

switching 244application group 250

337

best practice 23, 244, 245, 253change audit level before 43, 53compliance 253conditions that end 257description, planned switch 245description, unplanned switch 246journal analysis after unplanned switch 295last switch field 253phases of a 245policies for 48problems checking compliance 254reasons for 245setting switch compliance policies 49setting switch framework policy 48switch framework vs. SWTDG command 249SWTDG command details 257unplanned, actions to complete an 246using option 6 on MIMIX Basic Main Menu 251using STRDG command 255

synchronizefile entry 211objects, system journal replicated 226tracking entries 221

system definition 20system journal replication 15, 21

detailed status 110errors automatically recovered 34journaling requirements 231

system level processes 149system manager 21

backlog 151ending 152resolving problems 149, 151starting 152status 149

system rolesmanagement or network 19production or backup 18source or target 18

Ttarget journal inspection 22

last entry inspected 158results 156starting 153status 149, 155

target system 18threshold

audit action 53audit warning 52CMPRCDCNT open commit 56switch action 56switch warning 56synchronize size 55user journal apply 51

timestamps 291automatically created 291creating additional 291deleting 293displaying 293printing 293

tipsdisplaying data group spooled files 262displaying long IFS object names 262removing journaled changes 294working with active file operations 263

tracking entry 21file identifiers (FIDs) 273IFS 219incomplete 186not journaled 102object 219removing duplicate 223

transfer definition 20

Uunconfirmed journal entries, removing 286unplanned switch 246

performing journal analysis 295unprocessed entries 196upgrade

hardware, no disk image change 319hardware, with a disk image change 321new hardware 321OS/400 312

user journal replication 15, 21detailed status 112errors automatically recovered 33journaling requirements 231non-file objects 271tracking entries 219tracking entry 21

Vverifying

communications link 281, 282journaling, IFS tracking entries 240

338

journaling, object tracking entries 243journaling, physical files 237key attributes 289

viewing status, active file operations 263

Wwait time, data group controlled end 187wait time, data group controlled end during switch 257

339