restart logic in db2

31
Restart Logic in DB2 is more important in huge Databases like DB2. I don't know how many of us are using this logic in our programs. But I think it is important to know something about it as a Mainframe programmer. As I know some tools like SMART/RESTART are available for RESTART logic but we can implement it on our own as follows. If you find time please go through this mail and let me know if you have more information on this. Checkpoint / Restart Scenerio HERE?S THE SCENARIO: Suppose, a batch program that basically reads an input file and posts the updates/inserts/deletes to DB2 tables in the database was abended before the end of the job because of some reasons; Is it possible to tell - How many input records were processed? Were any of the updates committed to the database or can the job be started from the beginning? Assume that COMMIT logic was not coded for large batch jobs that process millions of records.If an ABEND occurs all database updates will be rolled back and the job can be resubmitted from the beginning.If an ABEND occurs near the end of the process, the rollback of all the updates is performed.Also, DB2 will maintain a large number of locks for a long period of time, reducing concurrency in the system.In fact, the program may ABEND if it tries to acquire more than the installation-defined maximum number of locks. Program without COMMIT logic causes excessive locking in BASESYSPLEX and PARALLELSYSPLEX causes excessive consumption of memory.This can no longer continue if DATASHARING for DB2 is to provide workload balancing.These applications will cause the COUPLING facility to be over committed with large number of locks and huge storage requirements. To avoid the above difficulties COMMIT-RESTART LOGIC is recommended for all the batch programs performing updates/inserts/deletes. This involves setting up a batch-restart control table (CHECKPOINT_RESTART in our case) to store the last input record processed and other control information.The restart control table can also be used as an instrumentation table to control the execution, commit frequency, locking protocol and termination of batch jobs. One of the problems with restart is synchronizing DB2 tables and

Upload: dukkasrinivasflex

Post on 29-Nov-2014

423 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Restart Logic in DB2

Restart Logic in DB2 is more important in huge Databases like DB2. I don't know how many of us are using this logic in our programs. But I think it is important to know something about it as a Mainframe programmer. As I know some tools like SMART/RESTART are available for RESTART logic but we can implement it on our own as follows. If you find time please go through this mail and let me know if you have more information on this. Checkpoint / Restart Scenerio HERE?S THE SCENARIO: Suppose, a batch program that basically reads an input file and posts the updates/inserts/deletes to DB2 tables in the database was abended before the end of the job because of some reasons; Is it possible to tell - How many input records were processed? Were any of the updates committed to the database or can the job be started from the beginning? Assume that COMMIT logic was not coded for large batch jobs that process millions of records.If an ABEND occurs all database updates will be rolled back and the job can be resubmitted from the beginning.If an ABEND occurs near the end of the process, the rollback of all the updates is performed.Also, DB2 will maintain a large number of locks for a long period of time, reducing concurrency in the system.In fact, the program may ABEND if it tries to acquire more than the installation-defined maximum number of locks. Program without COMMIT logic causes excessive locking in BASESYSPLEX and PARALLELSYSPLEX causes excessive consumption of memory.This can no longer continue if DATASHARING for DB2 is to provide workload balancing.These applications will cause the COUPLING facility to be over committed with large number of locks and huge storage requirements. To avoid the above difficulties COMMIT-RESTART LOGIC is recommended for all the batch programs performing updates/inserts/deletes. This involves setting up a batch-restart control table (CHECKPOINT_RESTART in our case) to store the last input record processed and other control information.The restart control table can also be used as an instrumentation table to control the execution, commit frequency, locking protocol and termination of batch jobs. One of the problems with restart is synchronizing DB2 tables and output files.DB2 will rollback all work on DB2 tables to the last commit point; but for output files we have to delete all the records up to the last commit point.(One option to do this would be via a global temporary table, FILE_POSITION_GTT, See FILE REPOSITIONING section for further details.). COMMIT Function: The COMMIT statement ends a unit of recovery and commits the relational database changes that were made in that unit of recovery.If relational databases are the only recoverable resources used by the application process, COMMIT also ends the unit of work.The unit of recovery in which the statement is executed is ended and a new unit of recovery is effectively started for the process.All changes made byALTER, COMMENT ON, CREATE, DELETE, DROP, EXPLAIN, GRANT, INSERT, LABEL ON, RENAME, REVOKE and UPDATEstatements executed during the unit of recovery are committed. SQL connections are ended when any of the following apply: ?The connection is in the release pending state ?The connection is not in the release pending state but it is a remote connection and: ?The DISCONNECT(AUTOMATIC) bind option is in effect, or

Page 2: Restart Logic in DB2

?The DISCONNECT (CONDITIONAL) bind option is in effect and an open WITH HOLD cursor is not associated with the connection. For existing connections, ?All open cursors that were declared without the WITH HOLD option are closed. ?All open cursors that were declared with the WITH HOLD option are preserved, along with any SELECT statements that were prepared for those cursors. ?All other prepared statements are destroyed unless dynamic caching is enabled. ?If dynamic caching is enabled, then all prepared SELECT, INSERT, UPDATE and DELETE statements that are bound with KEEPDYNAMIC (YES) are kept past the commit. Prepared statements cannot be kept past a commit if: ?SQL RELEASE has been issued for that site, or ?Bind option DISCONNECT(AUTOMATIC) was used, or ?Bind option DISCONNECT (CONDITIONAL) was used and there are no hold cursors. ?All implicitly acquired locks are released, except for those required for the cursors that were not closed. ?All rows of every global temporary table of the application process are deleted. ?All rows of global temporary tables are not deleted if any program in the application process has open WITH HOLD cursor that is dependent on that temporary table. ?In addition, if RELEASE (COMMIT) is in effect, the logical work files for those temporary tables whose rows are deleted are also deleted. CHECKPOINT/RESTART LOGIC: To allow the interrupted program to be restarted from the last unit of recovery (COMMIT) or at a point other than the beginning of the program we should have a Checkpoint/restart logic. Basically, we need: ?A place to store the details (CHECKPOINT-COMMIT record) pertaining to the current execution of the program, like various counts (number of inserts/deletes/updates/selects), number of records processed, processing dates, and other details which are needed in the program after a RESTART. ?A reliable FILE RE-POSITIONING logic with minimal changes to the existing PROCJCL. ?Flexibility, to modify the commit frequency without changing the program code. Where we can store this CHECKPOINT-COMMIT record? We can store the CHECKPOINT-COMMIT record, COMMIT-FREQUENCY and other relevant information in a DB2 table. CHECKPOINT_RESTART TABLE DESCRIPTION:

database Tablename tablespace Dclgen DBMPDBII CHECKPOINT_RESTART DBMTS002 (MAXROW=1 DBMDG002

COLUMN NAME DCLGEN NAME SIZE DESCRIPTION PROGRAM_NAME PROGRAM-NAME X(08) Program name to identify

Page 3: Restart Logic in DB2

CALL_TYPE CALL-TYPE X(04) Not used CHECKPOINT_ID CHECKPOINT-ID X(08) Not used RESTART_IND RESTART-IND X(01) Indicate that pgm needs to be restarted RUN_TYPE RUN-TYPE X(01) Prime time or not COMMIT_FREQ COMMIT-FREQ S9(9) COMP No. of records intervals to commit COMMIT_SECONDS COMMIT-SECONDS S9(9) COMP No. of seconds intervals to commit COMMIT_TIME COMMIT-TIME X(26) Update Timestamp SAVE_AREA SAVE-AREA-LEN SAVE-AREA-TEXT S9(4) COMP X(4006) Length of Commit record Save Area Commit record Save Area FILE RE-POSITIONING: At restart, all records written to the output file since the last commit will have to be removed.To accomplish this, FILE_POSITION_GTT global temporary table is used. SQL statements that use global temporary tables can run faster because: {DB2 does not log changes to global temporary tables {Global temporary tables do not experience lock contention {DB2 creates an instance of the temp table for OPEN/SELECT/INSERT/DELETE stmts. only {An instance of a temporary table exists at the current server until one of the following actions occur: ?The remove server connection under which the instance was created terminates ?The unit of work under which the instance was created completes. For ROLLBACK stmt, DB2 deletes the instance of the temporary table. For COMMIT stmt, DB2 deletes the instance of the temporary table unless a cursor for accessing the temporary table is defined WITH HOLD and is open. ?The application process ends. File re-positioning Logic: ?Open the output file in INPUT mode ?INSERT all records from the output file to FILE_POSITION_GTT global temp table until the last record which was written at the time of last commit ?Close the output file ?Open the output file in OUTPUT mode ?FETCH all rows from the FILE_POSITION_GTT global temp table and write into output file ?In the Next commit, FILE_POSITION_GTT global temp table will be deleted automatically. FILE_POSITION_GTTGlobal Temp Table:

Database tablename tablespace Dclgen DSNDB06 FILE_POSITION_GTT SYSPKAGE DSNDG006

COLUMN NAME DCLGEN NAME SIZE DESCRIPTION RECORD_NUMBER FPG-RECORD-NUMBER S9(9) COMP Record number

Page 4: Restart Logic in DB2

RECORD_DETAIL FPG-RECORD-DETAIL-LEN FPG-RECORD-DETAIL-TEXT S9(4) COMP X(4000) Output file length Output file record information CHECKPOINT/RESTART Implementation:

STEP1: Create the CHECKPOINT-COMMIT record in the working storage section, to store the data, which is needed for the next unit of recovery. STEP2: In the procedure division MAIN para: First check the restart status flag i.e. RESTART-IND of CHECKPOINT_RESTART table. If RESTART-IND = ?N? then if any output file existsopen output file in OUTPUT mode start the normal process end If RESTART-IND = ?Y? then Move the SAVE-AREA information to CHECKPOINT-COMMIT record if any output file exists do the FILE REPOSITION: Open the output file in INPUT mode. Repeatedly Read the output record and INSERT it into GLOBAL temp table FILE_POSITION_GTT Until the last unit of recovery write count. Close the output file. Open the output file in OUTPUT mode. open a cursor for a table FILE_POSITION_GTT repeatedly fetch a cursor and write the record information into the output file until end of cursor close a cursor end If input for the program is from cursor then skip the rows until COMMIT-KEY. If input for the program is from file then skip the records until COMMIT-KEY. End. Note: For more than one output files, delete GTT after repositioning each output file. STEP3: Make a count for each Insert?s/Update?s/Deletes in RECORDS-PROCESSED-UOR variable. STEP4: Go thro? the logic and find out the appropriate place where COMMIT WORK can be hosted. There check the frequency of COMMITS: IF RECORDS-PROCESSED-UOR > COMMIT-FREQ KEY (input) value of the programTO COMMIT-KEY MOVE checkpoint-commit record lengthTO SAVE-AREA-LEN MOVE checkpoint-commit recordTO SAVE-AREA-TEXT Update the CHECKPOINT_RESTART table with this information END-COMMIT STEP5: Before STOP RUN statement; reset the RESTART flag of the CHECKPOINT_RESTART table. i.e. MOVE ?N? TO RESTART-IND Update the CHECKPOINT_RESTART table with the above information. Sample COBOL code for CHECKPOINT/RESTART Logic: CHECKPOINT-COMMIT RECORD DEFINITION: ************************************************************************ *****GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN***** ************************************************************************* EXEC SQL DECLARE FPG-FPOS CURSOR FOR SELECT RECORD_NUMBER ,RECORD_DETAIL FROM FILE_POSITION_GTT ORDER BY RECORD_NUMBER END-EXEC. ******************************************************************************** *****CHECK-POINT RESTART DATA DEFINITIONS***** ******************************************************************************** 01 COMMIT-REC.

Page 5: Restart Logic in DB2

02 FILLERPIC X(16) VALUE 'REC. PROCESSED: '. 02 COMMIT-KEYPIC 9(06) VALUE 0. 02 FILLERPIC X(14) VALUE 'TOTAL COUNTS: '. 02 COMMIT-COUNTS. 03 WS-REC-READPIC 9(06) VALUE 0. 03 WS-REC-REJTPIC 9(06) VALUE 0. 03 WS-REC-WRITPIC 9(06) VALUE 0. 03 WS-RECP-READPIC 9(06) VALUE 0. 03 WS-RECP-UPDTPIC 9(06) VALUE 0. 01 CHKPRSL-VARS. 02 RECORDS-PROCESSED-UORPIC S9(09) COMP VALUE +0. ************************************************************** ********** *****CHECK POINT RESTART LOGIC SECTION***** ********** ************************************************************** RESTART-CHECK. MOVE 'XXXXXX' TO PROGRAM-NAME. PERFORM RESTART-SELECT. IF RESTART-IND = 'Y' MOVE SAVE-AREA-TEXT TO COMMIT-REC If input is from cursor the skip until the commit-key If input is from file then skip the records until the commit-key END-IF. ************************************************** *****CHECK RESTART STATUS***** ************************************************** RESTART-SELECT. MOVE 0 TO RECORD-PROCESSED-UOR. EXEC SQL SELECT RESTART_IND ,COMMIT_FREQ ,RUN_TYPE ,SAVE_AREA INTO :RESTART-IND ,:COMMIT-FREQ ,:RUN-TYPE ,:SAVE-AREA FROM CHECKPOINT_RESTART WHERE PROGRAM_NAME = :PROGRAM-NAME END-EXEC. EVALUATE SQLCODE

Page 6: Restart Logic in DB2

WHEN 0 IF RESTART-IND = 'Y' DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * *' DISPLAY '***PROGRAM - ' PROGRAM-NAME ' RESTARTED***' DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * *' DISPLAY ' ' END-IF WHEN 100 PERFORM RESTART-INSERT WHEN OTHER MOVE 'RESTART-SELECT'TOWS-PARA-NAME MOVE 'CHECKPOINT_RESTART SELECT ERR'TOWS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. / ************************************************************** *****INSERT THE NEW RESTART STATUS RECORD***** ************************************************************** RESTART-INSERT. MOVE SPACESTO CALL-TYPE. MOVE SPACESTO CHECKPOINT-ID. MOVE 'N'TO RESTART-IND. MOVE 'B'TO RUN-TYPE. MOVE +500TO COMMIT-FREQ. MOVE ZEROESTO COMMIT-SECONDS. MOVE +4006TO SAVE-AREA-LEN. MOVE SPACESTO SAVE-AREA-TEXT. EXEC SQL INSERT INTO CHECKPOINT_RESTART ( PROGRAM_NAME ,CALL_TYPE ,CHECKPOINT_ID ,RESTART_IND ,RUN_TYPE ,COMMIT_FREQ ,COMMIT_SECONDS ,COMMIT_TIME ,SAVE_AREA ) VALUES ( :PROGRAM-NAME ,:CALL-TYPE

Page 7: Restart Logic in DB2

,:CHECKPOINT-ID ,:RESTART-IND ,:RUN-TYPE ,:COMMIT-FREQ ,:COMMIT-SECONDS , CURRENT TIMESTAMP ,:SAVE-AREA ) END-EXEC. EVALUATE SQLCODE WHEN 0 CONTINUE WHEN OTHER MOVE 'RESTART-INSERT'TOWS-PARA-NAME MOVE 'CHECKPOINT_RESTART INSERT'TOWS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. / ********************************************************** *****UPDATE THE CHECKPOINT RECORD***** ********************************************************** RESTART-COMMIT. MOVE 'Y'TO RESTART-IND. EXEC SQL UPDATE CHECKPOINT_RESTART SET RESTART_IND= :RESTART-IND ,SAVE_AREA= :SAVE-AREA ,COMMIT_TIME=CURRENT TIMESTAMP WHERE PROGRAM_NAME = :PROGRAM-NAME END-EXEC. EVALUATE SQLCODE WHEN 0 EXEC SQL COMMIT WORK END-EXEC EVALUATE SQLCODE WHEN 0 CONTINUE WHEN OTHER MOVE 'RESTART-COMMIT' TOWS-PARA-NAME MOVE 'COMMIT ERROR'TOWS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE MOVE 0 TO RECORD-PROCESSED-UOR

Page 8: Restart Logic in DB2

WHEN OTHER MOVE 'RESTART-COMMIT'TOWS-PARA-NAME MOVE 'CHECKPOINT_RESTART UPDATE ERR'TOWS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. ******************************************************************* *****RESET THE RESTART FLAG AT THE END OF PROGRAM***** ******************************************************************* RESTART-RESET. MOVE0TO RECORD-PROCESSED-UOR. MOVE 'N'TO RESTART-IND. EXEC SQL UPDATE CHECKPOINT_RESTART SET RESTART_IND= :RESTART-IND ,COMMIT_TIME=CURRENT TIMESTAMP WHERE PROGRAM_NAME = :PROGRAM-NAME END-EXEC. EVALUATE SQLCODE WHEN 0 EXEC SQL COMMIT WORK END-EXEC WHEN OTHER MOVE 'RESTART-RESET'TOWS-PARA-NAME MOVE 'CHECKPOINT_RESTART DELETE ERR'TOWS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. / ************************************************************* ********** *****OUTPUT FILE REPOSITION LOGIC SECTION***** ********** ************************************************************** ************************************************************************ *****GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN***** ************************************************************************* FPG-OPEN. EXEC SQL OPENFPG-FPOS END-EXEC. EVALUATE SQLCODE WHEN 0 CONTINUE WHEN OTHER

Page 9: Restart Logic in DB2

MOVE 'FPG-OPEN'TO WS-PARA-NAME MOVE 'GLOBAL TEMP TABLE OPENERR' TO WS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. *************************************************************** *****GLOBAL TEMPORARY TABLE CURSOR FETCH***** *************************************************************** FPG-FETCH. EXEC SQL FETCH FPG-FPOS INTO :FPG-RECORD-NUMBER ,:FPG-RECORD-DETAIL END-EXEC. EVALUATE SQLCODE WHEN 0 CONTINUE WHEN +100 MOVE0TO FPG-RECORD-NUMBER WHEN OTHER MOVE 'FPG-FETCH 'TO WS-PARA-NAME MOVE 'GLOBAL TEMP TABLE FETCH ERR' TO WS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. **************************************************************** *****GLOBAL TEMPORARY TABLE CURSOR CLOSE***** **************************************************************** FPG-CLOSE. EXEC SQL CLOSE FPG-FPOS END-EXEC. EVALUATE SQLCODE WHEN 0 MOVE 0 TO FPG-RECORD-NUMBER WHEN OTHER MOVE 'FPG-FPOS-CLOSE 'TO WS-PARA-NAME MOVE 'GLOBAL TEMP TABLE CLOSE ERR' TO WS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. *********************************************************** *****GLOBAL TEMPORARY TABLE INSERTS***** *********************************************************** FPG-INSERT.

Page 10: Restart Logic in DB2

ADD1 TO FPG-RECORD-NUMBER. EXEC SQL INSERT INTO FILE_POSITION_GTT ( RECORD_NUMBER ,RECORD_DETAIL ) VALUES ( :FPG-RECORD-NUMBER ,:FPG-RECORD-DETAIL ) END-EXEC. EVALUATE SQLCODE WHEN 0 CONTINUE WHEN OTHER MOVE 'FPG-INSERT'TO WS-PARA-NAME MOVE 'GLOBAL TEMP TABL INSERT ERR' TO WS-PARA-MSG PERFORM EXCEPTION-ROUTINE END-EVALUATE. / RESTART-FILE-REPOSITION. OPEN INPUT outputfile-name. MOVE LENGTH OF output-record TO FPG-RECORD-DETAIL-LEN. READ output-file INTO FPG-RECORD-DETAIL-TEXT. PERFORM UNTIL FPG-RECORD-NUMBER >= output record count of last commit PERFORM FPG-INSERT READ output-file INTO FPG-RECORD-DETAIL-TEXT END-PERFORM. CLOSE output-filename OPEN OUTPUT outputfile-name. PERFORM FPG-OPEN. PERFORM FPG-FETCH. PERFORM UNTIL FPG-RECORD-NUMBER = 0 WRITE outputfile-recordFROM FPG-RECORD-DETAIL-TEXT PERFORM FPG-FETCH END-PERFORM. PERFORM FPG-CLOSE. ---------skip input file until the last commit------------------ DISPLAY '*** ALREADY ' COMMIT-KEY ' RECORDS PROCESSED ***'. DISPLAY ' '

Page 11: Restart Logic in DB2

DISPLAY ' '. / *********************************************************** ************** E X C E P T I O NR O U T I N E **************** *********************************************************** EXCEPTION-ROUTINE. MOVESQLCODE TO WS-SQL-RET-CODE. DISPLAY '*************************************************'. DISPLAY '****E R R O RM E S S A G E S****'. DISPLAY '*************************************************'. DISPLAY '*ERROR INPARA.....: ' WS-PARA-NAME. DISPLAY '*MESSAGES.....: ' WS-PARA-MSG. DISPLAY '*'. DISPLAY '*SQL RETURNCODE..: ' WS-SQL-RET-CODE. DISPLAY '*************************************************'. CALL CDCABEND USING ABEND-CODE. Output file Disposition in JCL: ?In JCL, disposition must be given as DISP=(NEW,CATLG,CATLG) or DISP=(OLD,KEEP,KEEP) ?Override statement is needed for the output files if job abended: 1.GDG with DISP=(NEW,CATLG,CATLG) Override stmt: ?Change +1 generation to 0 (current) generation ?DISP=(OLD,KEEP,KEEP) 2.GDG with DISP=(OLD,KEEP,KEEP) Override stmt: ?Change +1 generation to 0 (current) generation Output file with Disposition MOD: ?If output file is already existing, and program is appending records to that, then the File repositioning must be handled in different way according to the requirements. Internal Sort: ?If any Commit-Restart program has Internal Sort, remove it and have an External Sort. { POINTS TO REMEMBER @All the update programs must use COMMIT frequency fromthe CHECKPOINT_RESTART table only @Avoid - Internal Sorts @Avoid - Mass updates (Instead, use cursor with FOR UPDATE clause and update one record at a time) @On-call analyst should back-up all the output files before restart (The procedure should be documented in APCDOC) @Reports to dispatch should be sent to a flat file; send the file to dispatch up on successful completion of the job @Save only the working storage variables that are required for RESTART in the CHECKPOINT_RESTART table

Page 12: Restart Logic in DB2

@RESET the RESTART_IND flag at the end of the program @If COMMIT-RESTART logic is introduced in an existing program then make relevant changes to the PROCJCL.

Actually i have coded a cobol program which includes the chekpoint logic. The jcl which runs for this program will unload records from some table and store the data into one flat file. This file is used as input to the program. The program will just read this input flat file and writes into outfile flat file specified in the jcl.

After the unload is done, the input provided to the program has 333 records. when the program is running, if it abends at 50th record, the checkpoint logic what am using in my program will help in capturing the record which was processed succesfully just before the abend happened( we specify something known as FREQUECY as an input to the checkpoint logic i.e., FREQ(002) as an example). so an entry will be made in a table called CKPT-TABLE i.e., 48th record is captured in this table as per the FREQ (002) as an example.

so when we RESTART the program after solving the problem in the 50th record which caused the abbend in the program, the program will run from 49 record.( the FREQ(002) does means that for every two records we are COMMIT'ing the CKPT-TABLE to capture that record. i.e., when processing, everying 2nd record in captured in the table for eg: when program picks up 2nd record it is saved in the table, again when it comes to 4th record, the 2nd record is replaced by 4th record in the CKPT_TABLE. similarly the 48th was stored in the table!!).

My question is, when we restart the program how the checkpoint would come to know that it has to start from 49th record!!?

Page 13: Restart Logic in DB2

as per i referred to some document, after every COMMIT the records will be stored in something called as DB2-BUFFER-POOL'. as per the above example i have shared, 48th record was commited and all those 48 records would be stored in BUFFER_POOL. the records wouldn't be written to output file specified in the jcl untill all the 333 records got processed successfully, untill then the records would go on stored in the buffer pool.

so how the chekpoint logic wil come to know that it has to start from 49th record, whether it will refer the BUFFER POOL or INPUT FILE while getting restart?

Q.2) HOW TO SET THE RESTART LOGIC IN DB2? In most of the shop there is some restart table, so when u start the program just manke a entry on that table with ur prog name, job name, table name and counters. Now when ever you do commit, just update this table with information and in counter give the number of row that u have committed. E.g say ur table has got 200 rows and u r commiting after performing calculation for 35, so in counter give 35 and rest other info like prog name, table name and all. Once u complete the whole execution do remeber to delete this row. One thing that u need to remeber when ever u r executing this program just read that table, if there is any entry for that particular program it means it's not the first run(i.e restart), otherwise it's fresh execution. If the entry is avilable then skip those many record which is present in counter column and do the processing of the rest.

Checkpoint Restart in DB2 Part - II

The first part : Checkpoint Restart in DB2 Part - I

In first part we understood what is check point restart and why we use it. We also covered the problem associated with Check point restart and solutions to those problems.

Now, in this post we will see the step by step implementation of check point restart logic.

CHECKPOINT/RESTART Implementation:

STEP1: 

Create the CHECKPOINT-COMMIT record in the working storage section, to store 

Page 14: Restart Logic in DB2

the data, which is needed for the next unit of recovery. 

STEP2:

In the procedure division MAIN para: First check the restart status flag i.e. RESTART-IND of CHECKPOINT_RESTART table. 

If RESTART-IND = ‘N’ then   if any output file exists  open output file in OUTPUT mode    start the normal process end If RESTART-IND = ‘Y’ then Move the SAVE-AREA information to CHECKPOINT-COMMIT record  if any output file exists   do the FILE REPOSITION:     Open the output file in INPUT mode.     Repeatedly        Read the output record and INSERT it into GLOBAL temp table       FILE_POSITION_GTT     Until the last unit of recovery write count.     Close the output file.    Open the output file in OUTPUT mode. open a cursor for a table FILE_POSITION_GTTrepeatedly fetch a cursor and write the record information into the output file  until end of cursorclose a cursor end       If input for the program is from cursor then skip the rows until COMMIT-KEY.       If input for the program is from file then skip the records until COMMIT-KEY. End. 

Note: For more than one output files, delete GTT after repositioning each output file.

STEP3: 

Make a count for each  Insert’s/Update’s/Deletes in  RECORDS-PROCESSEDUOR variable. 

STEP4:

Go thro’ the logic and find out the appropriate place where COMMIT WORK can be hosted. There check the frequency of COMMITS: 

IF RECORDS-PROCESSED-UOR > COMMIT-FREQ

    KEY (input) value of the program                 TO COMMIT-KEY       

Page 15: Restart Logic in DB2

    MOVE checkpoint-commit record length  TO SAVE-AREA-LEN     MOVE checkpoint-commit record       TO SAVE-AREA-TEXT     Update the CHECKPOINT_RESTART table with this information 

END-COMMIT 

STEP5: 

Before STOP RUN statement; reset the RESTART flag of the CHECKPOINT_RESTART table. i.e. MOVE ‘N’ TO RESTART-IND     Update the CHECKPOINT_RESTART table with the above information.

Sample COBOL code for CHECKPOINT/RESTART Logic:  

CHECKPOINT-COMMIT RECORD DEFINITION

 **************************************************************** GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN *****************************************************************

EXEC SQL 

   DECLARE FPG-FPOS CURSOR FOR    SELECT RECORD_NUMBER ,RECORD_DETAIL    FROM FILE_POSITION_GTT    ORDER BY RECORD_NUMBEREND-EXEC. 

******************************************************************* CHECK-POINT RESTART DATA DEFINITIONS  ***** **************************************************************

01 COMMIT-REC. 02 FILLER  PIC X(16) VALUE 'REC. PROCESSED: '.02 COMMIT-KEY  PIC 9(06) VALUE 0. 02 FILLER  PIC X(14) VALUE 'TOTAL COUNTS: '.02 COMMIT-COUNTS.                                                                 03 WS-REC-READ           PIC 9(06) VALUE 0.                    03 WS-REC-REJT             PIC 9(06) VALUE 0.                    03 WS-REC-WRIT            PIC 9(06) VALUE 0.                    03 WS-RECP-READ         PIC 9(06) VALUE 0.                    03 WS-RECP-UPDT         PIC 9(06) VALUE 0. 01 CHKPRSL-VARS. 02 RECORDS-PROCESSED-UOR PIC S9(09) COMP VALUE +0. 

Page 16: Restart Logic in DB2

************************************************************** *****  CHECK POINT RESTART LOGIC SECTION   ***** ************************************************************** 

RESTART-CHECK.             MOVE 'XXXXXX  ' TO PROGRAM-NAME.                              PERFORM RESTART-SELECT. IF RESTART-IND = 'Y' MOVE SAVE-AREA-TEXT TO COMMIT-REC    If input is from cursor the skip until the commit-key    If input is from file then skip the records until the commit-key                          END-IF.

************************************************** ***** CHECK RESTART STATUS  ***** ************************************************** 

RESTART-SELECT. 

MOVE 0 TO RECORD-PROCESSED-UOR. 

EXEC SQL 

   SELECT RESTART_IND ,COMMIT_FREQ ,RUN_TYPE ,SAVE_AREA    INTO :RESTART-IND ,:COMMIT-FREQ ,:RUN-TYPE ,:SAVE-AREA    FROM CHECKPOINT_RESTART    WHERE PROGRAM_NAME = :PROGRAM-NAME                      END-EXEC. 

EVALUATE SQLCODE 

    WHEN 0                                                                 IF RESTART-IND = 'Y'            DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * **********'            DISPLAY ' ***PROGRAM - ' PROGRAM-NAME ' RESTARTED***'            DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * **********'             DISPLAY ' '         END-IF 

    WHEN 100                                                               PERFORM RESTART-INSERT

     WHEN OTHER                                                    

Page 17: Restart Logic in DB2

        MOVE 'RESTART-SELECT  ' TO WS-PARA-NAME           MOVE 'CHECKPOINT_RESTART SELECT ERR' TO WS-PARA-MSG            PERFORM EXCEPTION-ROUTINE END-EVALUATE. 

/   ************************************************************** ***** INSERT THE NEW RESTART STATUS RECORD ***** ************************************************************** 

RESTART-INSERT. MOVE SPACES  TO CALL-TYPE. MOVE SPACES  TO CHECKPOINT-ID.             MOVE 'N'  TO RESTART-IND.             MOVE 'B'  TO RUN-TYPE.             MOVE +500  TO COMMIT-FREQ.                                    MOVE ZEROES              TO COMMIT-SECONDS.                                 MOVE +4006               TO SAVE-AREA-LEN.                      MOVE SPACES  TO SAVE-AREA-TEXT. EXEC SQL     INSERT INTO CHECKPOINT_RESTART     (      PROGRAM_NAME ,CALL_TYPE ,CHECKPOINT_ID ,RESTART_IND     ,RUN_TYPE,COMMIT_FREQ ,COMMIT_SECONDS ,COMMIT_TIME      ,SAVE_AREA      ) VALUES    (    :PROGRAM-NAME ,:CALL-TYPE ,:CHECKPOINT-ID ,:RESTART-IND ,:RUN-TYPE    ,:COMMIT-FREQ  ,:COMMIT-SECONDS, CURRENT TIMESTAMP ,:SAVE-AREA    ) 

END-EXEC. 

EVALUATE SQLCODE    WHEN 0                                                                  CONTINUE    WHEN OTHER                                                               MOVE 'RESTART-INSERT  ' TO WS-PARA-NAME                  MOVE 'CHECKPOINT_RESTART INSERT' TO WS-PARA-MSG                   PERFORM EXCEPTION-ROUTINE END-EVALUATE.

********************************************************** ***** UPDATE THE CHECKPOINT RECORD ***** ********************************************************** 

Page 18: Restart Logic in DB2

RESTART-COMMIT.             MOVE 'Y' TO RESTART-IND. EXEC SQL 

   UPDATE CHECKPOINT_RESTART    SET RESTART_IND = :RESTART-IND ,SAVE_AREA = :SAVE-AREA ,   COMMIT_TIME = CURRENT TIMESTAMP    WHERE PROGRAM_NAME = :PROGRAM-NAME                      END-EXEC. 

EVALUATE SQLCODE   WHEN 0                                                           EXEC SQL COMMIT WORK END-EXEC                                   EVALUATE SQLCODE          WHEN 0                                                              CONTINUE          WHEN OTHER                                                          MOVE 'RESTART-COMMIT' TO  WS-PARA-NAME                         MOVE 'COMMIT ERROR' TO WS-PARA-MSG                   PERFORM EXCEPTION-ROUTINE        END-EVALUATE        MOVE 0 TO RECORD-PROCESSED-UOR 

  WHEN OTHER                                                        MOVE 'RESTART-COMMIT' TO WS-PARA-NAME       MOVE 'CHECKPOINT_RESTART UPDATE ERR' TO WS-PARA-MSG        PERFORM EXCEPTION-ROUTINE END-EVALUATE.

******************************************************************* ***** RESET THE RESTART FLAG AT THE END OF PROGRAM ***** ******************************************************************* 

RESTART-RESET. MOVE 0 TO RECORD-PROCESSED-UOR. MOVE 'N' TO RESTART-IND. 

EXEC SQL    UPDATE CHECKPOINT_RESTART    SET RESTART_IND = :RESTART-IND ,COMMIT_TIME = CURRENT TIMESTAMP    WHERE PROGRAM_NAME = :PROGRAM-NAME                      END-EXEC. 

EVALUATE SQLCODE    WHEN 0                                                        

Page 19: Restart Logic in DB2

        EXEC SQL COMMIT WORK END-EXEC                                WHEN OTHER                                                            MOVE 'RESTART-RESET' TO WS-PARA-NAME           MOVE 'CHECKPOINT_RESTART DELETE ERR' TO WS-PARA-MSG            PERFORM EXCEPTION-ROUTINE END-EVALUATE. / ****************************************************************** OUTPUT FILE REPOSITION LOGIC SECTION  ***** ********* * ***** ********************************************

************************************************************************ ***** GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN  ***** ************************************************************************* FPG-OPEN. 

EXEC SQL   OPEN FPG-FPOS END-EXEC. 

EVALUATE SQLCODE     WHEN 0                                                                 CONTINUE      WHEN OTHER                                                              MOVE 'FPG-OPEN'                    TO WS-PARA-NAME                 MOVE 'GLOBAL TEMP TABLE OPEN  ERR' TO WS-PARA-MSG                  PERFORM EXCEPTION-ROUTINE  END-EVALUATE. 

****************************************************************   GLOBAL TEMPORARY TABLE CURSOR FETCH  ***** ************************************************************

FPG-FETCH. EXEC SQL 

  FETCH FPG-FPOS INTO :FPG-RECORD-NUMBER ,:FPG-RECORD-DETAIL 

END-EXEC. 

EVALUATE SQLCODE    WHEN 0                                                                CONTINUE     WHEN +100                                                             MOVE 0 TO FPG-RECORD-NUMBER                                 WHEN OTHER                                                            MOVE 'FPG-FETCH '                  TO WS-PARA-NAME       

Page 20: Restart Logic in DB2

        MOVE 'GLOBAL TEMP TABLE FETCH ERR' TO WS-PARA-MSG                PERFORM EXCEPTION-ROUTINE END-EVALUATE. 

***************************************************************** GLOBAL TEMPORARY TABLE CURSOR CLOSE  ***** ************************************************************

FPG-CLOSE. 

EXEC SQL 

   CLOSE FPG-FPOSEND-EXEC. 

EVALUATE SQLCODE      WHEN 0                                                                  MOVE 0 TO FPG-RECORD-NUMBER                                    WHEN OTHER                                                              MOVE 'FPG-FPOS-CLOSE '             TO WS-PARA-NAME                 MOVE 'GLOBAL TEMP TABLE CLOSE ERR' TO WS-PARA-MSG                  PERFORM EXCEPTION-ROUTINE END-EVALUATE. 

*********************************************************** ***** GLOBAL TEMPORARY TABLE INSERTS  ***** *********************************************************** 

FPG-INSERT. ADD 1 TO FPG-RECORD-NUMBER.                                  EXEC SQL 

   INSERT INTO FILE_POSITION_GTT   (     RECORD_NUMBER ,RECORD_DETAIL    ) VALUES (  :FPG-RECORD-NUMBER ,:FPG-RECORD-DETAIL ) 

END-EXEC. 

EVALUATE SQLCODE     WHEN 0                                                                 CONTINUE 

Page 21: Restart Logic in DB2

     WHEN OTHER                                                             MOVE 'FPG-INSERT  '              TO WS-PARA-NAME                MOVE 'GLOBAL TEMP TABL INSERT ERR' TO WS-PARA-MSG                 PERFORM EXCEPTION-ROUTINE END-EVALUATE. / RESTART-FILE-REPOSITION. OPEN INPUT outputfile-name.                                         MOVE LENGTH OF output-record TO FPG-RECORD-DETAIL-LEN.             READ output-file INTO FPG-RECORD-DETAIL-TEXT.                   PERFORM UNTIL FPG-RECORD-NUMBER >= output record count of last commitPERFORM FPG-INSERT READ output-file INTO FPG-RECORD-DETAIL-TEXT            END-PERFORM. CLOSE output-filenameOPEN OUTPUT outputfile-name.                                        PERFORM FPG-OPEN. PERFORM FPG-FETCH. PERFORM UNTIL FPG-RECORD-NUMBER = 0 WRITE outputfile-record FROM FPG-RECORD-DETAIL-TEXT PERFORM FPG-FETCH END-PERFORM. PERFORM FPG-CLOSE. ---------skip input file until the last commit------------------ DISPLAY '  *** ALREADY ' COMMIT-KEY ' RECORDS PROCESSED ***'. DISPLAY ' 'DISPLAY ' '. / *********************************************************** ************** E X C E P T I O N  R O U T I N E ************************************************************************* 

EXCEPTION-ROUTINE. 

MOVE SQLCODE TO WS-SQL-RET-CODE.                                      DISPLAY '*************************************************'.              DISPLAY '****  E R R O R   M E S S A G E S  ****'.              DISPLAY '*************************************************'.              DISPLAY '* ERROR IN PARA.....: ' WS-PARA-NAME.                          DISPLAY '*        MESSAGES.....: ' WS-PARA-MSG.                           DISPLAY '*'.            DISPLAY '* SQL RETURN  CODE..: ' WS-SQL-RET-CODE.                       DISPLAY '*************************************************'.  

Output file Disposition in JCL:  

Page 22: Restart Logic in DB2

♦  In JCL, disposition must be given as DISP=(NEW,CATLG,CATLG) or DISP=(OLD,KEEP,KEEP) ♦  Override statement is needed for the output files if job abended:         1.  GDG with DISP=(NEW,CATLG,CATLG)               Override stmt:                  •  Change +1 generation to 0 (current) generation                  •  DISP=(OLD,KEEP,KEEP)        2.  GDG with DISP=(OLD,KEEP,KEEP)              Override stmt:                 •  Change +1 generation to 0 (current) generation 

Output file with Disposition MOD:  

•  If output file is already existing, and program is appending records to that, then the File re-positioning must be handled in different way according to the requirements. 

Internal Sort: 

If any Commit-Restart program has Internal Sort, remove it and have an External Sort. 

POINTS TO REMEMBER   All the update programs must use COMMIT frequency from the

CHECKPOINT_RESTART table only  Avoid – Internal Sorts 

Avoid – Mass updates (Instead, use cursor with FOR 

UPDATE clause and update one record at a time) 

On-call analyst should back-up all the output files before 

restart (The procedure should be documented in APCDOC) 

Reports to dispatch should be sent to a flat file; send the file 

to dispatch up on successful completion of the job 

Save only the working storage variables that are required 

for RESTART in the CHECKPOINT_RESTART table 

RESET the RESTART_IND flag at the end of the program 

If COMMIT-RESTART logic is introduced in an existing 

program then make relevant changes to the PROCJCL.                    

Page 23: Restart Logic in DB2

Re: Checkpointing Hi, this is the check point restart logic in db2:

scenario: if a batch program reads an input file and updates/inserts/deletes from database into db2 tables, if it abends before the end of the job, is it possible to tell how many records were processed? do we need to start the job from beginning or are there any transactions happened with any of the records?

Assume that commit logic was not coded for large batch jobs that process millions of records.if an abend occurs all database updates will be rolled back and the job can be resubmitted from the beginning.if an abend occurs near the end of the process, the rollback of all the updates is performed.also, db2 will maintain a large number of locks for a long period of time, reducing concurrency in the system.in fact, the program may abend if it tries to acquire more than the installation-defined maximum number of locks.

Program without commit logic causes excessive memory consumption. So this will not provide workload balancing. These applications will cause the coupling facility to be over commited with large number of locks and huge storage requirements. To avoid this difficulties, commit-restart logic is recommended for all the batch programs performing transactions with database. This invloves setting up batch-restart control table (checkpoint_restart) to be set up to store the last input record processed and other control information.

Checkpoint/restart logic: to allow the interrupted program to be restarted from the last unit of recovery (commit) or at a point other than the beginning of the program we should have a checkpoint/restart logic. Basically, we need: ·a place to store the details (checkpoint-commit record) pertaining to the current execution of the program, like various counts (number of inserts/deletes/updates/selects), number of records processed, processing dates, and other details which are needed in the program after a restart. ·a reliable file re-positioning logic with minimal changes to the existing procjcl. ·flexibility, to modify the commit frequency without changing the program code. Where we can store this checkpoint-commit record? we can store the checkpoint-commit record, commit-frequency and other relevant information in a db2 table. Checkpoint_restart table description: database tablename tablespace dclgen dbmpdbii checkpoint_restart dbmts002 (maxrow=1 dbmdg002

column name dclgen name size description program_name program-name x(08) program name to identify call_type call-type x(04) not used checkpoint_id checkpoint-id x(08) not used restart_ind restart-ind x(01) indicate that pgm needs to be restarted run_type run-type x(01) prime time or not commit_freq

Page 24: Restart Logic in DB2

commit-freq s9(9) comp no. Of records intervals to commit commit_seconds commit-seconds s9(9) comp no. Of seconds intervals to commit commit_time commit-time x(26) update timestamp save_area save-area-len save-area-text s9(4) comp x(4006)length of commit record save area commit record save area

checkpoint/restart implementation: step1: create the checkpoint-commit record in the working storage section, to store the data, which is needed for the next unit of recovery. Step2: in the procedure division main para: first check the restart status flag i.e. Restart-ind of checkpoint_restart table. If restart-ind = ‘n’ then if any output file existsopen output file in output mode start the normal process end if restart-ind = ‘y’ then move the save-area information to checkpoint-commit record if any output file exists do the file reposition: open the output file in input mode. Repeatedly read the output record and insert it into global temp table file_position_gtt until the last unit of recovery write count. Close the output file. Open the output file in output mode. Open a cursor for a table file_position_gtt repeatedly fetch a cursor and write the record information into the output file until end of cursor close a cursor end if input for the program is from cursor then skip the rows until commit-key. If input for the program is from file then skip the records until commit-key. End. Note: for more than one output files, delete gtt after repositioning each output file.

Step3: make a count for each insert’s/update’s/deletes in records-processed-uor variable. Step4: go thro’ the logic and find out the appropriate place where commit work can be hosted. There check the frequency of commits: if records-processed-uor > commit-freq key (input) value of the programto commit-key move checkpoint-commit record lengthto save-area-len move checkpoint-commit recordto save-area-text update the checkpoint_restart table with this information end-commit

step5: before stop run statement; reset the restart flag of the checkpoint_restart table. I.e. Move ‘n’ to restart-ind update the checkpoint_restart table with the above information.