october 20, 2012 author rajesh kurapatiminisites.qaiglobalservices.com/stc2012/paper_...
TRANSCRIPT
Batch Job
A batch job is a scheduled program that runs without user intervention. Corporations use batch jobs to automate tasks
that they need to perform on a regular basis. Batch jobs usually run during off peak hours when systems are not being
used for online processing. (For example, systems can run to update files, create printed reports, or purge files.) Batch
jobs that need to be processed on a regular basis are incorporated into batch schedules.
Batch jobs usually run overnight, during the so called “batch window". During this time, online activity is restricted or
Introduction
| Testing Services
even completely forbidden. So a good first reason for batch jobs is to be sure that the batch window will end before the
agreed online start time. This is becoming more and more important because batch activity is still growing at many sites
while the batch window is normally shrinking due to 24X7 online demands.
Batch jobs usually run overnight, during the so called “batch window". During this time, online activity is restricted or
even completely forbidden.
The intent of Batch Test is to find errors in job setup or output distribution and correct them before go-live. There is no
approach or method to guarantee a schedule completely free of defects
The information contained in this presentation is proprietary.
Copyright ©2011 Capgemini. All rights reserved.2
Importance of batch job Performance testing
� Irrespective of the volume of the data, performance of the batch should be good enough to complete the
processing within the allowed time frame.
� Execution of batch job should never affect the performance of other processes.
Batch Performance Testing Approach
� Determine those jobs and jobsets currently used in Production which will need to be setup for the new server/environment and
| Testing Services
� Determine those jobs and jobsets currently used in Production which will need to be setup for the new server/environment and
subsequently tested during the appropriate phases
� Determine any new jobs to be added to the existing schedule as a result of new development/functionality
� Determine which jobs and jobsets residing on the Production server are obsolete and should be deleted before the testing
phase
� Submit documentation for jobs/jobsets to be copied to the testing environment
Possible ways to predict
There can be multiple ways to predict batch performance. Please note that a single option or hybrid of them can be
used to predict batch performance. Following are some of the options.
1. Do a proof of concept (PoC) of batches and extrapolate based on ‘to be’ data volume as per workload model .
2. Get benchmark numbers of the batches from core application vendor (if applicable) and extrapolate.
3. Get performance metrics from other similar projects and extrapolate as per workload model.
4. Profile the batches to get details of SQLs executed, apply base response time and overhead and estimate as
per workload model.
| Testing Services
The information contained in this presentation is proprietary.
Copyright ©2011 Capgemini. All rights reserved.4
per workload model.
Environment Setup
� An environment similar to production environment has to be set up to performance test the batch
job. This will include a similar system configuration set up, all the expected concurrent processes
running and a similar data setup as in production.
� In most of the cases since replicating production like environment is expensive a relatively less
configured system setup is made for testing and proportionate changes are made in the expected
throughput during the test
Figure: Batch Performance Testing Process
| Testing Services
The information contained in this presentation is proprietary.
Copyright ©2011 Capgemini. All rights reserved.5
Performance Monitoring
All the system resources have to be efficiently and optimally utilized to obtain best performance from batch process
Throughput – No. of records processed per minute (second) statistics has to be collected. A script can be executed or a
DB query lookup can be made to collect to collect throughput metrics of the batch process over the given time period.
CPU & Memory Utilization – These metrics should be always within the limit of what the batch job is expected to
consume. Unlike other applications, high CPU of 90% is acceptable for batch process provided no other concurrent tasks
will run and the application is finely tuned. This implies optimal utilization of resources by batch thus resulting in
maximum throughput. In case, CPU utilization is higher than expected, it’s always advisable to collect server level
statistics which eases digging into the issue.
| Testing Services
statistics which eases digging into the issue.
DB Query Response Time – This is one of the most important monitoring counters to be captured while doing batch job
performance testing. Most of the Batch jobs do DB updates. If the Query response time is low, any type of code or
configuration changes won’t be able to bring any commendable improvement in the batch performance.
Application level statistics – Need to keep track of no. of concurrent threads or sessions during the batch run. These
values have to be increased or decreased as situation demands.
The information contained in this presentation is proprietary.
Copyright ©2011 Capgemini. All rights reserved.6
1. For applications with Windows OS – perfmon is used to collect the System statistics. This includes, total CPU utilization,
Memory utilization, process level breakdown of processor utilization, context switches, queue length etc.
2. For UNIX and AIX there will be scripts executing commands in regular intervals to collect all the metrics during the test run.
there are commands which can collect CPU, memory, network statistics, DB connections etc.
Issues
There are few performance issues which are frequently seen to occur during Performance testing batch
process
Monitoring Tools
| Testing Services
1. Throughput
2. CPU Utilization
3. SQL Queries time out
4. Memory Issues
The information contained in this presentation is proprietary.
Copyright ©2011 Capgemini. All rights reserved.7
Tuning
CPU Utilization
| Testing Services
The information contained in this presentation is proprietary.
Copyright ©2011 Capgemini. All rights reserved.8
If the CPU utilization goes down infrequently during the run, it will be due to slowly responding backend. Whenever backend responds
slowly, it makes the processor wait for the Response brings down the CPU utilization. Data or queries taking long time have to be
analysed and tuned to respond like other queries.
Memory Issues
Memory issues in batch jobs are similar to other applications. Whenever the application hits out of memory, either the application will be in
need for additional memory or there will be a memory leak in the application. If it’s the first case then appropriate memory as required by
the process has to be allocated. If it’s a memory leak, the leaking objects in the batch process has to be identified and fixed using tools like
Heap dump analyser, JProbe Memory debugger etc.
| Testing Services
Heap dump analyser, JProbe Memory debugger etc.
© 2011 Capgemini. All rights reserved. 9