scott chapman – american electric power paper 9015 session 331
TRANSCRIPT
Scott Chapman – American Electric Power
Paper 9015Session 331
Agenda
What I mean by “critical path” My simple way of finding it Review some sample code Questions Bonus Material
Geek-required xkcd reference
HTTP://XKCD.COM/399/
Critical path
Simple definition: how long is this going to take? Longest sequence of activities
In a project In a batch schedule
Need to look at: Predecessor-successor relationships Durations Time dependencies
Ideas originated in Project Management
Project Management History Critical Path Method (CPM) originated at
DuPont in 1950s Used to manage chemical plant
maintenance projects Critical path is the sequence of events which
determines the duration of the project Delays in tasks on the CP delay the entire
project CP must be managed to stay on schedule To finish project earlier, tasks on CP must be
somehow shortened
CPM process
Identify activities Identify sequence and dependencies Draw network diagram of activities Estimate duration of activities Identify the longest path in the network
(critical path) Monitor & update as project progresses
Delays in tasks not on CP may change the CP!
Batch Windows & CP Business runs on cycles
Daily, weekly, monthly processes Large applications have large batch
schedules Batch schedules can be drawn as a
network diagram Predecessor – Successor
relationships between jobs Like projects, we like our batch
windows to finish on time! Like a project, a batch schedule has a
CP
CP Calculation Formally:
CP is path with no slack for any task Slack = difference between earliest & latest
start or finish time of task Latest finish = latest time task can finish
without delaying project And how do you figure that???
Informally: Find all the paths through the network
diagram Add up task duration on each path Select the longest path
Simple example
Five paths through this simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start
End
Simple example
Five paths through this simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start
End
Simple example
Five paths through this simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start
End
Simple example
Five paths through this simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start
End
Simple example
Five paths through this simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start
End
Simple example
Five paths through this simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start
End
Simple example
Calculate path durations from task durations
Task A
Task D
Task B
Task C
Task E
Task F
Start
End15 mins
30 mins
20 mins
15 mins
10 mins
5 mins
A+B+E = 45A+C+E = 40A+C+F = 35D+C+E = 55D+C+F = 50
Simple example
Calculate path durations from task durations
Task A
Task D
Task B
Task C
Task E
Task F
Start
End15 mins
30 mins
20 mins
15 mins
10 mins
5 mins
A+B+E = 45A+C+E = 40A+C+F = 35D+C+E = 55D+C+F = 50
So that’s simple enough!
Everybody ready to leave?
Real world complications Hundreds or thousands of batch jobs
Managed by a batch scheduler package
Time-of-day dependencies Extraneous dependencies
New jobs added without cleaning up obsolete dependencies
Variable execution times Variation in data to be processed Contention with other processes External waits Job failures
What that might look like…
Winding your way through that mess is a bit more complicated!
Tooling Options• Package from scheduler vendor
+ Should be well integrated
- Cost?
• Microsoft Project- Not really meant for this purpose
+ See CMG Proceedings: Schwarz/Aurand, 1999 and Zaslavsky, 2001
• SAS/OR- Cost and effort?
• Roll your own+ Can make output exactly what you want
- Time / effort
+ Sample code on your CD!
What is the real question?1. What is the longest path through
the schedule?- prediction of the critical path- usually one-time analysis
2. Why did job X finish late last night?- an ongoing question / process- requires the CP for job X
Fortunately, #2 is much easier!
Use what you know
Predecessors (from batch scheduler)
End times (from actual executions)
We are answering a question, not predicting the future
We just need to explain what happenedLook at the jobs in the critical path for job
X for anomalies
Critical path simplified Start at job X Find the predecessor that ended last
– that was the critical predecessor to X Call that job W
Find last predecessor of W, call it V Repeat until:
You go back some number of levels or You reach a time dependency
Resulting list is the critical path, for the day under study, for job X
Simple example
Task A
Task D
Task B
Task C
Task E
Task F
Start19:00
End15 mins
30 mins
20 mins
15 mins
10 mins
5 mins
19:15
19:30
19:35
19:45
19:55
19:50
Working backwards…
Task A
Task D
Task B
Task C
Task E
Task F
Start19:00
End19:15
19:30
19:35
19:45
19:55
19:50
E ended last
What is E’s last predecessor?
Task A
Task D
Task B
Task C
Task E
Task F
Start19:00
End19:15
19:30
19:35
19:45
19:55
19:50
E ended lastC ended after B
What is C’s last predecessor?
Task A
Task D
Task B
Task C
Task E
Task F
Start19:00
End19:15
19:30
19:35
19:45
19:55
19:50
E ended lastC ended after BD ended after A
Critical Path is E – C – D
Task A
Task D
Task B
Task C
Task E
Task F
Start19:00
End19:15
19:30
19:35
19:45
19:55
19:50
Tasks A, B, F had no direct bearing on the end time
Complicating simplicity… Schedule changes every day
Weekly / monthly processing Application changes
Schedule relationships may not be pristine
Jobs may be run multiple times—be sure to use the correct instance
If you want to graph the entire schedule it gets more complicated
Why bother finding the CP? Limit the data you need to look at to
investigate a late-finishing job The cause is on the CP
Find changes If the CP changes day to day:
why? Investigate impact of periodic
schedule differences Addition of monthly processing
jobs may change the CP
What I do Capture job stats every day to a
performance database Standard practice Extract history daily to XML file
Extract schedule once per day and store for 45 days Saved as XML files Allows historical investigation
JavaScript browser application pulls both data sources and allows for investigation
Sample application Input #1: Schedule XML file<?xml version='1.0'?><?xml-stylesheet type='text/xsl'?><opc><app id='#AMCSMISCBILL'> <op id='87' job='#AMCS331' arr='1930'><wkstn>CPUJ</wkstn> <desc>Online Bill Image xtract</desc> <pred><aid>#AMCSMISCBILL</aid><opid>81</opid></pred>… <pred><aid>#AMCSMISCBILL</aid><opid>78</opid></pred> <succ><aid>#SMCSDAILYRPTS</aid><opid>3</opid></succ>… <succ><aid>#AMCSDLYBKUP</aid><opid>6</opid></succ> </op> <op id='90' job='#AMX1358' arr='1930'><wkstn>CPUJ</wkstn> <desc>Load O/L Bill Image</desc> <pred><aid>#AMCSMISCBILL</aid><opid>87</opid></pred> <succ><aid>#AMCSMISCBILL</aid><opid>91</opid></succ> </op></app>
A grouping of jobs is an
“application”
A job is an “operation”
Each job has predecessors
and successors
Arrival time is the earliest the
job can run
Sample application Input #2: Job Data XML file
<job id="#AMCS331"> <sys>COCJ</sys><cls>0</cls><desc>MCSX4000</desc><cnt>24</cnt> <acpu>2.18</acpu><aet>13.8</aet><mcpu>2.70</mcpu><met>26.7</met> <run i="2009-05-09 2:01:55" rse=" 1:59:46, 2:01:56, 2:02:09"> <et>13.7</et><cp>2.16</cp><io>21632</io> </run> <run i="2009-05-08 0:51:46" rse=" 0:51:45, 0:51:48, 0:52:02"> <et>16.5</et><cp>2.11</cp><io>21690</io> </run> <run i="2009-05-07 1:24:41" rse=" 1:22:04, 1:24:41, 1:24:52"> <et>11.4</et><cp>2.12</cp><io>21740</io><sys>CO1J</sys> </run></job>
Jobs by name here
Norms, averages, maximums
May have multiple runs per
day
Read, start, end times
Note relatively compact format designed to reduce the file size—unfortunately that increases the complexity of interpreting the data
Stats for single run
Sample application output
Compare days in two windows
Why later on 12/29?
Completely different CP here
Critical path comes back together here
Here’s an ET difference of >1 hour and CPU >2x increase!
Sample application details On CD at back of room Very simple example coded quickly one
Sunday afternoon May not be bug free Will not satisfy all your needs For illustrative purposes only – easier to
understand than the examples in the paper HTML / JavaScript application Data is in XML files Internet Explorer only Uses XPath & XSLT
Beyond the scope of this presentation
Application flow When HTML page loads
Calls init() to load the XML files Selection criteria populated in HTML
When user clicks “Find It” button Calls findIt() to find the critical path,
which in turn calls: getRunsDate(job, date) – returns array of
executions of a job on a given batch date getLatestPred(job, app) – returns
predecessor that ran last getRunsDate is an example of using DOM functions to extract data from XML
getRunsDate is an example of using DOM functions to extract data from XML
Critical parts of the code
findIt() Get runs for this job and date
Save stats for this job to array
Then find preds and add them to
the array
Loop through array and build
HTML table
getLatestPred(cjob, capp)XPath to get
array of preds
Use DOM calls to get data from
XML
Xpath to find job name from op id
Get the runs for a pred job
Check run to see if it is the latest
pred so far
If it is, save it
That’s essentially it!
getRunsDate(job, date) is nothing special – simply retrieves list of runs from the XML file
Typical housekeeping code initializing variables, etc.
Sample in the paper was much more complicated due to it being pulled from the application that does the graphing
Summary
Critical Path Analysis for Performance Analysis usually involves answering why, not predicting the future
In such a case, start at the end and work backwards
That type of analysis is easy to code Record a snapshot of the schedule
daily as well as the job performance
Questions / comments ?
Bonus material: Export to Excel! Sometimes you need to play with
your data Copy to Excel to
Re-sort Filter Pivot tables Graph Summarize
Cut and paste HTML tables works well
Even better: automate it
New & improved:
sendExcel() function on CD
If Excel not open, open it If previous workbook not open
Open a new workbook Add new sheet to work book with
data
Requires IE and Excel
Useful references
XPath, XSLT, XML quick reference cards http://www.mulberrytech.com/quickref/index.html
Browser Book for Web Designers http://www.visibone.com/products/browserbook.html
Humor for geeks http://xkcd.com/