compuware corporation e2e performance monitoring to the mth tier (mainframe integrated) the new...
TRANSCRIPT
CompuwareCorporation
E2E Performance Monitoring to the Mth Tier (Mainframe Integrated)
The New Industry Standard via the Apdex Alliance
SCCMGNovember 2, 2007
Thomas A. Halinski
Yuri Grinshteyn
CompuwareCorporation
End-to-End Performance Monitoring to the Mth Tier (Mainframe Integrated) – The New Industry Standard via the Apdex Alliance How many reports do you use to determine if your end to end systems (including the Mainframe) needs tuning?
Would it help if you could convert many measurements into one number? Are you sure you are looking at application performance from the same perspective as your End-User (i.e. user satisfaction with enterprise applications)?
Is your enterprise interested in an integrated approach to monitoring, identifying and tuning the end user E2E transactions?
Here’s what the Apdex Alliance and the Industry are now suggesting and a step beyond with “dashboards” that are both for IT and Business Management.
CompuwareCorporation Page 3
Defining Service Management and the Apdex Standard
Service Management Problem Apdex Approach Performance Measurement Problem Industry Guideline/A Better Mousetrap The Cost of Poor Application Quality
© 1990-2006 NetForecast, Inc., All rights reserved.
CompuwareCorporation Page 4
The IT Value Chain
Half of enterprises are– providing poor performance or do not
know how well they are serving their users (business success)
NetForecast/BCR survey
– Half of enterprises are postponing launching new applications due to performance concerns (curtailing business)
Network World survey
Which Half are you?
Performance measured by success of the business, i.e. The User Experience
Users
Business
Infrastructure
Business
Users
Infrastructure
CompuwareCorporation Page 5
The 80/20 Rules Have Flipped Old 80/20 Rules
– 80% of your users are in your primary offices
– 80% of your traffic is inside your network
– Therefore, if you deliver good service to the 80% you know, then you are well ahead of the game
New 80/20 Rules
– 80% of the users are outside your primary offices
– 73% of application service problems are reported by end users, not by the IT department
Forrester Research
– 82% of enterprises say that poor performance is impairing employee productivity
Network World survey
The 80/20 Rules Have Flipped
CompuwareCorporation Page 6
* The Apdex Alliance is a group of companies collaborating to promote an application performance metric called Apdex. Apdex is a numerical measure of user satisfaction with the performance of enterprise applications, and reflects the effectiveness of IT investments in contributing to business objectives. See www.apdex.org for details.
An “end-to-end performance monitoring” view of an enterprise, is based on the industry standard from the “End User Perspective”, as proposed by the Apdex Alliance*. This includes the “application” perspective.
It means that “Performance Is The User Experience” - deterministic of “Business Value” and bottom line ($).
E2E Performance Monitoring End User Perspective - Definition
CompuwareCorporation Page 7
Today’s Problem:Many Numbers, Little Insight
Measured Response Time (seconds)
App A App B App C App D App E
Day Average
6.0 12.5 3.1 8.4 2.0
BestHour
5.0 6.8 2.8 4.1 1.7
Worst Hour
18.6 18.9 8.6 19.3 6.5
95th Percentile
8.1 17.3 10.7 12.9 9.5
Which application is in trouble?
CompuwareCorporation Page 8
Example: 100 Numbers
Start with what you have– Your measurement tool produced 100 samples
The samples are– Single application
– User-level response time measurements
– One hour period of observation
Is the application operating well?
1
6.45 16.89 3.36 54.50
59.55 13.25 3.33 2.51
16.67 4.50 2.22 4.75
12.56 8.44 9.76 3.84
2.99 4.75 13.20 11.98
14.55 8.83 3.73 2.94
7.37 3.78 3.28 3.99
2.78 3.54 4.90 4.29
7.38 6.39 6.21 23.56
19.69 21.33 22.50 18.10
1.61 1.46 2.15 10.46
5.60 3.67 2.20 2.35
1.64 2.13 15.35 2.48
3.87 4.90 4.64 3.42
2.02 1.99 3.69 3.22
6.09 2.32 3.83 16.37
3.74 2.70 2.95 30.08
30.54 1.76 4.53 1.46
2.76 1.74 5.33 4.11
7.50 1.36 2.49 2.77
2.38 6.38 7.98 3.85
5.85 2.20 7.57 1.77
15.00 6.02 1.26 14.83
3.28 3.34 3.46 1.87
1.80 2.24 2.65 5.20
CompuwareCorporation Page 9
Numbers Beget Numbers
0
5
10
15
20
25
30
35
40
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60
Incremental Time Period (sec)
Nu
mb
er o
f S
amp
les
in T
ime
Per
iod
secAverage 7.6Median 3.9Mode 4.8Standard Deviation 9.495th Percentile 22.5Minimum 1.3Maximum 59.6
Now you have 137 numbers.Can you answer the question,“Is the application operating well?”
2
CompuwareCorporation Page 10
Defining Performance and the Apdex Standard
Service Management Problem Apdex Approach Performance Measurement Problem Industry Guideline/A Better Mousetrap The Cost of Poor Application Quality
© 1990-2006 NetForecast, Inc., All rights reserved.
CompuwareCorporation Page 11
Apdex Defined Apdex is a numerical measure of user satisfaction with
the performance of enterprise applications It defines a method that converts many measurements
into one number– Uniform 0-1 scale, 0 = no users satisfied, 1 = all users satisfied
Standardized method– It is a comparable metric across all applications, and
– Across enterprises
CompuwareCorporation Page 12
Deconstructing Application Transactions
Session = Period of time that a user is “connected” to an application
Start theapplication
End or suspend the application
Task = Each interaction with the application during the session
Type orchoose
Enteror click
Systemresponds
Userwaits
User readsor thinks Type Wait Read
Enteror click
Systemresponds
Process = A group of user interactions that accomplish a goal
Get new email, add an employee, check on inventory status, etc. Idle
CompuwareCorporation Page 13
The Task Defined Task response time is the elapsed time required for an
application system to respond to a human user input such that the user can effectively proceed with the process they are trying to accomplish
– Time when the user is waiting in order to proceed
– User feels the responsiveness of the application
– Long Task time makes the user less productive
The Task is what a user cantime with a stopwatch
CompuwareCorporation Page 14
How Users View Application Task Performance
Satisfied– User maintains concentration– Performance is not a factor in the user experience– Time limit threshold is unknowingly set by users and is consistent
Tolerating– Concentration is impaired– Performance is now a factor in the user experience– User will notice how long it is taking
Frustrated– Performance is typically called unacceptable– Casual user may abandon the process– Production user is very likely to stop working
CompuwareCorporation Page 15
How Apdex Works Start with a sufficient number of Task measurement samples Target response time “T” defines the satisfied zone (0-T sec)
– T is shown as a subscript of all Apdex values (for example 0.80T )
Count the number of samples within three performance zones– Satisfied, Tolerating, Frustrated
Tolerating count
ApdexT = Total samples
Satisfied count2
+
GivenTarget response time T andSufficient response time measurement samples
Then
NoteFrustrated samples are not in numeratorbut are counted in total samples
Index0 = Failure; 1 = Perfection (all users satisfied)
CompuwareCorporation Page 16
Putting it All Together
Tolerating
ApdexT= Total samples
Satisfied2+
Frustrated
Satisfied
Tolerating
Good
Fair
Poor
Un
ac
ce
pta
ble
0.00T
0.50T
1.00T
0.85T
0.94T
0.70T
ExcellentReport Group:ApplicationUser GroupTime Period
Existing TaskResponse TimeMeasurement
Samples
T1
2
3
4
56
F
1. Define T for the application
T = the application target time (threshold between satisfied and tolerating users).
F = threshold between tolerating and frustrated users is calculated (F = 4T).
2. Define a Report Group (details available are tool dependent).
3. Extract data set from existing measurements for Report Group.
4. Count the number of samples in three performance zones.
5. Calculate the Apdex formula.
6. Display Apdex result (T is always shown as part of the result).
CompuwareCorporation Page 17
The Apdex View of the 100 Numbers
User productivity is impaired if the application responds in more than 8 seconds
– T = 8 sec
Apdex for the 100 measurements = 0.878
– The application barely providing “Good” performance
100 numbers = 0.878
4
Good
Fair
Poor
Un
ac
ce
pta
ble
0.00T
0.50T
1.00T
0.85T
0.94T
0.70T
Excellent
CompuwareCorporation Page 18
Defining Performance and the Apdex Standard
Service Management Problem Apdex Approach Performance Measurement Problem Industry Guideline/A Better Mousetrap The Cost of Poor Application Quality
© 1990-2006 NetForecast, Inc., All rights reserved.
CompuwareCorporation
Proper performance management is delivered in two steps:
1.) Ongoing end-user performance measurement, reporting and tracking.
2.) Flow-based performance measurements and diagnostics to identify which silo the issue resides in.
CompuwareCorporation Page 20
Service Monitoring
Web Servers orFat Client
ApplicationServers
Mainframe / DatabaseServers
Response Times Via End User
Perspective(Clients)
Mth Tier - APM
Mainframe monitoring can be triggered at the application code
level by watching SLAs set on the Java or .NET network methods which are used to
communicate to the Mainframe
Only available with “Integrated” E2E Performance
Monitoring
Response Times From the application
layer(Server)
CompuwareCorporation Page 21
APPLICATION SERVICE MANAGEMENT
Monitoring Individual Tiers
©2006, NetForecast, Inc. and Apdex Alliance. All rights reserved.
CompuwareCorporation Page 22
Desktop Windows WAN UNIX MainframeDBMS/Mainframe SAN
Report Aggregation
Report Report Report Report ReportReportReport
Report Aggregation
Typical Silo Management
©2006, NetForecast, Inc. and Apdex Alliance. All rights reserved.
CompuwareCorporation Page 23
Report Report Report Report ReportReportReport
First Reason Why Silo Performance Does Not Equal User Performance
* Performance between “silos” is missing whenMonitoring is Non-Integrated
* ** * * *
Desktop Windows WAN UNIX MainframeDBMS/Mainframe SAN
CompuwareCorporation Page 24
Assumed Path of a User’s Task
Actual Path of a User’s Task
Second Reason Why Silo Performance Does Not Equal User Performance
Some Tier Performance Tools miss/aggregate Actual Data©2006, NetForecast, Inc. and Apdex Alliance. All rights reserved.
Desktop Windows WAN UNIX MainframeDBMS/Mainframe SAN
CompuwareCorporation Page 25
Defining Performance and the Apdex Standard
Service Management Problem Apdex Approach Performance Measurement Problem Industry Guideline/A Better Mousetrap The Cost of Poor Application Quality
© 1990-2006 NetForecast, Inc., All rights reserved.
CompuwareCorporation Page 26
Client
Clients
Applications
Transaction Types
Transaction Response
Data Correlation
Business Model
Applications
Locations
Business Transactions
Users
Network
Session Response
Traffic\Utilization
Latency
IP End Points
WAN Links
Server
Servers
Counter Data
WMI
Methods
SQL Calls
Mainframe
Faults
Measurement Data
Device
Device
WMI
Counter Data
SNMP
Vantage
E2E Integrated Performance Analysis
Data Correlation
E2E IntegratedSolution
CompuwareCorporation Page 27
End User Monitoring Approaches Agentless
– Network attached appliance– Monitor all transactions from all users
Active– Dedicated workstation that executes
synthetic transactions– Flexible scripting
Passive– End user workstation agent
End User experience monitoring
CompuwareCorporation Page 28
Integrated with end user monitoring Focus on the application
– Web server tier– Application server tier
J2EE & .NET– Heap, CPU, connection pools– Long running and CPU intensive methods,
transactions, and database calls– Pinpoint source of memory leaks
– Database server tier Hundreds of Oracle, SQL Server, DB2 metrics
– Performance of individual SQL calls
Agent-based and agentless implementationapproaches
Server Monitoring
Server monitoring
CompuwareCorporation Page 29
APPLICATION CHARACTERIZAITON
• Quickly identify transactions that are having problems and what time the problem occurred
CompuwareCorporation Page 30
END TO END VISIBILITY
• Real time status and usage of applications, systems and network application performance
CompuwareCorporation Page 31
Integrated E2E for the Mth Tier -
Mainframe Monitoring Software
Monitor application performance Identify excessive resource consumption (CPU & Wait) Resolve excessive resource consumption (CPU & Wait) Improve “End User” transactions and reduce costs
CompuwareCorporation Page 32
Network monitoring
Transaction profiling tools allow you to see application calls from a Windows or UNIX server to the mainframe. If the call exceeds the threshold, it triggers a mainframe measurement for that alert.
CompuwareCorporation Page 33
CPU 10.53 secCPU 10.53 sec
Mth tier monitoring
CompuwareCorporation Page 34
Mth tier monitoring – identify excessive CPU usage
CompuwareCorporation Page 35
Mth tier monitoring – resolve excessive CPU usage
“…using an INSPECT verb in conjunction with a reserved word, such as SPACES or LOW-VALUES, causes an exit from the user application program. Changing the word SPACES to [an actual space] eliminates the CPU time…”
“…using an INSPECT verb in conjunction with a reserved word, such as SPACES or LOW-VALUES, causes an exit from the user application program. Changing the word SPACES to [an actual space] eliminates the CPU time…”
CompuwareCorporation Page 36
Defining Performance and the Apdex Standard
Service Management Problem Apdex Approach Performance Measurement Problem Industry Guideline/A Better Mousetrap The Cost of Poor Application Quality
© 1990-2006 NetForecast, Inc., All rights reserved.
CompuwareCorporation Page 37
Service Management Overview Maps business services
to IT infrastructure; determines root-cause and business impact in real time
Communicates service compliance status via dashboards and reports
ITIL, Six Sigma best practice support
Acquires data from any technical and business sources
Monetize the value of IT components
CompuwareCorporation Page 38
Real-Time Service Visualization
CompuwareCorporation Page 39
CompuwareCorporation Page 40
CompuwareCorporation Page 41
Questions