monitoring ibm websphere performance tuning and ibm tivoli · identify, while performance tuning,...

13
© Copyright IBM Corporation 2010 Trademarks IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 1 of 13 IBM WebSphere performance tuning and IBM Tivoli Monitoring Best practices for a POWER7 WebSphere Application Server 7 infrastructure Jason Meiers December 21, 2010 Discover best practices and tools for creating continuous improvement for transaction response times, as well as, initial hardware procurement performance evaluations for IBM WebSphere Application Server 7 and POWER7 architectures with IBM Tivoli Monitoring. Increased transactions per second (TPS) requires tuning for response times, as well as, lower CPU utilization for financial transactions and web-based transactions. TPS directly affects user response times and the cost of your hardware infrastructure. This article shows how to tune IBM WebSphere Application Server 7 on IBM POWER7® hardware for better performance. Architecture topologies The hamburger service shown in Figure 1 describes the IBM POWER profile configuration of POWER7 LPARs for the allocation of resources. Each request controller LPAR has four physical processors (eight logical CPUs) with 8GB of RAM to process the incoming workload accordingly. Other services—for example, the onion service profile—are configured with two physical processors (four logical CPUs) and 8GB of RAM. Whether certain services have more processors is determined by the load requirements of that specific service. In this case, the request controller service is called for each request, where the cheese and onion services are not. Frequently used acronyms I/O: Input/output JDK: Java software development kit LPAR: Logical partition SOA: Service-oriented architecture XML: Extensible Markup Language The application layer refers to the internal layer which includes a request controller service to process service-based transactions, as well as, a list of other WebSphere Application Server version 7-based Apache Tuscany (Open SCA) services. In this example, the application creates a hamburger. Based on customers' requests, specific services are executed (for example,

Upload: others

Post on 19-Jul-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

© Copyright IBM Corporation 2010 TrademarksIBM WebSphere performance tuning and IBM Tivoli Monitoring Page 1 of 13

IBM WebSphere performance tuning and IBM TivoliMonitoringBest practices for a POWER7 WebSphere Application Server 7infrastructure

Jason Meiers December 21, 2010

Discover best practices and tools for creating continuous improvement for transaction responsetimes, as well as, initial hardware procurement performance evaluations for IBM WebSphereApplication Server 7 and POWER7 architectures with IBM Tivoli Monitoring.

Increased transactions per second (TPS) requires tuning for response times, as well as, lowerCPU utilization for financial transactions and web-based transactions. TPS directly affects userresponse times and the cost of your hardware infrastructure. This article shows how to tune IBMWebSphere Application Server 7 on IBM POWER7® hardware for better performance.

Architecture topologiesThe hamburger service shown in Figure 1 describes the IBM POWER profile configurationof POWER7 LPARs for the allocation of resources. Each request controller LPAR has fourphysical processors (eight logical CPUs) with 8GB of RAM to process the incoming workloadaccordingly. Other services—for example, the onion service profile—are configured with twophysical processors (four logical CPUs) and 8GB of RAM. Whether certain services have moreprocessors is determined by the load requirements of that specific service. In this case, therequest controller service is called for each request, where the cheese and onion services are not.

Frequently used acronyms• I/O: Input/output• JDK: Java software development kit• LPAR: Logical partition• SOA: Service-oriented architecture• XML: Extensible Markup Language

The application layer refers to the internal layer which includes a request controller service toprocess service-based transactions, as well as, a list of other WebSphere Application Serverversion 7-based Apache Tuscany (Open SCA) services. In this example, the application createsa hamburger. Based on customers' requests, specific services are executed (for example,

Page 2: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

developerWorks® ibm.com/developerWorks/

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 2 of 13

hamburger grill service, lettuce service, onion service, packaging service, tomato service, andcheese service).

Figure 1. The hamburger service

Topologies for systems monitoring and management include IBM Tivoli Monitoring, wheremonitoring data is consolidated for performance views and management. The Tivoli topologyincludes IBM Tivoli Enterprise Monitoring Server as the primary hub for all systems monitoring, abackup Tivoli Enterprise Monitoring Server in case the primary fails, IBM Tivoli Enterprise PortalServer and the number of remote servers to handle, as well as, load balancing the number ofsystems monitoring agents.

Figure 2 shows the configuration of each instance within a POWER7 LPAR with one physicalprocessor (two logical CPUs) and 4GB of RAM. This configuration covers infrastructures for up to500 servers for the collection of systems monitoring data.

Figure 2. The Tivoli Monitoring version 6.2 architecture

Figure 3 shows the topology diagram for IBM Tivoli Composite Application Manager for ApplicationDiagnostics version 7.1 to collect application performance data and diagnostics tools forperformance tuning. The LPAR-generated distribution of the Tivoli Composite ApplicationManager services the visualization engine, kernel, publish server, global publish server, message

Page 3: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

ibm.com/developerWorks/ developerWorks®

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 3 of 13

dispatcher, archive agent. Publish server provides scalability and availability for distributedapplication profiling.

Figure 3. Tivoli Composite Application Manager version 7.1 architecture

RequirementsPerformance requirements for transaction response times (that is, TPS) include the medianresponse time (150ms), the average response time (170ms), the response time 95th percentile(180ms), and 20-35 percent CPU utilization on a single POWER7 core.

ScalabilityYou implement vertical scaling by adding multiple Java™ Virtual Machine (JVM) instances on asingle LPAR leveraging the same processor and memory. You can leverage this architecture if youtune your application to such a level of serialization that running multiple JVMs can increase yourworkload. Use this configure if LPAR processors and memory are still available.

You implement horizontal scaling by adding multiple LPARs, each with one or many instances ofWebSphere Application Server from the same service or application clusters. (This configurationis best used for processor- and memory-intensive services.) In Figure 1, the request controllerinvoked for each service request is an example of horizontal scaling, because controller patternsseem to be processor intensive and memory is determined by the payload of each service request.

Usage and performance management best practicesThis section describes tools you can use to determine where performance bottlenecks in thetransaction are located.

Problem determinationTivoli Composite Application Manager provides many tools and graphs for monitoring yourinfrastructure:

• JVM CPU utilization graph: This Tivoli Composite Application Manager graph (see Figure 4)provides utilization metrics of a single JVM rather than simply system CPU utilization. In this

Page 4: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

developerWorks® ibm.com/developerWorks/

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 4 of 13

case, you see that for this application there is minimal JVM-specific processor utilization—between 10 and 25 percent.

Figure 4. JVM CPU utilization (Percent, last hour)

• JVM memory utilization graph: This Tivoli Composite Application Manager graph (seeFigure 5) provides insights into the utilization of a single JVM rather than simply systemmemory utilization. But in this case, you can see that there is a 30 percent jump in memoryuse when load is generated for the application.

Figure 5. JVM memory utilization (Percent, last hour)

• Application-level throughput: Throughput for a request is available in Tivoli CompositeApplication Manager for availability, as well as, problem determination (see Figure 6). Often,without throughput and utilization, side-by-side performance bottlenecks are difficult tocorrelate with throughput, transaction response times, and CPU or memory utilization. For thisapplication, there are on average 240 transactions processed per minute—about 4 TPSs.

Figure 6. Throughput (request/min, last hour)

• Response times: Response times are critical performance indicators both for your businessand for your customers. In this case, Tivoli Composite Application Manager shows initialtransactions with extremely high response times in the 12-second range (see Figure 7). Onceall resources have been loaded, response times improve to sub-second levels—around a fewhundred milliseconds.

Figure 7. Response time (seconds/min, last hour)

• Web container thread pool: This thread pool is initially set to a minimum of 50 and amaximum of 50; in most cases, this setting is sufficient. For this example, 10 concurrentrequests were sent; therefore, 12 web container threads were used. As Figure 8 shows, thereis a maximum of 50 threads, and 24 percent are actually being consumed.

Page 5: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

ibm.com/developerWorks/ developerWorks®

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 5 of 13

Figure 8. Thread pools

By default in WebSphere Application Server, asynchronous web request dispatchingis not enabled. Enable this setting to process asynchronous web requests by clickingAppServers > Server > Web container > Asynchronous Request Dispatching. Then, onthe Configuration tab, select Allow Asynchronous Request Dispatching.In clustered infrastructure profiles, it makes sense to monitor Distribution & ConsistencyServices (DCS) threads (see Figure 9). The DCS threads indicate network connectionsbetween each member of a cluster, including synchronization of configuration updates. Forproduction configurations, IBM recommends disabling DCS in the WebSphere integratedconsole, because this feature is only required during configuration.

Figure 9. TCP DCS threads

• Transaction failure rate: The transaction failure rate (see Figure 10) helps you quicklyidentify, while performance tuning, whether transactions are failing. Such failures can occurif, for instance, the database or other service that the transaction requires is unavailable. Inthis small load performance example, zero transactions are failing, which indicates that all themetrics gathered are valid for capture in a tuning comparison.

Figure 10. Transaction failure rate

• Database connection pools: In Tivoli Composite Application Manager, the use of databaseconnection pools (see Figure 11) helps you determine usage and whether thresholds forthe database are employed. If the transaction cannot access a data source or the requestthread needs to wait on database availability, then this can directly affect response times andthroughput.

Figure 11. Database connection pools

Page 6: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

developerWorks® ibm.com/developerWorks/

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 6 of 13

The agent that Tivoli Monitoring provides for system-level monitoring includes network activityto help determine, based on performance load tests, how much data is passing through thenetwork (see Figure 12). In the case of this example, the aggregate packets per secondis 50. This value can increase based on the payload (XML request and response) of theapplications and services.

Figure 12. Network activity

Other metrics include CPU load averages, which can include idle time. In this case, totalsover 15 minutes average 100 percent. Also provided is the user nice CPU, system CPU, andI/O wait percentages. In the capture interval for Tivoli Monitoring, you can see 100 percentidle time in Figure 13. Tivoli Monitoring system monitoring agents also provide trend analysisto help you make better decisions over time.

Figure 13. Tivoli Monitoring CPU utilization

• The nmon analyzer: In certain cases, CPU utilization for performance tuning may requiremore real-time data for tuning applications and services running on WebSphere ApplicationServer. The nmon analyzer, an IBM freeware tool, provides this real-time data. In this

Page 7: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

ibm.com/developerWorks/ developerWorks®

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 7 of 13

example, the POWER7 LPAR has been configured for two physical processors (four logicalCPUs) displayed in nmon as four processors (see Figure 14). Each logical CPU and itsutilization is shown, and the total average is 50 percent. Keep in mind that the CPU utilizationmeasurement is taken at time-based intervals and is never exact; therefore, be sure you usemultiple tools and modes.

Figure 14. nmon CPU utilization

nmon CPU utilization also provides an l option to get the total average over a longer periodof time when capturing CPU metrics. In this example, there is 90 percent CPU utilizationwith reoccurring 0 percent utilization. This result is based on the client load performance toolsending 10 concurrent requests and waiting on blocked threads. You can see this in Figure15, as well as, in a JavaCore file and in a hint from Tivoli Composite Application Manager-monitored web container threads.

Figure 15. CPU utilization in nmon with the l option

• top, topas, and prstat. On most UNIX® and Linux® systems, top, topas, or prstat areavailable to displace utilization, including memory and CPU. In the example in Figure 16,the H option is set to display thread-level utilization. Each was user thread is a thread withinWebSphere Application Server and, in some cases, is consuming 14 percent of the CPU.This consumption can be the result of several things: the actual service request processing a

Page 8: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

developerWorks® ibm.com/developerWorks/

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 8 of 13

web container thread; DCS threads synchronizing with other members of a cluster; or otherWebSphere Application Server-specific threads.

Figure 16. CPU utilization in top

• vmstat. In vmstat, you can see utilization as well as other metrics to determine system-levelbottlenecks caused by the application. The typical CPU columns us, sys, id, and wa appearin the other tools mentioned earlier.If you're working with vmstat on Red Hat Linux running on POWER7, you'll also get the stealinformation in the far right column. This metric indicates the CPU utilization that the systemuses rather than the user CPU utilization. In the example shown in Figure 17, althoughperformance load is running, you can see a high level of steal and only 18–21 percent userCPU utilization for WebSphere Application Server. This result may indicate that a two-physicalCPU LPAR configuration profile is too high for this application running on a single JVMbecause of context switching and processing of non-WebSphere Application Server-specificmethods.

Figure 17. CPU utilization in vmstat

The processor data in the r column indicates the number of threads waiting to be processed.A high number indicates a thread bottleneck and many waiting threads. Because the load onlyran shortly, shown in the first four rows, there were no waiting threads.

Load

Generating load is critical to successful application tuning. You must establish a baseline beforeeach tuning change with a load tool. Figure 18 shows Jmeter being used to generate WebSphereApplication Server load. To initiate load, the requirements for Jmeter are:

• A web service URL

Page 9: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

ibm.com/developerWorks/ developerWorks®

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 9 of 13

• An XML request payload• The number of concurrent users

Figure 18. Load testing in Jmeter

For the generation of performance load—and especially for a single request that requires theresponse payload—Figure 19 shows soapUI. In some cases, you may want simply to confirm thatthe transaction is successful and validate the response payload. In this example, you can seehigh response times at the beginning of the application transactions—the result of loading initialclasses, data sources, and caching. Before running high volumes of concurrent users, it may behelpful to initiate a few single-thread transactions and view the payload before running load againstthe new configuration change tuning parameter.

Page 10: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

developerWorks® ibm.com/developerWorks/

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 10 of 13

Figure 19. Service testing in soapUI

Diagnostics

Debugging problems in a development environment can be quick and efficient for a singleapplication, but a multiple-service application can be challenging to troubleshoot. Tivoli CompositeApplication Manager for Application Diagnostics correlates Java 2 Platform, Enterprise Edition(J2EE) to J2EE and/or J2EE to WebSphere MQ transactions spanning multiple LPARs. This helpsto identify the transaction flow and provide insights into the code to determine response times atthe method level. As Figure 20 shows, the trace report of Tivoli Composite Application Manager forApplication Diagnostics has identified methods in the application code with high response times.Methods 3, 7, 12, and 13 have been marked for double- and even triple-digit millisecond responsetimes. Furthermore, you can drill down into each method to find an additional breakdown, if youneed to know which section of the code requires a fix.

Page 11: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

ibm.com/developerWorks/ developerWorks®

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 11 of 13

Figure 20. A transaction drill-down in Tivoli Composite Application Manager

Transaction execution paths are also available with Jinsight to identify the cause of high responsetimes. Each method elapsed CPU time is profiled thus quickly identifying what code is responsiblefor execution problems (see Figure 21).

Figure 21. Xtrace method profiling

Page 12: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

developerWorks® ibm.com/developerWorks/

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 12 of 13

This figure shows the execution pattern for the HamburgerService servlet and which methods arecausing high response times. To capture a single Jinsight transaction, begin by deploying Jinsight.To do so, add the libjinsight-pLinux.so file to the WebSphere classpath:

JVM Arguments -agentlib:jinsight-pLinux=localFileName=/tmp/trace.trc,localID=10Start Jinsight Tracejinctl start 10Stop Jinsight Tracejinctl stop 10

Once a method has been identified with high response times and before you submit a requestfor an application or service patch, run Xtrace for that method to qualify that it's actually theproblem. The benefit of running Xtrace is that it enables profiling for a single method rather thanthe complete application or service. Therefore, you can run load against the service and determinemore accurate total response times, including the response times of a single method that has beenidentified as a suspect for high response times.

To implement Xtrace in WebSphere Application Server 7, add the following line to the JVMarguments and restart the application server to include the change:

-Xtrace:methods={com/ibmtools/*},print=mt

When you have set the methods in Xtrace, you will see in the native_err.out file each methodentry invocation marked with > and each method exit marked with <. The method parameters andattributes are marked with -, although this may not provide as much value as the execution timesof entry and exit methods.

Java garbage collection policy determines the performance of your application and by default isset to optthruput. This configuration generates and keeps all objects in a single heap containerand therefore has performance-impacting garbage collection cycles. To improve performance in allapplications, use the gencon generation conditional setting for the IBM JDK for garbage collection:

-Xgcpolicy:gencon -Xmnx124M -Xmns124M -Xmos900M

Because the generational conditional garbage collection policy splits the heap into two sections,it may be useful to specify the sizes of these sections of the heap. One section is named nurseryand contains objects leaving the heap in the next scavenger or global garbage collection cycle.The other section is named tenured and contains objects that leave only after global garbagecollections.

The nursery is specified in the sample application, and you can see in the native_err.out file thescavenger garbage collection responses. In line 1 of Figure 22, the intervalms is 13072.241ms,which is exceptional and could possibly be a concern for nursery garbage collection. In high-volume scenarios with tighter intervals (for example, between 50 and 100ms), high CPU utilizationfor garbage collections is a concern. Performance tuning the nursery sizes and tenured heapsections is a must.

Page 13: Monitoring IBM WebSphere performance tuning and IBM Tivoli · identify, while performance tuning, whether transactions are failing. Such failures can occur if, for instance, the database

ibm.com/developerWorks/ developerWorks®

IBM WebSphere performance tuning and IBM Tivoli Monitoring Page 13 of 13

Figure 22. Output from a thread dump

Core files indicate where threads are blocked or waiting and serve as the initial point ofperformance investigation. In Tivoli Composite Application Manager and on Linux or UNIX, kill-3 <pid> generates this file for investigation. Using Tivoli Composite Application Manager as acentralized performance management tool for JavaCore helps, because no additional remoteconnections or copies of files across the infrastructure are required.

Summary

In this article, you learned about tuning a WebSphere Application Server 7 and POWER7deployment running Open SCA services. Based on the list of methods and tools provided here,you can develop a process for improving response times, CPU utilization, and hardware cost.Financial transactions and web-based requests will improve even more with added features andSOA services.

© Copyright IBM Corporation 2010(www.ibm.com/legal/copytrade.shtml)Trademarks(www.ibm.com/developerworks/ibm/trademarks/)