Parallel Performance Toolsin Visual Studio 2010Hazim ShafiArchitect, Parallel Computing PlatformMicrosoft
Outline
Motivation Brief introduction to VS2010 parallel
computing technologies Overview of performance tools Demo
Motivation (1 of 2)
Multi-core processors are here Exploiting parallelism is becoming critical
for performance Enables compute-intensive scenarios
Developer Challenges Identify candidates for parallelization Performance tuning of parallel apps
Reasoning about the performance of multithreaded applications is difficult
This is not a new problem, but is exacerbated by the multi-core shift
Motivation (2 of 2) Profiling tools need enhancements
Temporal relationships Interactions with OS, libraries, I/O Visualization can be valuable Focus on parallel execution
Primary Goals Improve productivity of parallel
development and performance tuning Native and managed support Windows Vista, Windows Server 2008,
Windows 7, and Windows Server 2008 R2 32-bit and 64-bit
Visual Studio 2010Tools / Programming Models / Runtimes
Parallel Pattern Library
Resource Manager
Task Scheduler
Task Parallel Library
PLINQ
Managed Native Key:
ThreadsOperating System
Concurrency Runtime
Programming Models
Agents
Library
Thread PoolTask Scheduler
Resource Manager
Data
Stru
cture
s Data
Str
uct
ure
s
Tools
Tools
ParallelDebugger Toolwindo
ws
Profiler Concurrenc
yAnalysis
Detailed Goals
Identify opportunities for parallelism Expose causes of inefficiency Reach out to non-expert developers Provide actionable data by linking
behavior to source code when possible
IDE integration allows more efficient performance tuning
Key Features
Concurrency analysis Thread blocking analysis Thread migration and core utilization Inter-thread dependencies Multi-process scenario support
Typical Uses
Identifying opportunities for parallelism CPU-bound phases – speedup Synchronous I/O – hide latency
Improving parallel performance Load balancing Synchronization NUMA cache effects, thread migrations
(Server scenarios usually) Application responsiveness (hiding
latency)
Concurrency View (1 of 2)
Purpose: Learn or confirm the degree of
concurrency in a particular scenario Tune or determine opportunities for
parallelism Understand the degree of CPU contention
with other processes Use as an entry-point into more detailed
analysis
Concurrency View (2 of 2) How?
Core utilization for process over time Number of idle cores Number of cores used by “System”
process Number of cores used by “other”
processes
Thread Blocking Analysis (1 of 2) Purpose:
Assist developers in understanding the causes of thread blocking events
Provide actionable information to allow users to act upon some problems
Aggregate costs of blocking call stacks
Thread Blocking Analysis (2 of 2) How?
Threads as swim lanes (channels) in a timeline
Map thread state to a category Disk accesses with associated files and
delays For each blocking event
Display root cause when possible with call stack and delay duration
Provide various summary reports/stats
Thread Migration (1 of 2)
Purpose: Inform user when threads exhibit a large
number of thread migrations across cores Verify thread affinity impact on actual
execution Threads contending for a core while others
cores are available Allow users to identify a region of
interest: Threads exhibiting poor behavior Region in time Use the thread blocking view to drill into root
causes
Thread Migration (2 of 2)
How? Cores as swim lanes in a timeline view Associate a color per thread on the
timeline Show how threads execute on cores Display statistics
Inter-Thread Dependencies (1 of 2) Purpose:
Allow developers to understand inter-thread dependencies by exposing “blocker” to “blocked” thread relationships
Provide actionable data to correlate dependencies to code involved Sometimes, the dependency is a long chain
Inter-Thread Dependencies (2 of 2) How?
Modified thread blocking view, with “unblocking” events displayed and lines connecting dependent threads
Users can follow a dependence chain using the timeline user interface
Call stack of blocking thread shown in selection tab when available
VS2010 Parallel Performance Analysis
Demo
Simple Usage Scenario
Call to Action
We need your feedback on tools Download Visual Studio Team System
Beta 1 http://go.microsoft.com/fwlink/?LinkId=14740
7 Resources:
http://www.msdn.com/concurrency http://channel9.msdn.com/tags/pdc2008.par
allelism Contact:
http://blogs.msdn.com/hshafi
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.