performance analysis tools for high-performance computing daniel becker 25-03-2010
TRANSCRIPT
![Page 1: Performance Analysis Tools for High-Performance Computing Daniel Becker 25-03-2010](https://reader035.vdocuments.mx/reader035/viewer/2022062619/55152f8155034673228b56ec/html5/thumbnails/1.jpg)
Performance Analysis Tools for High-Performance Computing
Daniel Becker
25-03-2010
![Page 2: Performance Analysis Tools for High-Performance Computing Daniel Becker 25-03-2010](https://reader035.vdocuments.mx/reader035/viewer/2022062619/55152f8155034673228b56ec/html5/thumbnails/2.jpg)
German Research School for Simulation Sciences
• Joint venture between Forschungszentrum Jülich and RWTH Aachen University– Research and education in the simulation sciences– International Master’s program– Ph.D. program
![Page 3: Performance Analysis Tools for High-Performance Computing Daniel Becker 25-03-2010](https://reader035.vdocuments.mx/reader035/viewer/2022062619/55152f8155034673228b56ec/html5/thumbnails/3.jpg)
Jülich Supercomputing Centre
Research in• Computational Science• Computer Science• Mathematics
Jülich BG/P 294,912 cores
Jülich Nehalem Cluster 26,304 cores
![Page 4: Performance Analysis Tools for High-Performance Computing Daniel Becker 25-03-2010](https://reader035.vdocuments.mx/reader035/viewer/2022062619/55152f8155034673228b56ec/html5/thumbnails/4.jpg)
• Scalable performance-analysis toolset for parallel codes• Integrated performance analysis process
– Performance overview on call-path level via runtime summarization
– In-depth study of application behavior via event tracing
– Switching between both options without recompilation or relinking
• Supported programming models – MPI-1, MPI-2 one-sided communication
– OpenMP (basic features)
• Available under New BSD– http://www.scalasca.org/
![Page 5: Performance Analysis Tools for High-Performance Computing Daniel Becker 25-03-2010](https://reader035.vdocuments.mx/reader035/viewer/2022062619/55152f8155034673228b56ec/html5/thumbnails/5.jpg)
Research Challenges
• Scalability– Collection and representation of necessary runtime information– Analysis and visualization of performance behavior
• Analysis of asynchronous tasks – Examples: OpenMP 3.0, StarSs, CUDA, OpenCL,... – Threads and tasks - different dimensions of parallelism– Identification of performance properties– Representation of asynchronous performance data with
respect to call-path profile data – Measurement, analysis and result presentation– Integration with current analysis approaches