speedup for multi-level parallel computing
TRANSCRIPT
![Page 1: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/1.jpg)
Speedup for Multi-Level Parallel Computing
School of Computer Engineering
Nanyang Technological University
21st May 2012
Shanjiang Tang, Bu-Sung Lee, Bingsheng He
![Page 2: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/2.jpg)
OutLine
• Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion
![Page 3: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/3.jpg)
Multi-Level Computing Architecture and Paradigm
![Page 4: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/4.jpg)
Multi-Level Computing Architecture and Paradigm
• MPI+OpenMP • MPI+CUDA • MPI+OpenMP+CUDA …..
![Page 5: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/5.jpg)
Multi-Level Parallel Computing Model
L Lm L3 L2 L1
LLLLLLLL
Notes: Sequential Part Parallel Part
PE2,
2
PE1,
1 PE2,
1
PE3,
1
PE3,
2
PE3,
3
PE3,
4
PE3,
5
PE3,
6
PE3,
7
PE3,
8
![Page 6: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/6.jpg)
Parallel Speedup
• Definition • Classification
Ø Absolute Speedup
Ø Relative Speedup
SequentialExecutionTimeSpeedupParallelExecutionTime
=
BestSequentialALGExecutionTimeSpeedupParallelALGExecutionTime
=
ParallelALGSequentialExecutionTimeSpeedupParallelALGExecutionTime
=
![Page 7: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/7.jpg)
Relative Speedup Model
• Fixed-size Speedup
Ø Amdahl’s Law
where is parallel fraction workload of the program, is the
number of processors.
• Fixed-time Speedup
Ø Gustafson’s Law
pmeparallelTiTimesequentialSpeedup
αα +−
==1
1
ppmeparallelTiTimesequentialSpeedup αα
αααα
+−=+−
+−== 111
α p
![Page 8: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/8.jpg)
Motivation Example—NAS Benchmark (MPI+OpenMP)
![Page 9: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/9.jpg)
Motivation Example—NAS Benchmark (MPI+OpenMP)
Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing
![Page 10: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/10.jpg)
OutLine
• Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion
![Page 11: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/11.jpg)
E-Amdahl’s Law
• Awareness of Different Grained-Level Parallelism
L Lm L3 L2 L1
LLLLLLLL
Notes:
Sequential Part
Parallel Part
PE2,
2
PE1,
1 PE2,
1
PE3,
1
PE3,
2
PE3,
3
PE3,
4
PE3,
5
PE3,
6
PE3,
7
PE3,
8
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
⎨
⎧
<≤
++−
=+−
=
)1(
)1()()()(1
1
)(
)()()(1
1
)(
mi
ispipifif
mi
mpmfmf
isp
![Page 12: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/12.jpg)
E-Amdahl’s Law
• Two-Level Parallelism Speedup Model (MPI+OpenMP)
where is the parallel fraction of coarse-grained (MPI-level) parallelism. is the parallel fraction of fine-grained (OpenMP-level) parallelism. is the number of processes spawned. is the number of threads spawned per process.
pt
pt
tpsp)1(
1
1),,,(β
βαα
βα+−
+−
=
αβ
![Page 13: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/13.jpg)
E-Gustafson’s Law
• Awareness of Different Grained-Level Parallelism
L Lm L3 L2 L1
LLLLLLLL
Notes:
Sequential Part
Parallel Part
PE2,
2
PE1,
1 PE2,
1
PE3,
1
PE3,
2
PE3,
3
PE3,
4
PE3,
5
PE3,
6
PE3,
7
PE3,
8
⎩⎨⎧
<≤++−
=+−=
)1()1()()()(1)()()()(1
)(miispipififmimpmfmf
isp
![Page 14: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/14.jpg)
OutLine
• Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion
![Page 15: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/15.jpg)
Experiment Setup
• Platform and Configuration
Ø A linux cluster consisting of eight computing nodes each with two quad-core chips Ø Configuration: One thread per CPU core
• Benchmarks
NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: Ø BT-MZ (Unbalanced Workload Partitioning) Ø SP-MZ (balanced Workload Partitioning) Ø LU-MZ (balanced Workload Partitioning)
![Page 16: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/16.jpg)
Performance Prediction
![Page 17: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/17.jpg)
Prediction Result Comparison
![Page 18: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/18.jpg)
OutLine
• Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion
![Page 19: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/19.jpg)
Conclusion
• Traditional speedup models are unsuitable for multi-level parallelism
– Unable to be awareness of different granularities of parallelism for multi-level parallel computing.
• Multi-level Parallelism Model
– A guidance model for multi-level optimization. – A prediction model for multi-level parallelism.
![Page 20: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/20.jpg)
![Page 21: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/21.jpg)
Argument Estimation
![Page 22: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/22.jpg)
Speedup Under E-Amdahl’s Law
![Page 23: Speedup for Multi-Level Parallel Computing](https://reader031.vdocuments.mx/reader031/viewer/2022012217/61dfc754e479cf577d41dd18/html5/thumbnails/23.jpg)
Speedup Under E-Gustafson’s Law