porting an mpi application to hybrid mpi+openmp with reveal tool on shaheen ii

30
KAUST Supercompu.ng Laboratory Por.ng an MPI applica.on to hybrid MPI+OpenMP with Reveal tool on Shaheen II George Markomanolis Computa.onal Scien.st June 23 th , 2016

Upload: george-markomanolis

Post on 15-Feb-2017

94 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

KAUSTSupercompu.ngLaboratoryPor.nganMPIapplica.ontohybridMPI+OpenMPwithRevealtoolonShaheenII

GeorgeMarkomanolisComputa.onalScien.stJune23th,2016

Page 2: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Outline

KAUST King Abdullah University of Science and Technology 2

❖  Introduction

❖  Test case

❖  Reveal

Page 3: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Introduc.on-ComponentsofCrayPat

KAUST King Abdullah University of Science and Technology 3

❖  Module perftools-base

•  pat_build – Instruments the program to be analyzed •  pat_report – Generates text reports from the performance data

captured during program execution and exports data for use in other programs.

•  Cray Apprentice2 – A graphical analysis tool that can be used to visualize and explore the performance data captured during program, execution

•  Reveal – A graphical source code analysis tool that can be used to correlate performance analysis data with annotated source code listings, to identify key opportunities for optimization (it works only with Cray compiler)

Page 4: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Studyingcase

KAUST King Abdullah University of Science and Technology 4

❖  Application from seismic group related to acoustic wave

solver •  Why this application? A user asked for it •  MPI application •  Test on 3 nodes with totally 96 cores on

Shaheen II

Page 5: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Prepareforthetutorial

KAUST King Abdullah University of Science and Technology 5

•  Connect to Shaheen II and copy the material: •  ssh –X [email protected]

•  cp /scratch/tmp/model_reveal.tgz .

•  tar zxvf model_reveal.tgz

•  cd model_reveal

•  Reservation name: k1056_141

Page 6: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal

A tool to port your application to OpenMP programming model

KAUST King Abdullah University of Science and Technology 6

Page 7: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal

KAUST King Abdullah University of Science and Technology 7

❖ Reveal is Cray’s next-generation integrated performance analysis and code optimization tool.

•  Source code navigation using whole program

analysis (data provided by the Cray compilation environment only)

•  Coupling with performance data collected during execution by CrayPAT. Understand which high level serial loops could benefit from parallelism.

•  Enhanced loop mark listing functionality. •  Dependency information for targeted loops •  Assist users optimize code by providing variable

scoping feedback and suggested compile directives.

Page 8: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

PrepareforReveal

KAUST King Abdullah University of Science and Technology 8

❖  Load Perftools •  module unload darshan •  module load perftools-base/6.3.2 •  module load perftools/6.3.2

❖  Execute the MPI version •  cd model_reveal •  make clean •  make •  In the submit.sh file changed to your account number and submit the

job §  sbatch submit.sh

•  tail -n 10 testdata.XXX.err §  1m46.361s

Reservation: k1056_141

Page 9: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Preparetheapplica.onforReveal

KAUST King Abdullah University of Science and Technology 9

❖  Compile the version for Reveal tool •  make clean –f Makefile_reveal •  In the Makefile_reveal file

§  $(CC) -h profile_generate -hpl=data.pl -h noomp $< -o $@ $(CFLAGS)

§  ${CC} -h profile_generate -hpl=data.pl -h noomp -c $< CrayData.c §  Reveal needs the object of the files, so you need to modify the

Makefile if needed •  make –f Makefile_reveal •  The folder data.pl is created in the folder •  Instrument your application

§  pat_build –w CrayData.exe §  New executable is called CrayData.exe+pat, replace it to submit.sh

Page 10: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

SubmitthejobforRevealtool

KAUST King Abdullah University of Science and Technology 10

❖  Submit your job script and do not forget the reservation name (--reservation=…)

•  sbatch submit.sh

❖  A performance file (extension .xf) is created, if not something was wrong in the previous steps

❖  Generate the report and the ap2 file •  pat_report -o report.txt CrayData.exe+pat+58072-37t.xf

❖  Execute Reveal •  reveal data.pl CrayData.exe+pat+58072-37t.ap2

Page 11: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–LoopPerformance

KAUST King Abdullah University of Science and Technology 11

Page 12: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–Scoping

KAUST King Abdullah University of Science and Technology 12

Page 13: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–Programview

KAUST King Abdullah University of Science and Technology 13

Page 14: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–Func.onView

KAUST King Abdullah University of Science and Technology 14

Page 15: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–ArrayView

KAUST King Abdullah University of Science and Technology 15

Page 16: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–CompilerMessages

KAUST King Abdullah University of Science and Technology 16

Page 17: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–LoopPerformance

KAUST King Abdullah University of Science and Technology 17

Page 18: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–ScopingTool

KAUST King Abdullah University of Science and Technology 18

Page 19: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–ScopingResults

KAUST King Abdullah University of Science and Technology 19

Page 20: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–OpenMPpragmas

KAUST King Abdullah University of Science and Technology 20

Page 21: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Reveal–InsertedOpenMPpragmas

KAUST King Abdullah University of Science and Technology 21

Page 22: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

CleanthecodefromunresolvedissuesandobserveOpenMPpragmas

KAUST King Abdullah University of Science and Technology 22

❖  vim CrayData.c ❖  Remove the lines with unresolved, only if you are sure.

#pragma omp parallel for default(none) \ private (i1,i2,u) \ shared (nxpad,nzpad)

#pragma omp parallel for default(none) \ private (ix,ib,ibz) \ shared (nxpad,nb,nzpad,bndr,p0) \ lastprivate (w)

Page 23: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

CheckanOpenMPpragmaanditsvalida.on

KAUST King Abdullah University of Science and Technology 23

#pragma omp parallel for default(none) private (ix,ib,ibz) \ shared (nxpad,nb,nzpad,bndr,p0) \ lastprivate (w) for(ix=0; ix<nxpad; ix++) {

for(ib=0; ib<nb; ib++) { w = bndr[nb-ib-1]; ibz = nzpad-ib-1;

p0[ix][ib ] *= w; /* top sponge */ p0[ix][ibz] *= w; /* bottom sponge */ } } for(ib=0; ib<nb; ib++) { ibx = nxpad-ib-1; for(iz=0; iz<nzpad; iz++) { p0[ib ][iz] *= w; /* left sponge */

p0[ibx][iz] *= w; /* right sponge */ } }

Page 24: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Cleanthecodefromunresolvedissues,compileandrun

KAUST King Abdullah University of Science and Technology 24

❖  vim CrayData.c ❖  Remove the lines with unresolved if you are sure. ❖  Compile your application with MPI and OpenMP

•  make –f Makefile_omp •  The new executable is called CrayData_omp.exe •  Comment the active srun line in the submit.sh and uncomment

the next srun call. •  Uncomment also the line with OMP_NUM_THREADS=2 •  Now, we will execute the application with 48 MPI processes

(ntasks) and 2 threads per MPI process (cpus-per-task) •  srun --ntasks=48 --ntasks-per-node=16 --ntasks-per-socket=8 --

hint=nomultithread --cpus-per-task=2 ./CrayData_omp.exe

Page 25: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Differentcasesandresults

KAUST King Abdullah University of Science and Technology 25

❖  Results for 2 threads •  Change according:

§  export OMP_NUM_THREADS=2 §  srun –ntasks=48 --ntasks-per-node=16 --ntasks-per-

socket=8 --hint=nomultithread --cpus-per-task=2 ./CrayData_omp.exe

•  51.211s (2.86X)

❖  Results 4 threads •  Change according:

§  export OMP_NUM_THREADS=4 §  srun --ntasks=24 --ntasks-per-node=8 --ntasks-per-socket=4

--hint=nomultithread --cpus-per-task=4 ./CrayData_omp.exe •  24.815s (5.9X)

Page 26: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Differentcasesandresults

KAUST King Abdullah University of Science and Technology 26

❖  Results 8 threads •  12.222s (11.98X)

❖  Results 16 threads •  Change according:

§  export OMP_NUM_THREADS=16

§  srun --ntasks=6 --ntasks-per-node=2 --ntasks-per-socket=1 --hint=nomultithread --cpus-per-task=16 ./CrayData_omp.exe

•  8.895s (16.45X)

Page 27: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Theoriginalversionwasimproved19.19.mes

KAUST King Abdullah University of Science and Technology 27

170.67

106.36

8.8950

20406080

100120140160180

Originalversion Op.mizedMPIversion

MPI+OpenMP

Time(in

sec.)

Execu.on.me

Page 28: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Valida.on

KAUST King Abdullah University of Science and Technology 28

Original version Optimized MPI+OpenMP

Page 29: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

Summary

KAUST King Abdullah University of Science and Technology 29

❖  Reveal is an easy to use tool

❖  The user should be careful though, give notice to compiler messages

❖  You can have great speedup with this tool

❖  We need to investigate more complicated applications

Page 30: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II

KAUST Supercomputing Laboratory

KAUST King Abdullah University of Science and Technology 30