portable mpi and related parallel development tools rusty lusk mathematics and computer science...

32
Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group: Bill Gropp, Rob Ross, David Ashton, Brian Toonen, Anthony Chan)

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

Portable MPI and Related Parallel Development Tools

Rusty LuskMathematics and Computer Science Division

Argonne National Laboratory

(The rest of our group: Bill Gropp, Rob Ross, David Ashton, Brian Toonen, Anthony Chan)

Page 2: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

2

Outline

• MPI– What is it?– Where did it come from?– One implementation– Why has it “succeeded”?

• Case study: an MPI application– Portability– Libraries– Tools

• Future developments in parallel programming – MPI development– Languages– Speculative approaches

Page 3: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

3

What is MPI?

• A message-passing library specification– extended message-passing model– not a language or compiler specification– not a specific implementation or product

• For parallel computers, clusters, and heterogeneous networks

• Full-featured• Designed to provide access to advanced

parallel hardware for– end users– library writers– tool developers

Page 4: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

4

Where Did MPI Come From?

• Early vendor systems (NX, EUI, CMMD) were not portable.

• Early portable systems (PVM, p4, TCGMSG, Chameleon) were mainly research efforts.– Did not address the full spectrum of message-passing

issues– Lacked vendor support– Were not implemented at the most efficient level

• The MPI Forum organized in 1992 with broad participation by vendors, library writers, and end users.

• MPI Standard (1.0) released June, 1994; many implementation efforts.

• MPI-2 Standard (1.2 and 2.0) released July, 1997.

Page 5: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

5

Informal Status Assessment

• All MPP vendors now have MPI-1. (1.0, 1.1, or 1.2)

• Public implementations (MPICH, LAM, CHIMP) support heterogeneous workstation networks.

• MPI-2 implementations are being undertaken now by all vendors.

• MPI-2 is harder to implement than MPI-1 was.• MPI-2 implementations will appear

piecemeal, with I/O first.

Page 6: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

6

MPI Sources

• The Standard itself:– at http://www.mpi-forum.org– All MPI official releases, in both postscript and HTML

• Books on MPI and MPI-2:– Using MPI: Portable Parallel Programming with the

Message-Passing Interface (2nd edition), by Gropp, Lusk, and Skjellum, MIT Press, 1999.

– Using MPI-2: Extending the Message-Passing Interface, by Gropp, Lusk, and Thakur, MIT Press, 1999

– MPI: The Complete Reference, volumes 1 and 2, MIT Press, 1999.

• Other information on Web:– at http://www.mcs.anl.gov/mpi– pointers to lots of stuff, including other talks and tutorials, a

FAQ, other MPI pages

Page 7: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

7

The MPI Standard Documentation

Page 8: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

8

Tutorial Material on MPI, MPI-2

Page 9: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

9

The MPICH Implementation of MPI

• As a research project: exploring tradeoffs between performance and portability; conducting research in implementation issues.

• As a software project: providing a free MPI implementation on most machines; enabling vendors and others to build complete MPI implementation on their own communication services.

• MPICH 1.2.2 just released, with complete MPI-1, parts of MPI-2 (I/O and C++), port to Windows2000.

• Available at http://www.mcs.anl.gov/mpi/mpich

Page 10: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

10

Lessons From MPIWhy Has It Succeeded?

• The MPI Process

• Portability

• Performance

• Simplicity

• Modularity

• Composability

• Completeness

Page 11: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

11

The MPI Process

• Started with open invitation to all those interested in standardizing message-passing model

• Participation from– Parallel computing vendors– Computer Scientists– Application scientists

• Open process– All invited, but hard work required– All deliberations available at all times

• Reference implementation developed during design process– Helped debug design– Immediately available when design completed

Page 12: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

12

Portability

• Most important property of a programming model for high-performance computing– Application lifetimes 5 to 20 years– Hardware lifetimes much shorter– (not to mention corporate lifetimes!)

• Need not lead to lowest common denominator approach

• Example: MPI semantics allow direct copy of data from user space send buffer to user space receive buffer– Might be implemented by hardware data mover– Might be implemented by netwrk hardware– Might be implemented by socket

• The hard part: portability with performance

Page 13: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

13

Performance

• MPI can help manage the crucial memory hierarchy– Local vs. remote memory is explicit– A received message is likely to be in cache

• MPI provides collective operations for both communication and computation that hide complexity or non-portability of scalable algorithms from the programmer.

• Can interoperate with optimising compilers• Promotes use of high-performance libraries• Doesn’t provide performance portability

– This problem is still too hard, even for the best compilers– E.g. BLAS

Page 14: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

14

Simplicity

• Simplicity is in the eye of the beholder– MPI-1 has about 125 functions

• Too big!• Too small!

– MPI-2 has about 150 more– Even this is not very many by comparison

• Few applications use all of MPI– But few MPI functions go unused

• One can write serious MPI programs with as few as six functions– Other programs with a different six…

• Economy of concepts– Communicators encapsulate both process groups and contexts– Datatypes both enable heterogeneous communication and allow

non-contiguous messages buffers

• Symmetry helps make MPI easy to understand.

Page 15: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

15

Modularity

• Modern applications often combine multiple parallel components.

• MPI supports component-oriented software through its use of communicators

• Support of libraries means applications may contain no MPI calls at all.

Page 16: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

16

Composability

• MPI works with other tools– Compilers

• Since it is a library– Debuggers

• Debugging interface used by MPICH, TotalView, others– Profiling tools

• The MPI “profiling interface” is part of standard

• MPI-2 provides precise interaction with multi-threaded programs– MPI_THREAD_SINGLE– MPI_THREAD_FUNNELLED (OpenMP loops)– MPI_THREAD_SERIAL (Open MP single)– MPI_THREAD_MULTIPLE

• The interface provides for both portability and performance

Page 17: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

17

Completeness

• MPI provides a complete programming model.• Any parallel algorithm can be expressed.• Collective operations operate on subsets of

processes.• Easy things are not always easy, but• Hard things are possible.

Page 18: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

18

The Center for the Study of Astrophysical Thermonuclear Flashes

• To simulate matter accumulation on the surface of compact stars, nuclear ignition of the accreted (and possibly underlying stellar) material, and the subsequent evolution of the star’s interior, surface, and exterior• X-ray bursts (on neutron star surfaces)• Novae (on white dwarf surfaces)• Type Ia supernovae (in white dwarf interiors)

Page 19: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

19

FLASH Scientific Results Wide range of compressibility Wide range of length and time scales Many interacting physical processes Only indirect validation possible Rapidly evolving computing environment Many people in collaboration

Cellular detonations

Compressible turbulence

Helium burning on neutron starsRayleigh-Taylor instability

Richtmyer-Meshkov instability

Flame-vortex interactions

Laser-driven shock instabilities

Nova outbursts on white dwarfs

Gordon Bell prize at SC2000

Page 20: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

20

The FLASH Code: MPI in Action

• Solves complex systems of equations for hydrodynamics and nuclear burning

• Written primarily in Fortran-90

• Uses Paramesh library for adaptive mesh refinement; Paramesh is implemented with MPI

• I/O (for checkpointing, visualization, other purposes) done with HDF-5 library, which is implemented with MPI-IO

• Debugged with TotalView, using standard debugger interface

• Tuned with Jumpshot and Vampir, using MPI profiling interface

• Gordon Bell prize winner in 2000

• Portable to all parallel computing environments (since MPI)

Page 21: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

21

FLASH Scaling Runs

Page 22: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

22

X-Ray Burst on the Surface of a Neutron Star

Page 23: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

23

Showing the AMR Grid

Page 24: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

24

MPI Performance Visualization with Jumpshot

• For detailed analysis of parallel program behavior, timestamped events are collected into a log file during the run.

• A separate display program (Jumpshot) aids the user in conducting a post mortem analysis of program behavior.

• Log files can become large, making it impossible to inspect the entire program at once.

• The FLASH Project motivated an indexed file format (SLOG) that uses a preview to select a time of interest and quickly display an interval.

Logfile

Jumpshot

Processes

Display

Page 25: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

25

Removing Barriers From Paramesh

Page 26: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

26

Using Jumpshot

• MPI functions and messages automatically logged

• User-defined states

• Nested states• Zooming and

scrolling• Spotting

opportunities for optimization

Page 27: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

27

Future Developments in Parallel Programming: MPI and Beyond

• MPI not perfect• Any widely-used replacement will have to share

the properties that made MPI a success.• Some directions (in decreasing order of

speculativeness)– Improvements to MPI implementations– Improvements to the MPI definition– Continued evolution of libraries– Research and development for parallel languages– Further out: radically different programming models

for radically different architectures.

Page 28: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

28

MPI Implementations

• Implementations beget implementation research– Datatypes, I/O, memory motion elimination

• On most platforms, better collective– Most MPI implementations build collective on point-to-point, too

high-level– Need stream-oriented methods that understand MPI datatypes

• Optimize for new hardware– In progress for VIA, Infiniband– Need more emphasis on collective operations– Off-loading message processing onto NIC

• Scaling beyond 10,000 processes• Parallel I/O

– Clusters– Remote

• Fault-tolerance– Intercommunicators provide an approach

• Working with multithreading approaches

Page 29: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

29

Improvements to MPI Itself

• Better Remote-memory-access interface– Simpler for some simple operations– Atomic fetch-and-increment

• Some minor fixup already in progress– MPI 2.1

• Building on experience with MPI-2• Interactions with compilers

Page 30: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

30

Libraries and Languages

• General Libraries– Global Arrays– PETSc– ScaLAPACK

• Application-specific libraries

• Most built on MPI, at least for portable version.

Page 31: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

31

More Speculative Approaches

• HTMT for Petaflops• Blue Gene• PIMS• MTA• All will need a

programming model that explicitly manages a deep memory hierarchy.

• Exotic + small benefit = dead

Page 32: Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group:

32

Summary

• MPI is a successful example of a community defining, implementing, and adopting a standard programming methodology.

• It happened because of the open MPI process, the MPI design itself, and early implementation.

• MPI research continues to refine implementations on modern platforms, and this is the “main road” ahead.

• Tools that work with MPI programs are thus a good investment.

• MPI provides portability and performance for complex applications on a variety of architectures.