mpi and openmp

46
MPI and OpenMP

Upload: justis

Post on 05-Jan-2016

53 views

Category:

Documents


2 download

DESCRIPTION

MPI and OpenMP. How to get MPI and How to Install MPI?. http://www-unix.mcs.anl.gov/mpi/ mpich2-doc-install.pdf // installation guide mpich2-doc-user.pdf // user guide. Outline. Message-passing model Message Passing Interface (MPI) Coding MPI programs Compiling MPI programs - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MPI and OpenMP

MPI and OpenMP

Page 2: MPI and OpenMP

How to get MPI and How to Install MPI? http://www-unix.mcs.anl.gov/mpi/

mpich2-doc-install.pdf

// installation guide

mpich2-doc-user.pdf

// user guide

Page 3: MPI and OpenMP

Outline

Message-passing model Message Passing Interface (MPI) Coding MPI programs Compiling MPI programs Running MPI programs Benchmarking MPI programs OpenMP

Page 4: MPI and OpenMP

Message-passing Model

Page 5: MPI and OpenMP

Processes

Number is specified at start-up time Remains constant throughout execution of

program All execute same program Each has unique ID number Alternately performs computations and

communicates

Page 6: MPI and OpenMP

Circuit Satisfiability

11

11111111

11

111

1

Not satisfiedNot satisfied

0

Page 7: MPI and OpenMP

Solution Method

Circuit satisfiability is NP-complete No known algorithms to solve in polynomial

time We seek all solutions We find through exhaustive search 16 inputs 65,536 combinations to test

Page 8: MPI and OpenMP

Partitioning: Functional Decomposition

Embarrassingly parallelEmbarrassingly parallel: No channels : No channels between tasks between tasks

Page 9: MPI and OpenMP

Agglomeration and Mapping

Properties of parallel algorithm Fixed number of tasks No communications between tasks Time needed per task is variable

Consult mapping strategy decision tree Map tasks to processors in a cyclic fashion

Page 10: MPI and OpenMP

Cyclic (interleaved) Allocation Assume p processes Each process gets every pth piece of work Example: 5 processes and 12 pieces of work

P0: 0, 5, 10

P1: 1, 6, 11

P2: 2, 7

P3: 3, 8

P4: 4, 9

Page 11: MPI and OpenMP

Summary of Program Design

Program will consider all 65,536 combinations of 16 boolean inputs

Combinations allocated in cyclic fashion to processes

Each process examines each of its combinations

If it finds a satisfiable combination, it will print it

Page 12: MPI and OpenMP

Include Files

MPI header file

#include <mpi.h>

Standard I/O header fileStandard I/O header file

#include <stdio.h>

Page 13: MPI and OpenMP

Local Variables

int main (int argc, char *argv[]) { int i; int id; /* Process rank */ int p; /* Number of processes */ void check_circuit (int, int);

Include Include argcargc and and argvargv: they are : they are needed to initialize MPIneeded to initialize MPI

One copy of every variable for each One copy of every variable for each process running this programprocess running this program

Page 14: MPI and OpenMP

Initialize MPI

First MPI function called by each process Not necessarily first executable statement Allows system to do any necessary setup

MPI_Init (&argc, &argv);

Page 15: MPI and OpenMP

Communicators

Communicator: opaque object that provides message-passing environment for processes

MPI_COMM_WORLD Default communicator Includes all processes

Possible to create new communicators Will do this in Chapters 8 and 9

Page 16: MPI and OpenMP

Communicator

MPI_COMM_WORLD

Communicator

0

2

1

3

4

5

Processes

Ranks

Communicator Name

Page 17: MPI and OpenMP

Determine Number of Processes

First argument is communicator Number of processes returned through

second argument

MPI_Comm_size (MPI_COMM_WORLD, &p);

Page 18: MPI and OpenMP

Determine Process Rank

First argument is communicator Process rank (in range 0, 1, …, p-1)

returned through second argument

MPI_Comm_rank (MPI_COMM_WORLD, &id);

Page 19: MPI and OpenMP

Replication of Automatic Variables

0id

6p

4id

6p

2id

6p

1id

6p5id

6p

3id

6p

Page 20: MPI and OpenMP

What about External Variables?

int total;

int main (int argc, char *argv[]) { int i; int id; int p; …

Where is variable Where is variable totaltotal stored? stored?

Page 21: MPI and OpenMP

Cyclic Allocation of Work

for (i = id; i < 65536; i += p) check_circuit (id, i);

Parallelism is outside function Parallelism is outside function check_circuitcheck_circuit

It can be an ordinary, sequential It can be an ordinary, sequential functionfunction

Page 22: MPI and OpenMP

Shutting Down MPI

Call after all other MPI library calls Allows system to free up MPI resources

MPI_Finalize();

Page 23: MPI and OpenMP

Our Call to MPI_Reduce()

MPI_Reduce (&count, &global_count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

Only process 0will get the result

if (!id) printf ("There are %d different solutions\n", global_count);

Page 24: MPI and OpenMP

Benchmarking the Program

MPI_Barrier barrier synchronization MPI_Wtick timer resolution MPI_Wtime current time

Page 25: MPI and OpenMP

How to form a Ring?

Now to form ring of systems first of all one has to execute following command.

 [nayan@MPI_system1 ~]$ mpd &

[1] 9152 

  Which make MPI_system1 as master system of ring.

[nayan@MPI_system1 ~]$ mpdtrace –l

MPI_system1_32958 (172.16.1.1)

Page 26: MPI and OpenMP

How to form a Ring? (cont’d) Then run following command in terminal of all

other system

[nayan@MPI_system1 ~]$ mpd –h MPI_system1 –p 32958 &

Host name of Master system

Port number of Master system

Page 27: MPI and OpenMP

How to kill a Ring?

And to kill ring run following command on master system. 

[nayan@MPI_system1 ~]$ mpdallexit

Page 28: MPI and OpenMP

Compiling MPI Programs

to compile and execute above program follow the following steps

first of all compile sat1.c by executing following command on master system. 

[nayan@MPI_system1 ~]$ mpicc -0 sat1.out sat1.c

here “mpicc” is mpi command to compile sat1.c file and sat1.out is output file.

Page 29: MPI and OpenMP

Running MPI Programs

Now to run this output file type following command in master system 

[nayan@MPI_system1 ~]$ mpiexec -n 1 ./sat1.out

Page 30: MPI and OpenMP

Benchmarking Results

satisfiability problem

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

1 2 3 4 5 6 7 8 9 10

Number of Processor

tim

e t

aken

in

sec

Series2

Page 31: MPI and OpenMP

OpenMP

OpenMP: An application programming interface (API) for parallel programming on multiprocessors Compiler directives Library of support functions

OpenMP works in conjunction with Fortran, C, or C++

Page 32: MPI and OpenMP

Shared-memory Model

P r o c es s o r P r o c es s o r P r o c es s o r P r o c es s o r

M em o r y

Processors interact and synchronize with eachother through shared variables.

Page 33: MPI and OpenMP

Fork/Join Parallelism

Initially only master thread is active Master thread executes sequential code Fork: Master thread creates or awakens

additional threads to execute parallel code Join: At end of parallel code created threads

die or are suspended

Page 34: MPI and OpenMP

Fork/Join Parallelism

Tim

e

f o r k

jo in

M as ter T h r ead

f o r k

jo in

O th er th r ead s

Page 35: MPI and OpenMP

Shared-memory Model vs.Message-passing Model

Shared-memory model Number active threads 1 at start and finish of

program, changes dynamically during execution

Message-passing model All processes active throughout execution of

program

Page 36: MPI and OpenMP

Parallel for Loops

C programs often express data-parallel operations as for loopsfor (i = first; i < size; i += prime)

marked[i] = 1; OpenMP makes it easy to indicate when the

iterations of a loop may execute in parallel Compiler takes care of generating code that

forks/joins threads and allocates the iterations to threads

Page 37: MPI and OpenMP

Pragmas

Pragma: a compiler directive in C or C++ Stands for “pragmatic information” A way for the programmer to communicate

with the compiler Compiler free to ignore pragmas Syntax:

#pragma omp <rest of pragma>

Page 38: MPI and OpenMP

Parallel for Pragma

Format:#pragma omp parallel forfor (i = 0; i < N; i++) a[i] = b[i] + c[i];

Compiler must be able to verify the run-time system will have information it needs to schedule loop iterations

Page 39: MPI and OpenMP

Function omp_get_num_procs Returns number of physical processors

available for use by the parallel program

int omp_get_num_procs (void)

Page 40: MPI and OpenMP

Function omp_set_num_threads Uses the parameter value to set the number

of threads to be active in parallel sections of code

May be called at multiple points in a program

void omp_set_num_threads (int t)

Page 41: MPI and OpenMP

Comparison

CharacteristicCharacteristic OpenMPOpenMP MPIMPI

Suitable for multiprocessorsSuitable for multiprocessors YesYes YesYes

Suitable for multicomputersSuitable for multicomputers NoNo YesYes

Supports incremental parallelizationSupports incremental parallelization YesYes NoNo

Minimal extra codeMinimal extra code YesYes NoNo

Explicit control of memory Explicit control of memory hierarchyhierarchy

NoNo YesYes

Page 42: MPI and OpenMP

C+MPI vs. C+MPI+OpenMP

C + MPI C + MPI + OpenMP

Page 43: MPI and OpenMP

Benchmarking Results

Page 44: MPI and OpenMP

Example Programm

C Files Descriptionomp_hello.c Hello world

omp_workshare1.c Loop work-sharingomp_workshare2.c Sections work-sharing

omp_reduction.cCombined parallel loop

reduction

omp_orphan.cOrphaned parallel loop

reductionomp_mm.c Matrix multiply

omp_getEnvInfo.cGet and print environment

information

Page 45: MPI and OpenMP

How to run OpenMP programm

$ export OMP_NUM_THREAD=2

$ gcc –fopenmp filename.c

$ ./a.out

Page 46: MPI and OpenMP

For more information

http://www.cs.ccu.edu.tw/~naiwei/cs5635/cs5635.html

http://www.nersc.gov/nusers/help/tutorials/mpi/intro/print.php

http://www-unix.mcs.anl.gov/mpi/tutorial/mpiintro/ppframe.htm

http://www.mhpcc.edu/training/workshop/mpi/MAIN.html

http://www.openmp.org