parallel programming - univ-brest.frlabsticc.univ-brest.fr/~lemarch/eng/cours/algop1pareng.pdf ·...

Post on 28-May-2020

24 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

ParallelProgramming

Laurent LemarchandLaurent LemarchandLISyC/UBOLISyC/UBO

Laurent.Lemarchand@univ-brest.frLaurent.Lemarchand@univ-brest.fr

2

What is parallelism Parallele computers architecture Distributed and parallel applications Message passing model PVM : fonctionnalities Using PVM

Combinatorial optimization techniques Exact methods Heuristic methods Parallelization

Course outline

3

What is parallelism

Execute an algorithm with a set of processor instead of a single one

Divide the algorithm into a set of tasks that can be run independently of disjoint processors

Goal : decrease the runtimes needed to solve the problem 3 abstraction levels

Architectures Algorithms Programming model

4

Parallel Architectures

Parallel computer : more than one processors, support for parallel programming

Different architecture types Distributed Centralized

Architecture models : classification of parallel computers SIMD MIMD

5

Parallel Algorithm

Approachs for solving problems with parallel support

Algorithmic Models : simplified models for the execution of programs – facilitate program design

Allows theorical performance analysis

6

Parallel programming

Programming language which allow the expression of parallel execution

Different abstraction levels Automatic parallelism from sequential programs : the dream,

but not efficient in general now Programming style depends (partially) of targeted

architecture

7

Parallelism sources

Examples Control

command

Movment detection

Matrix

manipulation

Scheme

Définition

control data pipeline

« many actions atthe same time »

« same action onsimilar data »

« workflow »

8

Some vocabulary tips

Parallel vs. Distributed computing Parallel : homogenous, high level of dependency among

tasks Distributed : heterogenous, physical and logical

indendency among tasks, client-server

Parallel vs. Concurrent computing Parallel : prossess cooperate on solving a problem Concurrent : prosses compete for the resources

9

Large grain (task level) Program

Average grain (control level) Function

Fine grain (data level) Loop

Very fine grain (many levels) Hardware

Task i-l Task i Task i+l

func1 ( ){........}

func2 ( ){........}

func3 ( ){........}

a (0) = ..b (0) = ..

a (1) = ..b (1) = ..

a (2) = ..b (2) = ..

+ x Load

Parallelism grain

10

Parallelismarchitectural grain

Vector, pipeline, .... Intra processor (computing units)

SMP Multi-processor (process/threads)

Cluster Many processors, dedicated network (giganet,

myrinet) Constellation

Cluster grid Grid

Many processors, public network (ethernet) Meta computing

On demand computation time

11

ArchitecturesFlynn Classification

MIMDMISD (Systolique)

Multiple

SIMD(Vectoriel)

SISD (Von Neumann)

UniqueInstruction flow

MultipleUnique

Data flow

SPMD

12

Architecturesmemory classes

Flynn don't speak about memory classes shared

Simple model bottleneck

distributed scalable hard programming aspects

P P P P

P P P P

mémoire

13

Clusters vs. Network of workstations

Networks of workstations (NOW) : Set of heterogenous computers Geographic dispersion Non privative use Parallel computing = optional usage of computers Standard network connection as ethernet

Cluster Set of homogenous computers Single location Dedicated computing nodes One parallel computing resource High performannce network (Fast/Gigabit Ethernet, Myrinet)

14

Parallel computer size

«Small» parallel computers 2 < #nodes < 64 Typically, Multiprocessors (SMP) Global memory, cache consistency mechanism

«Large» parallel computers 64 < #nodes < few hundred or thousand Typically Multi-computer Often SMP clusters

15

Parallel computer size

Site Computer Pays An OS Arch1DOE/NNSA/LANL IBM US 2008 1026000 122400Linux2DOE/NNSA/LLNL IBM US 2007 478200 212992CNK/SLES 9MPP3Argonne IBM US 2007 450300 163840CNK/SLES 9MPP4 Sun US 2008 326000 62976Linux5 US 2008 205000 30976CNL MPP6FZJ IBM 2007 180000 65536CNK/SLES 9MPP7NMCAC SGI US 2007 133200 14336 MPP8TATA SONS HP Inde 2008 132800 14384Linux9IDRIS IBM France 2008 112500 40960CNK/SLES 9MPP

10Total Exploration ProductionSGI France 2008 106100 10240 MPP

Fab. RMax #ProcsBladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 Ghz / Opteron DC 1.8 GHz , Voltaire InfinibandClustereServer Blue Gene SolutionBlue Gene/P Solution

U. of Texas SunBlade x6420, Opteron Quad 2Ghz, Infiniband ClusterDOE/Oak Ridge Cray Cray XT4 QuadCore 2.1 GHz

Blue Gene/P Solution AllSGI Altix ICE 8200, Xeon quad core 3.0 GHz SLES10 + SGI ProPack 5Cluster Platform 3000 BL460c, Xeon 53xx 3GHz, Infiniband ClusterBlue Gene/P SolutionSGI Altix ICE 8200EX, Xeon quad core 3.0 GHz SLES10 + SGI ProPack 5

http://www.top500.org/lists/2008/06Rmax - Maximal LINPACK performance achieved (TFlops)

16

Parallel computer applications

Main fields Science

Plasma physics, quantic mechanics Molecular chimical

Ingeneering Telephone, networks Microelectronics design Mechanical design Air control, control command

Forecasting Wheater, Earthbreaks Social & economical models

Exploration   Oceanic exploration, mineral exploration, oil Satellite pictures

17

Parallel computer applications

Main fields Military

Multihead rockets Radar and Sonar tracing and analysiis Mapping, strategic data warehouses

Clinical Scan, medical imaging Protein synthesis Génetics

AI Robotiqcs, autonomous vehicles, Vision, Planning Speach recognition

Visualization 3D (cinema, video games) Pattern recognition

18

ApplicationsMIMD/SPMD

On grid or cluster Distributed memory

Cheap, widely avalaible client/server or p2p

Functional or programmming level SPMD (1 program) client/server (MIMD 2 programs)

19

Grid ApplicationsInstallation models

Internet

Zone of congestion

Zone de congestion

Client-server Centralized or distributed

memory Caching to avoid congestion Centralized information

Peer to peer (P2P) Each peer is both client and

server Work balancing Distributed information

Internet

srv

c

srv

Zone of congestion

c

c

c

cc

cc

c

c

c/s

srv srv

c/s

c/s

c/sc/s

c/s

c/s

c/s

c/s

20

Grid Applicationsclient/server

Information grid (cloud)

Data staorage and diffusion

Web, nfs, ....

Computing grid

Exploit available computing power Meta computing : book time on

supercomputers Internet computing (SETI, Decrypton, ...)

Web server

Search engine

Web server

21

ApplicationsP2P grids

Information grid

distributed

Napster, Gnutella, Freenet

Computing grid Migrate client/server applications to p2p

CGP2P

22

Introducing message passing

Message passing model : Set of parallel process Process running onto separate

processors Each processor ows its memory

(not shared) Process communications

depends on messages (send and receive)

P2

M

P4

M

P1

M

P3

M

E

E

E

E

R

R R

R

23

Introducing message passing

Main functions offered by messages passing Data exchange Synchronisation

Suitable for distributed memory architectures (de type MIMD) Multi-Computers Clusters and NOW

24

Model characteristics

Drawbacks : Non easy programming Explicit management of

Data distribution Process scheduling Process commmunication

Consequences Eventually long development cycle Errors High development costs

25

Model characteristics

Advantages : Efficiency Programmer has lot degrees of freedom Optimization and « fine-tuning » of application Consequence : best usage of ressources for maximal

efficiency is possible Portability

Long term model The model is well known and popular since a long time :

standard programming Many environment implentation ever realized

Consequences Easily portable But code portability <> performance portability

26

Designing applications based on the message passing model

2 main tools : Message Passing Interface (MPI) Parallel Virtual Machine (PVM)

Most of message passing applications deployed now are based on one of the above libraries

Extension of existing programming languages C C++ Fortran

27

Programming model leakly coupled network

asynchroneous MPMD model (SPMD)

Programming model Services ? Libray uuusage

Parallel Virtual Machine

Heterogeneous network (clusters/grid/constellations)

C or Fortran host language services

28

Programming model communicating process

asynchroneous communication process

Model is independant from resources Process localization Commuunication hardware

Communications Bi-point Or group

P1

P3 P4

P2

29

Parallel Virtual Machine caracteristics

On leakly connnected heterogeneous network Services

Creation/Destruction of processs Communications (XDR transport)

Asynchroneous bi-point FIFO or multicast Synchroneous bi-point (RdV) or group

(synchronisation barrier) Distant signals

Machine and process management

31

Parallel Virtual Machine virtual computer

Unified view of hardware

Various computing resources Various communication resources

Virtual machine

c1 c2 c3 c1 c4 c5a1 a2 a3

p1 p2p3 p4 p5r2r1

applications

resources

32

Parallel Virtual Machine virtual computer

An application

A few componants (processs) interacting together freely

Simultaneous applications

A virtual machine

Preliminary enrollment of hardware resources

Possible dynamic management

33

Parallel Virtual Machinecomposition

For applications Ressource acces by an API

For hardware

Enrollment within PVM machine

Daemon local

pvmdp1

pvmd

c1

c2

pvmd

network

p2

34

Parallel Virtual Machinesetup

PVM console Machine management

Process management

% pvmpvm> conflocalhostpvm> add distHost

% pvmpvm> ps -ef...pvm> resetpvm> haltpvm> quit

35

Parallel Virtual Machineconsole

% pvmpvm> help add Add hosts to virtual machine alias Define/list command aliases conf List virtual machine configuration delete Delete hosts from virtual machine echo Echo arguments export Add environment variables to spawn export list getopt Display PVM options for the console task halt Stop pvmds help Print helpful information about a command id Print console task id jobs Display list of running jobs kill Terminate tasks mstat Show status of hosts names List message mailbox names ps List tasks pstat Show status of tasks put Add entry to message mailbox quit Exit console reset Kill all tasks, delete leftover mboxes setenv Display or set environment variables setopt Set PVM options - for the console task *only*! sig Send signal to task spawn Spawn task trace Set/display trace event mask unalias Undefine command alias unexport Remove environment variables from spawn export list version Show libpvm version

36

PVM process

Choice of location by machine/by archi/dontcare

Uniq tid similar to Unix PID

/home/master10214

/home/slave7418

/home/slave657

/home/slave7419

NumPs = pvm_spawn(“/home/slave”, ..., 3, tabTids)// tabTids : {7419,7418,657}

37

PVM processs

Basis of applications

Identified by a uniq tid over the PVM PVM

Correspond to Unix process execution

/home/myAppli10214

mtid = pvm_mytid();// mtid : 10214

prt = pvm_parent();// prt : father tid

38

PVM

PVM communications

Communications by message passing

message

Communication biPoint FIFO

Communications buffered by PVM

/home/p210214/home/p1

7419

m2 m1 m2 m1

39

PVM message composition

Who is destinated ? tid

Which kind (tag) of message ? Possible filter at destination

Data ?

typing (heterogeneous computers) Composite message

/home/p1 10214

/home/p210214

10214 circle

10214 square

2/50/10 'hello' 20.658/1.008

40

PVM sending – initialisation

myAppli

...pvm_initsend(PvmDataDefault);

libpvm

Initialize the communication buffer

Data encoding free, XDR

According to the composition of the PVM

41

PVM sending – message assembly

myApp

...pvm_initsend(PvmDataDefault);pvm_pkint(tab, 10, 1);pvm_pkstr('toto');

libpvm

1,2,...toto

Copy data into buffer

1,2, ...

Data type associated t function names pkint, pkfloat, pkstring, ... Many consecutive packing possibility

tototab

42

PVM sending – effective sending

myApp

...pvm_initsend(PvmDataDefault);pvm_pkint(tab, 10, 1);pvm_pkstr('toto');pvm_send(10214, 45);

libpvm

1,2,...toto

Message sent with a kind tag to recipient

1,2, ...

Copy into daemon memory space Non blocking sending

toto

PVM

10214 45 1,2,...toto

10214

tab

43

PVM Bocking reception - filtering

myApp

...pvm_recv(exp, gre);

libpvm

Filtering On sender exp On the tag associated to the message Possible don't care (-1) :

PVM

7419 45 1,2,...toto

pvm_recv(-1, gre); pvm_recv(exp, -1);

44

PVM non blocking reception - test

...pvm_nrecv(exp, gre);

...pvm_probe(exp, gre);

Return a value < 0 if error

45

PVM reception - unpacking

myApp

...pvm_recv(exp,gre);pvm_upkint(truc, 10, 1);pvm_upkstr(ptr);

libpvm

1,2,...toto

Copy data locally

1,2, ...

Same order in during sending operation

totoptr truc

46

PVM process groups

...pvm_join(''group'');

...pvm_leave(''groupe'');

Return a <0 value if error For global synchronization/communications Implanted using available hardware

47

PVM direct table sending

pvm_psend(int tid, int msgtag, void *vp, int cnt, int type )

pvm_precv(int tid, int msgtag, void *vp, int cnt, int type, int *rtid, int *rtag, int *rcnt)

vp, cnt : data, type: table types Receiving mode: returns same informations as pvm_bufinfo()

: actual tid, effective tag and message size

PVM_STR PVM_FLOAT PVM_BYTE PVM_CPLX PVM_SHORT PVM_DOUBLE PVM_INT PVM_DCPLX PVM_LONG PVM_UINT PVM_USHORT PVM_ULONG

48

PVM group communications

pvm_initsend(...);pvm_pk...(...);pvm_bcast('group', genre);

...pvm_barrier('group', n);

synchronisation Everybody is blocked until callneme

message to multiple recipients at the same time Message typing as with pvm_send()

49

Master/Slave principle

Master controls process and global data

PROGRAMIF (process = master) THEN

master-codeELSE

slave-codeENDIFEND

50

PVM example - application

master.c

Create n process with 2 distinct programs

First, master, which will create other process

n-1 slaves, same processing, but on distinct data

slave.c

/home/master

/home/slave

/home/slave

/home/slave

... ...

51

PVM example - execution

A PVM with 2 computers

Running the app and then ending the PVM

[on comp1]% pvmpvm> add comp2pvm> quit[on comp1]%

[on comp1]% master...trace...[on comp1]% pvmpvm> halt[on comp1]%

52

PVM example – master code

#include "pvm3.h"#define SLAVENAME "/home/slave"

main(){ int tids[32]; /* slave task ids */ int n, nproc, numt, i, who, msgtype, nhost, narch; float data[100], result[32];

puts("How many slave programs (1-32)?"); scanf("%d", &nproc); /* start up slave tasks */ numt=pvm_spawn(SLAVENAME, (char**)0, 0, "", nproc, tids);

/* Begin User Program -- dummy data */ n = 100; for( i=0 ; i<n ; i++ ) { data[i] = i*10.8; }

/* Broadcast initial data to slave tasks */ pvm_initsend(PvmDataDefault); pvm_pkint(&nproc, 1, 1); pvm_pkint(tids, nproc, 1); pvm_pkint(&n, 1, 1); pvm_pkfloat(data, n, 1); pvm_mcast(tids, nproc, 0);

/* Wait for results from slaves */ msgtype = 5; for( i=0 ; i<nproc ; i++ ) {

pvm_recv( -1, msgtype );pvm_upkint( &who, 1, 1 );pvm_upkfloat( &result[who], 1, 1 );printf("I got %f from %d\n", result[who], who);

}

/* Program Finished exit PVM before stopping */ pvm_exit();}

53

PVM example – slave code

#include <stdio.h>#include "pvm3.h"

main(){ int mytid; /* my task id */ int tids[32]; /* task ids */ int n, me, i, nproc, master, msgtype; float data[100], result;

/* enroll in pvm */ mytid = pvm_mytid(); /* Receive data from master */ msgtype = 0; pvm_recv( -1, msgtype ); pvm_upkint(&nproc, 1, 1); pvm_upkint(tids, nproc, 1); pvm_upkint(&n, 1, 1); pvm_upkfloat(data, n, 1);

/* Determine which slave I am (0 -- nproc-1) */ for( i=0; i<nproc ; i++ ) if( mytid == tids[i] ){ me = i; break; }

/* Do calculations with data */ result = work( ... );

/* Send result to master */ pvm_initsend( PvmDataDefault ); pvm_pkint( &me, 1, 1 ); pvm_pkfloat( &result, 1, 1 ); msgtype = 5; master = pvm_parent(); pvm_send( master, msgtype );

/* Program finished. Exit PVM before stopping */ pvm_exit();}

top related