mpi computing

35
2.1 MPI computing 张张张 [email protected] http://bioinfo.uncc.edu/szhang Lecture 3

Upload: stacy-burch

Post on 01-Jan-2016

105 views

Category:

Documents


0 download

DESCRIPTION

Lecture 3. MPI computing. 张少强 [email protected] http://bioinfo.uncc.edu/szhang. Message-Passing Computing 消息传递计算 More MPI routines: Collective ( 集合 ) routines Synchronous ( 同步) routines Non-blocking ( 非阻塞) routines. 上一讲是点对点消息传递,这一讲:. 集合消息传递例程. 发送消息给一组多个进程或者从一组进程接收进程的例程。 比“点对点”的例程要高效些。. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MPI computing

2.1

MPI computing

张少强[email protected]

http://bioinfo.uncc.edu/szhang

Lecture 3

Page 2: MPI computing

2.2

Message-Passing Computing消息传递计算

More MPI routines:Collective ( 集合 ) routines

Synchronous (同步) routinesNon-blocking (非阻塞) routines

上一讲是点对点消息传递,这一讲:

Page 3: MPI computing

2.3

集合消息传递例程

发送消息给一组多个进程或者从一组进程接收进程的例程。

比“点对点”的例程要高效些。

Page 4: MPI computing

2.4

Collective Communication

一个通信子中的一组进程 . 无消息标记 .• MPI_Bcast() - 从根进程 (root process) 向其他进程广播 (broadcast)• MPI_Gather() - 一个进程从一组进程中的每个进程收集 (gather) 一个值 • MPI_Scatter() - 将跟进程的数据分发 (scatter) 给各个进程• MPI_Alltoall() - 从所有进程向所有进程发送数据• MPI_Reduce() - 将各进程的值组合成一个值• MPI_Reduce_scatter() - 组合值,后分发结果• MPI_Scan()- 在各个进程计算前缀数据归约• MPI_Barrier() - 一种使各进程同步的方法:停止各进程直到所有进程都收到一个特定的“ barrier” 呼叫 .

Page 5: MPI computing

2.5

MPI 广播操作Sending same message to all processes in communicator.Multicast (多播) - sending same message to defined group of processes.

Page 6: MPI computing

2.6

MPI_Bcast() 参数MPI_INT MPI_FLOATMPI_CHAR

Page 7: MPI computing

2.7

基本的 MPI 分发操作Sending each element of an array in root process to a separate process. Contents of ith location of array sent to ith process. 将数组的各个元素分发到各个进程

Page 8: MPI computing

2.8

MPI_Scatter() 参数

Page 9: MPI computing

2.9

MPI_Scatter()

• 最简单的 scatter 是将一个数组的每个元素分发给不同的进程 .

• MPI_Scatter() routine 也提供分发数组固定数目( a fixed number of contiguous elements )连续的元素 给各个进程 .

Page 10: MPI computing

2.10

Scattering contiguous groups of elements to each process

分发连续的元素组给每个进程

Page 11: MPI computing

2.11

举例In the following code, size of send buffer is given by 100 * <number of processes> and 100 contiguous elements are send to each process:

main (int argc, char *argv[]) {

int size, *sendbuf, recvbuf[100]; /* for each process */

MPI_Init(&argc, &argv); /* initialize MPI */

MPI_Comm_size(MPI_COMM_WORLD, &size);

sendbuf = (int *)malloc(size*100*sizeof(int)); /*memory

allocating */

.

MPI_Scatter(sendbuf,100,MPI_INT,recvbuf,100,MPI_INT,0,

MPI_COMM_WORLD);

.

MPI_Finalize(); /* terminate MPI */

}

Page 12: MPI computing

2.12

Gather (集中)

2.12

Having one process collect individual values from set of processes.

Page 13: MPI computing

2.13

MPI_Gather() 参数

Page 14: MPI computing

2.14

Gather ExampleTo gather items from group of processes into process 0, using dynamically allocated memory in root process:

int data[10]; /*data to be gathered from processes*/

MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* find rank */

if (myrank == 0) {

MPI_Comm_size(MPI_COMM_WORLD, &grp_size); /*find group size*/

buf = (int *)malloc(grp_size*10*sizeof (int)); /*alloc. mem*/

}

MPI_Gather(data,10,MPI_INT,buf,grp_size*10,MPI_INT,0,MPI_COMM_WORLD) ;

MPI_Gather() gathers from all processes, including root.

Page 15: MPI computing

2.15

Reduce (归约)Gather operation combined with specified arithmetic/logical operation.Example: Values could be gathered and then added together by root:

Page 16: MPI computing

2.16

MPI_Reduce() 参数

Page 17: MPI computing

2.17

Reduce - operations

MPI_Reduce(*sendbuf,*recvbuf,count,datatype,op,root,comm)

Parameters:*sendbuf send buffer address*recvbuf receive buffer addresscount number of send buffer elementsdatatype data type of send elementsop reduce operation.

Several operations, includingMPI_MAX MaximumMPI_MIN MinimumMPI_SUM SumMPI_PROD Product

root root process rank for resultcomm communicator

Page 18: MPI computing

2.18

#include "mpi.h"#include <stdio.h>#include <string.h>#include <stddef.h>#include <stdlib.h>#include <math.h>#define NUM 40int main(int argc, char *argv[]){ int myid, i,numprocs,x,low,high,myresult, result; int data[40]={10, 13, 34, 45, 56, 76, 83, 54, 55, 60, 33, 54, 62, 10, 45, 32, 40, 54, 55, 60, 32, 43, 62, 44, 12, 31, 56, 91, 22, 90, 32, 44, 67, 76, 89, 99, 18, 77, 40, 21}; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Bcast(data, NUM, MPI_INT, 0, MPI_COMM_WORLD); x=NUM/numprocs; myresult=0; low=myid *x; high=low+x; for(i=low; i<high; i++) myresult+=data[i]; printf("I got %d from process %d\n", myresult, myid); MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0,MPI_COMM_WORLD); if (myid == 0) printf("The sum is %d \n",result); MPI_Finalize();}

一个集合通信的实例 (求和的并行程序)

Page 19: MPI computing

2.19

上例的运行结果

将文件保存为 bcast_reduce.c 先用 mpicc 编译 再用 mpiexec 运行 4 个进程

Page 20: MPI computing

2.20

多节点运行

Page 21: MPI computing

2.21

Collective routines主要特点

• 在一个通信子 ( communicator) 内的一组进程上运行 • 可以替换一组点对点通信 • 通信是本地阻塞的 • 不能保证同步 (implementation dependent) • 可能需要定义根进程 (root) • 数据量要准确匹配 • 不需要消息标记

Page 22: MPI computing

2.22

Barrier (障栅)Block process until all processes have called it.Synchronous operation.

MPI_Barrier(comm)Communicator

Page 23: MPI computing

2.23

Synchronous Message Passing同步消息传递

Routines that return when message transfer completed.

Synchronous send routine• Waits until complete message can be accepted by the

receiving process before sending the message. In MPI, MPI_SSend() routine.

Synchronous receive routine• Waits until the message it is expecting arrives.

In MPI, actually the regular MPI_recv() routine.

Page 24: MPI computing

2.24

Synchronous Message Passing

Synchronous message-passing routines intrinsically perform two actions:

• They transfer data and • They synchronize processes.

Page 25: MPI computing

2.25

Synchronous Ssend() and recv() using 3-way protocol

Process 1 Process 2

Ssend();

recv();Suspend

Time

processAcknowledgment

MessageBoth processescontinue

(a) When send() occurs before recv()

Process 1 Process 2

recv();

Ssend();Suspend

Time

process

Acknowledgment

MessageBoth processescontinue

(b) recv() 出现在 send()

Request to send

Request to send

之前

Page 26: MPI computing

2.26

Parameters of synchronous send(same as blocking send)

MPI_Ssend(buf, count, datatype, dest, tag, comm)

Address of

Number of items

Datatype of

Rank of destination

Message tag

Communicator

send buffer

to send

each item

process

Page 27: MPI computing

2.27

//ssend_rcev.c#include <stdio.h>#include "mpi.h"int main(int argc, char *argv[]){

int rank, size, i;int buffer[10];MPI_Status status;MPI_Init(&argc, &argv);MPI_Comm_size(MPI_COMM_WORLD, &size);MPI_Comm_rank(MPI_COMM_WORLD, &rank);if(size<2){

printf("Please run with two processes.\n");fflush(stdout);/*clear stdout buffer*/MPI_Finalize();return 0;

}if(rank==0){

for(i=0;i<10;i++)buffer[i]=i;

printf("Process: %d sending ...\n",rank);

MPI_Ssend(buffer,10,MPI_INT,1,123,MPI_COMM_WORLD);}if(rank==1){

for(i=0;i<10;i++)buffer[i]=-1;

printf("Process: %d receiving ...\n",rank);

MPI_Recv(buffer,10,MPI_INT,0,123,MPI_COMM_WORLD,&status);for(i=0;i<10;i++){

if(buffer[i]!=i) printf("Error: buffer[%d]=%d but is expected to be %d\n",i,buffer[i],i);else printf("Process %d: buffer[%d]=%d as expected\n",rank,i,i);

}fflush(stdout);

}MPI_Finalize();return 0;

}

一个实例

Page 28: MPI computing

2.28

运行结果

Page 29: MPI computing

2.29

Asynchronous Message Passing异步消息传递

• Routines that do not wait for actions to complete before returning. Usually require local storage for messages.

• More than one version depending upon the actual semantics for returning.

• In general, they do not synchronize processes but allow processes to move forward sooner.

• Must be used with care.

Page 30: MPI computing

2.30

MPI 阻塞和非阻塞

• Blocking - return after their local actions complete (局部完成后返回) , though the message transfer may not have been completed. Sometimes called locally blocking. 阻塞发送的局部完成条件是:保存消息的单元可再次为其他语句或例程所使用或者其内容改变但不影响消息的传递

• Non-blocking - return immediately (asynchronous)Non-blocking assumes that data storage used for transfer not modified by subsequent statements prior to being used for transfer, and it is left to the programmer to ensure this.

Blocking/non-blocking terms may have different interpretations in other systems.

Page 31: MPI computing

2.31

MPI Nonblocking Routines

• Non-blocking send - MPI_Isend() - will return “immediately” even before source location is safe to be altered.

• Non-blocking receive - MPI_Irecv() - will return even if no message to accept.

Page 32: MPI computing

2.32

Nonblocking Routine FormatsMPI_Isend(buf,count,datatype,dest,tag,comm,request)

MPI_Irecv(buf,count,datatype,source,tag,comm, request)

Completion detected by MPI_Wait() and MPI_Test().

MPI_Wait() waits until operation completed and returns then.

MPI_Test() returns with flag set indicating whether operation completed at that time.

Need to know whether particular operation completed.

Determined by accessing request parameter.

Page 33: MPI computing

2.33

Example

To send an integer x from process 0 to process 1 and allow process 0 to continue:

MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* find rank */

if (myrank == 0) {

int x;

MPI_Isend(&x,1,MPI_INT, 1, msgtag, MPI_COMM_WORLD, req1);

compute();

MPI_Wait(req1, status);

} else if (myrank == 1) {

int x;

MPI_Recv(&x,1,MPI_INT,0,msgtag, MPI_COMM_WORLD, status);

}

Page 34: MPI computing

2.34

How message-passing routines return before message transfer completed

Process 1 Process 2

send();

recv();

Message buffer

Readmessage buffer

Continueprocess

Time

Message buffer needed between source and destination to hold message:

Page 35: MPI computing

2.35

Other MPI features will be introduced as we need them.