2.1 collective communication involves set of processes, defined by an intra-communicator. message...
DESCRIPTION
2.3 Broadcast IllustratedTRANSCRIPT
![Page 1: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/1.jpg)
2.1
Collective Communication
Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations:
• MPI_BCAST() - Broadcast from root to all other processes• MPI_GATHER() - Gather values for group of processes• MPI_SCATTER() - Scatters buffer in parts to group of processes• MPI_REDUCE() - Combine values on all processes to single
value
![Page 2: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/2.jpg)
2.2
BroadcastSending same message to all processes concerned with problem.Multicast - sending same message to defined group of processes.
bcast();
buf
bcast();
data
bcast();
datadata
Process 0 Process p - 1Process 1
Action
Code
MPI form
![Page 3: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/3.jpg)
2.3
Broadcast Illustrated
![Page 4: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/4.jpg)
2.4
Broadcast (MPI_BCAST)One-to-all communication: same data sent from rootprocess to all others in the communicator
MPI_BCAST(data, size, MPI_Datatype,root,MPI_COMM)
• All processes must specify same root, rank and comm
![Page 5: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/5.jpg)
2.5
Reduction (MPI_REDUCE)The reduction operation allow to:• Collect data from each process• Reduce the data to a single value• Store the result on the root processes• Store the result on all processes• Reduction function works with arrays• Operations: sum, product, min, max, and, ….
![Page 6: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/6.jpg)
COMPE472 Parallel Computing 2.6
Reduction Operation (SUM)
![Page 7: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/7.jpg)
2.7
MPI_REDUCE
• MPI_REDUCE( snd_buf, rcv_buf, count, type, op, root, comm, ierr)
– snd_buf input array of type type containing local values.– rcv_buf output array of type type containing global results– count (INTEGER) number of element of snd_buf and rcv_buf– type (INTEGER) MPI type of snd_buf and rcv_buf– op (INTEGER) parallel operation to be performed– root (INTEGER) MPI id of the process storing the result– comm (INTEGER) communicator of processes involved in the
operation– ierr (INTEGER) output, error code (if ierr=0 no error occours)
• MPI_ALLREDUCE( snd_buf, rcv_buf, count, type, op, comm, ierr)– The argument root is missing, the result is stored to all processes.
![Page 8: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/8.jpg)
2.8
Predefined Reduction Operations
![Page 9: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/9.jpg)
2.9
Example#include <stdio.h>#include <stdlib.h>#include <string.h>#include <math.h>#include <mpi.h>#define MAXSIZE 100
int main(int argc, char **argv){
int myid, numprocs;int data[MAXSIZE], i, x, low, high, myresult=0, result;char fn[255];FILE *fp;
MPI_Init(&argc, &argv);MPI_Comm_size(MPI_COMM_WORLD, &numprocs);MPI_Comm_rank(MPI_COMM_WORLD, &myid);
![Page 10: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/10.jpg)
2.10
Example cont.if(myid==0) {
/* open input file and intialize data */strcpy(fn, getenv("PWD"));strcat(fn, "/rand_data.txt");if( (fp = fopen(fn, "r")) == NULL) {printf("Can't open the input file: %s\n\n", fn);
exit(1);}for(i=0; i<MAXSIZE; i++) {
fscanf(fp, "%d", &data[i]);}
}
![Page 11: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/11.jpg)
2.11
Example cont./* broadcast data */MPI_Bcast(data, MAXSIZE, MPI_INT, 0, MPI_COMM_WORLD);/* add portion of data */x = MAXSIZE/numprocs; /* must be an integer */low = myid * x;high = low + x;for(i=low; i<high; i++) {myresult += data[i];}printf("I got %d from %d\n", myresult, myid);/* compute global sum */MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);if(myid == 0) {printf("The sum is %d.\n", result);}MPI_Finalize();return 0;
}
![Page 12: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/12.jpg)
COMPE472 Parallel Computing 2.12
MPI_Scatter• One-to-all communication: different data sent
from root process to all others in the communicator
MPI_SCATTER(sndbuf, sndcount, sndtype, rcvbuf, rcvcount, rcvtype, root, comm, ierr)– Arguments definition are like other MPI subroutine– sndcount is the number of elements sent to each
process, not the size of sndbuf, that should be sndcount times the number of process in the communicator
– The sender arguments are significant only at root
![Page 13: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/13.jpg)
MPI_Scatter Example• Suppose there are four processes including the root (process
0). A 16 element array on the root should be distributed among the processes.
• Every process should include the following line:
2.13
![Page 14: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/14.jpg)
2.14
MPI_Gather• different data collected by the root process,
from all others processes in the communicator. • Is the opposite of ScatterMPI_GATHER(sndbuf, sndcount, sndtype,
rcvbuf, rcvcount,rcvtype, root, comm, ierr)– Arguments definition are like other MPI subroutine– rcvcount is the number of elements collected from
each process, not the size of rcvbuf, that should be rcvcount times the number of process in the communicator
– The receiver arguments are significant only at root
![Page 15: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/15.jpg)
2.15
Scatter/Gather
![Page 16: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/16.jpg)
2.16
Scatter/Gather Example#include <stdio.h>#include <stdlib.h>#include <mpi.h>int main ( int argc, char *argv[] ){ int myid,j,data[100],tosum[25],sums[4]; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&myid); if(myid==0) { for (j=0; j<100; j++)
data[j] = j+1;
printf("The data to sum : "); for (j=0; j<100; j++)
printf(" %d",data[j]);
printf("\n"); }
![Page 17: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/17.jpg)
2.17
Scatter/Gather Example MPI_Scatter(data,25,MPI_INT,tosum,25,MPI_INT,0,MPI_COMM_WORLD);
printf("Node %d has numbers to sum :",myid); for(j=0; j<25; j++)
printf(" %d", tosum[j]); printf("\n"); sums[myid] = 0; for(j=0; j<25; j++)
sums[myid] += tosum[j];
printf("Node %d computes the sum %d\n",myid,sums[myid]);
![Page 18: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/18.jpg)
2.18
Scatter/Gather Example MPI_Gather(&sums[myid],1,MPI_INT,sums,1,MPI_INT,0,MPI_COMM_WORLD);
if(myid==0) /* after the gather, sums contains the four sums*/ { printf("The four sums : "); printf("%d",sums[0]); for(j=1; j<4; j++)
printf(" + %d", sums[j]); for(j=1; j<4; j++)
sums[0] += sums[j]; printf(" = %d, which should be 5050.\n",sums[0]); } MPI_Finalize(); return 0;}
![Page 19: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/19.jpg)
2.19
MPI_Barrier()
• Stop processes until all processes within a communicator reach the barrier
• Almost never required in a parallel program• Occasionally useful in measuring
performance and load balancing• C:
– int MPI_Barrier(MPI_Comm comm)
![Page 20: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/20.jpg)
2.20
Barrier
![Page 21: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/21.jpg)
2.21
Barrier routine
• A means of synchronizing processes by stopping each one until they all have reached a specific “barrier” call.
![Page 22: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/22.jpg)
2.22
Barrier example
#include <stdio.h>#include <stdlib.h>#include <mpi.h>#include <unistd.h>// some time consuming functionalityint function(int t){ sleep(3*t+1); return 0;}
![Page 23: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/23.jpg)
2.23
Barrier example cont.int main(int argc, char** argv){ int MyRank; double s_time, l_time, g_time;
MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &MyRank);
s_time = MPI_Wtime(); function(MyRank); l_time = MPI_Wtime(); // wait for all to come together MPI_Barrier(MPI_COMM_WORLD); g_time = MPI_Wtime();
printf("Processor %d: LocalTime = %lf,GlobalTime = %lf\n",MyRank,l_time-s_time, g_time-s_time);
MPI_Finalize(); return 0;}
![Page 24: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/24.jpg)
COMPE472 Parallel Computing 2.24
Evaluating Parallel Programs
![Page 25: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/25.jpg)
COMPE472 Parallel Computing 2.25
Sequential execution time, ts: Estimate by counting computational steps of best sequential algorithm.
Parallel execution time, tp: In addition to number of computational steps, tcomp, need to estimate communication overhead, tcomm:
tp = tcomp + tcomm
![Page 26: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/26.jpg)
Elapsed parallel time• Returns the number of seconds that have
elapsed since some time in the past.
![Page 27: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/27.jpg)
COMPE472 Parallel Computing 2.27
Communication Time
Many factors, including network structure and network contention. As a first approximation, use
tcomm = tstartup + ntdata
tstartup is startup time, essentially time to send a message with no data. Assumed to be constant. tdata is transmission time to send one data word, also assumed constant, and there are n data words.
![Page 28: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/28.jpg)
COMPE472 Parallel Computing 2.28
Idealized Communication Time
Number of data items (n)
Startup time
![Page 29: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/29.jpg)
2.29
Benchmark FactorsWith ts, tcomp, and tcomm, can establish speedup factor and computation/communication ratio for a particular algorithm/implementation:
Both functions of number of processors, p, and number of data elements, n.
![Page 30: 2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()](https://reader034.vdocuments.mx/reader034/viewer/2022051104/5a4d1b6b7f8b9ab0599b3397/html5/thumbnails/30.jpg)
COMPE472 Parallel Computing 2.30
Factors give indication of scalability of parallel solution with increasing number of processors and problem size.
Computation/communication ratio will highlight effect of communication with increasing problem size and system size.