introduction to parallel and distributed systems ...kwiatkow/intr/lec1.pdf · - simd - mimd -...
TRANSCRIPT
1
Introduction to Parallel and Distributed Systems - INZ0277Wcl – 5 ECTS
Teacher: Jan Kwiatkowski, Office 201/15, D-2 COMMUNICATION
• For questions, email to [email protected] with 'Subject=your name”. Make sure to email from an account I can reply to.
• All course information will be available at https://www.ii.pwr.edu.pl/~kwiatkowski/
1
2
Grading Policy
The distribution of grades will be as follows: quizes - 10%, final exam - 40%, laboratory projects 25%, homework's 25%. Incremental grades (+/-) and curving are in my discretion. I reserve the rights to fail anyone who has a failing tests average.
To pass the course you need to have at least 40% of points from each activity
- 5.0 for >=90
- else 4.0 for >=75% - else 3.0 for >=60%
- else 2.0.
2
3
TEXTBOOKS
• V. Kumar i inni, "Introduction to Parallel Computing", The Benjamin/Cummings Pub., New York 2003.
• Foster I., “Designing and Building Parallel Programs”,
http://www. mcs.aul.gov/dbpp/text/book.html
• Writing Message-Passing Parallel Programs with MPI, Course Notes, http://www.zib.de/zibdoc/mpikurs/mpi-course.pdf
3
4
Requirements
• All in and out of class work for grade should be done independently. Projects may be discussed up to design, but no code is allowed to be shared.
• Students are expected to be in class. A penalty of 1% per missed lecture may be imposed.
• No make-up will be given for missed test. Special circumstances will be discussed individually.
• All programming expected to be done on time. A penalty may be imposed.
4
5
A taxonomy of parallel architectures
- control mechanism
- SIMD
- MIMD
- address-space organization
- message-passing architecture
- shared-address-space architecture
- UMA
- NUMA
- interconnection networks
- static
- dynamic
- processor granularity
- coarse-grain computers
- medium-grain computers
- fine-grain computers
6 6
Flynn’s classifications
– Single-instruction-stream, single-data-stream (SISD) computers
- Typical uniprocessors
- Parallelism through pipelines,
– Multiple-instruction-stream, single-data-stream (MISD) computers
- Not used often ?
– Single-instruction-stream, multiple-data-stream (SIMD) computers
- Vector and array processors
– Multiple-instruction-stream, multiple-data-stream (MIMD) computers
- Multiprocessors
7 7
Inter-
connection
Network
P
M
P
M
P
M
Inter-
connection
Network
P
P
P
M
M
M
An uniform-memory-access computer
A non-uniform-memory-access
computer with local memory only
Typical shared-address-space architecture.
8 8
Interconnection Network
P
M
P
M
P
M
P
M
P
M .............
.............
P - Processor
M - Memory
A typical message-passing architecture.
9
Dynamic interconnection networks
Crossbar switching networks
Bus-based networks
Multistage interconnection networks
10
P0
P1
P2
P3
Pp-1
M0 M1 M2 M3 M0
A switching
element P4
A completely nonblocking crossbar switch connecting p processors to b memory banks
11 11
A typical bus-based architecture with no cache.
Global memory
Processor Processor Processor
Bus
12 12
Multistage interconnection network
Stage 1 Stage 2 Stage n
0
1
p-1
0
1
b-1
Multistage interconnection network
Processors Memory banks
13 13
Omega network
Pass-through Cross-over
000 001
010 011
100 101
110 111
000 001
010 011
100 101
110 111
14 14
000 001
010 011
100 101
110 111
000 001
010 011
100 101
110 111
An example of blocking in omega network
15 15
Cost and performance
cost
per
form
ance
number of processors number of processors
crossbar multistage
bus
bus
multistage
crossbar
16
Static interconnection networks
Completely-connected networks
Star-connected network
Linear array
Ring
Mesh (2D, 3D, wraparound)
Hypercube
17
Examples of static interconnection networks.
A two-dimensional mesh A two-dimensional wraparound mesh
A linear array A ring
A completely-connected network A star-connected network
18 18
A three-dimensional mesh
processor
switching
elements
Complete binary tree network and message routing
Examples of static interconnection networks.
19
0-D hypercube
1-D hypercube 2-D hypercube 3-D hypercube
0
1
00
01
10
11
010 000
001 011
100
101
110
111
0000
0001 0011
0100
0101
0110
0111
1000
1001 1011
1100
1101
1110
1111
4-D hypercube
0010 1010
Hypercube
20 20
Evaluating Static Interconnection Networks
Diameter - the maximum distance between any two processors in the
network
Connectivity - measure of the multiplicity of paths between any two
processors
Arc connectivity - minimum number of arc that must be removed from
the network to break it into two disconnected networks.
Bisection width - minimum number of communication links that have to
be removed to partition the network into two equal halves.
Channel width -the number of bits that can be communicated
simultaneously over a link connecting two processors.
Bisection bandwidth - minimum volume of communication allowed
between any two halves of the network with an equal number of processors.
Cost - for example: number of communication links.
21 21
Message passing parallel programming paradigm
several instances of the sequential paradigm are considered together
separate workers or processes
interact by exchanging information
M
P
Memory
Processor
M
P
M
P
M
P
…
communication network
22 22
Message Passing Interface – MPI
extended message-passing model
for parallel computers, clusters and heterogeneous networks
not a language or compiler specification
not a specific implementation or product
support send/receive primitives communicating with other workers
in-order data transfer without data loss
several point-to-point and collective communications
MPI supports the development of parallel libraries
MPI does not make any restrictive assumptions about the underlying hardware architecture
23 23
Message Passing Interface – MPI
MPI is very large (125 functions) - MPI’s extensive functionality requires many functions
MPI is very small and simple (6 functions) - many parallel programs can be written with just 6 basic functions
• MPI_Init()
• MPI_Finalize()
• MPI_Comm_rank()
• MPI_Comm_size()
• MPI_Send()
• MPI_Recv()
24 24
An example
#include “mpi.h”
#include <stdio.h>
int main(int argc, char** argv)
{
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MpI_COMM_WORLD, &size);
printf(“Hello, world! I’m %d of %d\n”, rank, size);
MPI_Finalize();
return 0;
}
25 25
Two main functions
Initializing MPI
– every MPI program must call this routine once, before any other MPI routines
MPI_Init(&argc, &argv);
Clean-up of MPI
– every MPI program must call this routine when all communications have completed
MPI_Finalize();
26 26
Communicator
Communicator – MPI processes can only communicate if they share a
communicator
– MPI_COMM_WORLD
- predefined default communicator in MPI_Init()call
27 27
Communicator
How do you identify different processes?
– an MPI process can query a communicator for information about the group
– a communicator returns in rank of the calling process
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
How many processes are contained within a communicator?
– a communicator returns the # of processes in the communicator
MPI_Comm_size(MPI_COMM_WORLD, &size);
28 28
Sending Messages
communication between only 2 processes
source process sends message to destination process
destination process is identified by its rank in the communicator
0
1
5
4
3
2 source
dest
communicator
29 29
MPI types
Message
– an array of elements of a particular MPI datatype
basic MPI datatype
- MPI_(UNSIGNED_)CHAR : signed(unsigned) char
- MPI_(UNSIGNED_)SHORT : signed(unsigned) short int
- MPI_(UNSIGNED_)INT : signed(unsigned) int
- MPI_(UNSIGNED_) LONG : signed(unsigned) long int
- MPI_FLOAT : float
- MPI_DOUBLE : double
MPI datatype
30 30
Send Message
MPI_Send(void* buf, int count, MPI_Datatype datatype, int
dest,
int tag, MPI_COMM_WORLD);
/* (IN) buf : address of the data to be sent */
/* (IN) count : # of elements of the MPI Datatype */
/* (IN) dest : destination process for the message
(rank of the destination process) */
/* (IN) tag : marker distinguishes used message type */
31 31
Receive Message
MPI_Recv(void *buf,
int count,
MPI_Datatype datatype,
int source, /* MPI_ANY_SOURCE */
int tag, /* MPI_ANY_TAG */
MPI_COMM_WORLD,
MPI_Status *status);
/* (IN) buf : address where the data should be placed*/
/* (IN) count : # of elements of the MPI Datatype */
/* (IN) source : rank of the source of the message */
/* (IN) tag : receives a message having this tag*/
/* (OUT) status : some information to be used at later */