parallel programming patterns · mccool et al., chapter 3 . parallel programming patterns 2....

88
Parallel Programming Parallel Programming Patterns Patterns Moreno Marzolla Dip. di Informatica—Scienza e Ingegneria (DISI) Università di Bologna http://www.moreno.marzolla.name/ McCool et al., Chapter 3

Upload: others

Post on 02-Oct-2020

27 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Parallel Programming PatternsPatterns

Moreno MarzollaDip. di Informatica—Scienza e Ingegneria (DISI)Università di Bologna

http://www.moreno.marzolla.name/

McCool et al., Chapter 3

Page 2: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 2

Page 3: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 3

What is a pattern?

● A design pattern is “a general solution to a recurring engineering problem”

● A design pattern is not a ready-made solution to a given problem...

● ...rather, it is a description of how a certain kind of problem can be solved

Page 4: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 4

Architectural patterns

● The term “architectural pattern” was first used by architect Christopher Alexander to denote common design decision that have been used by architects and engineers to realize buildings and constructions in general Christopher Alexander,

(1936--), A Pattern Language: Towns, Buildings, Construction

Page 5: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 5

Example

● Building a bridge across a river● You do not “invent” a brand new type of bridge each

time– Instead, you adapt an already existing type of bridge

Page 6: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 6

Example

Page 7: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 7

Example

Page 8: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Example

Page 9: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 9

Parallel Programming Patterns

● Embarrassingly Parallel● Partition● Master-Worker● Stencil● Reduce● Scan

Page 10: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 10

Parallel programming patterns:Embarrassingly parallel

Page 11: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 11

Embarrassingly Parallel

● Applies when the computation can be decomposed in independent tasks that require little or no communication

● Examples:– Vector sum– Mandelbrot set– 3D rendering – Brute force password cracking– ...

+ + +

===

a[]

b[]

c[]

Processor 0 Processor 1 Processor 2

Page 12: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 12

Parallel programming patterns:Partition

Page 13: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 13

Partition

● The input data space (in short, domain) is split in disjoint regions called partitions

● Each processor operates on one partition● This pattern is particularly useful when the application

exhibits locality of reference– i.e., when processors can refer to their own partition only

and need little or no communication with other processors

Page 14: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 14

Example

Core 0

Core 1

Core 2

Core 3

x =

● Matrix-vector product Ax = b

● Matrix A[][] is partitioned into P horizontal blocks

● Each processor– operates on one block

of A[][] and on a full copy of x[]

– computes a portion of the result b[] A[][] x[] b[]

Page 15: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 15

Partition

● Types of partition– Regular: the domain is split into partitions of roughly the

same size and shape. E.g., matrix-vector product– Irregular: partitions do not necessarily have the same size or

shape. E.g., heath transfer on irregular solids● Size of partitions (granularity)

– Fine-Grained: a large number of small partitions– Coarse-Grained: a few large partitions

Page 16: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 16

1-D Partitioning

● Block

● Cyclic

Core 0 Core 2 Core 3Core 1

Page 17: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 17

2-D Block Partitioning

Core 0

Core 2

Core 3

Core 1

Block, * *, Block Block, Block

Page 18: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 18

2-D Cyclic PartitioningCyclic, * *, Cyclic

Page 19: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 19

2-D Cyclic PartitioningCyclic-cyclic

Page 20: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 20

Irregular partitioning example

● A lake surface is approximated with a triangular mesh

● Colors indicate the mapping of mesh elements to processors

Page 21: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 21

Fine grained vsCoarse grained partitioning

● Fine-grained Partitioning– Better load balancing, especially if combined

with the master-worker pattern (see later)– If granularity is too fine, the computation /

communication ratio might become too low (communication dominates on computation)

● Coarse-grained Partitioning– In general improves the computation /

communication ratio– However, it might cause load imbalancing

● The "optimal" granularity is sometimes problem-dependent; in other cases the user must choose which granularity to use

Computation

Communication

Tim

eT

ime

Page 22: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 22

Example: Mandelbrot set

● The Mandelbrot set is the set of points c on the complex plane s.t. the sequence z

n(c) defined as

does not diverge whenn → +∞

zn(c)={ 0 if n=0zn−1

2(c) + c otherwise

Page 23: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 23

Mandelbrot set in color

● If the modulus of zn(c) does

not exceed 2 after nmax iterations, the pixel is black (the point is assumed to be part of the Mandelbrot set)

● Otherwise, the color depends on the number of iterations required for the modulus of z

n(c) to become

> 2

Page 24: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 24

Pseudocode

maxit = 1000for each point (cx, cy) {

x = y = 0;it = 0;while ( it < maxit AND x*x + y*y ≤ 2*2 ) {

xnew = x*x - y*y + cx;ynew = 2*x*y + cy;x = xnew;y = ynew;it = it + 1;

}plot(cx, cy, it);

}

Embarassingly parallel structure: the color of each

pixel can be computed independently from other pixels

Source: http://en.wikipedia.org/wiki/Mandelbrot_set#For_programmers

Page 25: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 25

Mandelbrot set

● A regular partitioning can result in uneven load distribution– Black pixels require

maxit iterations– Other pixels require

fewer iterations

Page 26: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 26

Load balancing

● Ideally, each processor should perform the same amount of work– If the tasks synchronize at the end of the computation, the

execution time will be that of the slower task

Task 1

Task 2

Task 3

Task 0

barrier synchronization

busy

idle

Page 27: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 27

Load balancing HowTo

● The workload is balanced if each processor performs more or less the same amount of work

● Ways to achieve load balancing:– Use fine-grained partitioning

● ...but beware of the possible communication overhead if the tasks need to communicate

– Use dynamic task allocation (master-worker paradigm)● ...but beware that dynamic task allocation might incur in higher

overhead with respect to static task allocation

Page 28: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 28

Master-worker paradigm(process farm, work pool)

● Apply a fine-grained partitioning– number of task >> number of cores

● The master assigns a task to the first available worker

Master

Worker0

Worker1

WorkerP-1

Bag of tasks of possibly different duration

Page 29: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 29

Choosing the partition size

Too small = higher scheduling overhead Too large = unbalanced workload

Page 30: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

coarse-grained decompositionstatic task assignment

block size = 64static task assignment

Page 31: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 31

Exampleomp-mandelbrot.c

● Coarse-grained partitioning– OMP_SCHEDULE="static" ./omp-mandelbrot

● Cyclic, fine-grained partitioning (64 rows per block)– OMP_SCHEDULE="static,64" ./omp-mandelbrot

● Dynamic, fine-grained partitioning (64 rows per block)– OMP_SCHEDULE="dynamic,64" ./omp-mandelbrot

● Dynamic, fine-grained partitioning (1 row per block)– OMP_SCHEDULE="dynamic" ./omp-mandelbrot

Page 32: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 35

Parallel programming patterns:Stencil

Page 33: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 36

Stencil

● Stencil computations involve a grid whose values are updated according to a fixed pattern called stencil– Example: the Gaussian smoothing of an image updates the

color of each pixel with the weighted average of the previous colors of the 5 ´ 5 neighborhood

41

164

45

1628

287

164

28

1628

41 47

1

4

7

4

1

41

Page 34: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 37

2D Stencils

5-point 2-axis 2D stencil(von Neumann neighborhood) 9-point 2-axis 2D stencil

9-point 1-plane 2D stencil(Moore neighborhood)

Page 35: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 38

3D Stencils

7-point 3-axis 3D stencil

13-point 3-axis 3D stencil

Page 36: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

39

Stencils

● Stencil computations usually employ two domains to keep the current and next values– Values are read from the current domain– New values are written to the next domain– current and next are exchanged at the end of each step

Page 37: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 40

Ghost Cells

● How do we handle cells on the border of the domain?– For some applications, cells

outside the domain have some fixed, application-dependent value

– In other cases, we may assume periodic boundary conditions

● In either case, we can extend the domain with ghost cells, so that cells on the border do not require any special treatment

Domain

Ghost cells

https://blender.stackexchange.com/questions/39735/how-could-i-animate-a-plane-into-a-pipe-and-then-a-pipe-into-a-torus

Page 38: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 41

Periodic boundary conditions:How to fill ghost cells

……..

Page 39: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 42

Periodic boundary conditions:How to fill ghost cells

……..

Page 41: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 44

Periodic boundary conditions:Another way to fill ghost cells

….

Page 42: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 45

Periodic boundary conditions:Another way to fill ghost cells

….

Page 43: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 46

Periodic boundary conditions:Another way to fill ghost cells

….

Page 44: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 47

Periodic boundary conditions:Another way to fill ghost cells

….

Page 45: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 48

Periodic boundary conditions:Another way to fill ghost cells

….

Page 46: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 49

Periodic boundary conditions:Another way to fill ghost cells

….

Page 47: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 50

Parallelizing stencil computations

● Computing the next domain from the current one has embarassingly parallel structure

Initialize current domainwhile (!terminated) {

Init ghost cellsCompute next domain in parallelExchange current and next domains

}

Page 48: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 51

Stencil computations on distributed-memory architectures

● Ghost cells are essential to efficiently implement stencil computations on distributed-memory architectures

Page 49: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 52

Example: 2D (Block, *) partitioning with 5P stencilPeriodic boundary

P0

P1

P2

Page 50: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 53

Example: 2D (Block, *) partitioning with 5P stencilPeriodic boundary

Page 51: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 54

Example: 2D (Block, *) partitioning with 5P stencilPeriodic boundary

Page 52: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 55

Example: 2D (Block, *) partitioning with 5P stencilPeriodic boundary

Page 53: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 61

2D Stencil Example:Game of Life

● 2D cyclic domain, each cell has two possible states– 0 = dead– 1 = alive

● The state of a cell at time t + 1 depends on– the state of that cell at time t– the number of alive cells at time t among the 8 neighbors

● Rules:– Alive cell with less than 2 alive neighbors → dies– Alive cell with two or three alive neighbors → lives– Alive cell with more than three alive neighbors → dies– Dead cell with three alive neighbors → lives

Page 54: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 62

Example: Game of Life

● See game-of-life.c

Page 55: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 63

Parallel programming patterns:Reduce

Page 56: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 64

Reduce

● A reduction is the application of an associative binary operator (e.g., sum, product, min, max...) to the elements of an array [x

0, x

1, … x

n-1]

– sum-reduce( [x0, x

1, … x

n-1] ) = x

0+ x

1+ … + x

n-1

– min-reduce( [x0, x

1, … x

n-1] ) = min { x

0, x

1, … x

n-1}

– …

● A reduction can be realized in O(log2 n) parallel steps

Page 57: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 65

Example: sum

12-52416-512-81174-231

Page 58: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 66

Example: sum

12-52416-512-81174-231

3-669814-22

Page 59: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 67

Example: sum

12-52416-512-81174-231

3-669814-22

118411

Page 60: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 68

Example: sum

12-52416-512-81174-231

3-669814-22

118411

1519

Page 61: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 69

Example: sum

12-52416-512-81174-231

3-669814-22

118411

1519

34

Page 62: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 70

Example: sum

12-52416-512-81174-231

3-669814-22

118411

1519

34

int d, i;/* compute largest power of two < n */for (d=1; 2*d < n; d *= 2) ;/* do reduction */for ( ; d>0; d /= 2 ) { for (i=0; i<d; i++) { if (i+d<n) x[i] += x[i+d]; }}return x[0];

See reduction.c

d

Page 63: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 71

...

Work efficiency

● How many sums are computed by the parallel reduction algorithm?– n / 2 sums at the first level– n / 4 sums at the second level– …– n / 2j sums at the j-th level– …– 1 sum at the (log

2 n)-th level

● Total: O(n) sums– The tree-structured reduction algorithm is work-efficient,

which means that it performs the same amount of “work” of the optimal serial algorithm

n/4 n/8n/2

n

….

Page 64: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 72

Parallel programming patterns:Scan

Page 65: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 73

Scan (Prefix Sum)

● A scan computes all prefixes of an array [x0, x

1, … x

n-1]

using a given associative binary operator op (e.g., sum, product, min, max... )

[y0, y

1, … y

n - 1] = inclusive-scan( op, [x

0, x

1, … x

n - 1] )

where

y0 = x

0

y1 = x

0 op x

1

y2

= x0 op x

1 op x

2

…y

n - 1= x

0 op x

1 op … op x

n - 1

Page 66: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 74

Scan (Prefix Sum)

● A scan computes all prefixes of an array [x0, x

1, … x

n-1]

using a given associative binary operator op (e.g., sum, product, min, max... )

[y0, y

1, … y

n - 1] = exclusive-scan( op, [x

0, x

1, … x

n - 1] )

where

y0 = 0

y1 = x

0

y2

= x0 op x

1

…y

n - 1= x

0 op x

1 op … op x

n - 2

this is the neutral element of the binary operator (zero for

sum, 1 for product, ...)

Page 67: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 75

Example

1 -3 12 6 2 -3 7 -10x[] =

1 -2 10 16 18 15 22 12inclusive-scan(+, x) =

0 1 -2 10 16 18 15 22exclusive-scan(+, x) =

Page 68: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 76

Example

1 -3 12 6 2 -3 7 -10x[] =

1 -2 10 16 18 15 22 12inclusive-scan(+, x) =

0 1 -2 10 16 18 15 22exclusive-scan(+, x) =

+

Page 69: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 77

1 -2 10 16 18 15 22 12

Example

1 -3 12 6 2 -3 7 -10x[] =

inclusive-scan(+, x) =

0 1 -2 10 16 18 15 22exclusive-scan(+, x) =

+

Page 70: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 78

Serial implementation

void inclusive_scan(int *x, int *s, int n) // n must be > 0{

int i;s[0] = x[0];for (i=1; i<n; i++) {

s[i] = s[i-1] + x[i];}

}

void exclusive_scan(int *x, int *s, int n) // n must be > 0{

int i;s[0] = 0;for (i=1; i<n; i++) {

s[i] = s[i-1] + x[i-1];}

}

Page 71: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 79

Exclusive scan: Up-sweep

x[0] x[1] x[2] x[3] x[4] x[5] x[6] x[7]

x[0] ∑x[0..1] x[2] ∑x[2..3] x[4] ∑x[4..5] x[6] ∑x[6..7]

x[0] ∑x[0..1] x[2] ∑x[0..3] x[4] ∑x[4..5] x[6] ∑x[4..7]

x[0] ∑x[0..1] x[2] ∑x[0..3] x[4] ∑x[4..5] x[6] ∑x[0..7]

for ( d=1; d<n/2; d *= 2 ) {for ( k=0; k<n; k+=2*d ) {

x[k+2*d-1] = x[k+d-1] + x[k+2*d-1];}

} O(n) additions

….

http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html

Page 72: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 80

Exclusive scan: Down-sweepx[0] ∑x[0..1] x[2] ∑x[0..3] x[4] ∑x[4..5] x[6] ∑x[0..7]

x[0] ∑x[0..1] x[2] ∑x[0..3] x[4] ∑x[4..5] x[6] 0

zero

x[0] ∑x[0..1] x[2] 0 x[4] ∑x[4..5] x[6] ∑x[0..3]

x[0] 0 x[2] ∑x[0..1] x[4] ∑x[0..3] x[6] ∑x[0..5]

0 x[0] ∑x[0..1] ∑x[0..2] ∑x[0..3] ∑x[0..4] ∑x[0..5] ∑x[0..6]

http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html

x[n-1] = 0;for ( ; d > 0; d >>= 1 ) {

for (k=0; k<n; k += 2*d ) {float t = x[k+d-1];x[k+d-1] = x[k+2*d-1];x[k+2*d-1] = t + x[k+2*d-1];

}}

O(n) additions

See prefix-sum.c….

Page 73: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 81

Example: Line of Sight

● n peaks of heights h[0], … h[n - 1]; the distance between consecutive peaks is one

● Which peaks are visible from peak 0?

visiblenot

visible

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 74: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 82

Line of sight

Source: Guy E. Blelloch, Prefix Sums and Their Applications

Page 75: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 83

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 76: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 84

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 77: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 85

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 78: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 86

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 79: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 87

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 80: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 88

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 81: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 89

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 82: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 90

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 83: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 91

Line of sight

h[0] h[1] h[2] h[3] h[4] h[5] h[6] h[7]

Page 84: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 92

Serial algorithm

● For each i = 0, … n – 1– Let a[i] be the slope of the line connecting the peak 0 to the

peak i– a[0] ← -∞– a[i] ← arctan( ( h[i] – h[0] ) / i ), se i > 0

● For each i = 0, … n – 1– amax[0] ← -∞– amax[i] ← max {a[0], a[1], … a[i – 1]}, se i > 0

● For each i = 0, … n – 1– If a[i] ≥ amax[i] then the peak i is visible– otherwise the peak i is not visible

Page 85: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 93

Serial algorithm

bool[0..n-1] Line-of-sight( double h[0..n-1] )bool v[0..n-1]double a[0..n-1], amax[0..n-1]a[0] ← -∞for i ← 1 to n-1 do

a[i] ← arctan( ( h[i] – h[0] ) / i )endforamax[0] ← -∞for i ← 1 to n-1 do

amax[i] ← max{ a[i-1], amax[i-1] }endforfor i ← 0 to n-1 do

v[i] ← ( a[i] ≥ amax[i] )endforreturn v

Page 86: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 94

Serial algorithm

bool[0..n-1] Line-of-sight( double h[0..n-1] )bool v[0..n-1]double a[0..n-1], amax[0..n-1]a[0] ← -∞for i ← 1 to n-1 do

a[i] ← arctan( ( h[i] – h[0] ) / i )endforamax[0] ← -∞for i ← 1 to n-1 do

amax[i] ← max{ a[i-1], amax[i-1] }endforfor i ← 0 to n-1 do

v[i] ← ( a[i] ≥ amax[i] )endforreturn v

Embarassinglyparallel

Embarassinglyparallel

Page 87: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 95

Parallel algorithm

bool[0..n-1] Parallel-line-of-sight( double h[0..n-1] )bool v[0..n-1]double a[0..n-1], amax[0..n-1]a[0] ← -∞for i ← 1 to n-1 do in parallel

a[i] ← arctan( ( h[i] – h[0] ) / i )endfor

amax ← exclusive-scan( max, a )

for i ← 0 to n-1 do in parallelv[i] ← ( a[i] ≥ amax[i] )

endforreturn v

Page 88: Parallel Programming Patterns · McCool et al., Chapter 3 . Parallel Programming Patterns 2. Parallel Programming Patterns 3 What is a pattern?

Parallel Programming Patterns 96

Conclusions

● A parallel programming patterns defines:– a partitioning of the input data– a communication structure among parallel tasks

● Parallel programming patterns can help to define efficient algorithms– Many problems can be solved using one or more known

patterns