parallel patterns reduce & scan

34
PARALLEL PATTERNS REDUCE & SCAN 6/16/2010 Parallel Patterns - Reduce & Scan 1

Upload: hila

Post on 22-Feb-2016

58 views

Category:

Documents


0 download

DESCRIPTION

Parallel Patterns Reduce & Scan. Programming Patterns For Parallelism. Some patterns repeat in many different contexts e.g. Search an element in an array Identifying such patterns important Solve a problem once and reuse the solution Split a hard problem into individual problems - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 1

PARALLEL PATTERNSREDUCE & SCAN

6/16/2010

Page 2: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 2

Programming Patterns For Parallelism• Some patterns repeat in many different contexts• e.g. Search an element in an array

• Identifying such patterns important • Solve a problem once and reuse the solution• Split a hard problem into individual problems• Helps define interfaces

6/16/2010

Page 3: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 3

We Have Already Seen Some Patterns

6/16/2010

Page 4: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 4

We Have Already Seen Some Patterns• Divide and Conquer• Split a problem into n sub problems• Recursively solve the sub problems• And merge the solution

• Data Parallelism• Apply the same function to all elements in a collection, array• Parallel.For, Parallel.ForEach• Also called as “map” in functional programming

6/16/2010

Page 5: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 5

Map• Given a function f : (A) => B• A collection a: A[]• Generates a collection b: B[], where B[i] = f( A[i] )

• Parallel.For, Paralle.ForEach• Where each loop iteration is independent

6/16/2010

f f f f f f f f

A

B

Page 6: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 6

Reduce And Scan• In practice, parallel loops have to work together to

generate an answer• Reduce and Scan patterns capture common cases of

processing results of Map

6/16/2010

Page 7: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 7

Reduce And Scan• In practice, parallel loops have to work together to

generate an answer• Reduce and Scan patterns capture common cases of

processing results of Map

• Note: Map and Reduce are similar to but not the same as MapReduce• MapReduce is a framework for distributed computing

6/16/2010

Page 8: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 8

Reduce• Given a function f: (A, B) => B• A collection a: A[]• An initial value b0: B• Generate a final value b: B• Where b = f(A[n-1], … f(A[1], f(A[0], b0)) )

6/16/2010

f f f f f f f fb0b

A

Page 9: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 9

Reduce• Given a function f: (A, B) => B• A collection a: A[]• An initial value b0: B• Generate a final value b: B• Where b = f(A[n-1], … f(A[1], f(A[0], b0)) )

• Only consider where A and B are the same type

6/16/2010

f f f f f f f fb0b

A

Page 10: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 10

Reduce

6/16/2010

f f f f f f f fb0b

A

B acc = b_0;for( i = 0; i < n; i++ ) { acc = f( a[i], acc );}b = acc;

Page 11: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 11

Associativity of the Reduce function• Reduce is parallelizable if f is associative

f(a, f(b, c)) = f(f(a,b), c)

• E.g. Addition : (a + b) + c = a + (b + c)• Where + is integer addition (with modulo arithmetic)• But not when + is floating point addition

6/16/2010

Page 12: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 12

Associativity of the Reduce function• Reduce is parallelizable if f is associative

f(a, f(b, c)) = f(f(a,b), c)

• E.g. Addition : (a + b) + c = a + (b + c)• Where + is integer addition (with modulo arithmetic)• But not when + is floating point addition

• Max, min, multiply, …• Set union, intersection,

6/16/2010

Page 13: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 13

We can use Divide and Conquer• Reduce(f, A[1…n], b_0)

= f ( Reduce(f, A[1..n/2], b_0), Reduce(f, A[n/2+1…n], I) ) where I is the identity element of f

6/16/2010

f f f f f f f fb0 b

A

I f

Page 14: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 14

Implementation Optimizations• Switch to sequential Reduce for the base k elements• Do k way splits instead of two way splits

• Maintain a thread-local accumulated value• A task updates the value of the thread it executes in

6/16/2010

Page 15: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 15

Implementation Optimizations• Switch to sequential Reduce for the base k elements• Do k way splits instead of two way splits

• Maintain a thread-local accumulated value• A task updates the value of the thread it executes in• Requires that the reduce function is also commutative

f(a, b) = f(b, a)

6/16/2010

Page 16: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 16

Implementation Optimizations• Switch to sequential Reduce for the base k elements• Do k way splits instead of two way splits

• Maintain a thread-local accumulated value• A task updates the value of the thread it executes in• Requires that the reduce function is also commutative

f(a, b) = f(b, a)• Thread local values are then merged in a separate pass

6/16/2010

Page 17: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 17

Scan• Given a function f: (A, B) => B• A collection a: A[]• An initial value b0: B• Generate a collection b: B[]• Where b[i] = f(A[i-1], … f(A[1], f(A[0], b0)) )

6/16/2010

f f f f f f f fb0

A

Page 18: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 18

Scan

6/16/2010

f f f f f f f fb0

A

B acc = b_0;for( i = 0; i < n; i++ ) { acc = f( a[i], acc );}

Page 19: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 19

Scan is Efficiently Parallelizable• When f is associative

6/16/2010

Page 20: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 20

Scan is Efficiently Parallelizable• When f is associative• Scan(f, A[1..n], b_0) = Scan(f, A[1..n/2], b_0), Scan(f, A[n/2+1…n], ____)

6/16/2010

f f f f f f f fb0

A

?

Page 21: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 21

Scan is Efficiently Parallelizable• When f is associative• Scan(f, A[1..n], b_0) = Scan(f, A[1..n/2], b_0), Scan(f, A[n/2+1…n], Reduce(f, A[1..n/2], b_0))

6/16/2010

f f f f f f f fb0

A

?

Page 22: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 22

Scan is useful in many places• Radix Sort • Ray Tracing• …

6/16/2010

Page 23: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 23

Scan is useful in many places• Radix Sort ( )• Ray Tracing• …

6/16/2010

Page 24: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 24

Computing Line of Sight• Given x1, … xn with altitudes a[1],…a[n]• Which of the points are visible from x0

6/16/2010

Page 25: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 25

Computing Line of Sight• Given x0, … xn with altitudes alt[0],…alt[n]• Which of the points are visible from x0

• angle[i] = arctan( (alt[i] – alt[0]) / i )

• xi is visible from x0 if all points between them have lesser angle than angle[i]

6/16/2010

Page 26: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 26

Solution

6/16/2010

Page 27: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 27

Radix Sort

5 = 1017 = 1112 = 0104 = 1005 = 1013 = 0111 = 001

6/16/2010

Page 28: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 28

Radix Sort

5 = 1017 = 1112 = 0104 = 1005 = 1013 = 0111 = 001

6/16/2010

2 = 0104 = 1005 = 1017 = 1115 = 1013 = 0111 = 001

Page 29: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 29

Radix Sort

5 = 1017 = 1112 = 0104 = 1005 = 1013 = 0111 = 001

6/16/2010

2 = 0104 = 1005 = 1017 = 1115 = 1013 = 0111 = 001

4 = 1005 = 1015 = 1011 = 0012 = 0107 = 1113 = 011

Page 30: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 30

Radix Sort

5 = 1017 = 1112 = 0104 = 1005 = 1013 = 0111 = 001

6/16/2010

2 = 0104 = 1005 = 1017 = 1115 = 1013 = 0111 = 001

4 = 1005 = 1015 = 1011 = 0012 = 0107 = 1113 = 011

1 = 0012 = 0103 = 0114 = 1005 = 1015 = 1017 = 111

Page 31: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 31

Basic Primitive: Pack• Given an array A and an array F of flags• A = [5 7 2 4 5 3 1]• F = [1 1 0 0 1 1 1]

• Pack all elements with flag = 0 before elements with flag = 1• A’ = [2 4 5 7 5 3 1]

6/16/2010

Page 32: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 32

Solution

6/16/2010

Page 33: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 33

Other Applications of Scan• Radix Sort• Computing Line of Sight• Adding multi-precision numbers• Quick Sort• To search for regular expressions• Parallel grep

• …

6/16/2010

Page 34: Parallel  Patterns Reduce & Scan

Parallel Patterns - Reduce & Scan 34

High Level Points• Minimize dependence between parallel loops• Unintended dependences = data races• Next lecture

• Carefully analyze remaining dependences• Use Reduce and Scan patterns where applicable

6/16/2010