cilk - a forerunning language for parallel...

114
Cilk A Forerunning Language for Parallel Programming Olivier Aumage STORM Team Inria – LaBRI Technological Development’s Tuesdays ST RM Static Optimizations – Runtime Methods

Upload: others

Post on 18-Aug-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

CilkA Forerunning Language for Parallel Programming

Olivier Aumage – STORM TeamInria – LaBRI

Technological Development’s Tuesdays

ST RMStatic Optimizations – Runtime Methods

Page 2: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Olivier Aumage – STORM Team – Cilk 2

1Introduction

Page 3: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Introduction

The two “all-time” goals in parallel programming

Programming parallel applications– Easily

Running parallel applications– Efficiently

The Cilk language and framework played an anticipative rolein reaching these goals for some classes of applications

Olivier Aumage – STORM Team – Cilk – 1. Introduction 3

Page 4: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Cilk

Cilk in a Few Words

A programming environment– A language and compiler: keyword-based extension of C– An execution model and a run-time system

Developed at the MIT– Supertech Research Group– Charles E. Leiserson’s team– Mid-90’s

Olivier Aumage – STORM Team – Cilk – 1. Introduction 4

Page 5: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

History

Academic Era — Cilk– 1994: Cilk 1– 1998: The Implementation of the Cilk-5 Multithreaded Language paper

by Matteo Frigo, Charles E. Leiserson, and Keith H. Randall, at PLDI’98

Start-up Era — Cilk++– 2006: Launch of “Cilk Arts” company– 2008: Cilk++ version 1.0

Intel Era — Cilk Plus– 2009: Intel acquires Cilk Arts– 2010: Intel Cilk Plus released as part of the Intel C++ Compiler– 2012: Release of the Cilk Plus support for the GNU GCC Compiler,

implemented by Intel

Olivier Aumage – STORM Team – Cilk – 1. Introduction 5

Page 6: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Context

Middle of nineties

HardwareSMP: Symmetric Multi-ProcessorsNeed for parallel programming models

SoftwareNotion of threads: concurrent processing contexts within single processHow to efficiently/easily express application parallelism using threads?

Program Easily?Parallel program quickly derived from sequential programConcurrency expressed safely (correctness, consistency)

Run Efficiently?No over/under-subscriptionLoad-balancingLow overhead

Olivier Aumage – STORM Team – Cilk – 1. Introduction 6

Page 7: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Context

Middle of nineties

HardwareSMP: Symmetric Multi-ProcessorsNeed for parallel programming models

SoftwareNotion of threads: concurrent processing contexts within single processHow to efficiently/easily express application parallelism using threads?

Program Easily?Parallel program quickly derived from sequential programConcurrency expressed safely (correctness, consistency)

Run Efficiently?No over/under-subscriptionLoad-balancingLow overhead

Olivier Aumage – STORM Team – Cilk – 1. Introduction 6

Page 8: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Context

Middle of nineties

HardwareSMP: Symmetric Multi-ProcessorsNeed for parallel programming models

SoftwareNotion of threads: concurrent processing contexts within single processHow to efficiently/easily express application parallelism using threads?

Program Easily?Parallel program quickly derived from sequential programConcurrency expressed safely (correctness, consistency)

Run Efficiently?No over/under-subscriptionLoad-balancingLow overhead

Olivier Aumage – STORM Team – Cilk – 1. Introduction 6

Page 9: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Context

Middle of nineties

HardwareSMP: Symmetric Multi-ProcessorsNeed for parallel programming models

SoftwareNotion of threads: concurrent processing contexts within single processHow to efficiently/easily express application parallelism using threads?

Program Easily?Parallel program quickly derived from sequential programConcurrency expressed safely (correctness, consistency)

Run Efficiently?No over/under-subscriptionLoad-balancingLow overhead

Olivier Aumage – STORM Team – Cilk – 1. Introduction 6

Page 10: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Context

Nowadays

Hardware

SMP: Symmetric Multi-Processors

Multi-core processors

Hardware multi-threading (SMT, Hyperthreading, etc.)

Even greater need for parallel programming models!

Revived interest for Cilk-like approaches

Olivier Aumage – STORM Team – Cilk – 1. Introduction 7

Page 11: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Context

Nowadays

Hardware

SMP: Symmetric Multi-Processors

Multi-core processors

Hardware multi-threading (SMT, Hyperthreading, etc.)

Even greater need for parallel programming models!

Revived interest for Cilk-like approaches

Olivier Aumage – STORM Team – Cilk – 1. Introduction 7

Page 12: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Olivier Aumage – STORM Team – Cilk 8

2Concept and Ideas

Page 13: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Concept and Ideas

C extension

Task Parallelism paradigm

Work Stealing paradigm

THE Algorithm

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 9

Page 14: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

C Extension

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 10

Cilk is a faithful extension of C

Keyword based language– Per opposition to pragma based

languages

Main keywords (original Cilk):– cilk: declaration of a potentially

parallel routine– spawn: launch of a potentially

parallel routine– sync: wait for completion of

launched routines– inlet : special function to

aggregate results (reduction)A faithful extension of C

– The C elision of a Cilk program isa valid C program

Page 15: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

C Extension

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 10

Cilk is a faithful extension of C

Keyword based language– Per opposition to pragma based

languagesMain keywords (original Cilk):

– cilk: declaration of a potentiallyparallel routine

– spawn: launch of a potentiallyparallel routine

– sync: wait for completion oflaunched routines

– inlet : special function toaggregate results (reduction)

A faithful extension of C– The C elision of a Cilk program is

a valid C program

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Page 16: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

C Extension

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 10

Cilk is a faithful extension of C

Keyword based language– Per opposition to pragma based

languagesMain keywords (original Cilk):

– cilk: declaration of a potentiallyparallel routine

– spawn: launch of a potentiallyparallel routine

– sync: wait for completion oflaunched routines

– inlet : special function toaggregate results (reduction)

A faithful extension of C– The C elision of a Cilk program is

a valid C program

1

2

3 i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = f i b ( n−1) ;9 y = f i b ( n−2) ;

10

11 return x+y ;12 }13 }

Page 17: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

C Extension

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 10

Cilk is a faithful extension of C

Keyword based language– Per opposition to pragma based

languagesMain keywords (original Cilk):

– cilk: declaration of a potentiallyparallel routine

– spawn: launch of a potentiallyparallel routine

– sync: wait for completion oflaunched routines

– inlet : special function toaggregate results (reduction)

A faithful extension of C– The C elision of a Cilk program is

a valid C program

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Page 18: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Parallelism Paradigm

Potential Parallelism

The programmer declares what can be run in parallel

Cilk’s runtime decides what to run in parallel

Work / Worker DecouplingWorkers: threads launched by Cilk

Work: tasks expressed by the Application

RationalizationGeneric worker threads instead of special purpose threads

Externalized common parallelism managementExternalized resource allocation policy

– Potential parallelism vs number of concurrent contexts

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 11

Page 19: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Parallelism Paradigm

Potential Parallelism

The programmer declares what can be run in parallel

Cilk’s runtime decides what to run in parallel

Work / Worker DecouplingWorkers: threads launched by Cilk

Work: tasks expressed by the Application

RationalizationGeneric worker threads instead of special purpose threads

Externalized common parallelism managementExternalized resource allocation policy

– Potential parallelism vs number of concurrent contexts

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 11

Page 20: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Parallelism Paradigm

Potential Parallelism

The programmer declares what can be run in parallel

Cilk’s runtime decides what to run in parallel

Work / Worker DecouplingWorkers: threads launched by Cilk

Work: tasks expressed by the Application

RationalizationGeneric worker threads instead of special purpose threads

Externalized common parallelism managementExternalized resource allocation policy

– Potential parallelism vs number of concurrent contexts

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 11

Page 21: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Tasks

Notion of frameState of the current cilk function being executed

Live local variables, function arguments

“Program Counter” (PC)

A Frame is a Task

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 12

Page 22: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Tasks

Notion of frameState of the current cilk function being executed

Live local variables, function arguments

“Program Counter” (PC)

A Frame is a Task

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 12

Page 23: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Lists

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 13

Notion of frame dequeOne task list per workerImplemented as a “deque”(doubly-ended queue of frames)

– Head H– Tail T

Page 24: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Lists

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 13

Notion of frame dequeOne task list per workerImplemented as a “deque”(doubly-ended queue of frames)

– Head H– Tail T

H = T

Worker

8

7

6

5

4

3

1

10

2

9

Page 25: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Lists

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 13

Notion of frame dequeOne task list per workerImplemented as a “deque”(doubly-ended queue of frames)

– Head H– Tail T

A worker thread pushes/pops framesat the Tail side of its deque

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 26: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Lists

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 13

Notion of frame dequeOne task list per workerImplemented as a “deque”(doubly-ended queue of frames)

– Head H– Tail T

A worker thread pushes/pops framesat the Tail side of its deque

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 27: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Lists

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 13

Notion of frame dequeOne task list per workerImplemented as a “deque”(doubly-ended queue of frames)

– Head H– Tail T

A worker thread pushes/pops framesat the Tail side of its deque

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 28: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Lists

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 13

Notion of frame dequeOne task list per workerImplemented as a “deque”(doubly-ended queue of frames)

– Head H– Tail T

A worker thread pushes/pops framesat the Tail side of its deque

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 29: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Spawn

Upon a spawn, the worker. . .. . . suspends the parent function

. . . saves the state of the parent function in its frame

. . . pushes the parent frame on its task list

. . . before calling the spawned child function

When the child function completes and returns, the worker. . .. . . pops the parent frame

. . . restores the state of the parent function from its frame

. . . resumes the parent function

This is more or less what regular functions do. . .

Where is the parallelism?

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 14

Page 30: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Spawn

Upon a spawn, the worker. . .. . . suspends the parent function

. . . saves the state of the parent function in its frame

. . . pushes the parent frame on its task list

. . . before calling the spawned child function

When the child function completes and returns, the worker. . .. . . pops the parent frame

. . . restores the state of the parent function from its frame

. . . resumes the parent function

This is more or less what regular functions do. . .

Where is the parallelism?

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 14

Page 31: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Spawn

Upon a spawn, the worker. . .. . . suspends the parent function

. . . saves the state of the parent function in its frame

. . . pushes the parent frame on its task list

. . . before calling the spawned child function

When the child function completes and returns, the worker. . .. . . pops the parent frame

. . . restores the state of the parent function from its frame

. . . resumes the parent function

This is more or less what regular functions do. . .

Where is the parallelism?

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 14

Page 32: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Spawn

Upon a spawn, the worker. . .. . . suspends the parent function

. . . saves the state of the parent function in its frame

. . . pushes the parent frame on its task list

. . . before calling the spawned child function

When the child function completes and returns, the worker. . .. . . pops the parent frame

. . . restores the state of the parent function from its frame

. . . resumes the parent function

This is more or less what regular functions do. . .

Where is the parallelism?

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 14

Page 33: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Page 34: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

H = T

Worker

8

7

6

5

4

3

1

10

2

9

Page 35: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 36: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 37: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 38: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

. . .

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 39: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 2

. . .

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 40: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

. . .

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 41: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Deque Management on Spawn

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 15

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 42: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Parallelism

Work Stealing paradigmIdle workers steal work. . .

. . . from other worker’s queues

Work stolen as frame/taskA thief resumes a suspended parent task. . .

. . . while a victim runs its child

Load balancing: Idle workers steal from busy workers

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 16

Page 43: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Parallelism

Work Stealing paradigmIdle workers steal work. . .

. . . from other worker’s queues

Work stolen as frame/taskA thief resumes a suspended parent task. . .

. . . while a victim runs its child

Load balancing: Idle workers steal from busy workers

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 16

Page 44: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Parallelism

Work Stealing paradigmIdle workers steal work. . .

. . . from other worker’s queues

Work stolen as frame/taskA thief resumes a suspended parent task. . .

. . . while a victim runs its child

Load balancing: Idle workers steal from busy workers

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 16

Page 45: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Spawn [UPDATED]

Upon a spawn, the worker. . .. . . suspends the parent function

. . . saves the state of the parent function in its frame

. . . pushes the parent frame on its task list

. . . before calling the spawned child function

When the child function completes and returns, the worker. . .. . . attempts to pop the parent frameif it succeeds, it. . .

– . . . restores the state of the parent function from its frame– . . . resumes the parent function

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 17

Page 46: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Task Spawn [UPDATED]

Upon a spawn, the worker. . .. . . suspends the parent function

. . . saves the state of the parent function in its frame

. . . pushes the parent frame on its task list

. . . before calling the spawned child function

When the child function completes and returns, the worker. . .. . . attempts to pop the parent frameif it succeeds, it. . .

– . . . restores the state of the parent function from its frame– . . . resumes the parent function

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 17

Page 47: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Work Stealing Implementation

Task lists implementation. . .Doubly-ended queue

Head H

Tail T

. . . with the following rulesWorkers push/pop work at the Tail side T of their own deque

An idle worker (thief) steals work at the Head side H of another worker(victim) deque

T >= H under normal conditions

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 18

Page 48: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Work Stealing Implementation

Task lists implementation. . .Doubly-ended queue

Head H

Tail T

. . . with the following rulesWorkers push/pop work at the Tail side T of their own deque

An idle worker (thief) steals work at the Head side H of another worker(victim) deque

T >= H under normal conditions

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 18

Page 49: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Work Stealing

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 19

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 50: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Work Stealing

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 19

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

Fib n−1, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 51: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Work Stealing

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 19

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 52: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Work Stealing Implementation Discussion

Doubly-ended queue benefitsTail-side worker push/pop

– Depth-first– Locality

Head side steal– Breadth-first– Potentially favor big steals– Potentially steal less often

Potentially rare deque interferences between worker and thieves

Low overhead, high efficiency for Divide and Conquer algorithms

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 20

Page 53: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Conflict

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 21

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

Fib n−1, step 2

Fib n, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 54: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Conflict

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 21

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

Fib n−1, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 55: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Conflict

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 21

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

. . .

Fib n−2, step 2

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 56: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Example: Conflict

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 21

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Fib 2, step 3

T

H

Worker

8

7

6

5

4

3

1

10

2

9

Page 57: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

TH(E) Algorithm

GeneralitiesT.H.(E.): Tail, Head, (Exception)Optimization of the deque synchronization mechanism

– Favor the potentially frequent worker pushes/pops. . .– . . . over the potentially infrequent thief steals

PrinciplesThe deque is protected by a expensive lock LThe thieves only modify H

– Thieves always take the lock L before changing H

The worker only modifies TConflicts between a worker and a thief arise when. . .

– . . . the invariant T >= H is broken

The worker speculatively modifies T without taking the lock L!

If the speculation fails, the worker retries with the L lock taken

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 22

Page 58: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

TH(E) Algorithm

GeneralitiesT.H.(E.): Tail, Head, (Exception)Optimization of the deque synchronization mechanism

– Favor the potentially frequent worker pushes/pops. . .– . . . over the potentially infrequent thief steals

PrinciplesThe deque is protected by a expensive lock LThe thieves only modify H

– Thieves always take the lock L before changing H

The worker only modifies TConflicts between a worker and a thief arise when. . .

– . . . the invariant T >= H is broken

The worker speculatively modifies T without taking the lock L!

If the speculation fails, the worker retries with the L lock taken

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 22

Page 59: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

TH(E) Algorithm

GeneralitiesT.H.(E.): Tail, Head, (Exception)Optimization of the deque synchronization mechanism

– Favor the potentially frequent worker pushes/pops. . .– . . . over the potentially infrequent thief steals

PrinciplesThe deque is protected by a expensive lock LThe thieves only modify H

– Thieves always take the lock L before changing H

The worker only modifies TConflicts between a worker and a thief arise when. . .

– . . . the invariant T >= H is broken

The worker speculatively modifies T without taking the lock L!

If the speculation fails, the worker retries with the L lock taken

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 22

Page 60: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

TH(E) Algorithm

GeneralitiesT.H.(E.): Tail, Head, (Exception)Optimization of the deque synchronization mechanism

– Favor the potentially frequent worker pushes/pops. . .– . . . over the potentially infrequent thief steals

PrinciplesThe deque is protected by a expensive lock LThe thieves only modify H

– Thieves always take the lock L before changing H

The worker only modifies TConflicts between a worker and a thief arise when. . .

– . . . the invariant T >= H is broken

The worker speculatively modifies T without taking the lock L!

If the speculation fails, the worker retries with the L lock taken

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 22

Page 61: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

TH(E) Algorithm Pseudocode

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 23

1 /∗ Worker ∗ /2 push ( ) {3 T++;4 }5

6 pop ( ) {7 T−−;8 __mem_fence__ ( ) ;9 i f (H > T) {

10 T++;11 lock ( L ) ;12 T−−;13 i f (H > T) {14 T++;15 unlock ( L ) ;16 return FAILURE ;17 }18 unlock ( L ) ;19 return SUCCESS;20 }

1 /∗ Th ie f ∗ /2 s t e a l ( ) {3 lock ( L ) ;4 H++;5 __mem_fence__ ( ) ;6 i f (H > T) {7 H−−;8 unlock ( L ) ;9 return FAILURE ;

10 }11 unlock ( L ) ;12 return SUCCESS;13 }

Page 62: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Cilk’s Keywords Summary

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 24

Rehearsal before looking at someexamples

cilk: declaration of a potentiallyparallel routine

spawn: launch of a potentiallyparallel routine

sync: wait for completion oflaunched routines

inlet : special function to aggregateresults (reduction)

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Page 63: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Cilk’s Keywords Summary

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 24

Rehearsal before looking at someexamples

cilk: declaration of a potentiallyparallel routine

spawn: launch of a potentiallyparallel routine

sync: wait for completion oflaunched routines

inlet : special function to aggregateresults (reduction)

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i f ( n < 2) {5 return n ;6 } else {7 i n t x , y ;8 x = spawn f i b ( n−1) ;9 y = spawn f i b ( n−2) ;

10 sync ;11 return x+y ;12 }13 }

Page 64: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Cilk’s Keywords Summary

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 24

Rehearsal before looking at someexamples

cilk: declaration of a potentiallyparallel routine

spawn: launch of a potentiallyparallel routine

sync: wait for completion oflaunched routines

inlet : special function to aggregateresults (reduction)

1 #include < c i l k . h>2

3 c i l k i n t f i b ( i n t n ) {4 i n t x = 0;5

6 i n l e t void acc ( i n t res ) {7 x += res ;8 }9

10 i f ( n < 2) {11 return n ;12 } else {13 acc (spawn f i b ( n−1) ) ;14 acc (spawn f i b ( n−2) ) ;15 sync ;16 return x ;17 }18 }

Page 65: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Examples

Fib

Knapsack

Heat

LU

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 25

Page 66: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Code Generation Example

Fib

Two-clone StrategyFast clone executed by worker

– “Streamlined” C functionSlow clone executed by a thief after a steal

– Post-steal logic to restore saved state

Olivier Aumage – STORM Team – Cilk – 2. Concept and Ideas 26

Page 67: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Olivier Aumage – STORM Team – Cilk 27

3Cilk Plus

Page 68: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus

URL: http://www.cilkplus.org/

ChangesSupports C and C++

No need to declare Cilk functions

Main keywordscilk_spawn: similar to original Cilk’s spawn

cilk_sync: similar to original Cilk’s synccilk :: reducer <...>

– Template parameterized with a reduction op– Replacement for inlets

cilk_for: parallel loop

Fortran inspired Array Notation

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 28

Page 69: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus

URL: http://www.cilkplus.org/

ChangesSupports C and C++

No need to declare Cilk functions

Main keywordscilk_spawn: similar to original Cilk’s spawn

cilk_sync: similar to original Cilk’s synccilk :: reducer <...>

– Template parameterized with a reduction op– Replacement for inlets

cilk_for: parallel loop

Fortran inspired Array Notation

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 28

Page 70: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Cilk Plus Parallel Loop

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 29

Work-Stealing Loop

cilk_for keyword

Potentially parallel loop

Recursively divided range

Work-stealing load balancing

1 i n t i ;2

3 for ( i =0; i <N; i ++) {4 f ( i ) ;5 }6

7 /∗ − − − − ∗ /8

9 for ( i =0; i <N; i ++) {10 cilk_spawn f ( i ) ;11 }12 cilk_sync ;13

14 /∗ − − − − ∗ /15

16 c i l k_ fo r ( i =0; i <N; i ++) {17 f ( i ) ;18 }

Page 71: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Array notationInspired by FortranHelps compiler auto-vectorization

– May use SIMD instruction sets (SSE, AVX)

Syntax:

1 double a [ lower_bound : leng th : s t r i d e ] ;

Implicit vector loop:

1 for ( i = lower ; i < ( len∗ s t r i d e ) + lower ; i += s t r i d e )2 . . . a [ i ] . . .

Shortcut syntax to access all elements (static arrays only):

1 a [ : ]

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 30

Page 72: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Array notationInspired by FortranHelps compiler auto-vectorization

– May use SIMD instruction sets (SSE, AVX)

Syntax:

1 double a [ lower_bound : leng th : s t r i d e ] ;

Implicit vector loop:

1 for ( i = lower ; i < ( len∗ s t r i d e ) + lower ; i += s t r i d e )2 . . . a [ i ] . . .

Shortcut syntax to access all elements (static arrays only):

1 a [ : ]

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 30

Page 73: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Array notationInspired by FortranHelps compiler auto-vectorization

– May use SIMD instruction sets (SSE, AVX)

Syntax:

1 double a [ lower_bound : leng th : s t r i d e ] ;

Implicit vector loop:

1 for ( i = lower ; i < ( len∗ s t r i d e ) + lower ; i += s t r i d e )2 . . . a [ i ] . . .

Shortcut syntax to access all elements (static arrays only):

1 a [ : ]

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 30

Page 74: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Array notationInspired by FortranHelps compiler auto-vectorization

– May use SIMD instruction sets (SSE, AVX)

Syntax:

1 double a [ lower_bound : leng th : s t r i d e ] ;

Implicit vector loop:

1 for ( i = lower ; i < ( len∗ s t r i d e ) + lower ; i += s t r i d e )2 . . . a [ i ] . . .

Shortcut syntax to access all elements (static arrays only):

1 a [ : ]

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 30

Page 75: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Example expressions1D arrays:

1 a [ : ] = 1 ;2 a [ 2 : 4 ] = 3 ;3 a [ 1 : 1 0 : 2 ] = 7 ; / / se t odd ind i ces

Multi-dimensional arrays:

1 c [ : ] [ : ] = 17;2 c [ 3 ] [ : ] = a [ : ] ; / / a f f e c t an ar ray to some row3 c [ : ] [ 4 ] = a [ : ] ; / / a f f e c t an ar ray to some column4 c [ 0 : 1 0 : 2 ] [ : ] = 1 ; / / se t even rows

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 31

Page 76: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Example expressions1D arrays:

1 a [ : ] = 1 ;2 a [ 2 : 4 ] = 3 ;3 a [ 1 : 1 0 : 2 ] = 7 ; / / se t odd ind i ces

Multi-dimensional arrays:

1 c [ : ] [ : ] = 17;2 c [ 3 ] [ : ] = a [ : ] ; / / a f f e c t an ar ray to some row3 c [ : ] [ 4 ] = a [ : ] ; / / a f f e c t an ar ray to some column4 c [ 0 : 1 0 : 2 ] [ : ] = 1 ; / / se t even rows

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 31

Page 77: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Advanced example expressionsConditions:

1 i f ( a [ : ] == 1) {2 b [ : ] = 42;3 } else {4 b [ : ] = 0 ;5 }

Scalar function call:

1 void f ( double v ) {2 p r i n t f ( "%l f \ n " , v ) ;3 }4

5 f ( a [ : ] ) ;

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 32

Page 78: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Advanced example expressionsConditions:

1 i f ( a [ : ] == 1) {2 b [ : ] = 42;3 } else {4 b [ : ] = 0 ;5 }

Scalar function call:

1 void f ( double v ) {2 p r i n t f ( "%l f \ n " , v ) ;3 }4

5 f ( a [ : ] ) ;

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 32

Page 79: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Advanced example expressions (cont’d)Gather:

1 c [ : ] = a [ b [ : ] ] ;

Scatter:

1 a [ b [ : ] ] = c [ : ] ;

Reductions:

1 __sec_reduce_add ( a [ : ] ) ;2 __sec_reduce_mul ( a [ : ] ) ;3 . . .

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 33

Page 80: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Advanced example expressions (cont’d)Gather:

1 c [ : ] = a [ b [ : ] ] ;

Scatter:

1 a [ b [ : ] ] = c [ : ] ;

Reductions:

1 __sec_reduce_add ( a [ : ] ) ;2 __sec_reduce_mul ( a [ : ] ) ;3 . . .

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 33

Page 81: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Intel Cilk Plus Arrays

Advanced example expressions (cont’d)Gather:

1 c [ : ] = a [ b [ : ] ] ;

Scatter:

1 a [ b [ : ] ] = c [ : ] ;

Reductions:

1 __sec_reduce_add ( a [ : ] ) ;2 __sec_reduce_mul ( a [ : ] ) ;3 . . .

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 33

Page 82: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Other Cilk Plus Ports

Cilk Plus / GCCIntegrated in GCC 4.9.2

– Tasks– Array notation– No cilk_for keyword yet

Usage

1 g++ − f c i l k p l u s − l c i l k r t s −o f i b f i b . cpp

Cilk Plus / LLVMURL: http://cilkplus.github.io/

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 34

Page 83: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Examples using Cilk Plus

Tasks

Arrays

Olivier Aumage – STORM Team – Cilk – 3. Cilk Plus 35

Page 84: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Olivier Aumage – STORM Team – Cilk 36

4Conclusion

Page 85: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Conclusion

Cilk: A Forerunning Language for Parallel Programming

From the programmer point of viewEasy to program with (only 3-4 keywords)

Efficient on Divide and Conquer application classes

From the HPC research point of viewKey precursory effort in promoting task parallelism

Inspirational for many related works over the last 20 years

The Implementation of the Cilk-5 Multithreaded LanguageMatteo Frigo, Charles E. Leiserson, and Keith H. Randall, PLDI’98

Olivier Aumage – STORM Team – Cilk – 4. Conclusion 37

Page 86: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Olivier Aumage – STORM Team – Cilk 38

5Related Works

Page 87: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Variants and Evolutions of the Task Scheduling Model

Some related works Inspired by CilkOpenMP 3.0 tasks

Intel Threading Building Blocks (TBB)

Evolutions of the task modelDependent Tasks

Heterogeneous Tasks

Olivier Aumage – STORM Team – Cilk – 5. Related Works 39

Page 88: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Variants and Evolutions of the Task Scheduling Model

Some related works Inspired by CilkOpenMP 3.0 tasks

Intel Threading Building Blocks (TBB)

Evolutions of the task modelDependent Tasks

Heterogeneous Tasks

Olivier Aumage – STORM Team – Cilk – 5. Related Works 39

Page 89: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Dependent Tasks

More flexibility in specifying task relationshipsVarious possibilities

– Task – Task dependencies– Task – Data dependencies

Various expressiveness vs automatism trade-offs– Expressed dependencies– Inferred dependencies

Some examplesStarPU (STORM Team)

OpenMP 4.0 tasks

OmpSs (Barcelona Supercomputing Center)

Kaapi (Inria Team MOAIS)

DAGuE / PARSeC (UTK + Inria Team HiePACS)

etc.

Olivier Aumage – STORM Team – Cilk – 5. Related Works 40

Page 90: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Dependent Tasks

More flexibility in specifying task relationshipsVarious possibilities

– Task – Task dependencies– Task – Data dependencies

Various expressiveness vs automatism trade-offs– Expressed dependencies– Inferred dependencies

Some examplesStarPU (STORM Team)

OpenMP 4.0 tasks

OmpSs (Barcelona Supercomputing Center)

Kaapi (Inria Team MOAIS)

DAGuE / PARSeC (UTK + Inria Team HiePACS)

etc.

Olivier Aumage – STORM Team – Cilk – 5. Related Works 40

Page 91: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Sequential Cholesky Decomposition

Olivier Aumage – STORM Team – Cilk – 5. Related Works 41

for (j = 0; j < N; j++) {POTRF ( A[j][j]);for (i = j+1; i < N; i++)

TRSM ( A[i][j], A[j][j]);for (i = j+1; i < N; i++) {

SYRK ( A[i][i], A[i][j]);for (k = j+1; k < i; k++)

GEMM ( A[i][k],A[i][j], A[k][j]);

}}

Page 92: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Task-Based Cholesky Decomposition

Olivier Aumage – STORM Team – Cilk – 5. Related Works 42

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

Page 93: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 94: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 95: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 96: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 97: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 98: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 99: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 100: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 101: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 102: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 103: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 104: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 105: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Ex.: Inferred Dependencies

Olivier Aumage – STORM Team – Cilk – 5. Related Works 43

for (j = 0; j < N; j++) {POTRF (RW,A[j][j]);for (i = j+1; i < N; i++)

TRSM (RW,A[i][j], R,A[j][j]);for (i = j+1; i < N; i++) {

SYRK (RW,A[i][i], R,A[i][j]);for (k = j+1; k < i; k++)

GEMM (RW,A[i][k],R,A[i][j], R,A[k][j]);

}}__wait__();

POTRF

GEMM

TRSM

SYRK

Page 106: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Heterogeneous Tasks

Olivier Aumage – STORM Team – Cilk – 5. Related Works 44

Special purpose accelerator devices(or general purpose GPUs)

(initially) a discrete expansion card

Rationale: dye area trade-off

A single control unit. . .

. . . for several computing units

Allows flows to diverge

. . . but better avoid it!

Page 107: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Heterogeneous Tasks

Olivier Aumage – STORM Team – Cilk – 5. Related Works 44

Special purpose accelerator devices(or general purpose GPUs)

(initially) a discrete expansion card

Rationale: dye area trade-off

Single Instruction Multiple Threads (SIMT)

A single control unit. . .

. . . for several computing units

Allows flows to diverge

. . . but better avoid it!

Page 108: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Heterogeneous Tasks

Olivier Aumage – STORM Team – Cilk – 5. Related Works 44

Special purpose accelerator devices(or general purpose GPUs)

(initially) a discrete expansion card

Rationale: dye area trade-off

Single Instruction Multiple Threads (SIMT)

A single control unit. . .

. . . for several computing units

Allows flows to diverge

. . . but better avoid it!

GPU

DRAM

Control

Control Scalar Cores

(Streaming Processors)

Streaming Multiprocessor

R1 + R2

R5 / R2

Scalar Cores

Page 109: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Heterogeneous Tasks

Olivier Aumage – STORM Team – Cilk – 5. Related Works 44

Special purpose accelerator devices(or general purpose GPUs)

(initially) a discrete expansion card

Rationale: dye area trade-off

Single Instruction Multiple Threads (SIMT)

A single control unit. . .

. . . for several computing units

Allows flows to diverge

. . . but better avoid it!

GPU

Control Scalar Cores

(Streaming Processors)

Streaming Multiprocessor

R1 + R2

...if(cond){

... ... ...

} else { ... ...}...

Page 110: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

GPU Hardware Model

CPU vs GPU

Multiple strategies for multiple purposesCPU

– Strategy– Large caches– Large control

– Purpose– Complex codes, branching– Complex memory access patterns

– World Rally Championship carGPU

– Strategy– Lot of computing power– Simplified control

– Purpose– Regular data parallel codes– Simple memory access patterns

– Formula One car

Olivier Aumage – STORM Team – Cilk – 5. Related Works 45

CPU

GPU

Control ALU ALU

ALU ALU

Cache

DRAM

DRAM

Page 111: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Heterogeneous Tasks

Main issuesOffload or not offload?

– Computation kernel efficiency on accelerator wrt on the main CPU?Discrete accelerator memory

– Data transfer management– Data transfer cost

Some examplesStarPU (STORM Team)

OmpSs (Barcelona Supercomputing Center)

DAGuE / PARSeC (UTK + Inria Team HiePACS)

Olivier Aumage – STORM Team – Cilk – 5. Related Works 46

Page 112: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

StarPU Showcase with the MAGMA Linear Algebra Library

UTK, INRIA HIEPACS, INRIA RUNTIME

QR decomposition on 16 CPUs (AMD) + 4 GPUs (C1060)

Olivier Aumage – STORM Team – Cilk – 5. Related Works 47

0

200

400

600

800

1000

0 5000 10000 15000 20000 25000 30000 35000 40000

Gflo

p/s

Matrix order

4 GPUs + 16 CPUs4 GPUs + 4 CPUs3 GPUs + 3 CPUs2 GPUs + 2 CPUs1 GPUs + 1 CPUs

Expected increase: +12 CPUs ~150 Gflops

Measured increase: +12 CPUs ~200 GFlops

Page 113: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Showcase with the MAGMA Linear Algebra Library

QR kernel propertiesKernel SGEQRTCPU: 9 GFlop/s GPU: 30 GFlop/s Speed-up: 3Kernel STSQRTCPU: 12 GFlop/s GPU: 37 GFlop/s Speed-up: 3Kernel SOMQRTCPU: 8.5 GFlop/s GPU: 227 GFlop/s Speed-up: 27Kernel SSSMQCPU: 10 GFlop/s GPU: 285 GFlop/s Speed-up: 28

ConsequencesTask distribution

– SGEQRT: 20% Tasks on GPU– SSSMQ: 92% tasks on GPU

StarPU takes advantage of heterogeneity!– Only use the accelerator for kernels it is good for– Don’t slow down the accelerator with kernels it is not good for

Olivier Aumage – STORM Team – Cilk – 5. Related Works 48

Page 114: Cilk - A Forerunning Language for Parallel Programmingsed.bordeaux.inria.fr/seminars/cilk_20150324.pdf · Cilk Cilk in a Few Words A programming environment – A language and compiler:

Upcoming Tutorial

PRACE PATC Training session on Runtime SystemsAt INRIA in Bordeaux, France

June 4-5, 2015

In partnership with La Maison de la Simulation

http://www.maisondelasimulation.fr/

ProgramHwloc: locality management tool/lib

StarPU: task scheduler

EZtrace: trace-based debugging framework

Kaapi: task scheduler

Olivier Aumage – STORM Team – Cilk – 5. Related Works 49