a brief introduction to openmp - university of...

75
A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center

Upload: others

Post on 07-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

A brief introduction to OpenMP

Alejandro Duran

Barcelona Supercomputing Center

Page 2: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 2 / 47

Page 3: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 3 / 47

Page 4: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

What is OpenMP?

It’s an API extension to the C, C++ and Fortran languages to writeparallel programs for shared memory machines

Current version is 3.1 (June 2010)Supported by most compiler vendors

Intel,IBM,PGI,Oracle,Cray,Fujitsu,HP,GCC,...

Natural fit for multicores as it was designed for SMPs

Maintained by the Architecture Review Board (ARB), a consortiumof industry and academia

http://www.openmp.org

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 4 / 47

Page 5: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

A bit of historyO

penM

PFo

rtra

n1.

0

1997

Ope

nMP

C/C

++1.

0

1998

Ope

nMP

Fort

ran

1.1

1999

Ope

nMP

Fort

ran

2.0

2000

Ope

nMP

C/C

++2.

0

2002

Ope

nMP

2.5

2005

Ope

nMP

3.0

2008

Ope

nMP

3.1

2010

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 5 / 47

Page 6: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

Target machines

Shared Multiprocessors

Chip Chip Chip

Memory interconnect

Memory

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 6 / 47

Page 7: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

Shared memory

Cpu1 Cpu2x=x=5 x=5

Memory is shared acrossdifferent processorsCommunication andsynchronization happenimplicitely through sharedmemory

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 7 / 47

Page 8: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

Including...

Multicores/SMTs

Core

L1 Caches

Core

L1 Caches

L2 Cache

Chip

Off-chipCache

Memory

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 8 / 47

Page 9: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

More commonly

NUMA

Chip

Chip

Chip

Chip

Memory Memory

Memory interconnect

Access to memory addresses is not uniform

Memory migration and locality are very important

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 9 / 47

Page 10: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

Why OpenMP?

Mature standard and implementationsStandardizes practice of the last 20 years

Good performance and scalabilityPortable across architecturesIncremental parallelizationMaintains sequential version(mostly) High level language

Some people may say a medium level language :-)

Supports both task and data parallelismCommunication is implicit

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 10 / 47

Page 11: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Introduction

Why not OpenMP?

Communication is implicitbeware false sharing

Flat memory modelcan lead to poor performance in NUMA machines

Incremental parallelization creates false sense of glory/failureNo support for acceleratorsNo error recovery capabilitiesDifficult to composePipelines are difficult

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 11 / 47

Page 12: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 12 / 47

Page 13: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

OpenMP at a glance

OpenMP components

CPU CPU CPU CPU CPU CPU SMP

OS Threading Libraries

OpenMP Runtime Library ICVs

OpenMP Exec

Compiler

Constructs

OpenMP API EnvironmentVariables

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 13 / 47

Page 14: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

OpenMP directives syntax

In FortranThrough a specially formatted comment:

s e n t i n e l cons t ruc t [ c lauses ]

where sentinel is one of:!$OMP or C$OMP or *$OMP in fixed format!$OMP in free format

In C/C++Through a compiler directive:

#pragma omp cons t ruc t [ c lauses ]

OpenMP syntax is ignored if the compiler does not recognizeOpenMP

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 14 / 47

Page 15: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Hello world!

Example

i n t i d ;char ∗message = "Hello world!" ;

#pragma omp parallel private ( i d ){

i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;

}

Creates a parallel region of OMP_NUM_THREADS

All threads execute the same code

id is private to each thread

Each thread gets its id in the teammessage is shared among all threads

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 15 / 47

Page 16: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Hello world!

Example

i n t i d ;char ∗message = "Hello world!" ;

#pragma omp parallel private ( i d ){

i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;

}

Creates a parallel region of OMP_NUM_THREADS

All threads execute the same code

id is private to each thread

Each thread gets its id in the teammessage is shared among all threads

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 15 / 47

Page 17: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Hello world!

Example

i n t i d ;char ∗message = "Hello world!" ;

#pragma omp parallel private ( i d ){

i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;

}

Creates a parallel region of OMP_NUM_THREADS

All threads execute the same code

id is private to each thread

Each thread gets its id in the team

message is shared among all threads

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 15 / 47

Page 18: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Hello world!

Example

i n t i d ;char ∗message = "Hello world!" ;

#pragma omp parallel private ( i d ){

i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;

}

Creates a parallel region of OMP_NUM_THREADS

All threads execute the same code

id is private to each thread

Each thread gets its id in the team

message is shared among all threads

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 15 / 47

Page 19: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Execution model

Fork-join modelOpenMP uses a fork-join model

The master thread spawns a team of threads that joins at the end ofthe parallel regionThreads in the same team can collaborate to do work

Parallel Region Parallel Region

Nested Parallel Region

Master Thread

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 16 / 47

Page 20: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Writing OpenMP programs

Memory model

OpenMP defines a weak relaxed memory modelThreads can see different values for the same variableMemory consistency is only guaranteed at specific points

syncronization constructs, parallelism creation points, . . .

Luckily, the default points are usually enough

Variables can have shared or private visibility for each thread

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 17 / 47

Page 21: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 18 / 47

Page 22: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data environment

When creating a new parallel region (and in other cases) a new dataenvironment needs to be constructed for the threads. This is definedby means of clauses in the construct:

shared

private

firstprivate

default

threadprivate

. . .Not a clause!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 19 / 47

Page 23: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

SharedWhen a variable is marked as shared all threads see the samevariable

Not necessarily the same valueUsually need some kind of synchronization to update themcorrectly

PrivateWhen a variable is marked as private, the variable inside theconstruct is a new variable of the same type with an undefined value.

Can be accessed without any kind of synchronization

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 20 / 47

Page 24: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

FirstprivateWhen a variable is marked as firstprivate, the variable inside theconstruct is a new variable of the same type but it is initialized to theoriginal variable value.

In a parallel construct this means all threads have a differentvariable with the same initial valueCan be accessed without any kind of synchronization

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 21 / 47

Page 25: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

Example

i n t x=1 ,y=1 ,z =1;#pragma omp parallel shared ( x ) private ( y ) firstprivate ( z ) \

num_threads ( 2 ){

x++; y++; z++;p r i n t f ("%d\n" , x ) ;p r i n t f ("%d\n" , y ) ;p r i n t f ("%d\n" , z ) ;

}

The parallel region will have only two threads

Prints 2 or 3. Unsafe update!Prints any numberPrints 2

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 22 / 47

Page 26: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

Example

i n t x=1 ,y=1 ,z =1;#pragma omp parallel shared ( x ) private ( y ) firstprivate ( z ) \

num_threads ( 2 ){

x++; y++; z++;p r i n t f ("%d\n" , x ) ;p r i n t f ("%d\n" , y ) ;p r i n t f ("%d\n" , z ) ;

}

The parallel region will have only two threads

Prints 2 or 3. Unsafe update!Prints any numberPrints 2

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 22 / 47

Page 27: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

Example

i n t x=1 ,y=1 ,z =1;#pragma omp parallel shared ( x ) private ( y ) firstprivate ( z ) \

num_threads ( 2 ){

x++; y++; z++;p r i n t f ("%d\n" , x ) ;p r i n t f ("%d\n" , y ) ;p r i n t f ("%d\n" , z ) ;

}

The parallel region will have only two threads

Prints 2 or 3. Unsafe update!

Prints any numberPrints 2

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 22 / 47

Page 28: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

Example

i n t x=1 ,y=1 ,z =1;#pragma omp parallel shared ( x ) private ( y ) firstprivate ( z ) \

num_threads ( 2 ){

x++; y++; z++;p r i n t f ("%d\n" , x ) ;p r i n t f ("%d\n" , y ) ;p r i n t f ("%d\n" , z ) ;

}

The parallel region will have only two threads

Prints 2 or 3. Unsafe update!

Prints any number

Prints 2

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 22 / 47

Page 29: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Data-sharing attributes

Example

i n t x=1 ,y=1 ,z =1;#pragma omp parallel shared ( x ) private ( y ) firstprivate ( z ) \

num_threads ( 2 ){

x++; y++; z++;p r i n t f ("%d\n" , x ) ;p r i n t f ("%d\n" , y ) ;p r i n t f ("%d\n" , z ) ;

}

The parallel region will have only two threads

Prints 2 or 3. Unsafe update!Prints any number

Prints 2

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 22 / 47

Page 30: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Threadprivate storage

The threadprivate constructHow to parallelize:

Global variablesStatic variablesClass-static members

Use threadprivate storageAllows to create a per-thread copy of “global” variables.

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 23 / 47

Page 31: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Threaprivate storage

Example

char∗ foo ( ){

s t a t i c char b u f f e r [ BUF_SIZE ] ;#pragma omp t h r e a d p r i v a t e ( b u f f e r )

. . .

return b u f f e r ;}

Creates one staticcopy of buffer per

thread

Now foo can be called bymultiple threads at the same

time

Simpler than redefining theinterface. More costly

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 24 / 47

Page 32: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Threaprivate storage

Example

char∗ foo ( ){

s t a t i c char b u f f e r [ BUF_SIZE ] ;#pragma omp t h r e a d p r i v a t e ( b u f f e r )

. . .

return b u f f e r ;}

Creates one staticcopy of buffer per

thread

Now foo can be called bymultiple threads at the same

time

Simpler than redefining theinterface. More costly

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 24 / 47

Page 33: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Data-sharing attributes

Threaprivate storage

Example

char∗ foo ( ){

s t a t i c char b u f f e r [ BUF_SIZE ] ;#pragma omp t h r e a d p r i v a t e ( b u f f e r )

. . .

return b u f f e r ;}

Creates one staticcopy of buffer per

thread

Now foo can be called bymultiple threads at the same

time

Simpler than redefining theinterface. More costly

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 24 / 47

Page 34: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 25 / 47

Page 35: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Why synchronization?

MechanismsThreads need to synchronize to impose some ordering in thesequence of actions of the threads. OpenMP provides differentsynchronization mechanisms:

barrier

critical

atomic

taskwait

low-level locks

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 26 / 47

Page 36: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Barrier

Example

#pragma omp parallel{

foo ( ) ;#pragma omp barrier

bar ( ) ;}

Syncronizes all threads of the team

Forces all foo occurrences toohappen before all bar occurrences

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 27 / 47

Page 37: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Barrier

Example

#pragma omp parallel{

foo ( ) ;#pragma omp barrier

bar ( ) ;}

Syncronizes all threads of the team

Forces all foo occurrences toohappen before all bar occurrences

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 27 / 47

Page 38: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Critical construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp critical

x++;}p r i n t f ("%d\n" , x ) ;

Only one thread at a time here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 28 / 47

Page 39: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Critical construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp critical

x++;}p r i n t f ("%d\n" , x ) ;

Only one thread at a time here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 28 / 47

Page 40: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Critical construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp critical

x++;}p r i n t f ("%d\n" , x ) ;

Only one thread at a time here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 28 / 47

Page 41: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Atomic construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp atomic

x++;}p r i n t f ("%d\n" , x ) ;

Specially supported by hardware primitivesOnly one thread at a time updates x here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 29 / 47

Page 42: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Atomic construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp atomic

x++;}p r i n t f ("%d\n" , x ) ;

Specially supported by hardware primitives

Only one thread at a time updates x here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 29 / 47

Page 43: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Atomic construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp atomic

x++;}p r i n t f ("%d\n" , x ) ;

Specially supported by hardware primitives

Only one thread at a time updates x here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 29 / 47

Page 44: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Atomic construct

Example

i n t x =1;#pragma omp parallel num_threads ( 2 ){#pragma omp atomic

x++;}p r i n t f ("%d\n" , x ) ;

Specially supported by hardware primitivesOnly one thread at a time updates x here

Prints 3!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 29 / 47

Page 45: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Synchronization

Locks

OpenMP provides lock primitives for low-level synchronizationomp_init_lock Initialize the lockomp_set_lock Acquires the lockomp_unset_lock Releases the lockomp_test_lock Tries to acquire the lock (won’t block)omp_destroy_lock Frees lock resources

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 30 / 47

Page 46: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 31 / 47

Page 47: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

Worksharings

Worksharing constructs divide the execution of a code region amongthe threads of a team

Threads cooperate to do some workBetter way to split work than using thread-ids

In OpenMP, there are four worksharing constructs:loop worksharingsinglesectionworkshare

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 32 / 47

Restriction: worksharings cannot be nested

Page 48: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The for construct

Example

void foo ( i n t ∗m, i n t N, i n t M){

i n t i ;#pragma omp parallel#pragma omp for private ( j )for ( i = 0 ; i < N; i ++ )

for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;

}

Loop iterations must be independentThe i variable is automatically privatizedMust be explicitly privatized

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 33 / 47

Page 49: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The for construct

Example

void foo ( i n t ∗m, i n t N, i n t M){

i n t i ;#pragma omp parallel#pragma omp for private ( j )for ( i = 0 ; i < N; i ++ )

for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;

}

New created threads cooperate to exe-cute all the iterations of the loop

Loop iterations must be independentThe i variable is automatically privatizedMust be explicitly privatized

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 33 / 47

Page 50: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The for construct

Example

void foo ( i n t ∗m, i n t N, i n t M){

i n t i ;#pragma omp parallel#pragma omp for private ( j )for ( i = 0 ; i < N; i ++ )

for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;

}

Loop iterations must be independent

The i variable is automatically privatizedMust be explicitly privatized

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 33 / 47

Page 51: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The for construct

Example

void foo ( i n t ∗m, i n t N, i n t M){

i n t i ;#pragma omp parallel#pragma omp for private ( j )for ( i = 0 ; i < N; i ++ )

for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;

}

Loop iterations must be independent

The i variable is automatically privatized

Must be explicitly privatized

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 33 / 47

Page 52: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The for construct

Example

void foo ( i n t ∗m, i n t N, i n t M){

i n t i ;#pragma omp parallel#pragma omp for private ( j )for ( i = 0 ; i < N; i ++ )

for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;

}

Loop iterations must be independentThe i variable is automatically privatized

Must be explicitly privatized

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 33 / 47

Page 53: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The reduction clause

Example

i n t vector_sum ( i n t n , i n t v [ n ] ){

i n t i , sum = 0;#pragma omp parallel for

for ( i = 0 ; i < n ; i ++ )sum += v [ i ] ;

return sum;}

Common pattern. Allthreads accumulate to a

shared variable

Efficiently solved with the reduction clause

Private copy initialized here to the identity value

Shared variable updated here with the partial values of each thread

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 34 / 47

Page 54: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The reduction clause

Example

i n t vector_sum ( i n t n , i n t v [ n ] ){

i n t i , sum = 0;#pragma omp parallel for reduction(+:sum)

for ( i = 0 ; i < n ; i ++ )sum += v [ i ] ;

return sum;}

Common pattern. Allthreads accumulate to a

shared variable

Efficiently solved with the reduction clause

Private copy initialized here to the identity value

Shared variable updated here with the partial values of each thread

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 34 / 47

Page 55: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The reduction clause

Example

i n t vector_sum ( i n t n , i n t v [ n ] ){

i n t i , sum = 0;#pragma omp parallel for

for ( i = 0 ; i < n ; i ++ )sum += v [ i ] ;

return sum;}

Common pattern. Allthreads accumulate to a

shared variableEfficiently solved with the reduction clause

Private copy initialized here to the identity value

Shared variable updated here with the partial values of each thread

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 34 / 47

Page 56: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The schedule clause

The schedule clause determines which iterations are executed byeach thread.

Importart to choose for performance reasons onlyThere are several possible options as schedule:

STATIC

STATIC,chunk

DYNAMIC[,chunk]

GUIDED[,chunk]

AUTO

RUNTIME

Good locality, low overhead, load imbalance

Bad locality, higher overhead, load balance

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 35 / 47

Page 57: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The single construct

Example

i n t main ( i n t argc , char ∗∗argv ){

#pragma omp parallel{

#pragma omp single{

p r i n t f ("Hello world!\n" ) ;}

}}

This program outputs justone “Hello world”

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 36 / 47

Page 58: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Worksharings

The single construct

Example

i n t main ( i n t argc , char ∗∗argv ){

#pragma omp parallel{

#pragma omp single{

p r i n t f ("Hello world!\n" ) ;}

}}

This program outputs justone “Hello world”

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 36 / 47

Page 59: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Outline

1 Introduction

2 Writing OpenMP programs

3 Data-sharing attributes

4 Synchronization

5 Worksharings

6 Task parallelism

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 37 / 47

Page 60: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Task parallelism in OpenMP

Task parallelism model

Team Task pool

Parallelism is extracted from “several” pieces of codeAllows to parallelize very unstructured parallelism

Unbounded loops, recursive functions, ...

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 38 / 47

Page 61: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

What is a task in OpenMP ?

Tasks are work units whose execution may be deferredthey can also be executed immediately

Tasks are composed of:code to executea data environment

Initialized at creation time

internal control variables (ICVs)

Threads of the team cooperate to execute them

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 39 / 47

Page 62: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

When are task created?

Parallel regions create tasksOne implicit task is created and assigned to each thread

So all task-concepts have sense inside the parallel region

Each thread that encounters a task constructPackages the code and dataCreates a new explicit task

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 40 / 47

Page 63: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversal

Example

void t r a v e r s e _ l i s t ( L i s t l ){

Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )

#pragma omp taskprocess ( e ) ;

}e is firstprivate

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 41 / 47

Page 64: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Taskwait

Example

void t r a v e r s e _ l i s t ( L i s t l ){

Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )

#pragma omp taskprocess ( e ) ;

#pragma omp taskwait

}

Suspends current task until all children are completed

All tasks guaranteed to be completed here

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 42 / 47

Page 65: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Taskwait

Example

void t r a v e r s e _ l i s t ( L i s t l ){

Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )

#pragma omp taskprocess ( e ) ;

#pragma omp taskwait

}

Suspends current task until all children are completed

All tasks guaranteed to be completed here

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 42 / 47

Page 66: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Taskwait

Example

void t r a v e r s e _ l i s t ( L i s t l ){

Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )

#pragma omp taskprocess ( e ) ;

#pragma omp taskwait

}

Suspends current task until all children are completedAll tasks guaranteed to be completed here

Now we need some threadsto execute the tasks

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 42 / 47

Page 67: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversalCompleting the picture

Example

L i s t l

#pragma omp parallelt r a v e r s e _ l i s t ( l ) ;

This will generate multiple traversalsWe need a way to have a singlethread execute traverse_list

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 43 / 47

Page 68: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversalCompleting the picture

Example

L i s t l

#pragma omp parallelt r a v e r s e _ l i s t ( l ) ; This will generate multiple traversals

We need a way to have a singlethread execute traverse_list

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 43 / 47

Page 69: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversalCompleting the picture

Example

L i s t l

#pragma omp parallelt r a v e r s e _ l i s t ( l ) ;

This will generate multiple traversals

We need a way to have a singlethread execute traverse_list

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 43 / 47

Page 70: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversalCompleting the picture

Example

L i s t l

#pragma omp parallel#pragma omp single

t r a v e r s e _ l i s t ( l ) ;

One thread creates the tasks of the traversalAll threads cooperate to execute them

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 44 / 47

Page 71: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversalCompleting the picture

Example

L i s t l

#pragma omp parallel#pragma omp single

t r a v e r s e _ l i s t ( l ) ; One thread creates the tasks of the traversal

All threads cooperate to execute them

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 44 / 47

Page 72: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

List traversalCompleting the picture

Example

L i s t l

#pragma omp parallel#pragma omp single

t r a v e r s e _ l i s t ( l ) ;

One thread creates the tasks of the traversal

All threads cooperate to execute them

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 44 / 47

Page 73: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Another exampleSearch problem

Example

void search ( i n t n , i n t j , bool ∗s ta te ){

i n t i , res ;

i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /mysolut ions ++;return ;

}

/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{

bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {

search ( n , j +1 , new_state ) ;}

}

#pragma omp taskwait}

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 45 / 47

Page 74: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

Summary

OpenMP...allows to incrementally parallelize applications for SMPhas good support for data and task parallelismrequires you to pay attention to localityhas many other features beyond this short presentation

http://www.openmp.org

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 46 / 47

Page 75: A brief introduction to OpenMP - University of Washingtoncourses.cs.washington.edu/courses/csep524/13wi/omp...Introduction Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing

Task parallelism

The End

Thanks for your attention!

Alex Duran (BSC) A brief introduction to OpenMP 10th October 2011 47 / 47