tutorial presentation 8

15
OpenMP Arash Bakhtiari [email protected] 2012-12-18 Tue

Upload: hisuin

Post on 14-Jul-2016

222 views

Category:

Documents


0 download

DESCRIPTION

LIOH OOI

TRANSCRIPT

Page 1: Tutorial Presentation 8

OpenMP

Arash [email protected]

2012-12-18 Tue

Page 2: Tutorial Presentation 8

Introduction

I Chip manufacturers are rapidly moving to multi-coreCPUs

Figure : Quad-core processor Intel Sandy Bridge

Page 3: Tutorial Presentation 8

Shared Memory ModelI All processors can access all memory in global address

space.I Threads Model: A single process can have multiple,

concurrent execution pathsI On a multi-core system, the threads run at the same

time, with each core running a particular thread or task.

Figure : Shared Memory Model [1]

Page 4: Tutorial Presentation 8

What is OpenMP?

I An Application Program Interface (API)I Used to explicitly direct multi-threaded, shared memory

parallelismI Provides a portable, scalable modelI Supports C/C++ and Fortran on a wide variety of

architectures

Page 5: Tutorial Presentation 8

Fork-Join Model

I OpenMP-program starts as a single threadI Additional threads (Team) are created when the master

hits a parallel regionI When all threads finished the parallel region, the new

threads are given back to the runtime or operatingsystem.

I The master continues after the parallel region

Page 6: Tutorial Presentation 8

Fork-Join Model (cont.)

Figure : Fork-Join Model [1]

Page 7: Tutorial Presentation 8

OpenMP API

Primary API components:I Compiler Directives:

#pragma omp p a r a l l e l

I Run-time Library Routines:

i n t omp_get_num_threads ( v o i d ) ;

I Environment Variables

e x p o r t OMP_NUM_THREADS=2

Page 8: Tutorial Presentation 8

Example

Listing 1: OpenMP Hello World!#i n c l u d e <ios t r eam>#i n c l u d e <omp . h>

i n t main ( i n t argc , cha r ∗ a rgv [ ] ){

#pragma omp p a r a l l e l{

s t d : : cout << "THREAD: " << omp_get_thread_num ( ) << "\ tHe l l o , World ! \ n" ;}r e t u r n 0 ;

}

Listing 2: Compilingg++ −o h e l l o h e l l o . c −fopenmp

Page 9: Tutorial Presentation 8

Classification of Variables

I private(var-list):I Variables in var-list are private

I shared(var-list):I Variables in var-list are shared.

I default(private | shared | none):I Sets the default for all variables in this region.

Page 10: Tutorial Presentation 8

Example

Listing 3: OpenMP Private Variable#i n c l u d e <ios t r eam>#i n c l u d e <omp . h>i n t main ( i n t argc , cha r ∗ a rgv [ ] ){

i n t i , j ;i = 1 ;j = 2 ;

s t d : : cout << "BEFORE: i , j= "<< i << " , " << j << std : : e nd l ;

#pragma omp p a r a l l e l p r i v a t e ( i ){

i = 3 ;j = 5 ;s t d : : cout << "IN−LOOP: i , j= "<< i << " , " << j << std : : e nd l ;

}

s t d : : cout << "AFTER: i , j= "<< i << " , " << j << std : : e nd l ;r e t u r n 0 ;

}

Page 11: Tutorial Presentation 8

Work-Sharing Constructs

I Work-sharing constructs distribute the specified work toall threads within the current team

I Types:I Parallel loopI Parallel sectionI Master regionI Single region

Page 12: Tutorial Presentation 8

Parallel Loop

I Syntax:

#pragma omp f o r [ c l a u s e . . . ]

I The iterations of the loop are distributed to the threadsI The scheduling of loop iterations: static, dynamic,

guided, and runtime.

Page 13: Tutorial Presentation 8

Scheduling Strategies

I Schedule clause:

s c h edu l e ( type [ , s i z e ] )

I static: Chunks of the specified size are assigned in around- robin fashion to the threads.

I dynamic: The iterations are broken into chunks of thespecified size. When a thread finishes the execution of achunk, the next chunk is assigned to that thread.

I guided: Similar to dynamic, but the size of the chunks isexponentially decreasing. The size parameter specifies thesmallest chunk. The initial chunk is implementationdependent.

I runtime: The scheduling type and the chunk size isdetermined via environment variables.

Page 14: Tutorial Presentation 8

Example

Listing 4: OpenMP Private Variable#i n c l u d e <ios t r eam>#i n c l u d e <omp . h>

#d e f i n e CHUNKSIZE 100#d e f i n e N 1000i n t main ( ){

i n t i , chunk ;doub l e a [N] , b [N] , c [N ] ;s r and ( t ime (NULL) ) ;f o r ( i =0; i < N; i++) {

a [ i ] = generate_random_double ( 0 . 0 , 1 0 . 0 ) ;b [ i ] = generate_random_double ( 0 . 0 , 1 0 . 0 ) ;

}chunk = CHUNKSIZE ;

#pragma omp p a r a l l e l s ha r ed ( a , b , c , chunk ) p r i v a t e ( i ){

#pragma omp f o r s c h edu l e ( dynamic , chunk ) nowa i tf o r ( i =0; i < N; i++)

c [ i ] = a [ i ] + b [ i ] ;

}r e t u r n 0 ;

}

Page 15: Tutorial Presentation 8

References

Blaise Barney, Lawrence Livermore National Laboratory,https://computing.llnl.gov/tutorials/openMP/