openmp programming 2 - anasayfa

46
OpenMP Programming 2 Advanced OpenMP Programming Berk ONAT İTÜ Bilişim Enstitüsü 21 Haziran 2012

Upload: others

Post on 01-Jun-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenMP Programming 2 - Anasayfa

OpenMP Programming 2 Advanced OpenMP Programming

Berk ONAT İTÜ Bilişim Enstitüsü

21 Haziran 2012

Page 2: OpenMP Programming 2 - Anasayfa

Outline

OpenMP Prog. & App. 21.06.2012 2/36  

•  OpenMP  Synchroniza6on  Constructs  –  Single,  Cri6cal,  Atomic,  Barrier  

•  OpenMP  Data  Scope  Clauses  –  Firstprivate,  Lastprivate  

•  Advanced  OpenMP  Direc6ves  –  Flush,  Threadprivate,  Copyin  

•  Run6me  library  rou6nes  •  OpenMP  “Danger  Zones”  

–  Race  Condi6ons  –  Deadlock  –  Livelock  

Page 3: OpenMP Programming 2 - Anasayfa

Synchronization

3/36  

•  Synchroniza6on  direc6ves  help  to  organize  accesses  to  shared  data  by  mul6ple  threads  

•  Types  –  Barrier  –  Single  –  Master  –  Ordered  –  Cri6cal  –  Atomic  –  Locks  

21.06.2012 OpenMP Prog. & App.

Page 4: OpenMP Programming 2 - Anasayfa

Data Scope Clauses

4/36  

•  The  OpenMP  Data  Scope  AOribute  Clauses  are  used  to  explicitly  define  how  variables  should  be  scoped.  

•  These  constructs  provide  the  ability  to  control  the  data  environment  during  execu6on  of  parallel  constructs.  o  They  define  how  and  which  data  variables  in  the  serial  sec6on  of  

the  program  are  transferred  to  the  parallel  sec6ons  of  the  program  (and  back)  

o  They  define  which  variables  will  be  visible  to  all  threads  in  the  parallel  sec6ons  and  which  variables  will  be  privately  allocated  to  all  threads  

•  Types  –  Firstprivate  –  Lastprivate  –  Reduc6on  –  Copyin  

21.06.2012 OpenMP Prog. & App.

Page 5: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (firstprivate)

5/36  

•  FIRSTPRIVATE  Clause:  –  It  combines  the  behavior  of  the  PRIVATE  clause  with  automa6c  

ini6aliza6on  of  the  variables  –  Variables  that  are  declared  to  be  “firstprivate”  are  private  variables  –  Listed  variables  are  ini6alized  according  to  the  value  of  their  original  

objects  prior  to  entry  into  the  parallel  or  work-­‐sharing  construct.  

C/C++   Fortran  

firstprivate  (list)   FIRSTPRIVATE  (list)  

Example:                                  Check    firstprivate.c example  code.  

21.06.2012 OpenMP Prog. & App.

Page 6: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (lastprivate)

6/36  

C/C++   Fortran  

•  LASTPRIVATE  Clause:  –  It  combines  the  behavior  of  the  PRIVATE  clause  with  a  copy  from  

the  last  loop  itera6on  or  sec6on  to  the  original  variable  object.  –  A  performance  penalty  is  likely  to  be  associated  with  the  use  of  

lastprivate,  because  the  OpenMP  library  needs  to  keep  track  of  which  thread  executes  the  last  itera6on.  For  a  sta6c  workload  distribu6on  scheme  this  is  rela6vely  easy  to  do,  but  for  a  dynamic  scheme  this  is  more  costly.  

lastprivate  (list)   LASTPRIVATE  (list)  

Example:                                                  Check    lastprivate.c example  code.  

21.06.2012 OpenMP Prog. & App.

Page 7: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (lastprivate)

7/36  

•  LASTPRIVATE  Clause:  …  

output  

If  the  lastprivate  clause  is  used  on  a  sec6ons  construct,  the  object  gets  assigned  the  value  that  it  has  at  the  end  of  the  lexically  last  sec6ons  construct.  

21.06.2012 OpenMP Prog. & App.

Page 8: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (reduction)

8/36  

•  REDUCTION  Clause:  –  It  provides  some  forms  of  recurrence  

calcula6ons  (involving  mathema6cally  associa6ve  and  commuta6ve  operators)  so  that  they  can  be  performed  in  parallel  without  code  modifica6on.  

–  The  results  will  be  shared  and  it  is  not  necessary  to  define  as  shared  variables.  

C/C++   Fortran  

reducAon  (operator  :  list)   REDUCTION  (operator  :  list)  

Example:                          Check    reduction.c or reduction.F example  code.  

21.06.2012 OpenMP Prog. & App.

Page 9: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (reduction)

9/36  

•  REDUCTION  operator  values:  

Example:                                  Check    reduction.c example  code.  

C/C++   Fortran  

21.06.2012 OpenMP Prog. & App.

Page 10: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (reduction)

10/36   21.06.2012

SUBROUTINE  REDUCTION(A,  B,  C,  D,  X,  Y,  N)      REAL  ::  X(*),  A,  D      INTEGER  ::  Y(*),  N,  B,  C      INTEGER  ::  I    A=0      B=0    C  =  Y(1)    D  =  X(1)    

!$OMP  PARALLEL  DO  PRIVATE(I)  SHARED(X,  Y,  N)  REDUCTION(+:A)  &    !$OMP&  REDUCTION(IEOR:B)  REDUCTION(MIN:C)  REDUCTION(MAX:D)    

 DO  I=1,N      A  =  A  +  X(I)        B  =  IEOR(B,  Y(I))        C  =  MIN(C,  Y(I))  IF  (D  <  X(I))        D  =  X(I)    END  DO    

END  SUBROUTINE  REDUCTION  

Fortran  Intrinsic  since  OpenMP  2.5  C/C++  since  OpenMP  3.1  (July  2011)  

OpenMP Prog. & App.

Page 11: OpenMP Programming 2 - Anasayfa

Data Scope Clauses (copyin)

11/36  

•  COPYIN  Clause:  –  It  provides  a  means  for  assigning  the  same  value  to  thread  private  

variables  for  all  threads  in  the  team.  –  List  contains  the  names  of  variables  to  copy.  In  Fortran,  the  list  can  

contain  both  the  names  of  common  blocks  and  named  variables.  –  The  master  thread  variable  is  used  as  the  copy  source.  The  team  

threads  are  ini6alized  with  its  value  upon  entry  into  the  parallel  construct.  

C/C++   Fortran  

copyin  (list)   COPYIN  (list)  

21.06.2012 OpenMP Prog. & App.

Page 12: OpenMP Programming 2 - Anasayfa

Advanced OpenMP Constructs

12/36  

•  THREADPRIVATE  Direc6ve:  –  It  controls  the  defini6on  of  global  data  (Sta6c  in  C/C++  and  Common  

Block  in  Fortran)  as  a  private  or  shared  –  By  default  global  data  is  shared  but  some6mes  you  need  to  define  as  

private  –  Supports  pointer  in  C/C++  &  Fortran  ,  Allocatable  in  Fortran  –  COPYIN  clause  can  be  used  to  ini6liza6on  of  the  global  data  variable  

C/C++   Fortran  

#pragma  omp  threadprivate  (list)   !$OMP  THREADPRIVATE  (/cmn/,  list...)  

Example:    Check    threadprivate.c or threadprivate.F example  

21.06.2012 OpenMP Prog. & App.

Page 13: OpenMP Programming 2 - Anasayfa

Advanced OpenMP Constructs

13/36  

•  THREADPRIVATE  Direc6ve:  –  Each  thread  then  gets  its  own  copy  of  the  variable/common  block,  

so  data  wriOen  by  one  thread  is  not  visible  to  other  threads.  –  Restric6ons:  

•  To  use  THREADPRIVATE  variables,  the  parallel  regions  must  be  executed  by  the  same  number  of  threads.  Each  of  the  threads  will  con6nue  to  work  on  one  of  the  sets  of  data  previously  produced.  

•  The  value  of  the  dyn-­‐var  internal  control  variable  is  false  at  entry  to  the  first  parallel  region  and  remains  false  un6l  entry  to  the  second  parallel  region.  

21.06.2012 OpenMP Prog. & App.

Page 14: OpenMP Programming 2 - Anasayfa

Advanced OpenMP Constructs

14/36  

•  PRIVATE    vs.  THREADPRIVATE  

Data Item C/C++: variable Fortran: variable common block

C/C++: variable Fortran: common block

Where Declared

Start of region or work-sharing group

In declarations of each routine using block or global file scope  

Persistent NO YES  

Initiliaze FIRSTPRIVATE COPYIN  

PRIVATE THREADPRIVATE  

21.06.2012 OpenMP Prog. & App.

Page 15: OpenMP Programming 2 - Anasayfa

Clause/Directives Summary

15/36   21.06.2012 OpenMP Prog. & App.

Page 16: OpenMP Programming 2 - Anasayfa

Runtime Library Routines

16/36  

•  OMP_SET_NUM_THREADS  –  Its  the  number  of  threads  that  will  be  used  in  the  next  parallel  

region.  Must  be  a  posi6ve  integer.  –  Notes  

•  This  rou6ne  can  only  be  called  from  the  serial  por6ons  of  the  code  •  This  call  has  precedence  over  the  OMP_NUM_THREADS  environment  variable  

C/C++   Fortran  

#include  <omp.h>  void  omp_set_num_threads(int  n)  

USE  omp.h  SUBROUTINE  OMP_SET_NUM_THREADS(N)  

21.06.2012 OpenMP Prog. & App.

Page 17: OpenMP Programming 2 - Anasayfa

Runtime Library Routines

17/36  

C/C++   Fortran  

•  OMP_GET_NUM_THREADS  –  Returns  the  number  of  threads  that  are  currently  in  the  team  execu6ng  

the  parallel  region  from  which  it  is  called.  –  Notes  

•  If  this  call  is  made  from  a  serial  por6on  of  the  program,  or  a  nested  parallel  region  that  is  serialized,  it  will  return  1.  

#include  <omp.h>  int  omp_get_num_threads(void)  

USE  omp.h  INTEGER    OMP_GET_NUM_THREADS()  

21.06.2012 OpenMP Prog. & App.

Page 18: OpenMP Programming 2 - Anasayfa

Runtime Library Routines

18/36  

C/C++   Fortran  

•  OMP_GET_MAX_THREADS  –  Returns  the  maximum  value  that  can  be  returned  by  a  call  to  the  

OMP_GET_NUM_THREADS  func6on.  –  Notes  

•  Generally  reflects  the  number  of  threads  as  set  by  the  OMP_NUM_THREADS  environment  variable  or  the  OMP_SET_NUM_THREADS()  library.  

#include  <omp.h>  int  omp_get_max_threads(void)  

USE  omp.h  INTEGER    OMP_GET_MAX_THREADS()  

Example:                                  Check    omp_getEnvInfo.c example  code.  

21.06.2012 OpenMP Prog. & App.

Page 19: OpenMP Programming 2 - Anasayfa

Runtime Library Routines

19/36  

C/C++   Fortran  

•  OMP_GET_THREAD_NUM  –  Returns  the  thread  number  of  the  thread,  within  the  team,  making  this  

call.  This  number  will  be  between  0  and  OMP_GET_NUM_THREADS-­‐1.  The  master  thread  of  the  team  is  thread  0  

–  Notes  •  If  called  from  a  nested  parallel  region,  or  a  serial  region,  this  func6on  will  return  0.  

#include  <omp.h>  int  omp_get_thread_num(void)  

USE  omp.h  INTEGER    OMP_GET_THREAD_NUM()  

21.06.2012 OpenMP Prog. & App.

Page 20: OpenMP Programming 2 - Anasayfa

Runtime Library Routines

20/36  

C/C++   Fortran  

•  OMP_GET_WTIME  –  Provides  a  portable  wall  clock  6ming  rou6ne.  Returns  seconds  in  double  

precision.    –  Usually  used  in  "pairs"  with  the  value  of  the  first  call  subtracted  from  the  

value  of  the  second  call  to  obtain  the  elapsed  6me  for  a  block  of  code.  

#include  <omp.h>  double  omp_get_w8me(void)  

USE  omp.h  DOUBLE  PRECISION  OMP_GET_WTIME()  

Example:                                  Check    omp_wtime.F example  code.  

GNU  fortran  compiler  implemented  correctly  

USE:    gfortran  -­‐fopenmp  omp_w6me.F  -­‐o  omp_w6me.x    

21.06.2012 OpenMP Prog. & App.

Page 21: OpenMP Programming 2 - Anasayfa

Advanced OpenMP Constructs

21/36  

•  FLUSH  Direc6ve:  –  If  a  thread  updates  shared  data,  

•   the  new  values  will  first  be  saved  in  a  register  •   then  stored  back  to  the  local  cache  

–  The  updates  are  thus  not  necessarily  immediately  visible  to  other  threads  

–  On  a  cache-­‐coherent  machine,  the  modifica6on  to  cache  is  broadcast  to  other  processors  to  make  them  aware  of  changes  

–  It  depends  on  the  plaporm  !!!  

21.06.2012 OpenMP Prog. & App.

Page 22: OpenMP Programming 2 - Anasayfa

Advanced OpenMP Constructs

22/36  

•  FLUSH  Direc6ve:  –  The  OpenMP  standard  specifies  that  all  modifi-­‐  

ca6ons  are  wriOen  back  to  main  memory  and  are  thus  available  to  all  threads,  at  synchroniza6on  points  in  the  program.  

–  Some6mes  updated  values  of  shared  values  must  become  visible  to  other  threads  in-­‐between  synchroniza6on  points.  

–  The  FLUSH  direc6ve  is  used  for  this  purpose.  –  The  purpose  of  the  flush  direc6ve  is  to  to  make  a  

thread’s  temporary  view  of  shared  data  consistent  with  the  values  in  memory  

21.06.2012 OpenMP Prog. & App.

Page 23: OpenMP Programming 2 - Anasayfa

Advanced OpenMP Constructs

23/36  

•  FLUSH  Direc6ve:  –  Implicit  FLUSH  opera6ons  

•  All  explicit  and  implicit  barriers  (e.g.,  at  the  end  of  a  parallel  region  or  work  sharing  construct)  

•  Entry  to  and  exit  from  cri6cal  regions  •  Entry  to  and  exit  from  lock  rou6nes    

C/C++   Fortran  

#pragma  omp  flush  (list)   !$OMP  FLUSH  (list)  

Example:                                  Check    omp_flush_prod_cons.c example  code.  

21.06.2012 OpenMP Prog. & App.

Page 24: OpenMP Programming 2 - Anasayfa

Conditional Parallel Regions

24/36  

•  IF  Clause:  –  it  is  used  to  specify  condi6onal  execu6on  –  supported  on  the  parallel  construct  only  

21.06.2012 OpenMP Prog. & App.

Page 25: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

25/36  

•  There  are  three  major  SMP  programming  error  –  Race  Condi6ons  

• A  race  condi6on  exists  when  two  unsynchronized  threads  access  the  same  shared  variable  with  at  least  one  thread  modifying  the  variable.  

–  Deadlock  • deadlock  describes  a  condi6on  where  two  or  more  threads  are  blocked  (hang)  forever,  wai6ng  for  each  other  

–  Livelock  • mul6ple  threads  working  individual  tasks  which  the  ensemble  can  not  finish.  

21.06.2012 OpenMP Prog. & App.

Page 26: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

26/36  

•  Race  Condi6ons  –  Another  common  mistake  is  the  use  of  un-­‐ini6alized  variables.  Remember  that  private  variables  do  not  have  ini6al  values  upon  entering  a  parallel  construct.  Use  the  firstprivate  and  lastprivate  clauses  to  ini6alize  them  only  when  necessary,  because  doing  so  adds  extra  overhead.  

–  Debug:  Intel  C++  Compiler  specific  environment  variable  KMP_LIBRARY=serial  or  just  compile  without  -­‐openmp  flag.  

21.06.2012 OpenMP Prog. & App.

Page 27: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

27/36  

•  Global  Data:  

…  include  “global.h”  …  !$omp  parallel  private(j)        do  j=1,  n                  call  suba(j)        end  do  !$omp  end  do  !$omp  end  paralel  

subrou6ne  suba(j)  …  include  “global.h”  …  do  i=1,  m            b(i)  =  j  end  do  return  end        

common  /work/  a(m,n),  b(m)  

race    condi6on  

21.06.2012 OpenMP Prog. & App.

Page 28: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

28/36  

•  Global  Data:  RACE  CONDITION  !!!  

subrou6ne  suba(j=1)  …  include  “global.h”  …  do  i=1,  m            b(i)  =  1  end  do  return  end        

subrou6ne  suba(j=2)  …  include  “global.h”  …  do  i=1,  m            b(i)  =  2  end  do  return  end        

Thread  1   Thread  2  

both  thread  changes    same  variable  

21.06.2012 OpenMP Prog. & App.

Page 29: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

29/36  

•  Global  Data:  SOLUTION  1  

…  include  “global.h”  …  !$omp  parallel  private(j)        do  j=1,  n                  call  suba(j)        end  do  !$omp  end  do  !$omp  end  paralel  

subrou6ne  suba(j)  …  include  “global.h”  TID  =  omp_get_thread_num()+1  do  i=1,  m            b(i,  TID)  =  j  end  do  return  end        

common  /work/  a(m,n)  common  /tprivate/b(m,nthreads)  

extend  b  to  reach  each  thread  to  it’s  unique  storage  area  

21.06.2012 OpenMP Prog. & App.

Page 30: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

30/36  

•  Global  Data:  SOLUTION  2  

…  include  “global.h”  …  !$omp  parallel  private(j)        do  j=1,  n                  call  suba(j)        end  do  !$omp  end  do  !$omp  end  paralel  

subrou6ne  suba(j)  …  include  “global.h”  …  do  i=1,  m            b(i)  =  j  end  do  return  end        

common  /work/  a(m,n)  /tprivate/  b(m)  !$omp  threadprivate  (tprivate)  

Compiler  create  private  cop  of  b  for  each  thread  

21.06.2012 OpenMP Prog. & App.

Page 31: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

31/36  

•  Race  Condi6ons  –  Lab:  see  racecond.F  

•  The  result  varies  unpredictably  based  on  specific  order  of  execu6on  for  each  sec6on  

• Wrong  answers  produced  without  warning  !!!  

–  Fixed  version:  racecond-­‐fixed.F  •  Choose  “IC”  counter  and  check  it  every  calcula6on.  IC  forces  the  order  of  calcula6ons.  FLUSH  forces  the  update  of  the  shared  variable.  

21.06.2012 OpenMP Prog. & App.

Page 32: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

32/36  

•  Deadlock  –  Two  or  more  threads  in  a  process  concurrently  

access  the  same  memory  loca6on,  at  least  one  of  the  threads  is  accessing  the  memory  loca6on  for  wri6ng  

–  Types  …  •  ‘poten6al  deadlock’:  It  is  a  deadlock  that  did  not  occur  in  a  given  run,  but  can  occur  in  different  runs  of  the  program  depending  on  the  6mings  of  the  requests  for  locks  by  the  threads.  

•  ‘actual  deadlock’:  It  is  one  that  actually  occurred  in  a  given  run  of  the  program.  An  actual  deadlock  causes  the  threads  involved  to  hang,  but  may  or  may  not  cause  the  whole  process  to  hang.  

21.06.2012 OpenMP Prog. & App.

Page 33: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

33/36  

•  Deadlock  #pragma omp parallel private(me) { int me; me = omp_get_thread_num(); if (me == 0) goto MASTER;

#pragma omp barrier

MASTER:

#pragma omp single printf(“done”); }

In  this  example  deadlock  occurs  because  thread  arrives  different  barriers.  If  one  thread  skip  a  barrier,  it  generally  causes  deadlock.  Nested  CRITICAL  sec6ons  or  LOCK  can  cause  deadlock  

21.06.2012 OpenMP Prog. & App.

Page 34: OpenMP Programming 2 - Anasayfa

OpenMP Danger Zones

34/36  

•  Livelock  –  A  livelock  is  similar  to  

a  deadlock,  except  that  the  states  of  the  processes  involved  in  the  livelock  constantly  change  with  regard  to  one  another,  none  progressing.  

!$OMP PARALLEL PRIVATE(ID) ID = OMP_GET_THREAD_NUM() N = OMP_GET_NUM_THREADS() 1000 CONTINUE PHASES(ID)=UPDATE(U,ID)

!$OMP SINGLE RES = MATCH(PHASES,N) !$OMP END SINGLE IF (RES**2.LT.TOL) GOTO 2000 GOTO 1000 2000 CONTINUE !$OMP END PARALLEL

If  the  square  of  RES  is  never  smaller  than  TOL,  the  program  spins  endlessly  in  livelock  

21.06.2012 OpenMP Prog. & App.

Page 35: OpenMP Programming 2 - Anasayfa

OpenMP Death Traps

35/36  

•  Are  you  using  thread  safe  libraries?  •  I/O  inside  a  parallel  region  can  interleave  

unpredictably.  •  Make  sure  you  understand  what  your  constructors  

are  doing  with  private  objects.  •  Private  variable  can  mask  global  ones.  •  Understand  when  shared  memory  is  coherent.  

When  in  doubts  use  FLUSH  •  NOWAIT  removes  implied  barriers  

21.06.2012 OpenMP Prog. & App.

Page 36: OpenMP Programming 2 - Anasayfa

OpenMP 3.0: task clause!

36/36   21.06.2012

•  TASK  Direc6ve:  –  Run  defined  subrou6ne,  func6on  or  code  block  with  omp  task  in  a  

seperate  thread  –  task  direc6ve  is  Vendor  specific!  (Intel  taskq)  –  task  direc6ve  is  implemented  with  OpenMP  3.0  standard  (intel  

10.1  and  gcc  4.4  compilers  implemented  task  in  late  2008  )  

C/C++   Fortran  

#pragma  omp  task   !$OMP  TASK  

Example:    Check    omp_linked_list.c and omp_task_omp3.c example  code.  

OpenMP Prog. & App.

Page 37: OpenMP Programming 2 - Anasayfa

OpenMP 3.0: Intel taskq

37/36   21.06.2012

C/C++   Fortran  

#pragma  intel  omp  task   !$INTEL  OMP  TASKQ  

Example:                                  Check    omp_task_intel.c example  code.  

bash  $  icc  omp_task_intel.c  -­‐o  omp_task_intel.x  -­‐openmp  -­‐openmp-­‐task  intel  

OpenMP Prog. & App.

Page 38: OpenMP Programming 2 - Anasayfa

TASK BARRIERS: taskwait

38/36   21.06.2012 OpenMP Prog. & App.

*Ref.  4  

Page 39: OpenMP Programming 2 - Anasayfa

TASK BARRIERS: taskgroup

39/36   21.06.2012 OpenMP Prog. & App.

*Ref.  4  

Page 40: OpenMP Programming 2 - Anasayfa

TASK SYNCRONIZATION: taskyield

40/36   21.06.2012 OpenMP Prog. & App.

*Ref.  4  

Page 41: OpenMP Programming 2 - Anasayfa

Task, Workshare or Nested ?

41/36   21.06.2012 OpenMP Prog. & App.

*Ref.  4  

Alignment  Evalua6on  Program  

Page 42: OpenMP Programming 2 - Anasayfa

Task, Workshare or Nested ?

42/36   21.06.2012 OpenMP Prog. & App.

*Ref.  4  

SparseLU  Program  

Page 43: OpenMP Programming 2 - Anasayfa

Lab: Exercise 1

43/36  

       Write  correct  OpenMP  pragmas  to  parallel  matrix  mul6plica6on  serial  code.  

$  ../openmp-­‐applica6on/  matrixmultp.c  

21.06.2012 OpenMP Prog. & App.

Page 44: OpenMP Programming 2 - Anasayfa

Lab: Exercise 2

44/36  

       Calculate    

1.  Write  a  simple  serial  code  first  

2.  Implement  OpenMP  pragmas  to  your  code  

3.  Do  your  threads  synchronized?  €

41+ x 2

dx = h 41+ xi

2 =i=1

N

∑ h 41+ (h(i − 1

2))2

i=1

N

∑0

1

21.06.2012 OpenMP Prog. & App.

Page 45: OpenMP Programming 2 - Anasayfa

Lab: Exercise 3

45/36  

  Write  A=LU  decomposi6on  code.    Back  subs6tu6on  is  not  necessary  for  now.  Just  do  the  paraleliza6on  of  

the  given  loops  below.    Ini6alize  matrix  with  any  number  (Ex:  A[i][j]=1.0+(i*size)+j)    Print  L  and  A  a|er  LU  decomposi6on.  (A  will  be  new  U)    Can  you  implement  par8al  pivo8ng  and  scaling  ?  

LU-­‐factoriza8on  algorithm  :   for k=1 to n-1 ! for i=k+1 to n ! L(i,k)=A(i,k)/A(k,k) ! for j=k+1 to n ! A(i,j)=A(i,j)-L(i,k)*A(k,j) ! end for ! end for ! end for !

21.06.2012 OpenMP Prog. & App.

Page 46: OpenMP Programming 2 - Anasayfa

References:  1.  Blaise  Barney,  OpenMP  Tutorial,  hOps://compu6ng.llnl.gov/tutorials/openMP/  2.  Rohit  Chandra,  Parallel  Programming  in  OpenMP,  2000,  Morgan  Kaufmann  3.  Micheal  J.  Quinn,  Parallel  Programming  in  C  with  MPI  and  OpenMP,  2003,  McGraw  

Hill  4.  Eduard  Ayguadé,  Alejandro  Duran,  Jay  Hoeflinger,  Federico  Massaioli  and  Xavier  

Teruel,  An  Experimental  Evalua6on  of  the  New  OpenMP  Tasking  Model  ,  LANGUAGES  AND  COMPILERS  FOR  PARALLEL  COMPUTING,  Lecture  Notes  in  Computer  Science,  2008,  Volume  5234/2008,  63-­‐77