lecture 7: posix threads - pthreads. parallel programming models parallel programming models: data...

21
Lecture 7: Lecture 7: POSIX Threads - Pthreads POSIX Threads - Pthreads

Upload: mercy-collins

Post on 13-Dec-2015

277 views

Category:

Documents


1 download

TRANSCRIPT

Lecture 7:Lecture 7:

POSIX Threads - Pthreads POSIX Threads - Pthreads

Parallel Programming Models

Parallel Programming Models:

Data parallelism / Task parallelism Explicit parallelism / Implicit parallelism Shared memory / Distributed memory Other programming paradigms

• Object-oriented

• Functional and logic

Parallel Programming Models

Shared MemoryThe programmer’s task is to specify the activities of a set of processes that communicate by reading and writing shared memory. • Advantage: the programmer need not be concerned with data-distribution

issues. • Disadvantage: performance implementations may be difficult on computers

that lack hardware support for shared memory, and race conditions tend to arise more easily

Distributed MemoryProcesses have only local memory and must use some other mechanism (e.g., message passing or remote procedure call) to exchange information.• Advantage: programmers have explicit control over data distribution and

communication.

Shared vs Distributed Memory

Shared memory

Distributed memory

Memory

Bus

P P P P

P P P P

M M M M

Network

Parallel Programming Models

Parallel Programming Tools:

Parallel Virtual Machine (PVM)• Distributed memory, explicit parallelism

Message-Passing Interface (MPI)• Distributed memory, explicit parallelism

PThreads• Shared memory, explicit parallelism

OpenMP• Shared memory, explicit parallelism

High-Performance Fortran (HPF)• Implicit parallelism

Parallelizing Compilers• Implicit parallelism

Parallel Programming Models

Shared Memory Model

Used on Shared memory MIMD architectures

Program consists of many independent threads

Concurrently executing threads all share a single, common address space.

Threads can exchange information by reading and writing to memory using normal variable assignment operations

Parallel Programming Models

Memory Coherence Problem

To ensure that the latest value of a variable updated in one thread is used when that same variable is accessed in another thread.

Hardware support and compiler support are required

Cache-coherency protocol

Thread 1 Thread 2

X

Parallel Programming Models

Distributed Shared Memory (DSM) Systems

Implement Shared memory model on Distributed memory MIMD architectures

Concurrently executing threads all share a single, common address space.

Threads can exchange information by reading and writing to memory using normal variable assignment operations

Use a message-passing layer as the means for communicating updated values throughout the system.

Parallel Programming Models

Synchronization operations in Shared Memory Model

Monitors Locks Critical sections Condition variables Semaphores Barriers

PThreads

POSIX Threads – Pthreads

www.pthreads.org/

PThreads

In the UNIX environment a thread:

Exists within a process and uses the process resources Has its own independent flow of control Duplicates only the essential resources it needs to be independently

schedulable May share the process resources with other threads Dies if the parent process dies Is "lightweight" because most of the overhead has already been

accomplished through the creation of its process.

PThreads

Because threads within the same process share resources:

Changes made by one thread to shared system resources will be seen by all other threads.

Two pointers having the same value point to the same data.

Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer.

PThreads

pthread_create(thread, attr, start_routine, arg): creates new threads of control• thread: unique identifier of the thread• attr: used to set thread attributes (default NULL)• start_routine: the C routine that the thread will execute once it is created • arg: a single argument that may be passed (passed by reference) to

start_routine (NULL if no arguments)

pthread_exit(): A thread terminates when the function being executed by the thread completes or when an explicit thread exit function is called.

PThread Code

#include <pthread.h> #include <stdio.h> #define NUM_THREADS 5

void *PrintHello(void *threadid) { long tid; tid = (long)threadid; printf("Hello World! It's me, thread #%ld!\n", tid); pthread_exit(NULL); }

int main (int argc, char *argv[]) { pthread_t threads[NUM_THREADS]; int rc; long t;

for(t=0; t<NUM_THREADS; t++){ printf("In main: creating thread %ld\n", t); rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t); if (rc){

printf("ERROR; return code from pthread_create() is %d\n", rc); exit(-1); }

} pthread_exit(NULL); }

PThreads

The data-oriented synchronization routines are based on the use of a mutex (mutual exclusion).

A mutex is a dynamically allocated data structure that can be passed as an argument to the synchronization routines

pthread_mutex_lock() and pthread_mutex_unlock(): Once a pthread_mutex_lock call is made on a specific mutex, subsequent pthread_mutex_lock calls will block until a call is made to pthread_mutex_unlock with that mutex.

PThreads

Condition variables allow a thread to wait until a Boolean predicate that depends on the contents of one or more shared-memory locations becomes true.

A condition variable associates a mutex with the desired predicate. Before the program makes its test, it obtains a lock on the associated mutex. Then it evaluates the predicate.

If the predicate evaluates to false, the thread can execute a pthread_cond_wait() operation, which atomically suspends the calling thread, puts the thread record on a waiting list that is part of the condition variable, and releases the mutex. The thread scheduler is now free to use the processor to execute another thread.

PThreads

If the predicate evaluates to true, the thread simply releases its lock and continues on its way.

If a thread changes the value of any shared variables associated with a condition variable predicate, it needs to cause any threads that may be waiting on this condition variable to be rescheduled. The pthread_cond_signal() causes one of the threads waiting on the condition variable to become unblocked, returning from the pthread_cond_wait that caused it to block in the first place. The mutex is automatically reobtained as part of the return from the wait, so the thread is in the position to reevaluate the predicate immediately.

Parallel Programming Models

Example: Pi calculation

f01 f(x) dx = f0

1 4/(1+x2) dx = w ∑ f(xi)

f(x) = 4/(1+x2)

n = 10

w = 1/n

xi = w(i-0.5)

x

f(x)

0 0.1 0.2 xi 1

Parallel Programming ModelsSequential Code

#define f(x) 4.0/(1.0+x*x);

main(){int n,i;float w,x,sum,pi;

printf(“n?\n”);scanf(“%d”, &n);w=1.0/n;sum=0.0;for (i=1; i<=n; i++){

x=w*(i-0.5);sum += f(x);

}pi=w*sum;printf(“%f\n”, pi);

}

= w ∑ f(xi) f(x) = 4/(1+x2) n = 10 w = 1/nxi = w(i-0.5)

x

f(x)

0 0.1 0.2 xi 1

Parallel Virtual Machine (PVM)Data Distribution

x

f(x)

0 0.1 0.2 xi 1

PThread Code

#include <pthread.h> #include <stdio.h>

#define f(x) 4.0/(1.0+x*x)#define NUM_THREADS 4

float pi;pthread_mutex_t m1;

void *worker(void args) { int i, p, n, id; float sum, w, x;

p=args[0];n=args[1];id=args[2];sum=0.0;w=1.0/n;for (i=id; i<n; i+=p) {

x=(i+0.5)*w;sum+=f(x);

}sum=sum*w;pthread_mutex_lock(&m1);pi += sum;pthread_mutex_unlock(&m1);

}

int main (int argc, char *argv[]) { pthread_t threads[NUM_THREADS]; int i, n, nproc, args[3];

scanf(“%d:, &nproc);scanf(“%d:, &n);args[0]=nproc;args[1]=n;pthread_mutex_init(&m1, NULL);for(i=0; i<NUM_THREADS; i++){

args[2]=i;pthread_create(&threads[i], NULL, worker, (void

*)args[0]); }for(i=0; i<NUM_THREADS; i++){

pthread_join(&threads[i], NULL); printf(“Pi=%f\n”, pi);

}