cse 160 - lecture 15
DESCRIPTION
CSE 160 - Lecture 15. Introduction to Threads, Synchronization and Mutual Exclusion. Heavyweight Processes. Complete stand-alone programs Code segment Data Segment Static data Heap Malloc’ed data Stack Registers. How can two heavyweight processed communicate. Process 1. Process 2. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/1.jpg)
CSE 160 - Lecture 15
Introduction to Threads, Synchronization and Mutual
Exclusion
![Page 2: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/2.jpg)
Heavyweight Processes
• Complete stand-alone programs– Code segment– Data Segment
• Static data
– Heap• Malloc’ed data
– Stack– Registers
![Page 3: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/3.jpg)
How can two heavyweight processed communicate
Process 1 Process 2
Shared Memory Segment
myshmPtrmyshmPtr
Communication Socket
or
![Page 4: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/4.jpg)
Shared Memory Segment
• Only a single cpu or multiprocessor shared memory
• A “named” segment of memory that processes attach to – shmat() function call for Unix
• Processes are given pointers to the beginning of the shared memory segment– Structure of the segment contents are not specified
![Page 5: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/5.jpg)
Concurrent Access Problem
ptrY = myshmPtr + sizeof (int);*ptrY = 1;if (*ptrY > 0)*ptrY --;
ptrY = myshmPtr + sizeof (int);*ptrY = 1;if (ptrY > 0)*ptrY --;
int x;int y;int z;
myshmPtrmyshmPtr
What value is y after these programs execute?
Shared Memory Segment
Process 1 Process 2
![Page 6: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/6.jpg)
Mutual Exclusion
• In general, the temporal (time) order in which processes execute code relative to each other is unknown
• Portions of code that modify shared variables are called critical sections– Access to critical shared variables must regulated so
that only one process at a time may have access to the section;
• This is called serialization of access or mutual exclusion
![Page 7: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/7.jpg)
Implementing Mutual Exclusion
• Spin LocksWhile (lock == 1) /* wait */ ;
lock = 1;
<critical section>
lock = 0;
• Busy waiting is inefficient
• Naïve implementation has pitfalls (how?)
![Page 8: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/8.jpg)
Atomic Operations
• Implementing locks, semaphores, monitors requires atomic building blocksload r0, <lock>
cmp r0, 0
jne again:
add r0, 1
store <lock>, r0
Again:A second process could be swapped in. (Simultaneously in an SMP)
Need to make sure all operations complete without interruption (atomically)
![Page 9: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/9.jpg)
Test and Set
• CPU designers recognize this need and have special hardware instructions– test and set
• test for zero, set if not zero
– fetch and increment• fetch location and add one
![Page 10: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/10.jpg)
Semaphores• Introduced by Dijkstra.
– Give a higher-level test and set semantic
• Two operations P and V.– P(semaphore) : if > 0, decrement semaphore, otherwise, wait– V(semaphore): increment semaphore by one– Semaphore initialized > 0
• Provides the functionality needed to implement mutual exclusion
• Standard OS construct– semget(), semctl(), semop() system calls
![Page 11: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/11.jpg)
More Mutual Exclusion
• Monitors– Higher-level than Semaphores making them
less prone to error– To gain access to shared resource, programs
must always go through the monitor.
• Condition variables– Gain access to a resource, when a particular
condition occurs (more later).
![Page 12: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/12.jpg)
Threads
• For SMP, could always use heavyweight processes– Performance penalties– More burden on the programmer to manage
shared structures (“pointer hell”)
• Threads allow concurrency within a single process– Lighter-weight access
![Page 13: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/13.jpg)
Processes and Threads
• Process includes address space.• Thread is program counter and stack pointer.• Process may have many threads.• All the threads share the same address space.• Processes are heavyweight, threads are
lightweight.• Processes/threads need not map one-to-one onto
processors.
![Page 14: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/14.jpg)
Three Threads Within a Process
function f
function g
code
data
heap
PC1
PC2
PC3
stack 1
stack 2
stack 3
SP1
SP2
SP3
![Page 15: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/15.jpg)
Thread Execution Model
pool of threads
pool of processors
•Each thread of control can be scheduled by the OS when it is in a runnable state.
•Threads within one process can run concurrently
• mutual exclustion is very important
![Page 16: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/16.jpg)
Thread Execution Model: Key Points
• Pool of processors, pool of threads.• Threads are peers.• Dynamic thread creation.• Can support many more threads than processors.• Threads dynamically switch between processors.• Threads share access to memory.• Synchronization needed between threads.
![Page 17: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/17.jpg)
Why Use Threads?
• Representing Concurrent Entities– Concurrency is part of the problem specification.– Examples: systems programming and user interfaces.– Single or multiple processors.– This kind of multithreaded programming is difficult.
• Multiprocessing for Performance– Concurrency is under programmer’s control.– Programs could be written sequentially.– This kind of multithreaded programming should be
easier.
![Page 18: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/18.jpg)
Commercial Thread Libraries
• Win32 threads (Windows NT and Windows 95).• Pthreads (POSIX Thread Interface).
(SGI IRIX, Sun Solaris, HP-UX, IBM AIX, Linux, etc.).
• Solaris threads (SunOS 5.x).• All designed primarily for systems programming.
![Page 19: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/19.jpg)
Example: Pthreads
• POSIX Threads – available on many platforms• Thread Management: pthread_create(),
pthread_join(), pthread_exit(), pthread_kill(),pthread_cancel()
• Mutexes: pthread_mutex_create(), pthread_mutex_init(), pthread_mutex_lock(), pthread_mutex_unlock(), pthread_mutux_trylock()
• Events: pthread_cond_init(), pthread_cond_wait(), pthread_cond_timedwait(), pthread_cond_signal()
• Scheduling: pthread_setschedparam(), pthread_attr_setschedpolicy()
![Page 20: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/20.jpg)
Condition Variables
• Would like to be “woken up” when a particular condition occurs– Calling pthread_cond_wait(mutex) releases
exclusive access to a mutex. Thread sleeps.– When condition is signalled, thread wakes up
and given access back to the mutex
![Page 21: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/21.jpg)
Conditional Waiting
action()
{
lock();
while (x != 0)
wait (s);
unlock();
}
counter()
{
lock();
x--;
if (x==0)
signal(s);
unlock();
}
Both must occur before wait() returns
![Page 22: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/22.jpg)
A Simple Example: Array Summation
int array_sum(int n, int data[]){ int mid; int low_sum, high_sum;
mid = n/2; low_sum = 0; high_sum = 0; #pragma multithreadable { for (int i = 0; i < mid; i++) low_sum = low_sum + data[i]; for (int j = mid; j < n; j++) high_sum = high_sum + data[j]; } return low_sum + high_sum;}
![Page 23: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/23.jpg)
typedef struct { int n, *data, mid; int *high_sum, *low_sum;} args_block;
void sum_0(args_block *args){ for (int i = 0; i < args->mid; i++) *args->low_sum = *args->low_sum + args->data[i];}
void sum_1(args_block *args){ for (int j = args->mid; j < args->n; j++) *args->high_sum = *args->high_sum + args->data[j];}
int array_sum(int n, int data[]){ int mid; int low_sum, high_sum; args_block args; pthread_t threads[2]; mid = n/2; args.n = n; args.data = data; args.mid = mid; args.low_sum = &low_sum; args.high_sum = &high_sum;
pthread_create(&thread[0], NULL, (void *) sum_0, (void *) &args); pthread_create(&thread[1], NULL, (void *) sum_1, (void *) &args);
for (i = 0; i < 2; i++) /* wait for threads to complete */
pthread_join(&thread[i], &retval); return low_sum + high_sum;}
attributesRoutine to execute
Thread args
![Page 24: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/24.jpg)
Commodity Multithreaded Applications
• Example Problems: Spreadsheets, CAD/CAM, simulation, video/photo editing and production, games, voice/handwriting recognition, real-time 3D rendering, job scheduling, etc. etc.
• Need to run as fast as sequential on one processor.
• Need to run significantly faster on multiprocessors.
• No recompilation, no relinking, no reconfiguration.
• Need to adapt dynamically to changing resources.
• Need to be reliable and timely.
![Page 25: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/25.jpg)
Last Thoughts on Threading
• Threads provide a way to expose parallelism within a task.
• Advantages– Straightforward parallelism– Common construction (Java, Win32, Pthreads)– Shared variables eliminates copying
• Disadvantages– Mutual exclusion hard to think about– Not scalable to outside of a single SMP
• (Active research to eliminate this)
![Page 26: CSE 160 - Lecture 15](https://reader035.vdocuments.mx/reader035/viewer/2022062803/5681465b550346895db37b1f/html5/thumbnails/26.jpg)
An Aside: Automatic Parallelization ?
• Write a sequential program.• Compiler transforms sequential program into efficient parallel
(multithreaded) program • A very very very very very very very difficult problem.• Decades of work on this problem.• Some success with some regular scientific programs.• Not a general solution (and probably never will be).• Not applicable to large, irregular, dynamic programs.
• Compilers must overuse locking to insure correctness • Compilers need help determining what code blocks can operate
independently OpenMP directives