threads concepts
TRANSCRIPT
2004 Deitel & Associates, Inc. All rights reserved.
1
Chapter 4 – Thread Concepts
Outline4.1 Introduction4.2 Definition of Thread4.3 Motivation for Threads4.4 Thread States: Life Cycle of a Thread4.5 Thread Operations4.6 Threading Models4.6.1 User-Level Threads4.6.2 Kernel-Level Threads4.6.3 Combining User- and Kernel-Level Threads4.7 Thread Implementation Considerations4.7.1 Thread Signal Delivery4.7.2 Thread Termination4.8 POSIX and Pthreads4.9 Linux Threads4.10 Windows XP Threads4.11 Java Multithreading Case Study, Part 1: Introduction to Java Threads
2004 Deitel & Associates, Inc. All rights reserved.
2
Objectives
After reading this chapter, you should understand:• the motivation for creating threads.• the similarities and differences between processes and
threads.• the various levels of support for threads.• the life cycle of a thread.• thread signaling and cancellation.• the basics of POSIX, Linux, Windows XP and Java
threads.
2004 Deitel & Associates, Inc. All rights reserved.
Recent Developments in Processors3
1. 32-bit processors are replaced by 64-bit processors. Software lags behind hardware by 2-4 years.
2. Multi-core processors will be dominant. It is much easier to increase the number of cores than to increase processor clock frequencies.
3. Netflix uses about 30% of Internet traffic at night.
2004 Deitel & Associates, Inc. All rights reserved.
4
Each core can run two threads currently.
Double Data Rate 3
2004 Deitel & Associates, Inc. All rights reserved.
Architecture of Intel 486 processor
5
Intel 32-bit386 fixed-point processor
Intel 32-bit387 floating-point processor
8KByte Cache memory & controller
Intel 486
2004 Deitel & Associates, Inc. All rights reserved.
6Intel Pentium(P5)Two ALU and one Floating-point Unit
2004 Deitel & Associates, Inc. All rights reserved.
7
Data TLB
Instruction TLB
FADD Floating-point ADD
Intel Core Architecture
2004 Deitel & Associates, Inc. All rights reserved.
The Problem with Threads, Edward A. Lee, UC Berkeley, 2006[2]
8
Although threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads, as a model of computation, are wildly non-deterministic, and the job of the programmer becomes one of pruning that non-determinism.
2004 Deitel & Associates, Inc. All rights reserved.
Single CPU
9
Massive Parallel Computer
2004 Deitel & Associates, Inc. All rights reserved.
Google Data Center10
2004 Deitel & Associates, Inc. All rights reserved.
Hardware vs. Software11
Hardware• Inherently parallel• Application specific,• Higher speed
Software• Mostly serial• Flexible,• Lower speed
Observations:1. To transmit a 2-hour MPEG movie, or 2 Gbyte, requires
transmissions by software servers of more than 1 million packets, with maximum data of 1,500 bytes.
2. Higher-definition movies require 40 Gbytes or more.3. In late 2013, Federal Health Exchange website limited
50,000 users to be logged on at any time.
2004 Deitel & Associates, Inc. All rights reserved.
12
4.1 Introduction
• General-purpose languages such as Java, C#, Visual C++ .NET, Visual Basic .NET and Python have made concurrency primitives available to applications programmer
• Multithreading– Programmer specifies applications
contain threads of execution– Each thread designate a portion of a
program that may execute concurrently with other threads
2004 Deitel & Associates, Inc. All rights reserved.
13
Three-thread Word Processor
Kernel
Word processor
keyboard
mouse
printer
1. Accept inputs from a keyboard or mouse,2. Display text and graphics on the video monitor,3. Send outputs to a printer
2004 Deitel & Associates, Inc. All rights reserved.
14
Three-thread Word Processor
• Thread 1 interacts with the user.• Thread 2 handles document formating in the backgroud.• Thread 3 handles interfacing with a printer.• As soon as a user deletes a sentence from page 1, Thread 1
tells Thread 2 to reformat the whole document. Meanwhile, Thread 1 continues to listen to the inputs from a keyboard/mouse, and respond to the user’s commands, while Thread 2 computes the reformating in the backgroud.
• When the user wants to display another page for editing, Thread 2 might complete the reformating already.
• If the program were single-threaded, a printing task would cause the commands from a keyboard/mouse to be ignored until the printing is done. (Assume that the printing does not use an interrupt-driven programming model.)
2004 Deitel & Associates, Inc. All rights reserved.
15
Three-thread Word Processor
• Thread 1 interacts with a user.• Thread 2 reformats the document when commanded
by Thread 1.• Thread 3 outputs to a printer when commanded by Thread 1.
• It should be clear that three processes would not work here because they all need to have access to the same document (in the memory).
• By having 3 threads, which share a common memory, all threads have access to the document being edited.
2004 Deitel & Associates, Inc. All rights reserved.
16
Two-thread Web Server Process
Kernel
Web server process
Web page cache
Dispatcher thread
worker thread
Network connection
2004 Deitel & Associates, Inc. All rights reserved.
17
Two-thread Web Server Process
• Requests for pages come in and the requested pages are sent back to the client.
• At most web sites, some pages are more commonly accessed than other pages. Web servers store these heavily used pages in the main (cache) memory to eliminate the need to go to a hard disk to get them.
• Dispatcher Thread reads the requests for work from the network. After examining the request, it chooses an idle Worker Thread to handle it, by writng a pointer to the message into a special word associated with each thread.
• The Dispatcher Thread then wakes up the sleeping Worker Thread, and move it from blocked state to ready state.
2004 Deitel & Associates, Inc. All rights reserved.
18
Two-thread Web Server Process
• When the Worker Thread wakes up, it checks to see if the request can be satisfied from the Web page cache, to which all threads have access. If not, it starts a readDisk operation to get the page from the disk and blocks until the dsk operation is complete.
• This model allows the server to be written as a collection of sequential threads.
• If the web server program were written as a single-thread program, the main loop of the program gets a request, examine it, and carries it out before getting the next one. While waiting for the disk, the program is idle and does not process any other incoming requests.
2004 Deitel & Associates, Inc. All rights reserved.
19
Two-thread Web Server Process
Dispatch thread
while(TRUE) {
get_next_request(&buf);
handoff_work(&buf);
}
Worker thread
while(TRUE) {
wait_for_work(&buf);
look_for_page_in_cache(&buf, &page);
if( page_not_in_cache(&page))
read_page_from_disk(&page);
return_page(&page);
}
2004 Deitel & Associates, Inc. All rights reserved.
20
4.2 Definition of Thread
• Thread– Lightweight process (LWP)– Threads of instructions or thread of control– Shares address space and other global information with its
process– Registers, stack, signal masks and other thread-specific data are
local to each thread
• Threads may be managed by the operating system or by a user application
• Examples: Win32 threads, C-threads, Pthreads
2004 Deitel & Associates, Inc. All rights reserved.
21
Figure 4.1 Thread Relationship to Processes.
4.2 Definition of Thread
TSD = thread-specific data
2004 Deitel & Associates, Inc. All rights reserved.
22
4.3 Motivation for Threads
Threads have become prominent due to trends in – Software design
• More naturally expresses inherently parallel tasks
– Performance • Scales better to multiprocessor systems
(each thread can be executed by a processor)– Cooperation
• Shared address space incurs less overhead than IPC
2004 Deitel & Associates, Inc. All rights reserved.
Benefits of Threads
1. Responsive to users’ inputs: a multi-thread process will continue to run even if part of it (a thread) is blocked or it is performing a lengthy operation.
2. Resource Sharing: Threads share the memory and the resources of the process
3. Economy: it takes less processor time to create and manage threads than processes.
4. Scalability: threads can run concurrently on different processing cores, while a process with a single thread can run on only one core.
23
2004 Deitel & Associates, Inc. All rights reserved.
24
4.3 Motivation for Threads
• Each thread transitions among a series of discrete thread states
• Threads and processes have many operations in common (e.g. create, exit, resume, and suspend)
• Thread creation does not require operating system to initialize resources that are shared between parent processes and its threads– Reduces overhead of thread creation and termination compared
to process creation and termination
2004 Deitel & Associates, Inc. All rights reserved.
25
4.4 Thread States: Life Cycle of a Thread
Thread states– Born state– Ready state (runnable state)– Running state– Dead state– Blocked state–Waiting state– Sleeping state
• Sleep interval specifies for how long a thread will sleep
2004 Deitel & Associates, Inc. All rights reserved.
26
4.4 Thread States: Life Cycle of a Thread
2004 Deitel & Associates, Inc. All rights reserved.
Example 1: Multi-threaded sorting application
27
7 12 19 3 18 4 2 6 15 8
4 2 6 15 87 12 19 3 18
2 4 6 8 153 7 12 18 19
2 3 4 6 7 8 12 15 18 19
sort thread 0 sort thread 1
merge thread
2004 Deitel & Associates, Inc. All rights reserved.
Example 1: Multi-threaded sorting application
• Assume that an array a[n] with n entries is to be sorted. It is stored in a global array to be accessed by all threads.
• The Sort Thread 0 sorts the first half of the array, a[0] to a[n/2 -1],
• The Sort Thread 1 sorts the second half of the array, a[n/2] to a[n -1],
• The merge thread combines the two sorted sub-arrays into one array, b[n], which is another global array.
28
2004 Deitel & Associates, Inc. All rights reserved.
void merge(int a[], int n, int b[])
{ //purpose: to merge two sorted arrays, a[0 to n/2-1] //and a[n/2 to n-1] into array b.
for( int i=0; i<n; i++)
{
int i0, i1;
i0=0; i1=n/2;
if( a[i0] < a[i1])
{ b[i] = a[i0]; i0++; }
else {
b[i] = a[i1]; i1++; }}
}
29
2004 Deitel & Associates, Inc. All rights reserved.
Example 2 Multi-threaded Sudoku Solution Validator
30
Thread to check that each row contains 1 to 9.
Thread to check that each column contains 1 to 9.
Thread to check that each 3*3 block contains 1 to 9.
2004 Deitel & Associates, Inc. All rights reserved.
bool checkDigit(int a[][9], int rowStart, int rowEnd, int columnStart, int columnEnd)
{//purpose: to check that a given Row, column or
//3*3 block contains 1 to 9.
int count[9];
for( int i=0; i<9; i++) count[i] = 0;
for( int row=rowStart; row<rowEnd; row++)
for( int column=columnStart; column<columnEnd; column++)
{ count[ a[row][column]-1]++; }
for( int i=0; i<9; i++) if(count[i]!= 1) return false;
return true;
}
31
2004 Deitel & Associates, Inc. All rights reserved.
Check row, column and block
rowStart rowEnd columnStart columnEnd
Check Row 0 0 0 0 9
Check Column 0
0 9 0 0
Check Column 8
0 9 8 8
Check first block
0 3 0 3
32
2004 Deitel & Associates, Inc. All rights reserved.
Example-3 MPEG 8*8 Block Direct-Cosine-Transform
33
2004 Deitel & Associates, Inc. All rights reserved.
Example-3 MPEG 8*8 Block Direct-Cosine-Transform
• A 640*480 image is divided into 80*60 blocks of 8 rows* 8 pixels/row.
• Each 8*8 block must perform one DCT independently. Thus each 8*8 DCT can be performed by one thread.
X(k1, k2) =
• Some blocks are 16 rows * 16 pixels/row.
34
2004 Deitel & Associates, Inc. All rights reserved.
Amdahl’s Law
If a N-core system runs an application with S portion of serial component,
Speedup <= 1/[ S + (1-S)/N]
Example: an application with 40% serial component,
2-core: speedup = 1/[0.4 + 0.6/2] = 1/0.74-core: speedup = 1/[0.4 + 0.6/4] = 1/0.55
35
2004 Deitel & Associates, Inc. All rights reserved.
Example of Serial CodeSimulation programs are mostly serial.
Fibonacci sequence
F[n] = F[n-1] + F[n-2]
F[0] =0, F[1] =1,
Using the above recursive definition, F[n] can’t be calculated before F[n-1] and F[n-2] are calculated.
void fibonacci(int F[], int n)
{ F[0] =0;
F[1] = 1;
for(int i=2; i<=n; i++)F[i] = F[i-1] + F[i-2];
}
36
F[0]
F[3]
F[2]
F[1]
2004 Deitel & Associates, Inc. All rights reserved.
Example of Parallel Codefor(int i=0; i< n; i++) a[i] = b[i] + c[i];
for(int i=0; i< n/2; i++)
a[i] = b[i] + c[i];
37
for(int i=n/2; i< n; i++)
a[i] = b[i] + c[i];
a[0] = b[0] + c[0]a[2] = b[2] + c[2]
a[1] = b[1] + c[1]a[3] = b[3] + c[3]
2004 Deitel & Associates, Inc. All rights reserved.
Matrix Multiplication A = B * Cint A[N][N], B[N][N], C[N][N];
for(int i=0; i<n; i++){
for(int j=0; j<m; j++)
{
A[i][j]=0;
for( int k=0; k<p; k++)
A[i][j] += B[i][k] * C[k][j];
}}
38
2004 Deitel & Associates, Inc. All rights reserved.
Parallel Code for Matrix Multiplication A = B * Cint A[N][N], B[N][N], C[N][N];
for(int i=0; i<n/2; i++){
for(int j=0; j<m; j++)
{
A[i][j]=0;
for( int k=0; k<p; k++)
A[i][j] += B[i][k] * C[k][j];
}}
39
for(int i=n/2; i<n; i++){
for(int j=0; j<m; j++)
{
A[i][j]=0;
for( int k=0; k<p; k++)
A[i][j] += B[i][k] * C[k][j];
}}
2004 Deitel & Associates, Inc. All rights reserved.
40
4.5 Thread Operations
• Threads and processes have common operations– Create– Exit (terminate)– Suspend– Resume– Sleep– Wake
2004 Deitel & Associates, Inc. All rights reserved.
41
4.5 Thread Operations
• Thread operations do not correspond precisely to process operations– Cancel
• Indicates that a thread should be terminated, but does not guarantee that the thread will be terminated
• Threads can mask the cancellation signal– Join
• A primary thread can wait for all other threads to exit by joining them
• The joining thread blocks until the thread it joined exits
2004 Deitel & Associates, Inc. All rights reserved.
42
4.6 Threading Models
• Three most popular threading models– User-level threads– Kernel-level threads– Combination of user- and kernel-level threads
2004 Deitel & Associates, Inc. All rights reserved.
43
4.6.1 User-level Threads
• User-level threads perform threading operations in user space– Threads are created by runtime libraries that cannot execute privileged
instructions or access kernel primitives directly• User-level thread implementation
– Many-to-one thread mappings• Operating system maps all threads in a multithreaded process to single
execution context• Advantages
– User-level libraries can schedule its threads to optimize performance– Synchronization performed outside kernel, avoids context switches– More portable
• Disadvantage– Kernel views a multithreaded process as a single thread of control
• Can lead to suboptimal performance if a thread issues I/O• Cannot be scheduled on multiple processors at once
2004 Deitel & Associates, Inc. All rights reserved.
44
Figure 4.3 User-level threads.
4.6.1 User-level Threads
2004 Deitel & Associates, Inc. All rights reserved.
45
4.6.2 Kernel-level Threads
• Kernel-level threads attempt to address the limitations of user-level threads by mapping each thread to its own execution context– Kernel-level threads provide a one-to-one thread mapping
• Advantages: Increased scalability, interactivity, and throughput• Disadvantages: Overhead due to context switching and reduced
portability due to OS-specific APIs
• Kernel-level threads are not always the optimal solution for multithreaded applications
2004 Deitel & Associates, Inc. All rights reserved.
46
Figure 4.4 Kernel-level threads.
4.6.2 Kernel-level Threads
2004 Deitel & Associates, Inc. All rights reserved.
47
4.6.3 Combining User- and Kernel-level Threads
• The combination of user- and kernel-level thread implementation– Many-to-many thread mapping (m-to-n thread mapping)
• Number of user and kernel threads need not be equal• Can reduce overhead compared to one-to-one thread mappings by
implementing thread pooling• Worker threads
– Persistent kernel threads that occupy the thread pool– Improves performance in environments where threads are frequently
created and destroyed– Each new thread is executed by a worker thread
• Scheduler activation– Technique that enables user-level library to schedule its threads– Occurs when the operating system calls a user-level threading library
that determines if any of its threads need rescheduling
2004 Deitel & Associates, Inc. All rights reserved.
48
Figure 4.5 Hybrid threading model.
4.6.3 Combining User- and Kernel-level Threads
2004 Deitel & Associates, Inc. All rights reserved.
49
4.7.1 Thread Signal Delivery
• Two types of signals– Synchronous:
• Occur as a direct result of program execution• Should be delivered to currently executing thread
– Asynchronous• Occur due to an event typically unrelated to the current instruction• Threading library must determine each signal’s recipient so that
asynchronous signals are delivered properly
• Each thread is usually associated with a set of pending signals that are delivered when it executes
• Thread can mask all signals except those that it wishes to receive
2004 Deitel & Associates, Inc. All rights reserved.
50
Figure 4.6 Signal masking.
4.7.1 Thread Signal Delivery
2004 Deitel & Associates, Inc. All rights reserved.
51
4.7.2 Thread Termination
• Thread termination (cancellation)– Differs between thread implementations– Prematurely terminating a thread can cause subtle errors in
processes because multiple threads share the same address space– Some thread implementations allow a thread to determine when
it can be terminated to prevent process from entering inconsistent state
2004 Deitel & Associates, Inc. All rights reserved.
52
4.8 POSIX and Pthreads
• Threads that use the POSIX threading API are called Pthreads– POSIX states that processor registers, stack and signal mask are
maintained individually for each thread– POSIX specifies how operating systems should deliver signals to
Pthreads in addition to specifying several thread-cancellation modes
2004 Deitel & Associates, Inc. All rights reserved.
53
4.9 Linux Threads
• Linux allocates the same type of process descriptor to processes and threads (tasks)
• Linux uses the UNIX-based system call fork to spawn child tasks
• To enable threading, Linux provides a modified version named clone– Clone accepts arguments that specify which resources to share
with the child task
2004 Deitel & Associates, Inc. All rights reserved.
54
Figure 4.7 Linux task state-transition diagram.
4.9 Linux Threads
2004 Deitel & Associates, Inc. All rights reserved.
55
4.10 Windows XP Threads
• Threads – Actual unit of execution dispatched to a processor– Execute a piece of the process’s code in the process’s
context, using the process’s resources– Execution context contains
• Runtime stack• State of the machine’s registers• Several attributes
2004 Deitel & Associates, Inc. All rights reserved.
56
4.10 Windows XP Threads
• Windows XP threads can create fibers– Fiber is scheduled for execution by the thread that creates
it, rather than the scheduler
• Windows XP provides each process with a thread pool that consists of a number of worker threads, which are kernel threads that execute functions specified by user threads
2004 Deitel & Associates, Inc. All rights reserved.
57
Figure 4.8 Windows XP thread state-transition diagram.
4.10 Windows XP Threads
2004 Deitel & Associates, Inc. All rights reserved.
58
4.11 Java Multithreading Case Study, Part I: Introduction to Java Threads
• Java allows the application programmer to create threads that can port to many computing platforms
• Threads– Created by class Thread– Execute code specified in a Runnable object’s run method
• Java supports operations such as naming, starting and joining threads
2004 Deitel & Associates, Inc. All rights reserved.
59
Figure 4.9 Java threads being created, starting, sleeping and printing. (Part 1 of 4.)
4.11 Java Multithreading Case Study, Part I: Introduction to Java Threads
2004 Deitel & Associates, Inc. All rights reserved.
60
Figure 4.9 Java threads being created, starting, sleeping and printing. (Part 2 of 4.)
4.11 Java Multithreading Case Study, Part I: Introduction to Java Threads
2004 Deitel & Associates, Inc. All rights reserved.
61
Figure 4.9 Java threads being created, starting, sleeping and printing. (Part 3 of 4.)
4.11 Java Multithreading Case Study, Part I: Introduction to Java Threads
2004 Deitel & Associates, Inc. All rights reserved.
62
4.11 Java Multithreading Case Study, Part I: Introduction to Java Threads
Figure 4.9 Java threads being created, starting, sleeping and printing. (Part 4 of 4.)
2004 Deitel & Associates, Inc. All rights reserved.
63
Reference
• Andrew Tanenbaum, “Moden Operating Systems”, 2nd Edition, Prentice-Hall, 2001.