cs 620 advanced operating systems

60
CS 620 Advanced Operating Systems Lecture 5 – Processes Professor Timothy Arndt BU 331

Upload: peggy

Post on 05-Jan-2016

44 views

Category:

Documents


2 download

DESCRIPTION

CS 620 Advanced Operating Systems. Lecture 5 – Processes Professor Timothy Arndt BU 331. Threads. Motivation Keep programs interactive. Programs may block waiting for I/O. This is a bad thing if the program is interactive. Background save. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CS 620 Advanced Operating Systems

CS 620 Advanced Operating Systems

Lecture 5 – Processes

Professor Timothy Arndt

BU 331

Page 2: CS 620 Advanced Operating Systems

Threads

• Motivation Keep programs interactive.

• Programs may block waiting for I/O.• This is a bad thing if the program is interactive.

Background save.• Consider saving a large file to a slow disk system.• The user has to wait for the save operation to finish before

proceeding.

Multiple clients.• Consider also a server program with multiple clients.• We must fork off a new process for each client.

Page 3: CS 620 Advanced Operating Systems

Threads

• Basic idea Multiple lightweight processes.

• Traditional processes are created very slowly and use many system resources.

• Threads are similar to processes. They have their own stack and PC.

Single address space.• By sharing a single address space, there is no need to have

separate (costly) address spaces.• Since threads of a single program cooperate (unlike processes

of separate programs) it is possible for them to share an address space.

Page 4: CS 620 Advanced Operating Systems

Thread Usage in Nondistributed Systems

• Context switching as the result of IPC

Page 5: CS 620 Advanced Operating Systems

Threads

• Threads also share: open files child processes timers etc.

Synchronization primitives available.• Necessary since threads share a common memory.• Semaphores, mutexes, etc.

Page 6: CS 620 Advanced Operating Systems

Threads

• So what’s a thread? Stack Program counter what about I/O, signals, global variables (like

errno?)?

Page 7: CS 620 Advanced Operating Systems

Threads

• Is it managed from user or kernel level? User threads have a very cheap task context

switch Kernel threads handle blocking I/O cleanly In order for user threads not block, we need

extended models for I/O • E.g. select() indicates which files are ready to

transfer data so we don’t block

Hybrid is also possible

Page 8: CS 620 Advanced Operating Systems

Thread Implementation

• Combining kernel-level lightweight processes and user-level threads.

Page 9: CS 620 Advanced Operating Systems

Threads

• Preemption can be implemented via signal• Should user-level threads be preempted?

Easier programming model if processes yield() the processor.

But it is a nuisance to program with extra yield() calls Preemption can be controlled with special no preempt

regions

Page 10: CS 620 Advanced Operating Systems

Threads

• So how do you use threads? User interface/computation/I/O handled

separately (think of a browser) Pop-up server threads On multiprocessor systems, we can have

threads working in parallel on the multiple processors as an alternative to shared memory IPC

Page 11: CS 620 Advanced Operating Systems

Multithreaded Servers (1)

• A multithreaded server organized in a dispatcher/worker model.

Page 12: CS 620 Advanced Operating Systems

Threads in Windows

Each process in Windows contains one or more threads.• Threads are the executable units in Windows, not processes.

Threads have the following attributes:• The PID of the process that owns it.• A numeric base priority specifying its importance relative to

other threads.• A dynamic priority.• Its execution time so far.• An allowed processor set.• An exit status.

Page 13: CS 620 Advanced Operating Systems

Threads in Windows

The Windows Kernel schedules threads and handles interrupts and exceptions.• The Kernel schedules or dispatches threads for the

processor in order of priority.• It also preempts threads of lower priority in favor

of threads of higher priority.• It can force context switches, directing the

processor to drop one task ands pick up another.• Therefore code operating in this system must be

reentrant. (Able to be interrupted and resumed unharmed and shared by different threads executing the code on different processors.)

Page 14: CS 620 Advanced Operating Systems

Threads in Windows

The Kernel’s own code does not, technically, run in threads.• Hence it is the only part of the OS that is not preemptible or

pageable.• The rest of the threads in Windows are preemptible and fully

reentrant.• Code which is non-reentrant can cause serialization, damaging

the performance of the OS on SMP machines.

The Kernel schedules ready threads for processor time based upon their dynamic priority, a number from 1 to 31.

Page 15: CS 620 Advanced Operating Systems

Threads in Windows

• The highest priority thread always runs on the processor, even if this requires that a lower-priority thread be interrupted.

The base priority class of a process establishes a range for the base priority of the process and its thread. The base priority classes are:• Idle• Normal• High• Real-Time

The base priority of a process varies within the range established by its base priority class.

Page 16: CS 620 Advanced Operating Systems

Threads in Windows

• When a user interacts with a process (the process window is at the top of the window stack), Windows boosts the priority of the process to maximize its response.

• The base priority of a thread is a function of the base priority of the process in which it runs. It varies within +/- 2 from the base priority of the process.

• The dynamic priority of a thread is a function of its base priority. Windows continually adjusts the dynamic priority of threads within the range established by its base priority.

• The base priority class of a running process can be changed by using Task Manager.

Page 17: CS 620 Advanced Operating Systems

Threads in Windows

The Windows Kernel takes maximum advantage of multiprocessor configurations by implementing symmetric multiprocessing (SMP) and soft affinity.• SMP allows the threads of any process, including the OS, to

run on any processor.• The threads of a single process can run on different processors

at the same time.• With soft affinity, the Kernel attempts to run the thread on the

last processor it ran on.• Applications can restrict threads to run only on certain

processors (hard affinity).

Page 18: CS 620 Advanced Operating Systems

Threads in Windows

The Kernel manages two types of objects:• Dispatcher objects have a signal state (signaled or

nonsignaled) and control dispatching and synchronization of system operations.

Semaphores, mutexes, events, etc.• Control objects are used to control the operation of

the Kernel but do not affect dispatching. Processes, interrupts, etc.

The I/O Manager (a component of the Windows Executive) supports Asynchronous I/O.• Asynchronous I/O allows an application to continue

working while an I/O operation completes.

Page 19: CS 620 Advanced Operating Systems

Threads in Windows

• A thread may wait for the I/O to complete or we may use an application procedure call (APC) that the I/O manager calls when the I/O completes or we may use a synchronization object (e.g. an event) that the I/O system sets to the signaled state when I/O completes.

The Process Manager creates and deletes processes and tracks process objects and thread objects.• The subsystems define the rules for threads and

processes.

Page 20: CS 620 Advanced Operating Systems

System Model

• We look at three models: Workstations (zero cost solution) Clusters (a.k.a. NOW a.k.a. COW, a.k.a.

LAMP, a.k.a. Beowulf, a.k.a. pool) Hybrid

• Workstation model Connect workstations in department via LAN Includes personal workstations and public ones We often have dedicated file servers

Page 21: CS 620 Advanced Operating Systems

Workstation Model

Page 22: CS 620 Advanced Operating Systems

Workstation Model - UMich

Page 23: CS 620 Advanced Operating Systems

Workstations

The workstations can be diskless• Not so popular anymore (disks are cheap)• Maintenance is easy• Must have some startup code in ROM

If you have a disk on the workstation you can use it for• 1. Paging and temporary files• 2. 1. + (some) system executables• 3. 2. + file caching• 4. full file system

Page 24: CS 620 Advanced Operating Systems

Workstations

Case 1 is often called dataless• Just as easy to maintain (software) as diskless• We still need startup code in ROM

Case 2• Reduces load more and speeds up program start time• Adds maintenance since new releases of programs must be

loaded onto the workstations Case 3

• We can have a very few executables permanently on the disk• Must keep the caches consistent

Not trivial for data files with multiple writers This issue comes up for NFS as well

• Should you cache whole files or blocks?

Page 25: CS 620 Advanced Operating Systems

Workstations

Case 4• You can work if just your machine is up• But you lose location transparency• Also requires the most maintenance

Using idle workstations• Early systems did this manually via rsh

Still used today. • How do you find idle workstations?

Page 26: CS 620 Advanced Operating Systems

Workstations

• Idle = no mouse or keyboard activity and low load average

• Workstation can announce it is idle and this is recorded by all

• A job looking for a machine can inquire • Must worry about race conditions • Some jobs want a bunch of machines so they look

for many idle machines • Can also have centralized solution, processor server

Usual tradeoffs apply here

What about the local environment?

Page 27: CS 620 Advanced Operating Systems

Workstations

• Files on servers are no problem• Requests for local files must be sent home

... but not needed for temporary files• System calls for memory or process management probably

need to be executed on the remote machine• Time is a bit of a mess unless have we time synchronized by a

system like ntp• If a program is interactive, we must deal with devices

mouse, keyboard, display

What if the borrowed machine becomes non-idle (i.e. the owner returns)?

Page 28: CS 620 Advanced Operating Systems

Workstations

• Detect presence of user.• Kill off the guest processes.

Helpful if we made checkpoints (or ran short jobs)• Erase files, etc.• We could try to migrate the guest processes to other hosts but

this must be very fast or the owner will object.• Our goal is to make owner not be aware of our presence.

May not be possible since you may have paged out his basic environment (shell, editor, X server, window manager) that s/he left running when s/he stopped using the machine.

Page 29: CS 620 Advanced Operating Systems

Clusters

Bunch of workstations without displays in machine room connected by a network.

They are quite popular now. Indeed some clusters are packaged by their

manufacturer into a serious compute engine.• Ohio Supercomputing Center replaced MPP and

Vector supercomputers with clusters Used to solve large problems using many

processors at one time• Pluses of large time sharing system vs. small

individual machines.

Page 30: CS 620 Advanced Operating Systems

A Cluster System

Page 31: CS 620 Advanced Operating Systems

A Cluster System

Page 32: CS 620 Advanced Operating Systems

Clusters

• Also the minuses of timesharing.• We can use easy queuing theory to show that a large

fast server better in some cases than many slower personal machines.

• Hybrid Each user has a workstation and uses the pool

for big jobs. It is the dominant model for cluster based

machines.

Page 33: CS 620 Advanced Operating Systems

Virtualization

• Virtualization has a long history. It was important in the 1960s/70s Faded during the 1980s/90s Increasing importance nowadays

• Security• Ease of management

Different types of virtualization• Process virtual machine• Virtual machine monitor (hardware virtual machine)

Page 34: CS 620 Advanced Operating Systems

Virtualization

• Process virtual machine JVM Macromedia Flash Player Wine

• VMM VMWare Parallels VirtualBox Microsoft Virtual PC

Page 35: CS 620 Advanced Operating Systems

The Role of Virtualization in Distributed Systems

• (a) General organization between a program, interface, and system. (b) General organization of virtualizing system A on top of system B.

Page 36: CS 620 Advanced Operating Systems

Architectures of Virtual Machines

• Interfaces at different levels• An interface between the hardware and software

consisting of machine instructions that can be invoked by any program.

• An interface between the hardware and software, consisting of machine instructions that can be invoked only by privileged programs, such

as an operating system.

Page 37: CS 620 Advanced Operating Systems

Architectures of Virtual Machines

• Interfaces at different levels• An interface consisting of system calls as offered by

an operating system.• An interface consisting of library calls

generally forming what is known as an application programming interface (API).

In many cases, the aforementioned system calls are hidden by an API.

Page 38: CS 620 Advanced Operating Systems

Architectures of Virtual Machines

• Figure 3-6. Various interfaces offered by computer systems.

Page 39: CS 620 Advanced Operating Systems

Architectures of Virtual Machines

• A process virtual machine, with multiple instances of (application, runtime) combinations.

Page 40: CS 620 Advanced Operating Systems

Architectures of Virtual Machines

• A virtual machine monitor, with multiple instances of (applications, operating system) combinations.

Page 41: CS 620 Advanced Operating Systems

Processor Allocation

• Processor Allocation Decide which processes should run on which

processors. Could also be called process allocation. We assume that any process can run on any

processor.

Page 42: CS 620 Advanced Operating Systems

Processor Allocation

Often the only difference between different processors is:• CPU speed• CPU speed and amount of memory

What if the processors are not homogeneous?• Assume that we have binaries for all the different

architectures.

What if not all machines are directly connected• Send process via intermediate machines

Page 43: CS 620 Advanced Operating Systems

Processor Allocation

• If we have only PowerPC binaries, restrict the process to PowerPC machines.

• If we need machines very close for fast communication, restrict the processes to a group of close machines.

Can you move a running process or are processor allocations done at process creation time?• Migratory allocation algorithms vs. non migratory.

Page 44: CS 620 Advanced Operating Systems

Processor Allocation

What is the figure of merit, i.e. what do we want to optimize in order to find the best allocation of processes to processors?• Similar to CPU scheduling in centralized operating

systems.

Minimize response time is one possibility.

Page 45: CS 620 Advanced Operating Systems

Processor Allocation

• We are not assuming all machines are equally fast. Consider two processes. P1 executes 100 millions

instructions, P2 executes 10 million instructions. Both processes enter system at time t=0 Consider two machines A executes 100 MIPS, B 10 MIPS If we run P1 on A and P2 on B each takes 1 second so

average response time is 1 sec. If we run P1 on B and P2 on A, P1 takes 10 seconds P2 .1

sec. so average response time is 5.05 sec. If we run P2 then P1 both on A finish at times .1 and 1.1

so average response time is .6 seconds!!

Page 46: CS 620 Advanced Operating Systems

Processor Allocation

Minimize response ratio.• Response ratio is the time to run on some machine

divided by time to run on a standardized (benchmark) machine, assuming the benchmark machine is unloaded.

• This takes into account the fact that long jobs should take longer.

Maximize CPU utilization Throughput

• Jobs per hour• Weighted jobs per hour

Page 47: CS 620 Advanced Operating Systems

Processor Allocation

If weighting is CPU time, we get CPU utilization This is the way to justify CPU utilization (user centric)

• Design issues Deterministic vs. Heuristic

• Use deterministic for embedded applications, when all requirements are known a priori.

Patient monitoring in hospital Nuclear reactor monitoring

Centralized vs. distributed • We have a tradeoff of accuracy vs. fault tolerance

and bottlenecks.

Page 48: CS 620 Advanced Operating Systems

Processor Allocation

Optimal vs. best effort• Optimal normally requires off line processing.• Similar requirements as for deterministic.• Usual tradeoff of system effort vs. result quality.

Transfer policy• Does a process decide to shed jobs just based on its

own load or does it have (and use) knowledge of other loads?

• Also called local vs. global• Usual tradeoff of system effort (gather data) vs.

result quality.

Page 49: CS 620 Advanced Operating Systems

Processor Allocation

• Location policy Sender vs. receiver initiated.

• Sender initiated - uploading programs to a compute server

• Receiver initiated - downloading Java applets

Look for help vs. look for work. Both are done.

Page 50: CS 620 Advanced Operating Systems

Processor Allocation

• Implementation issues Determining local load

• Normally use a weighted mean of recent loads with more recent weighted higher.

Page 51: CS 620 Advanced Operating Systems

Processor Allocation

• Example algorithms• Min cut deterministic algorithm

Define a graph with processes as nodes and IPC traffic as arcs

Goal: Cut the graph (i.e some arcs) into pieces so that • All nodes in one piece can be run on one processor

Memory constraints Processor completion times

• Values on cut arcs are minimized

Page 52: CS 620 Advanced Operating Systems

Processor Allocation

• Minimize the max minimize the maximum traffic for a process pair

• Minimize the sum minimize total traffic Minimize the sum to/from a piece

don't overload a processor

• Minimize the sum between pieces minimize traffic for processor pair

Tends to get hard as you get more realistic

Page 53: CS 620 Advanced Operating Systems

Processor Allocation

• Up-down centralized algorithm Centralized table that keeps "usage" data for a

user, the users are defined to be the workstation owners. Call this the score for the user.

The goal is to give each user a fair share. When user requests a remote job, if a

workstation is available it is assigned. For each process a user has running remotely,

the user's score increases by a fixed amount each time interval.

Page 54: CS 620 Advanced Operating Systems

Processor Allocation

When a user has an unsatisfied request pending (and none being satisfied), the score decreases (it can go negative).

If no requests are pending and none are being satisfied, the score is bumped towards zero.

When a processor becomes free, assign it to a requesting user with the lowest score.

Page 55: CS 620 Advanced Operating Systems

Processor Allocation

• Hierarchical algorithm Goal - assign multiple processors to a job Quick idea of algorithm

• Processors arranged in a tree• Requests go up the tree until a subtree has enough resources• Request is split and parts go back down the tree

Arrange processors in a hierarchy (tree)• This is a logical tree independent of how physically connected• Each node keeps (imperfect) track of how many available

processors are below it. If a processor can run more than one process, must be more

sophisticated and must keep track of how many processes can be allocated (without overload) in the subtree below.

Page 56: CS 620 Advanced Operating Systems

Processor Allocation

• If a new request appears in the tree, the current node sees if it can be satisfied by the processors below (plus itself).

If so, do it. If not pass the request up the tree Actually since machines may be down or the data on availability

may be out of date, you actually try to find more processes than requested

• Once a request has gone high enough to be satisfied, the current node splits the request into pieces and sends each piece to appropriate child.

• What if a node dies? Promote one of its children say C Now C's children are peers with the previous peers of C

Page 57: CS 620 Advanced Operating Systems

Processor Allocation

If this is considered too unbalanced, we can promote one of C children to take C's place.

• How can we decide which child C to promote? Peers of dead node have an election Children of dead node have an election Parent of dead node decides

• What if the root dies? Must use children since no peers or parent If we want to use peers, then we do not have a single root I.e. the top level of the hierarchy is a collection of roots that

communicate. This is a forest, not a tree

What if multiple requests are generated simultaneously?

Page 58: CS 620 Advanced Operating Systems

Processor Allocation

• Gets hard fast as information gets stale and potential race conditions and deadlocks are possible.

• Distributed heuristic algorithm Goal - find a lightly loaded processor to migrate job to Send probe to a random processor If the remote load is low, ship the job If the remote load is high, try another random probe After k (parameter of implementation) probes all say

the load is too high, give up and run the job locally. Modelled analytically and seen to work fairly well

Page 59: CS 620 Advanced Operating Systems

Scheduling

General goal is to have processes that communicate frequently run simultaneously

If they don’t and we use busy waiting for messages, we will have a huge disaster.

Even if we use context switching, we may have a small disaster as only one message transfer can occur per time scheduling slot

Co-scheduling (a.k.a. gang scheduling). Processes belonging to a job are scheduled together

Page 60: CS 620 Advanced Operating Systems

Scheduling

• Time slots are coordinated among the processors.• Some slots are for gangs; other slots are for regular

processes.