concurrency, threads, and events ken birman (based on a slide set prepared by robbert van renesse)

41
Concurrency, Threads, and Events Ken Birman (Based on a slide set prepared by Robbert van Renesse)

Post on 22-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Concurrency,Threads, and Events

Ken Birman(Based on a slide set prepared

by Robbert van Renesse)

Summary Paper 1

Using Threads in Interactive Systems: A Case Study (Hauser et al 1993) Analyzes two interactive computing

systems Classifies thread usage Finds that programmers are still struggling

(pre-Java) Limited scheduling support

Priority-inversion

Summary Paper 2

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services (Welsh, 2001) Analyzes threads vs event-based

systems, finds problems with both Suggests trade-off: stage-driven

architecture Evaluated for two applications

Easy to program and performs well

What is a thread?

A traditional “process” is an address space and a thread of control.

Now add multiple thread of controls Share address space Individual program counters and stacks

Same as multiple processes sharing an address space.

Thread Switching To switch from thread T1 to T2:

Thread T1 saves its registers (including pc) on its stack

Scheduler remembers T1’s stack pointer Scheduler restores T2’ stack pointer T2 restores its registers T2 resumes

Two models: preemptive/non-preemptive

Thread Scheduler Maintains the stack pointer of each thread Decides what thread to run next

E.g., based on priority or resource usage Decides when to pre-empt a running thread

E.g., based on a timer May need to deal with multiple CPUs

But not usually “fork” creates a new thread Blocking or calling “yield” lets scheduler run

Synchronization Primitives Semaphores

P(S): block if semaphore is “taken” V(S): release semaphore

Monitors: Only one thread active in a module at a time Threads can block waiting for some condition

using the WAIT primitive Threads need to signal using NOTIFY or

BROADCAST

Uses of threads To exploit CPU parallelism

Run two CPUs at once in the same program To exploit I/O parallelism

Run I/O while computing, or do multiple I/O Listen to the “window” while also running code,

e.g. allow commands during an interactive game For program structuring

E.g., timers To avoid deadlock in RPC-based applications

Hauser’s categorization

Defer Work: asynchronous activity Print, e-mail, create new window, etc.

Pumps: pipeline components Wait on input queue; send to output

queue E.g., slack process: add latency for

buffering Sleepers & one-shots

Periodic activity & timers

Categorization, cont’d

Deadlock Avoiders Avoid deadlock through ordered

acquisition of locks When needing more locks, roll-back

and re-acquire Task Rejuvenation: recovery

Start new thread when old one dies, say because of uncaught exception

Categorization, cont’d

Serializers: event loop for (;;) { get_next_event();

handle_event(); } Concurrency Exploiters

Use multiple CPUs Encapsulated Forks

Hidden threads used in library packages E.g., menu-button queue

Common Problems Priority Inversion

High priority thread waits for low priority thread Solution: temporarily push priority up (rejected??)

Deadlock X waits for Y, Y waits for X

Incorrect Synchronization Forgetting to release a lock

Failed “fork” Tuning

E.g. timer values in different environment

Problems he neglects

Implicit need for ordering of events E.g. thread A is supposed to run before

thread B does, but something delays A Non-reentrant code

Languages lack “monitor” features and users are perhaps surprisingly weak at detecting and protecting concurrently accessed data

Criticism of Hauser He assumes superb programmers and

seems to believe that “most” programmers won’t use threads (his example systems are really platforms, not applications)

Systems old but/and not representative Pre-Java and C# And now there are some tools that can

help discover problems

What is an Event? An object queued for some module Operations:

create_event_queue(handler) EQ enqueue_event(EQ, event-object)

Invokes, eventually, handler(event-object) Handler is not allowed to block

Blocking could cause entire system to block

But page faults, garbage collection, …

Example Event System

(Also common in telecommunications industry, where it’s called “workflow programming”)

Event Scheduler

Decides which event queue to handle next. Based on priority, CPU usage, etc.

Never pre-empts event handlers! No need for stack / event handler

May need to deal with multiple CPUs

Synchronization?

Handlers cannot block no synchronization

Handlers should not share memory At least not in parallel

All communication through events

Uses of Events CPU parallelism

Different handlers on different CPUs I/O concurrency

Completion of I/O signaled by event Other activities can happen in parallel

Program structuring Not so great… But can use multiple programming

languages!

Hauser’s categorization ?!

Defer Work: asynchronous activity Send event to printer, etc

Pumps: pipeline components Natural use of events!

Sleepers & one-shots Periodic events & timer events

Categorization, cont’d

Deadlock Avoiders Ordered lock acquisition still works

Task Rejuvenation: recovery Watchdog events?

Categorization, cont’d

Serializers: event loop Natural use of events and handlers!

Concurrency Exploiters Use multiple CPUs

Encapsulated Events Hidden events used in library

packages E.g., menu-button queue

Common Problems

Priority inversion, deadlock, etc. much the same with events

Threaded Server Throughput

Event-driven Server Throughput

Threads vs. Events Events-based systems use fewer

resources Better performance (particularly scalability)

Event-based systems harder to program Have to avoid blocking at all cost Block-structured programming doesn’t work How to do exception handling?

In both cases, tuning is difficult

Both? In practice, many kinds of systems

need to support both threads and events Threaded programs in Unix are the

common example of these, because window systems use events

The programmer uses cthreads or pthreads

Major problem: the UNIX kernel interface wasn’t designed with threads in mind!

Why does this cause problems?

Many system calls block the “process” File read or write, for example

And many libraries aren’t reentrant So when the user employs threads

The application may block unexpectedly Limited work-around: add “kernel threads”

And the user might stumble into a reentrancy bug

Events as seen in Unix Window systems use small messages… But the “old” form of events are signals

Kernel basically simulates an interrupt into the user’s address space

The “event handler” then runs… But can it launch new threads? Some system calls can return EINTR Very limited options to “block” signals in critical

sections

How people work around this?

They try not to do blocking I/O Use asynchronous system calls… or

select… or some mixture of the two Or try to turn the whole application into

an event-driven one using pools of threads, in the SEDA model (more or less)

One dedicated thread per I/O “channel”, to turn signal-style events into events on the event queue for the processing stage

This can be hard, but it works

Must write the whole program and have a way to review any libraries it uses!

One learns, the hard way, that pretty much nothing else works

Unix programs built by inexperienced developers are often riddled with concurrency bugs!

SEDA

Mixture of models of threads and (small message-style) events

Events, queues, and “pools of event handling threads”.

Pools can be dynamically adjusted as need arises.

Similar to Javabeans and EventListeners?

SEDA Stage

Authors: “Best of both worlds”

Ease of programming of threads Or even better

Performance of events Or even better

Threads Considered Harmful Like goto, transfer to some entry in

program In any scope Destroys structure of programs

Primitive Synchronization Primitives Too low-level Too coarse-grained Too error-prone Prone to over-specification

Example: create file

1. Create file2. Read current directory (may be

cached)3. Update and write back directory4. Write file

Thread Implementations

1. Serialize: op1; op2; op3; op4• Simplest and most common

2. Use threads• Requires at least two semaphores!• Results in complicated program

3. Simplified threadsa) Create file and read directory in parallelb) Barrierc) Write file and write directory in parallel• Over-specification!

Event Implementation

Create a dummy handler that awaits file creation and directory read events and then send an event to update the directory.

Not great…

GOP: Discussion Specifies dependencies at a high-level

No semaphores, condition variables, etc No explicit threads nor events

Can easily be supported by many languages C, Java, etc.

Top-down specification cmp with make, prolog, theorem prover

Exception handling easily supported

Conclusion

Threads still problematic As a code structuring mechanism High resource usage

Events also problematic Hard to code, but efficient

SEDA and GOP address shortcomings But neither can be said to have taken

hold

Issues not discussed

Kernel vs. User-level implementation

Shared memory and protection tradeoffs

Problems seen in demanding applications that may launch enormous numbers of threads