1 read-copy update paul e. mckenney linux technology center ibm beaverton jonathan appavoo...

Post on 18-Jan-2018

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

3 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

TRANSCRIPT

1

Read-Copy Update

Paul E. McKenneyLinux Technology CenterIBM Beavertonpmckenne@us.ibm.com, http://www.rdrop.com/users/paulmckJonathan AppavooDepartment of Electrical and Computer EngineeringUniversity of Torontojonathan@eecg.toronto.eduAndi KleenSuSE Labsak@suse.deOrran KriegerIBM T. J. Watson Research Centerokrieg@us.ibm.com, http://www.eecg.toronto.edu/~okriegRusty RussellRustCorprusty@rustcorp.com.auDipankar SarmaLinux Technology CenterIBM India Software Labdipankar.sarma@in.ibm.comManeesh SoniLinux Technology CenterIBM India Software Labsmaneesh@in.ibm.com Liao,Hsiao-Win

2

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

3

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

4

Traditional OS locking designs very complex poor concurrency Fail to take advantage of event-driven

nature of operating systems

5

Race Between Teardown and Use of Service

code executed,Interrupts taken memory error-correction events

6

Read-Copy Update Handling Race

quiescent state

When

7

Read-copy update works best when divide an update into two phases proceed on stale data for common-

case operations (e.g. continuing to handle operations by a module being unloaded)

destructive updates are very infrequent.

8

Implementations ofQuiescent State DYNIX/ptx 2.1 (1993) and Rusty Russell's first wait_for_rcu() patch [Russell01a] simply execute onto each CPU in turn. DYNIX/ptx 4.0 (1994) and Dipankar Sarma's RCU patch for Linux use context switch, execution in the idle loop, execution in user mode, system call entry, trap from user mode, and CPU offline (this last for DYNIX/ptx only) as the quiescent states.

9

Implementations ofQuiescent State Rusty Russell's second wait_for_rcu() patch [Russell01b] uses voluntary context switch as the sole quiescent state Tornado's and K42's "generation" facility tracks beginnings and ends of operations

10

11

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

12

Reference-count v.s Read-copy search() and delete()

read-copy functions avoid all cacheline bouncing for reading tasks read-copy functions can return references to deleted elements read-copy functions cannot hold a reference to elements across a voluntary context switch

13

Typical RCU update sequence Remove pointers to a data structure. Wait for all previous reader to complete

their RCU read-side critical sections. at this point, there cannot be any readers

who hold reference to the data structure, so it now may safely be reclaimed.

14

Read-Copy Deletion (delete B)

15

the first phase of the update

18

16

Read-Copy Deletion

first

18

17

Read-Copy Search

The Task See Table data

18

Read-Copy Deletion

Second

18

19

Read-Copy Deletion

When

20

Read-Copy Deletion

21

Assumptions Read intensive

the update fraction f < 1/ |CPU| Grace period

reading tasks can see stale data requires that the modification be compatible with lock-free access

linked-list insertion, deletion, and replacement are compatible

22

Outline Introduce Toy Example Simple Infrastructure to Support

RCU Application

23

Simple Implementation Wait_for_rcu()

waits for a grace period to expire Kfree_rcu()

waits for a grace period before freeing a specified block of memory.

24

Read-Copy Update Grace Period

non-preemptible kernel execution Quiescentstate execution

25

Simple Grace-Period Detection

26

Rusty Russell's wait_for_rcu() I

27

Rusty Russell's wait_for_rcu() II

28

Shortcomings Not work in a preemptible kernel unless preemption is suppressed in all read-side critical sections Not be called from an interrupt handler Not be called while holding a spinlock or with interrupts disabled Relatively slow

29

Addressing The K42 and Tornado implementations of RCU are such that read-side critical sections can block as well as being preempted—solve 1 Call_rcu() --solve 2 、 3 Kfree_rcu() --solve 2 、 3 High-Performance Design for RCU –solve 2 、 3 、 4

30

K42 and Tornado implementations of RCU maintain two generation counters

current generation non-current generation

Operations (next page)

31

Operation A Operation begins

increment the current counter store a pointer to that counter in the task

the operation ends Decrement generation counter

Periodically, non-current generation is checked to see if it is zero

Reverse current and non-current generations A token is handed from one CPU to next The token returns to a given CPU

All operations across the entire system have terminated.

32

Non-Blocking Grace-Period Detection

Queues callbacks onto alist

invoke all the pending callbacks after forcinga grace period

33

High-Performance Design defer frees of kmem_cache_alloc() memory detects and identifies overly long lock-hold durations “Batching" grace-period-measurement requests Maintaining per-CPU request lists Providing a less-costly algorithm for measuring grace-period duration.

34

Simple Deferred Free a simple implementation of a deferred-free function named kfree_rcu() low performance

kfree_rcu()→wait for rcu()

35

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

36

Application Distributed lock manager TCP/IP Storage-area network (SAN) Application regions manager (which is a

workload-management subsystem) Process management LAN drivers

37

Thanks for your listening

top related