1 read-copy update paul e. mckenney linux technology center ibm beaverton jonathan appavoo...

37
1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton [email protected], http://www.rdrop.com/users/paulmck Jonathan Appavoo Department of Electrical and Computer Engineering University of Toronto [email protected] Andi Kleen SuSE Labs [email protected] Orran Krieger IBM T. J. Watson Research Center [email protected], http://www.eecg.toronto.edu/~okrieg Rusty Russell RustCorp [email protected] Dipankar Sarma Linux Technology Center IBM India Software Lab [email protected] Maneesh Soni Linux Technology Center IBM India Software Lab [email protected] Liao,Hsi ao-Win

Upload: darrell-lynch

Post on 18-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

3 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

TRANSCRIPT

Page 1: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

1

Read-Copy Update

Paul E. McKenneyLinux Technology CenterIBM [email protected], http://www.rdrop.com/users/paulmckJonathan AppavooDepartment of Electrical and Computer EngineeringUniversity of [email protected] KleenSuSE [email protected] KriegerIBM T. J. Watson Research [email protected], http://www.eecg.toronto.edu/~okriegRusty [email protected] SarmaLinux Technology CenterIBM India Software [email protected] SoniLinux Technology CenterIBM India Software [email protected] Liao,Hsiao-Win

Page 2: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

2

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

Page 3: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

3

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

Page 4: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

4

Traditional OS locking designs very complex poor concurrency Fail to take advantage of event-driven

nature of operating systems

Page 5: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

5

Race Between Teardown and Use of Service

code executed,Interrupts taken memory error-correction events

Page 6: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

6

Read-Copy Update Handling Race

quiescent state

When

Page 7: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

7

Read-copy update works best when divide an update into two phases proceed on stale data for common-

case operations (e.g. continuing to handle operations by a module being unloaded)

destructive updates are very infrequent.

Page 8: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

8

Implementations ofQuiescent State DYNIX/ptx 2.1 (1993) and Rusty Russell's first wait_for_rcu() patch [Russell01a] simply execute onto each CPU in turn. DYNIX/ptx 4.0 (1994) and Dipankar Sarma's RCU patch for Linux use context switch, execution in the idle loop, execution in user mode, system call entry, trap from user mode, and CPU offline (this last for DYNIX/ptx only) as the quiescent states.

Page 9: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

9

Implementations ofQuiescent State Rusty Russell's second wait_for_rcu() patch [Russell01b] uses voluntary context switch as the sole quiescent state Tornado's and K42's "generation" facility tracks beginnings and ends of operations

Page 10: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

10

Page 11: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

11

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

Page 12: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

12

Reference-count v.s Read-copy search() and delete()

read-copy functions avoid all cacheline bouncing for reading tasks read-copy functions can return references to deleted elements read-copy functions cannot hold a reference to elements across a voluntary context switch

Page 13: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

13

Typical RCU update sequence Remove pointers to a data structure. Wait for all previous reader to complete

their RCU read-side critical sections. at this point, there cannot be any readers

who hold reference to the data structure, so it now may safely be reclaimed.

Page 14: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

14

Read-Copy Deletion (delete B)

Page 15: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

15

the first phase of the update

18

Page 16: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

16

Read-Copy Deletion

first

18

Page 17: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

17

Read-Copy Search

The Task See Table data

Page 18: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

18

Read-Copy Deletion

Second

18

Page 19: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

19

Read-Copy Deletion

When

Page 20: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

20

Read-Copy Deletion

Page 21: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

21

Assumptions Read intensive

the update fraction f < 1/ |CPU| Grace period

reading tasks can see stale data requires that the modification be compatible with lock-free access

linked-list insertion, deletion, and replacement are compatible

Page 22: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

22

Outline Introduce Toy Example Simple Infrastructure to Support

RCU Application

Page 23: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

23

Simple Implementation Wait_for_rcu()

waits for a grace period to expire Kfree_rcu()

waits for a grace period before freeing a specified block of memory.

Page 24: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

24

Read-Copy Update Grace Period

non-preemptible kernel execution Quiescentstate execution

Page 25: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

25

Simple Grace-Period Detection

Page 26: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

26

Rusty Russell's wait_for_rcu() I

Page 27: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

27

Rusty Russell's wait_for_rcu() II

Page 28: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

28

Shortcomings Not work in a preemptible kernel unless preemption is suppressed in all read-side critical sections Not be called from an interrupt handler Not be called while holding a spinlock or with interrupts disabled Relatively slow

Page 29: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

29

Addressing The K42 and Tornado implementations of RCU are such that read-side critical sections can block as well as being preempted—solve 1 Call_rcu() --solve 2 、 3 Kfree_rcu() --solve 2 、 3 High-Performance Design for RCU –solve 2 、 3 、 4

Page 30: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

30

K42 and Tornado implementations of RCU maintain two generation counters

current generation non-current generation

Operations (next page)

Page 31: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

31

Operation A Operation begins

increment the current counter store a pointer to that counter in the task

the operation ends Decrement generation counter

Periodically, non-current generation is checked to see if it is zero

Reverse current and non-current generations A token is handed from one CPU to next The token returns to a given CPU

All operations across the entire system have terminated.

Page 32: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

32

Non-Blocking Grace-Period Detection

Queues callbacks onto alist

invoke all the pending callbacks after forcinga grace period

Page 33: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

33

High-Performance Design defer frees of kmem_cache_alloc() memory detects and identifies overly long lock-hold durations “Batching" grace-period-measurement requests Maintaining per-CPU request lists Providing a less-costly algorithm for measuring grace-period duration.

Page 34: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

34

Simple Deferred Free a simple implementation of a deferred-free function named kfree_rcu() low performance

kfree_rcu()→wait for rcu()

Page 35: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

35

Outline Introduce Toy Example Simple Infrastructure to Support RCU Application

Page 36: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

36

Application Distributed lock manager TCP/IP Storage-area network (SAN) Application regions manager (which is a

workload-management subsystem) Process management LAN drivers

Page 37: 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton  Jonathan Appavoo Department

37

Thanks for your listening