memory management issues in non-blocking synchronization
DESCRIPTION
Memory Management Issues in Non-Blocking Synchronization. Maged Michael IBM T J Watson Research Center ISMM 2009. Non-blocking synchronization. Outline. Dynamic memory solves problems in non-blocking algorithms. Dynamic memory raises problems in non-blocking algorithms. - PowerPoint PPT PresentationTRANSCRIPT
IBM T. J. Watson Research Center
Memory Management Issuesin Non-Blocking Synchronization
Maged Michael IBM T J Watson Research Center
ISMM 2009
2 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Outline
Non-blocking synchronization
Dynamic memory solves problems in non-blocking algorithms
Dynamic memory raises problems in non-blocking algorithms
Memory management solutions and tradeoffs
3 Maged Michael Memory Management Issues in Non-Blocking Synchronization
System Model
Shared memory
Scheduler
Memory access primitives Read Write Compare-and-swap ...
Threads
4 Maged Michael Memory Management Issues in Non-Blocking Synchronization
The Scheduler
The scheduler decides when and if to let a ready thread take a step
zzzzzzzzz
Bad decisions by the scheduler can lead to the indefinite prevention of active threads from making progress
The scheduler does not know all dependencies among threads
In some cases (e.g., real-time applications, signal handlers, OS kernels) this is unacceptable as it may lead to deadlock, livelock, or delay of high priority operations
The scheduler can make very bad decisions
5 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Example: Deadlock in Signal Handling
A thread acquires a lock to operate on some shared data
The signal handler needs to acquire the lock
The scheduler decides to interrupt the thread to deliver a signal The signal handler runs
NO LOCKS IN SIGNAL HANDLERS
zzzzzzzzz I need
The interrupted thread will not run until the signal handler completes
The signal handler will not complete until the interrupted thread releases the lock
DEADLOCK
Can’t finishCan’t run
What?
6 Maged Michael Memory Management Issues in Non-Blocking Synchronization
obstruction-freeno blocking
Non-Blocking Progress Guarantees Three levels of non-blocking guarantees
An operation is wait-free, ifwhenever a thread executing the operation takes a finite number of steps,the thread must have completed the operation,regardless of the actions/inaction of other threads.
An operation is lock-free, ifwhenever a thread executing the operation takes a finite number of steps,some thread must have completed the operation,regardless of the actions/inaction of other threads.
An operation is obstruction-free, ifwhenever a thread executing the operation takes a finite number of steps alone,the thread must have completed the operation,regardless of where the other threads stopped.
lock-free
no livelock
wait-freeno starvation
7 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Non-blocking is a property of operations Non-blocking progress is a property of an operation in an
implementation of an abstract shared data type
If all operations in an implementation of an abstract shared data type are non-blocking, then the whole implementation is non-blocking
E.g., A lock-free hash table implementation of a shared set
E.g., The lookup operation in a hash table implementation of a shared set is wait-free, while the insert and remove operations are blocking.
8 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Non-blocking synchronization is not about ...
Non-blocking progress is not about fairness
Non-blocking synchronization is not just about not using locks
No locks Non-blocking
Fair Non-blocking
Non-blocking synchronization is all about ... Delay of any number of threads does not prevent active threads from
making progress
9 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Simple Non-Blocking Example
CAS(X,expval,newval) atomically r := (X == expval) if r X := newval return r
FetchAndIncrement() do oldval := Xuntil CAS(X,oldval,oldval+1)return oldval
Read() return X
Read is wait-free. Completes in one step.
Structures X : integer
operations Read() : integerFetchAndIncrement(): integer
FetchAndIncrement is lock-free.Whenever one loop iteration (two steps) is executed, some operation must have completed.
Lock-Free Counter
10 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Dynamic Memory and Non-Blocking Algorithms Dynamic memory solves problems
Atomic access to large blocks
ABA problem
Dynamic memory causes problems
Persistent pointers
Memory reclamation problem
Non-blocking allocation and deallocation
ABA problem
11 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Atomic Access to Multiple Words
Place multi-word data in a dynamic block
Some algorithms need to operate atomically on multiple or large locations that exceed the size of HW atomic primitives
u
X
atomically ret := X == u if ret X := v
A common solution in non-blocking algorithms
Updates replace the block
u
v
P
E.g., Wide CAS
Solved one problem
Created more problems
ptr := P ret := (*ptr == u) if ret newb := new Block(v) ret := CAS(P,ptr,newb) delete ret ? ptr : newb
unsafe access
allocation
ABA problem
unsafe reclamationdeallocation
12 Maged Michael Memory Management Issues in Non-Blocking Synchronization
The ABA Problem
P
uA
wB
1 Thread i reads A from P
Thread j sets P to B
Thread j reuses block A to hold value z
zA
vC
Thread j sets P to A again
Thread i checks that P is equal to ACAS succeeds although *P == z != u
3
4
5
7
Thread i reads u from *A2
1 ptr := P ret := (*ptr == u) if ret newb := new Block(v) ret := CAS(P,ptr,newb) delete ret ? ptr : newb
7
2
6
Thread i allocates block C to hold value v6
INCORRECT OUTCOME
Problem: CAS cannot tell if P changed or not
Example
13 Maged Michael Memory Management Issues in Non-Blocking Synchronization
The ABA Problem
1. A thread i reads a value A from a shared variable X
2. Other threads change X to a different value B and then back to A again3. Thread i checks X using a primitive that cannot tell if X changed,
finds X equal to A, and acts as if X never changed
Primitives susceptible to the ABA problem include read and variants of CAS
This interleaving of events is a necessary but not sufficient condition for the ABA problem. In some cases, the effect is benign.
14 Maged Michael Memory Management Issues in Non-Blocking Synchronization
LIFO Linked List: Classic ABA ExamplePop
1
1
Thread i reads A from Anchor
Thread i reads B from *A
Thread j pops A and B
Thread i checks that Anchor is equal to A, sets Anchor to B
The List is corrupted
2
3
5
do first := Anchor next := *firstuntil CAS(Anchor,first,next)return first
2
Introduced in IBM System 370 documentation in the 1970s
Anchor
5A B
Thread j pushes A back4
C
15 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Classic Solution: ABA Tags
Pop
1
1
Thread i reads [A,tag] from Anchor
Thread i reads B from *A
Thread j pops A and B
Thread i finds Anchor != [A,tag] and CAS fails as it should
2
3
5
do [first,tag] := Anchor next := *firstuntil CASD(Anchor,[first,tag],[next,tag+1])return first
2
Introduced in IBM System 370 documentation in 1983
5
A B
Thread j pushes A back, sets Anchor to [A,tag+2]4
C
Pack a tag with the shared variable. Increment tag upon every pop. Use double-width primitives
Anchor
100
Anchor
102
ABA problem prevented
16 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Pros and Cons of ABA Tags
Pros
Cons
Wait-free
Not portable: Requires wide primitives when packed with a full word.
Complicates/prevents reclamation of dynamic memory
A theoretical chance of exact wraparound if tag size is exceeded
Low time and space overheads
17 Maged Michael Memory Management Issues in Non-Blocking Synchronization
ABA-Immune Primitives
Inherently immune to the ABA problem
ABA solutions are often represented as LL/SC/VL implementations using practical primitives
LL(X) : value atomically return X
VL(X) : boolean atomically return X not written by others since last LL
LoadLinked (LL), Validate (VL), StoreConditional (SC)
Only partially supported on real architectures
SC(X,v) : boolean atomically r := VL(X) if (r) X := v return r
do first := LL(Anchor) next := *firstuntil SC(Anchor,next)return first
Pop
18 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Benign ABA Cases Example: Between the read of X and a successful CAS, the value of X might
have changed and returned back to its old value, but the outcome is still correctdo old := Xuntil CAS(X,old,old+v)
AtomicAdd(X,v)
Anchor
Push(block)do first := Anchor.ptr *block := firstuntil CAS(Anchor.ptr,first,block)
Another example is Push in a LIFO list
LL/SC/VL are unnecessarily strong as they prevent benign cases
19 Maged Michael Memory Management Issues in Non-Blocking Synchronization
The Memory Reclamation Problem
P
uA
wB
1 Thread i reads pointer value A from P
Thread i accesses free memory3
Thread j sets P to B and frees A to OS2
1 ptr := P ret := (*ptr == u) if ret newb := new Block(v) ret := CAS(P,ptr,newb) delete ret ? ptr : newb
3
ACCESS VIOLATION
Example
returned to OS
20 Maged Michael Memory Management Issues in Non-Blocking Synchronization
The Memory Reclamation Problem A thread i reads a pointer to a dynamic memory location
Another thread j removes the block and frees it
Thread i dereferences the pointer to access the freed block– Thread i might read/write unmapped memory
access violation
– Thread i might read unrelated data from the recycled block
return incorrect result
– Thread i might write into the recycled node
corrupt some shared structure
How to be able to reclaim dynamic memory blocks removed from non-blocking structures and guarantee that no thread will access the contents of free blocks?
21 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Memory Reclamation and the ABA ProblemTwo different but related problems
The ABA problem can occur even when no dynamic memory is used at all
Solving the memory reclamation problem often prevents some but not all cases of the ABA problem
– E.g., array-based structures
Memory reclamation is all about dynamic memory
No dynamic memory use No memory reclamation problem
No dynamic memory use No ABA problem
Complete ABA solutions can be constructed by using memory reclamation solutions
22 Maged Michael Memory Management Issues in Non-Blocking Synchronization
How does GC help?
Prevents the ABA problem if
Other ABA cases can use an extra level of indirection to be preventable by GC
– The ABA problem only involves pointers to dynamic blocks
Completely solves the memory reclamation problem
– Once a dynamic block is removed, it is not reinserted (in the same structure) before going through GC
inserted removed
reclaimedallocated
may be reinserted
P
always reclaimed before reuse
do first := Anchor next := *firstuntil CAS(Anchor,first,next)return first
Pop correct under GC
– The contents of a dynamic block are never changed while it is globally reachable
23 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Memory Reclamation Approaches
– Epoch-based
– Reference counting
– Hazard pointers
24 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Epoch-Based Solutions– E.g., RCU (read-copy-update) heavily-used in the Linux kernel
– Depend on the notion of quiescence points, where a thread is guaranteed not to hold references to removable memory blocks
– Typically use per-thread timestamp– A removed block is removed only after each thread (that could have had access to it) has gone through at least one
quiescence point after the block was removed
Pros:– Fast reading (no time overhead per dereference)
Cons:– In user level, either blocking or can result in an unbounded number of not-yet-reclaimed removed blocks
– No reader interference. No writer starvation by readers.
25 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Per-Block Reference Counting
Pros:
Cons:
– Reader-reader contention
– O(n) bound on not-yet-reclaimed removed blocks
– Writer starvation by readers possible– To reclaim blocks for arbitrary reuse, requires either
• DCAS (CAS on two locations), or• Extra level of indirection and extra space per pointer
– Lock-free
Threads increment or decrement a per-block reference counter whenever they create or destroy references to the block
26 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Hazard Pointers A hazard pointer is single-writer multi-reader pointer
do do first := Anchor *myHP := first until Anchor == first next := *first until CAS(Anchor,first,next)*myHP := nullreturn first
Pop
safe access: first will not be freed no ABA: first will not be inserted
As long as *myHP remains equal to first
Each hazard pointer has one owner (that can write to it)
By setting a hazard pointer to the address of a dynamic block, the owner thread is telling other threads: “if any of you remove this block after the last time I set this hazard pointer to this block
don’t reclaim this block until I change my hazard pointer”
27 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Reclaiming Blocks under Hazard Pointers After accumulating a number of removed nodes
1. Read active hazard pointers. Keep private copy of non-null values• Private copy can be arranged in an efficient search structure
e.g., hast table with constant expected lookup time
2. For each removed block, do a lookup in the private structure• Found? Keep block for next scan of hazard pointers• Not found? It is safe to reclaim the
28 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Hazard Pointers
Pros:
Cons:
– Constant expected time per reclaimed block
– Worst case O(m.n) not-yet-reclaimed removed blocks
m is number of active removing threads (readers of hazards pointers)n is max. num. of active traversing threads (writers of hazard pointers)
– Wait-free– No atomic instructions needed
• Even reads and writes to hazard pointers can be nonatomic
– No reader interference, and no writer starvation
• O(m) bound possible, but at the cost of O(n) time per reclaimed block
29 Maged Michael Memory Management Issues in Non-Blocking Synchronization
The Persistent Pointers Problem Some non-blocking algorithms require some pointers in removed blocks to retain
their values (as long as there are direct or indirect references to the blocks)
This is done for simplicity
But it can lead to unbounded memory use
zzzzz
Example:
– Simple linked list traversal
– But pointers in removed blocks cannot be nullified
– Unbounded memory
30 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Avoiding Persistent Pointers Just don’t use persistent pointers in algorithms Algorithms should be designed such that pointers in removed blocks are immediately nullifiable
zzzzz
But traversal becomes a bit more complicated– Double-check that previous node still points to the current one before moving on to the next
31 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Persistent Pointers and Memory Reclamation Algorithms with persistent pointers are limited in the memory reclamation solutions/approaches that they can use
– The restricted reuse (no reclamation) approach? NO• No, because the approach implies the possibility of immediate reuse of removed blocks.
– GC/reference counting/epoch-based solutions? YES• Yes, because these methods do not reclaim blocks that are indirectly reachable from a private reference.• But this same feature can lead to unbounded memory use with persistent pointers
– Hazard pointers? NO in general• No, because hazard pointers allow the reclamation of blocks that are indirectly reachable from private references
32 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Dynamic Memory Allocation and DeallocationNon-blocking algorithms that use dynamic memory need a non-blocking allocator to manage the reuse of reclaimed blocks
The key challenge in building a non-blocking allocator is the capability to coalesce free blocks for arbitrary reuse or to be returned to the OS
33 Maged Michael Memory Management Issues in Non-Blocking Synchronization
High-Level Design of A Non-Blocking Allocator Use coalescing units (superblocks) rather than arbitrary coalescing Keep track of each superblock’s state to detect when its blocks become fully free.
– Use a separate descriptor to avoid memory reclamation problems.
Manage the free blocks in a superblock as a linked list Manage both the free blocks list and the superblock state together atomically
superblock descriptorstate
superblock
34 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Allocation First try a fast path of allocation from the active superblock (of the appropriate heap)
If there is no active superblock, then try to find a partially allocated superblock to make it active
If not, then allocate a new superblock of an appropriate size, divide it, and make it the active superblock after taking a block
35 Maged Michael Memory Management Issues in Non-Blocking Synchronization
descriptor
heap header ptr 6
0 allocated1 2 allocated3 4 567
Malloc (common case)
superblock
Active superblock Done
Identify heap based onrequested block size and thread id
5 new block
headhead count
state
6 ACTIVE
1. Read header2. Read descriptor packed state3. Recheck header4. Read next pointer of first block5. CAS changes to packed state
5 ACTIVE
ABA tag
36 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Deallocation Push the freed block back into its superblock
If the superblock was fully allocated, now it becomes partially allocated and needs to be added to the set of partially allocated superblocks
If the superblock was partially allocated and now fully free, then remove it from the set of partially allocated superblocks and coalesce it
37 Maged Michael Memory Management Issues in Non-Blocking Synchronization
descriptor
heap header
0 allocated1 2 allocated3 4 5 to be freed67
Free (common case)
descriptor superblock
Active superblock
head
Done
The block header points to the descriptor of the original superblock
unreservedcount
state
5 ACTIVE6 ACTIVE
5 free
1. Read descriptor packed state2. Set next pointer of freed block3. CAS changes to packed state
38 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Superblock Lifecycle
ACTIVE BUSY
PARTIALFREE
Taking the last block
Freeing the last block
Freeing the first block
No Active superblock
New superblock
Unmap or reuse arbitrarily
not Active count = 0
not Activecount = total
not Active 0 < count < total
Activeany count
39 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Dealing with Memory Managementin Non-Blocking Algorithms
First, abstract away memory management problems to focus on the core algorithm
After designing the core algorithm under these assumptions, the options for dealing with memory management remains open and it is easier to weigh the trade-offs among the solutions
But, avoid abstractions that limit the memory management solutions or hide problems
Memory reclamation: Assume perfect GC but with explicit deallocation
ABA: Think in terms of ABA just not happening rather than LL/SC/VL
Consider ABA and memory reclamation solutions together
40 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Non-Blocking GC and Its Challenges Can we build a pure user-level non-blocking GC without special scheduler support?
The biggest challenge for a non-blocking GC is for the collector to find out the private references of the mutators at any arbitrary point, and to do so efficiently
Yes. One can use memory reclamation methods as a foundation. But it will be slow
Non-blocking memory reclamation methods add per-reference overheads
Adding these overheads to basically every load and store that may create or destroy a private reference may be prohibitively high
41 Maged Michael Memory Management Issues in Non-Blocking Synchronization
Concluding Remarks
Memory management solves problems and creates problems in the design of non-blocking algorithms
There were many advances in non-blocking memory management in this decade but there is space for more
Non-blocking synchronization is intertwined with memory management
The memory reclamation and ABA problems occur under blocking optimistic concurrency
42 Maged Michael Memory Management Issues in Non-Blocking Synchronization
THANK YOU