cafÉ: scalable task pool with adjustable fairness and contention
DESCRIPTION
CAFÉ: Scalable Task Pool with Adjustable Fairness and Contention. Dmitry Basin, Rui Fan, Idit Keidar , Ofer Kiselov , Dmitri Perelman. Technion , Israel Institute of Technology. Task Pools. Exist in most server applications: Web Servers, e.g., building block of SEDA Architecture - PowerPoint PPT PresentationTRANSCRIPT
CAFÉ: Scalable Task Pool with Adjustable Fairness and Contention
Dmitry Basin, Rui Fan, Idit Keidar, Ofer Kiselov, Dmitri Perelman
Technion, Israel Institute of Technology
Task Pools
Exist in most server applications:Web Servers, e.g., building block of SEDA ArchitectureHandling asynchronous requests
Ubiquitous programming pattern for parallel programs
Scalability is essential!
Task PoolProducer
sConsumers
Shared Memory
Typical Implementation: FIFO QueueHas inherent scalability problem
Do we really need FIFO ?In many cases no!We would like to:
Relax the requirementControl the degree of relaxation
contention pointsShared Memory
CAFÉ: Contention and Fairness ExplorerOrdered list of scalable bounded non-FIFO pools
TreeContainer size controls contention-fairness trade-off
garbage collected
TreeContainer TreeContainer TreeContainerJava VM
Less fairnessLess contention
More fairnessMore contention
Tree height0
Pure FIFO
TreeContainer (TC) SpecificationBounded containerA put operation can fail if no free space
foundA get operation returns a task or null if TC
is emptyRandomized algorithms for put and get
operations
TreeContainer Data StructureComplete binary
tree
Free node
Used node without a
task
Occupied node containing a
task
Right sub-tree doesn’t
contain tasks
Left sub-tree has tasks
Get/Put OperationsStep 1: find target node
By navigating the treeStep 2: perform put/get on that node
Need to handle racesStep 3: update routes
Update meta-data on path to node – tricky!
Task
Get() OperationGet(): Start from the root
TreeContainer
CAS
Close to the root updates
are rare
Step3: Update routes up to
the root
Step1: Navigate to a
task by random walk on arrows
graph
Step 2: Extract the task.
Put() OperationLevel 0
Level 1
Level 2
Level 3
TreeContainer
Random node
occupied
Random node
occupied
Random node
√ free
Taskput():
Every level of the tree implemented by array of nodes
occupied
Random node
Put() OperationTaskput():
Go to the highest free predecessor
CAS operation
TreeContainer
Level 0
Level 1
Level 2
Level 3Random node
√ free
Finished Step 1: found free node
Step 2: occupy the free node
Put() Operationput(): Close to the root, updates
are raretrue
TreeContainer
Upda
te rou
tes
RacesConcurrency issues are not trivial :)Challenge:
guarantee linearizabilityavoid updating all the metadata up to the
root upon each operationSee the paper
TreeContainer propertiesPut/Get operations are
linearizablewait-free
Under the worst-case thread scheduling:Good step complexity of puts
When N nodes occupied - O(log2N) whpDoes not depend on TC size
Good tree density (arbitrarily close to 2h whp)
TreeContainer
CAFÉ Data Structures
TC TC
GT
TC
PT
CAFÉ Data Structures
TC TC TC
PT
TC
TC.Put(task) false
Allocate and connect new TC
TC.Put(task) true
GT
TC
CAFÉ:get() from Empty TC
TC TC TC
PT
TC
TC.Getnull TC.Gettask
Garbage collected
GT
CAFÉ: Races
TC TC TC
PT
TC
Suspended producer thread
The task is lost for consumers
GT
CAFÉ: Handling Races – Try 1
TC TC TC
PT
TC
Move GT back
Check if GT bypassed TC GT
CAFÉ: Races (2)
TC TC TC
PT
TCTC.Get null
Consumer threadProducer thread
Going to
move GT
forward
TC.Put(task) true
Consumers can access the task
can terminateTask is lost
GT
CAFÉ: Handling Races – Try 2
TC TC TC
PT
TC
Read prev,If empty read curr
GTcur>
<prev
Lock-Free. To make it wait-free we do additional tricks.
CAFÉ: PropertiesSafety:
Put()/Get() operations are linearizableWait-freedom:
Get() operations are deterministically wait-free
Put() operations are wait-free with probability 1
Fairness:Preserves order among trees
Evaluation SetupCompared pools:
LBQ: Java 6 FIFO blocking queueCLQ: Java 6 FIFO non-blocking queue (M&S)EDQ: non-FIFO Elimination-Diffraction Tree
Queue
Evaluation server:8 AMD Opteron quad-cores total 32 cores
CAFÉ evaluationThroughput
CAFÉ-13: CAFÉ with tree height 13
LBQ: Java 6 FIFO blocking queue CLQ: Java 6 FIFO non-blocking
queue (M&S) EDQ: non-FIFO Elimination-
Diffraction Tree Queue Throughput as a function of thread number
factor of 30over lock-free implementations
CAFÉ evaluationThroughput
CAFÉ: CAFÉ queue LBQ: Java 6 FIFO blocking queue CLQ: Java 6 FIFO non-blocking
queue (M&S) EDQ: non-FIFO Elimination-
Diffraction Tree Queue .
CAFÉ throughput as a function of TreeContainer height
CAFÉ evaluationCAS-failures CAS failures per operation as a function of TreeContainer height
SummaryCAFÉ:
EfficientWait-FreeWith adjustable fairness and contention
Thank you