phase reconciliation for contended in-memory transactions
DESCRIPTION
Phase Reconciliation for Contended In-Memory Transactions. Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard. Cloud Computing and Databases. Two trends: multicore and in-memory databases - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/1.jpg)
Phase Reconciliation for Contended In-Memory
TransactionsNeha Narula, Cody Cutler, Eddie Kohler, Robert Morris
MIT CSAIL and Harvard
![Page 2: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/2.jpg)
Cloud Computing and Databases
Two trends: multicore and in-memory databases
Multicore databases face increased contention due to skewed workloads and rising core counts
![Page 3: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/3.jpg)
BEGIN TransactionADD(x, 1)ADD(y, 2)
END Transaction
![Page 4: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/4.jpg)
Throughput on a Contentious Transactional Workload
![Page 5: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/5.jpg)
Throughput on a Contentious Transactional Workload
![Page 6: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/6.jpg)
Concurrency Control Forces Serialization
core 0
core 1
core 2
ADD(x)ADD(y)
ADD(x)ADD(y)
ADD(x)ADD(y)
time
![Page 7: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/7.jpg)
BEGIN TransactionADD(x, 1)ADD(y, 2)
END Transaction
![Page 8: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/8.jpg)
Multicore OS Scalable Counters
core 0
core 1
core 2
x0 = x0+1;
x1 = x1+1;
x2 = x2+1;
Kernel can apply increments in parallel using per-core counters
time
![Page 9: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/9.jpg)
Multicore OS Scalable Counters
core 0
core 1
core 2
x0 = x0+1;
x1 = x1+1;
x2 = x2+1; print(x); x=x0+x1+x2; print(x)
To read per-core data, system has to stop all writes and reconcile x
time
![Page 10: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/10.jpg)
Can we use per-core data in complex database transactions?
![Page 11: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/11.jpg)
BEGIN TransactionADD(x, 1)ADD(y, 2)
END Transaction
BEGIN TransactionGET(x)GET(y)
END Transaction
![Page 12: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/12.jpg)
Challenges
• Deciding what records should be split among cores
• Transactions operating on some contentious and some normal data
• Different kinds of operations on the same records
![Page 13: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/13.jpg)
Phase Reconciliation
• Database automatically detects contention to split a record among cores
• Database cycles through phases: split and joined
• OCC serializes access to non-split data• Split records have assigned operations for
split phases
Doppel, an in-memory transactional database
![Page 14: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/14.jpg)
Outline
1. Phases2. Splittable operations3. Doppel optimizations4. Performance evaluation
![Page 15: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/15.jpg)
Split Phase
• A transaction can operate on split and non-split data• Split records are written to per-core (x)• Rest of the records use OCC (u, v, w)• OCC ensures serializability for the non-split parts of the transaction
core 0
core 1
core 2
ADD(x) ADD(u)
ADD(x) ADD(v)
ADD(x) ADD(w)
GET(x)GET(v)
x0=x0+1
x1=x1+1
x2=x2+1
split phase
![Page 16: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/16.jpg)
Reconciliation
• Cannot correctly process a read of x in current state• Abort read transaction• Reconcile per-core data to global store
core 0
core 1
core 2
ADD(x) ADD(u)
ADD(x) ADD(v)
ADD(x) ADD(w)
GET(x)GET(v)
x=x+x0
x0=0x0=x0+1
x1=x1+1
x2=x2+1
x=x+x1
x1=0
x=x+x2
x2=0
split phase reconciliation
![Page 17: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/17.jpg)
Joined Phase
• Wait until all processes have finished reconciliation• Resume aborted read transactions using OCC• Process all new transactions using OCC• No split data
core 0
core 1
core 2
ADD(x) ADD(u)
ADD(x) ADD(v)
ADD(x) ADD(w)
GET(x)GET(v)
x=x+x0
x0=0GET(x) GET(v)
x0=x0+1
x1=x1+1
x2=x2+1
x=x+x1
x1=0
x=x+x2
x2=0
split phase joined phasereconciliation
![Page 18: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/18.jpg)
Resume Split Phase
• When done, transition to split phase again• Wait for all processes to acknowledge they have
completed joined phase before writing to per-core data
core 0
core 1
core 2
ADD(x) ADD(u)
ADD(x) ADD(v)
ADD(x) ADD(w)
GET(x)GET(v)
x=x+x0
x0=0GET(x) GET(v)
x0=x0+1
x1=x1+1
x2=x2+1
ADD(x) ADD(v)x1=x1+1
x=x+x1
x1=0
x=x+x2
x2=0
ADD(x) ADD(v)x2=x2+1
split phase joined phase split phasereconciliation
![Page 19: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/19.jpg)
Outline
1. Phases2. Splittable operations3. Doppel Optimizations4. Performance evaluation
![Page 20: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/20.jpg)
Transactions and Operations
BEGIN Transaction (y)ADD(x,1)ADD(y,2)
END Transaction
• Transactions are composed of one or more operations
• Only some operations are amenable to splitting
![Page 21: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/21.jpg)
Supported Operations
Splittable
ADD(k,n)MAX(k,n)MULT(k,n)
OPUT(k,v,o)TOPKINSERT(k,v
,o)
Not Splittable
GET(k)PUT(k,v)
![Page 22: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/22.jpg)
MAX Example
core 0
core 1
core 2
MAX(x,55)
MAX(x,10)
MAX(x,21)
x0 = MAX(x0,55)
x1 = MAX(x1,10)
x2 = MAX(x2,21)
x0 = 55
x1 = 10
x2 = 21
• Each core keeps one piece of summarized state xi
• MAX is commutative so results can be determined out of order
![Page 23: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/23.jpg)
MAX Example
core 0
core 1
core 2
MAX(x,55)
MAX(x,10)
MAX(x,21)
x0 = MAX(x0,55)
x1 = MAX(x1,10)
x2 = MAX(x2,21)
x0 = 55
x1 = 27
x2 = 21
• Each core keeps one piece of summarized state xi
• MAX is commutative so results can be determined out of order
MAX(x,27) x1 =
MAX(x1,27)
MAX(x,2)
x1 = MAX(x1,2)
x = 55
![Page 24: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/24.jpg)
MAX Example
core 0
core 1
core 2
MAX(x,55)
MAX(x,10)
MAX(x,21)
x0 = MAX(x0,55)
x1 = MAX(x1,10)
x2 = MAX(x2,21)
x0 = 55
x1 = 27
x2 = 21
• Each core keeps one piece of summarized state xi
• MAX is commutative so results can be determined out of order
MAX(x,27) x1 =
MAX(x1,27)
MAX(x,2)
x1 = MAX(x1,2)
x = 55
![Page 25: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/25.jpg)
What Can Be Split?
Operations that can be split must be:– Commutative– Pre-reconcilable– On a single key
![Page 26: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/26.jpg)
What Can’t Be Split?
• Operations that return a value– ADD_AND_GET(x,4)
• Operations on multiple keys– ADD(x,y)
• Different operations in the same phase, even if arguments make them commutative– ADD(x,0) and MULT(x,1)
![Page 27: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/27.jpg)
Outline
1. Phases2. Splittable operations3. Doppel Optimizations4. Performance evaluation
![Page 28: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/28.jpg)
Batching Transactions
core 0
core 1
core 2
ADD(x) ADD(u)
ADD(x) ADD(v)
ADD(x) ADD(w)
GET(x)GET(v)
x=x+x0
x0=0GET(x) GET(v)
x0=x0+1
x1=x1+1
x2=x2+1
x=x+x1
x1=0
x=x+x2
x2=0
split phase joined phasereconciliation
![Page 29: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/29.jpg)
Batching Transactions
• Don’t switch phases immediately; stash reads• Wait to accumulate stashed transactions• Batch for joined phase
core 0
core 1
core 2
ADD(x) ADD(u)
ADD(x) ADD(v)
ADD(x) ADD(w)
GET(x)GET(v)
x=x+x0
x0=0GET(x) GET(v)
x0=x0+1
x1=x1+1
x2=x2+1
x=x+x1
x1=0
x=x+x2
x2=0
split phase joined phasereconciliation
GET(x) GET(x)GET(x) GET(x)
ADD(x) ADD(v)
x1=x1+1
![Page 30: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/30.jpg)
Ideal World
• Transactions with contentious operations happen in the split phase
• Transactions with incompatible operations happen correctly in the joined phase
![Page 31: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/31.jpg)
How To Decide What Should Be Split Data
• Database starts out with no split data• Count conflicts on records–Make key split if #conflicts >
conflictThreshold
• Count stashes on records in the split phase–Move key back to non-split if #stashes
too high
![Page 32: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/32.jpg)
Outline
1. Phases2. Splittable operations3. Data classification4. Performance evaluation
![Page 33: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/33.jpg)
Performance
• Contention vs. parallelism• What kinds of workloads benefit?• A realistic application: RUBiS
Doppel implementation:• Multithreaded Go server; worker thread per core• All experiments run on an 80 core intel server running 64 bit Linux 3.12• Transactions are one-shot procedures written in Go
![Page 34: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/34.jpg)
Parallel Performance on Conflicting Workloads
Th
rou
gh
pu
t (m
illio
ns
txn
s/se
c)
20 cores, 1M 16 byte keys, transaction: ADD(x,1)
Doppel OCC 2PL0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
![Page 35: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/35.jpg)
Increasing Performance with More Cores
1M 16 byte keys, transaction: ADD(x,1) all writing same key
![Page 36: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/36.jpg)
Varying Number of Hot Keys
20 cores, transaction: ADD(x,1)
![Page 37: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/37.jpg)
How Much Stashing Is Too Much?
20 cores, transactions: LIKE read, LIKE write
![Page 38: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/38.jpg)
RUBiS
• Auction application modeled after eBay– Users bid on auctions, comment, list new items,
search
• 1M users and 33K auctions• 7 tables, 26 interactions• 85% read only transactions
• More contentious workload– Skewed distribution of bids– More writes
![Page 39: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/39.jpg)
StoreBid Transaction
BEGIN Transaction (bidder, amount, item)num_bids_item = num_bids_item + 1if amount > max_bid_item:
max_bid_item = amountmax_bidder_item = bidder
bidder_item_ts = Bid{bidder, amount, item, ts}END Transaction
![Page 40: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/40.jpg)
StoreBid Transaction
BEGIN Transaction (bidder, amount, item)ADD(num_bids_item,1)MAX(max_bid_item, amount)
OPUT(max_bidder_item, bidder, amount)PUT(new_bid_key(), Bid{bidder, amount,
item, ts})END Transaction
All commutative operations on potentially conflicting auction
metadata
![Page 41: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/41.jpg)
RUBiS ThroughputTh
rou
gh
pu
t (m
illio
ns
txn
s/se
c)
20 cores, 1M users 33K auctions
Low Contention High Contention0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
![Page 42: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/42.jpg)
Related Work
• Split counters in multicore OSes– Linux kernel
• Commutativity in distributed systems and concurrency control– [Shapiro ‘11]
– [Li ‘12]
– [Lloyd ‘12]
– [Weihl ‘88]
• Optimistic concurrency control– [Kung ’81]
– [Tu ‘13]
![Page 43: Phase Reconciliation for Contended In-Memory Transactions](https://reader036.vdocuments.mx/reader036/viewer/2022062517/5681351c550346895d9c7786/html5/thumbnails/43.jpg)
Summary
• Contention is rising with increased core counts
• Commutative operations are amenable to being applied in parallel, per-core
• We can get good parallel performance on contentious transactional workloads by combining per-core split data with concurrency control
Neha Narula
http://nehanaru.la
@neha