load balancing & update propagation in tashkent+
DESCRIPTION
Load Balancing & Update Propagation in Tashkent+. Sameh Elnikety Steven Dropsho Willy Zwaenepoel. EPFL. Load Balancing. Traditional: Optimize for equal load on replicas MALB (Memory-Aware Load Balancing) : Optimize for in-memory execution. MALB Doubles Throughput. 37 X. TPC-W - PowerPoint PPT PresentationTRANSCRIPT
Load Balancing &Update Propagation in Tashkent+
Sameh Elnikety Steven Dropsho Willy Zwaenepoel
EPFL
• Traditional: – Optimize for equal load on replicas
• MALB (Memory-Aware Load Balancing):– Optimize for in-memory execution
Load Balancing
MALB Doubles Throughput
37 X TPC-W
Ordering
16 replicas
12 X
25 X
1 X
• Traditional: – Propagate updates everywhere
• Update Filtering: – Propagate updates to where they are needed
Update Propagation
12 X
MALB + UF Triples Throughput
25 X
37 X
TPC-W
Ordering
16 replicas
1 X
• MALB (Memory-Aware Load Balancing):– Optimize for in-memory execution
• Update filtering: – Propagate updates to where they are needed
• Implementation: – Tashkent+ middleware
• Experimental evaluation:– Big performance gain
Contributions
Background: DB Replication
Replica 2
Replica 1
Replica 3
SingleDBMS
Load Balancer
Read Tx
Replica 2
Replica 1
Replica 3
Load Balancer
TT
Update Tx
Replica 2
Replica 1
Replica 3
Load Balancer
TTwswswsws
Traditional Load Balancing
Replica 2
Replica 1
Replica 3
Load Balancer
Equal load on replicas Equal load on replicas
MALB: (Memory-Aware Load Balancing)Optimize for in-memory execution
MALB: (Memory-Aware Load Balancing)Optimize for in-memory execution
How Does MALB Work?
Database 2211 33
Workload A →
B →
MemMemory
2211
22 33
A, B, A, B
A, B, A, B
Read Data From Disk
A, B, A, B
Replica 1
Mem
Disk
2211 33
22
Replica 2
Mem
Disk
2211 33
LeastLoaded
3311
SlowSlow
SlowSlow
A →
B →
2211
22 33
Data Fits in MemoryReplica 1
Mem
Disk
2211 33
2211
Replica 2
Mem
Disk
2211 33
3322
MALB
FastFast
FastFast
A →
B →
2211
22 33A, A, A, A
B, B, B, BMemory info?Many tx and replicas?Memory info?Many tx and replicas?
A, B, A, B
• Exploit tx execution plan– Which tables & indices are accessed– Their access pattern
• Linear scan, direct access
• Metadata from database– Sizes of tables and indices
Estimate Tx Memory Needs
• Objective:– Construct tx groups that fit together in memory
• Bin packing:– Item: tx memory needs– Bin: memory of replica– Heuristic: Best Fit Decreasing
• Allocate replicas to tx groups– Adjust for group loads
Grouping Transactions
Group A
MALB in Action
A B CD E F Replica
Replica
Replica
Group B C
A
Group D E F
B C
D E F
MALB
Disk
Disk
Disk
Memory needs forA, B, C, D, E, F
• Objective:– Optimize for in-memory execution
• Method:– Estimate tx memory needs– Construct tx groups– Allocate replicas to tx groups
MALB Summary
• Traditional: – Propagate updates everywhere
• Update Filtering: – Propagate updates to where they are needed
Update Propagation
Group A
Update Filtering ExampleReplica 1
Group B
Mem
Disk
2211 33
2211
Replica 2
Mem
Disk
2211 33
3322
MALBUF
A →
B →
2211
22 33
A, B, A, B
Group A
Update Filtering Example
Disk
Replica 1
Group B
Mem
2211
2211
Replica 2
Mem
Disk
2211 33
22
MALBUF
Updatetable 1
33
Updatetable 3
33
A →
B →
2211
22 33
A, B, A, B
Update Filtering in Action
UF
Update tored table
Update togreen table
• Middleware:– Based on Tashkent [EuroSys 2006]
• Consistency:– No change in consistency – Our workloads run serializably– GSI (Generalized Snapshot Isolation)
Tashkent+ Implementation
• Compare:– LeastLoaded: optimize for equal load– MALB: optimize for in-memory execution– MALB+UF: propagate updates to where needed
• Environment:– Linux cluster running PostgreSQL– Workload: TPC-W Ordering (50% update txs)
Experimental Evaluation
MALB Doubles Throughput
37 X TPC-W
Ordering
16 replicas105%105%
12 X
25 X
1 X
BigSmall
Big
Small
MemSize
DBSize
Big Gains with MALB
4%4%0%0%29%29%
48%48%105%105%45%45%
182%182%75%75%12%12%
Run from memoryRun from memory
Run from disk
Run from disk
MALB + UF Triples Throughput15 15
7 7
TPC-W
Ordering
16 replicas
4949%%
12 X
25 X
37 X
1 X
1.49
0
0.5
1
1.5
2
MALB MALB+UF
Filtering Opportunities
50%Ordering Mix
5% Browsing Mix
1.02
0
0.5
1
1.5
2
MALB MALB+UF
Updates
• MALB is better than LARD (we explain why)
• Methods for grouping transactions– Better be conservative in estimating memory
• Dynamic reallocation for MALB– Adjust to workload changes
• More workloads– TPC-W browsing & shopping– RUBiS bidding
Checkout the Paper
• MALB (Memory-Aware Load Balancing):– Optimize for in-memory execution
• Update filtering: – Propagate updates to where they are needed
• Implementation: – Tashkent+ middleware
• Experimental evaluation:– Big performance gain, achievable in many cases
Conclusions
• Questions?
• MALB (Memory-Aware Load Balancing):– Optimize for in-memory execution
• Update filtering: – Propagate updates to where they are needed
Thank You
• Transaction type: – SELECT item-name– FROM shopping-cart
– WHERE user = ?
Example
Transaction type: AInstances: A1, A2, A3
A: SELECT * FROM address WHERE id=$1
B: INSERT INTO course VALUES ($1, $2)
C: UPDATE emp SET sal=$1 WHERE id=$2
D: DELETE FROM bid WHERE id=$1
Transaction Types
A: SELECT * FROM address WHERE id=$1
B: INSERT INTO course VALUES ($1, $2)
C: UPDATE emp SET sal=$1 WHERE id=$2
D: DELETE FROM bid WHERE id=$1
Transaction Types
A1: SELECT * FROM address WHERE id=10A2: SELECT * FROM address WHERE id=20A3: SELECT * FROM address WHERE id=76A4: SELECT * FROM address WHERE id=78A5: SELECT * FROM address WHERE id=97
• MALB-S– Size
• MALB-SC– Size and contents– No double counting of shared tables & indices
• MALB-SCAP– Size, contents, and access pattern
• Overflow tx types– Each tx type in its own group
MALB Grouping
Tx Grouping Alternatives
• Collect CPU and disk utilizations
• Maintain smoothed averages
• Group load: average across replicas– Example
• 3 replicas• Data ( CPU, disk ) = (45, 10), (53, 9), (40, 8)• Group load = (46, 9)
• Comparing loads: max( CPU, disk )– Example
• Max(46, 9) = 46%
MALB Allocation – Load
• Move one replica at a time– From least loaded group– To most loaded group
• Future load estimates– 2 replicas, 20% load -> 1 replica, 40%– 6 replicas, 25% load -> 5 replicas, 30%
• Hysteresis– Prevent oscillations– Trigger re-allocation if load difference > 1.25x
MALB Allocation – Basic
• Quickly snap to a good configuration
• Solve set of simultaneous equations
• Example– Current configuration
• Group M, 3 replicas, 70%• Group N, 7 replicas, 10%
– Equations• Mrep + Nrep = 10• (70% * 3) / Mrep = (10% * 7) / Nrep
• Fine tune with additional adjustments
MALB Allocation – Fast
Configuration of MALB-SC
Disk IO
Average Disk IO per Transaction
MALB & Update Filtering
TPC-W Ordering
LeastCon, MALB, MALB+UF
MidDB 1.8GB with different memory sizes
MALB & Update Filtering
SmallDB 0.7GB BigDB 2.9GB
Dynamic Reallocation
• Load balancing– LARD, HACC, FLEX
• Workload of small static files• Do not estimate working set
• Replicated databases– Front-end load balancer
• Conflict-aware scheduling [Pedone06]– Reduce conflict and therefore abort rate
• Conflict-aware scheduling [Amza03]– Correctness
– Partial replication
Related Work
• Replicated database systems– J. Gray et al., The Dangers of Replication and a Solution,
SIGMOD’96.– C. Amza, Consistent Replication for Scaling Back-end Databases for
Dynamic Content Servers, MiddleWare’03.– C. Plattner et al., Ganymed: Scalable Replication for Transactional
Web Applications, Middleware’04.– B. Kemme et al., Postgres-R(SI) Middleware based Data Replication
providing Snapshot Isolation, SIGMOD’05.– K. Daudjee and K. Salem, Lazy Database Replication with Snapshot
Isolation, VLDB’06.
• Networked server systems– V. Pai et al, Locality-aware Request distribution in Clustered-based
Network Servers, ASPLOS’98.
Related Work