Dynamic Resource Allocation for Database Servers Running on Virtual Storage
Gokul Soundararajan, Daniel Lupei, Saeed Ghanbari, Adrian Daniel Popescu, Jin Chen, Cristiana Amza
University of Toronto
1
Multi-tier Resource Allocation
Consolidated Environment
Web Server
Application Server
Database Server
2
Composed of several tiers
Application-BApplication-A
Share resources in each tier
Can lead to interference
Storage
Database
Our Focus: Storage Hierarchy
Application-A Application-B
3
Buffer Pool
StorageCache
DiskBandwidth
Want to use all resources efficiently
Disk is a bottleneck for Database Apps
Network
State of the Art
‣ Previous work studied resources in isolation
- Memory Partitioning: MRC [ASPLOS’04]
- Disk Bandwidth: Facade [FAST’03], Argon [FAST’07], etc.
- ... and many more
‣ Want to use the storage hierarchy efficiently
‣However, performance depends on all layers
- Interdependency between resources
- E.g., Increasing buffer pool reduces number of storage accesses
4
Motivating Scenario
Small Large
5
Buffer Pool
StorageCache
DiskBandwidth
Cache Friendly1 Outstanding I/O
Cache Un-Friendly10 Outstanding I/O
Using Oracle ORION I/O tool
Motivating Scenario
0
2.5
5.0
7.5
Shared Cache Disk Cache & Disk
Small Large6
Nor
mal
ized
Lat
ency
Benefits cache-friendly workload
Avoids disk interference
Has best performance
Contributions
‣ Build performance models dynamically
- Account for interdependencies between resources
- Lightweight but still accurate
‣Multi-level Resource Allocator
- Uses performance models to guide resource allocation
- Corrects model errors through runtime sampling
- Uses global utility (SLOs) to partition resources
- Minimize sum of application latencies
7
Approach
8
‣ Build performance models
- One per application
- Derive function to predict application latency given configuration
‣ Find resource partitioning setting
- Minimize sum of application latencies
- Find best setting using hill climbing
Lavg = f(ρc, ρs, ρd)
Outline
‣ Online Performance Models
- What are they?
- Why are they hard to build?
‣Multi-level Resource Allocator
‣ Prototype Implementation
‣ Experimental Results
‣Conclusions
9
One-Level Cache Model
10
Rd(A)
Cache size
Avg.
Lat
ency
Allocate in 32MB chunks
32 64 512
m=CacheSize/ChunkSize=1GB/32MB=32 choices
1G
32 choices
Choose 512MB
... ...
MRC Cache Model
11
Rd(A)
Cache size
Mis
s-Ra
tio
32 64 512 1G... ...
Computes miss-ratio
given an I/O trace
Multiply by I/O latency gets Avg. Latency
Two-Level Cache Model
‣ Performance affected by
- DB Buffer Pool Size (m choices)
- Storage Cache (n choices)
‣ Performance model
- Needs to consider all parameters (m*n choices)
- 1GB caches allocated in 32MB chunks
- m = 1GB/32MB = 32 settings
- m*n = 1024 distinct settings
12
Changes the I/O trace at
storage
Two-Level Cache Model
13
0.5
1
1.5
2
2.5
3
3.5
4
256
512
768
1024
256
512
768
1024
0
2
4
Latency Surface
Buffer Pool Size (MB)
Storage Cache Size (MB)
2 caches create a 3D surface
Buffer Pool Size
Storage
Cache Size
Avg
. La
tenc
y
32x32=1024 data points!
32 data points
HighLatency
LowLatency
Overall Performance Model
14
Application
ρc
ρs
ρd
Buffer Pool
StorageCache
DiskBandwidth
Needs 32x32x10=10240
samples
15 mins/sample takes 3 months!
Outline
‣ Online Performance Models
‣Multi-level Resource Allocator
- Building performance models
- Allocating resources using models
‣ Prototype Implementation
‣ Experimental Results
‣Conclusions
15
Key Observations
‣ Known cache replacement policies
- Most cache replacement algorithms are LRU
- Only as effective as the largest cache (cache inclusiveness)
‣Disk is a closed loop system
- Rate of responses is same as rate of requests
- Performance proportional to the disk bandwidth fraction
16
Cache Inclusiveness
17
LRU
LRU
8 8 1 6 8 4 5 6 3 48
8
I/Os: 0I/Os: 1
8
Cache Inclusiveness
18
LRU
LRU
8 8 1 6 8 4 5 6 3 48 8 1 6
I/Os: 6
8 4 5 6 3 4
34
34 56 8
Storage cache includes data in the buffer pool
Cache Inclusiveness
19
LRU
LRU
8 8 1 6 8 4 5 6 3 48 8 1 6
I/Os: 6
8 4 5 6 3 4
3 5
34 56 8Buffer pool
includes data in the storage cache
Approximate Single Cache Model (LRU)
20
8 8 1 6 8 4 5 6 3 48 8 1 6 8 4 5 6 3 4
I/Os: 6
Same Number of I/Os
LRU
LRU34
34 56 8
Mc(max[ρc, ρs])
34 56 8 LRU
Cache Model (DEMOTE)
21
‣ Maintain cache exclusiveness
- E.g., using DEMOTEs [USENIX’02]
- Every block brought into buffer pool is not cached below
- Only evictions from buffer pool cached in storage cache
‣ Approximate performance using single cache
- Mc(ρc + ρs)
Find Best Partitioning Setting
22
Find Best Resource Allocation Setting
Buffer Pool Size
Storage
Cache Size
Late
ncy
App-1
App-2
Buffer Pool Size
Storage
Cache Size
Late
ncy Minimize sum of
application latencies
‣ Observation: Closed loop system
- Rate of responses same as rate of requests
- Use interactive response time law
‣ Performance proportional to disk bandwidth fraction
- Measure base disk latency:
- Predict latency for smaller bandwidth fractions
Disk Model
23
Ld(ρd) =Ld(1)
ρd
Ld(1)
Putting it All Together
24
Application
Mc(ρc)Ms(ρc, ρs)N
Storage Cache
Buffer Pool
ApproximateSingle-Level
CacheCan now be
solved using MRC
=Mc(max[ρc, ρs])N
Putting it all Together
Application
Hc(ρc)Lc
Mc(ρc)Hs(ρc, ρs)Lnet
Mc(ρc)Ms(ρc, ρs)Ld(ρd)
25
Inaccuracies in the Model
‣ Cache Model
- Approximations to LRU, i.e., CLOCK
- Large fraction of writes in the workload
‣Disk Model
- Using Quanta-based scheduler [Wachs et. al, FAST’07]
- Interference due to disk seeks at small quanta
‣ Inaccuracies localized in known regions
- E.g., Small disk quanta
26
Iterative Refinement
‣ Build model
- Use trace collected at the database buffer pool
‣ Refine the model
- Use cross-validation to measure quality
- Selectively sample where error is high
- Interpolate computed and measured samples
- Using regression (SVM)
27
Virtual Storage Prototype
28
StorageMySQL
Linux
NBD
CLIENT
Block Layer
SCSI
Buffer Pool
Disk
SERVER
NBD
Linux
Block Layer
SCSI
Disk
Ne
two
rk
DiskDisk
Cache Quanta
Experimental Setup
‣ Benchmarks
- UNIFORM (microbenchmark), TPC-W and TPC-C
‣ LAMP Architecture
- Linux, Apache 1.3, MySQL/InnoDB 5.0, and PHP 5.0
‣ Cache Configuration
- MySQL buffer pool = 1GB
- Storage cache = 1GB
- Using InnoDB cache replacement in MySQL, CLOCK in storage cache
29
Our Algorithms
30
‣ GLOBAL
- Gather trace at the buffer pool
- Measure base disk latency
- Compute performance using performance model
‣ GLOBAL+
- Run GLOBAL
- Evaluate model accuracy
- Refine model using runtime samples
Algorithms for Comparison
31
‣ MRC
- Partition cache (independently) using miss-ratio curves
‣DISK
- Partition caches equally, determine best disk quanta
‣MRC+DISK
- Run MRC then DISK
‣ IDEAL*
- Build model with SVM using 16*16*5=1280 sampled configurations
Roadmap of Results
32
‣ Multi-level cache allocator
- Using LRU and DEMOTE cache replacement policies
‣ Multi-level cache and disk
‣ Accuracy of computed models
Miss-Ratio Curves
33
0
25
50
75
100
0 128 256 384 512 640 768 896 1024
Miss
Rat
io (%
)
Buffer Pool Size (MB)
TPC-WTPC-C
UNIFORM
Multi-Level Caching (LRU)
34
Optimal
HeatMapLighter: BetterDarker: WorseOptimal
Storage Cache Size(A)
Buf
fer
Po
ol S
ize
(A)
1G0
1G2 TPC-W Instances
Multi-Level Caching (DEMOTE)
35
Optimal
Storage Cache Size(A)
Buf
fer
Po
ol S
ize
(A)
1G0
1G2 TPC-W Instances
Roadmap of Results
36
‣ Multi-level cache allocator
‣ Multi-level cache and disk
- Using two identical applications
- Using different applications
‣ Accuracy of computed models
UNIFORM/UNIFORM
0
1.682
3.364
5.046
6.728
8.410
GLOBAL GLOBAL+ MRC DISK MRC+DISK IDEAL*
UNIFORM UNIFORM
37
Ave
rag
e La
tenc
y (m
s)
Allocate caches to 50/50
Matches GLOBAL
TPC-W/UNIFORM
0
1.5
3.0
4.5
6.0
7.5
GLOBAL GLOBAL+ MRC DISK MRC+DISK IDEAL*
TPC-W UNIFORM
38
Ave
rag
e La
tenc
y (m
s)
Allocate caches to 50/50
Not enough buffer pool to
UNIFORM Compensate for MRC settings
TPC-W/TPC-C
0
0.16
0.32
0.48
0.64
0.80
GLOBAL GLOBAL+ MRC DISK MRC+DISK IDEAL*
TPC-W TPC-C
39
Ave
rag
e La
tenc
y (m
s)
Corrects model at runtime
Corrects imbalance in
MRC
TPC-C allocated more in both
Roadmap of Results
40
‣ Multi-level cache allocator
‣ Multi-level cache and disk
‣ Accuracy of computed models
- Cache model
- Disk model
Cache Model Accuracy (TPC-W)
41
0 128 256 384 512 640 768 896
1024
0 128 256 384 512 640 768 896 1024
Stor
age
Cach
e Si
ze (M
B)
Buffer Pool Size (MB)
0
10
20
30
40
50
Erro
r (%
)
Localized in the middle
Disk Model Accuracy (TPC-W)
42
0
10
20
30
40
50
0 0.2 0.4 0.6 0.8 1
Late
ncy
(ms)
Disk Quota
MeasuredComputed
Conclusions
‣ Problem
- Need to consider resources on multiple tiers
- Independent cache/disk allocators are not sufficient
‣ Dynamic allocation of cache hierarchy and disk
- Build performance models dynamically
- Iteratively refine (if necessary)
- Use models for global resource partitioning
‣ Performance up to 2.9 better than single resource allocators
43
Thank you.
44