storage: isolation and fairness
DESCRIPTION
Storage: Isolation and Fairness. Lecture 24 Aditya Akella. Performance Isolation and Fairness in Multi-Tenant Cloud Storage , D. Shue , M. Freedman and A. Shaikh , OSDI 2012. Setting: Shared Storage in the Cloud. Y. Y. F. F. Z. Z. T. T. Shared Key-Value Storage. S3. EBS. SQS. - PowerPoint PPT PresentationTRANSCRIPT
Storage: Isolation and Fairness
Lecture 24Aditya Akella
Performance Isolation and Fairness in Multi-Tenant Cloud Storage, D. Shue, M. Freedman and A. Shaikh, OSDI 2012.
3
Setting: Shared Storage in the Cloud
Z
Y
TFZ
Y
FT
S3 EBS SQSShared Key-Value Storage
4
DD DD DD DDShared Key-Value Storage
Z Y T FZ Y FTY YZ F F F
Multiple co-located tenants resource contention⇒
Predictable Performance is Hard
5
DD DD DD DDDD DD DD DDShared Key-Value Storage
Z Y T FZ Y FTY YZ F F F
Fair queuing @ big iron
Multiple co-located tenants resource contention⇒
Predictable Performance is Hard
6
Distributed system distributed resource allocation⇒Multiple co-located tenants resource contention⇒
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
Predictable Performance is Hard
7
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
popu
larit
y data partition
Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒
Z keyspace T keyspace F keyspaceY keyspace
Predictable Performance is Hard
8
Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F
SS SS SS SS
Skewed object popularity variable per-node demand ⇒
Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒
1kBGET 10BGET 1kBSET 10BSET(small reads)(large reads) (large writes) (small writes)
Disparate workloads different bottleneck resources⇒
Predictable Performance is Hard
9
Zynga Yelp FoursquareTP
Shared Key-Value Storage
Tenants Want System-wide Resource Guarantees
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
80 kreq/s 120 kreq/s 160 kreq/s40 kreq/s
Skewed object popularity variable per-node demand ⇒
Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒
Disparate workloads different bottleneck resources⇒
10
Zynga Yelp FoursquareTP
Shared Key-Value Storage
Pisces Provides Weighted Fair-shares
wz = 20% wy = 30% wf = 40%wt = 10%
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
Skewed object popularity variable per-node demand ⇒
Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒
Disparate workloads different bottleneck resources⇒
11
Pisces: Predictable Shared Cloud Storage• Pisces– Per-tenant max-min fair shares of system-
wide resources ~ min guarantees, high utilization
– Arbitrary object popularity– Different resource bottlenecks
• Amazon DynamoDB– Per-tenant provisioned rates
~ rate limited, non-work conserving– Uniform object popularity– Single resource (1kB requests)
12
Tenant A
Predictable Multi-Tenant Key-Value Storage
Tenant BVM VM VM VM VM VM
RS
FQ
GET 1101100
RR
Controller
PP
13
Tenant A
Predictable Multi-Tenant Key-Value Storage
Tenant BVM VM VM VM VM VM
WeightA WeightB
RS
FQ
PP
WA WA2 WB2
GET 1101100
RR
Controller
WA2 WB2WA1 WB1
14
Strawman: Place Partitions Randomly
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
RS
FQ
WA WA2 WB2Controller
RR
15
Strawman: Place Partitions Randomly
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
RS
FQ
WA WA2 WB2
RR
ControllerOverloaded
16
Pisces: Place Partitions By Fairness Constraints
Bin-pack partitions
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
RS
FQ
WA WA2 WB2
RR
Collect per-partition tenant demand Controller
17
Pisces: Place Partitions By Fairness Constraints
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
Results in feasible partition placement
RS
FQ
WA WA2 WB2
RR
Controller
18
Controller
Strawman: Allocate Local Weights Evenly
WA1 = WB1 WA2 = WB2
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
WA WA2 WB2
RR RS
FQ
Overloaded
19
Pisces: Allocate Local Weights By Tenant Demand
A←B A→B
WA1 > WB1 WA2 < WB2
maxmismatch
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
Compute per-tenant+/- mismatch
PP
WA WA2 WB2Controller
WA1 = WB1 WA2 = WB2
Reciprocal weight swap
RR RS
FQ
20
Strawman: Select Replicas Evenly
50% 50%
RS
PP
WA WA2 WB2Controller
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
GET 1101100
RR
WA1 > WB1 WA2 < WB2
FQ
21
Tenant A
Pisces: Select Replicas By Local Weight
60% 40%detect weightmismatch byrequest latency
Tenant BVM VM VM
WeightB
Controller50% 50%
RS
PP
WA WA2 WB2
VM VM VM
WeightA
GET 1101100
WA1 > WB1 WA2 < WB2
FQ
RR
22
Strawman: Queue Tenants By Single Resource
Bandwidth limited Request Limited
bottleneck resource (out bytes) fair share
out req out req
Tenant A Tenant BVM VM VM VM VM VM
Controller
RS
PP
WA WA2 WB2
FQ
WA2 < WB2WA1 > WB1
RR
GET 1101100GET 0100111
23
Pisces: Queue Tenants By Dominant Resource
Bandwidth limited Request Limited
out req out req
Tenant A Tenant BVM VM VM VM VM VM
Track per-tenantresource vector
dominant resource fair share
Controller
RS
PP
WA WA2 WB2
FQ
WA2 < WB2
RR
24
Pisces Mechanisms Solve For Global Fairness
minutessecondsmicrosecondsTimescale
Syst
em V
isibi
lity
loca
lgl
obal
RSRR
RR
...
SS
SS
...Co
ntro
ller
Maximum bottleneck flow weight exchange
FAST-TCP basedreplica selection
DRR token-basedDRFQ scheduler
Replica Selection Policies
Wei
ght A
lloca
tions
fairness and capacity constraints
PP
WA WA2 WB2
FQ
Issues• RS < -- > consistency
– Turn off RS. Why does it not matter?• What happens to previous picture?• Translating SLAs to weights
– How are SLAs specified?• Overly complex mechanisms: weight swap and FAST-based replica
selection?• Why make it distributed?
– Paper argues configuration is not scalable, but requires server configuration for FQ anyway
• What happens in a congested network?• Assumes stable tenant workloads reasonable?• Is DynamoDB bad?
26
Pisces Summary• Pisces Contributions– Per-tenant weighted max-min fair shares of system-wide
resources w/ high utilization– Arbitrary object distributions – Different resource bottlenecks– Novel decomposition into 4 complementary mechanisms
PPPartition Placement
WA RS FQWeightAllocation
ReplicaSelection
FairQueuing
Evaluation
28
Evaluation•Does Pisces achieve (even) system-wide fairness?
- Is each Pisces mechanism necessary for fairness?
- What is the overhead of using Pisces?
•Does Pisces handle mixed workloads?
•Does Pisces provide weighted system-wide fairness?
•Does Pisces provide local dominant resource fairness?
•Does Pisces handle dynamic demand?
•Does Pisces adapt to changes in object popularity?
29
Evaluation•Does Pisces achieve (even) system-wide fairness?
- Is each Pisces mechanism necessary for fairness?
- What is the overhead of using Pisces?
•Does Pisces handle mixed workloads?
•Does Pisces provide weighted system-wide fairness?
•Does Pisces provide local dominant resource fairness?
•Does Pisces handle dynamic demand?
•Does Pisces adapt to changes in object popularity?
Pisces Achieves System-wide Per-tenant Fairness
Unmodified Membase
Ideal fair share: 110 kreq/s (1kB requests)
Pisces
0.57 MMR 0.98 MMR
Min-Max Ratio: min rate/max rate (0,1]
8 Tenants - 8 Client - 8 Storage NodesZipfian object popularity distribution
31
Each Pisces Mechanism Contributes to System-wide Fairness and Isolation
Unmodified Membase
0.59 MMR 0.93 MMR 0.98 MMR
0.36 MMR 0.58 MMR 0.74 MMR
RSWAPPFQ WAPPFQ
0.90 MMR
0.96 MMR 0.97 MMR0.89 MMR
0.57 MMR
FQ
2x vs 1x demand
0.64 MMR
32
Pisces Imposes Low-overhead
< 5%
> 19%
33
Pisces Achieves System-wide Weighted Fairness
0.98 MMR4 heavy hitters 20 moderate demand 40 low demand
0.56 MMR
0.89 MMR 0.91 MMR
0.91 MMR
34
Pisces Achieves Dominant Resource Fairness
Time (s)
Band
wid
th (M
b/s)
GET
Requ
ests
(kre
q/s)76% of bandwidth 76% of request rate
Time (s)
1kB workloadbandwidth limited
10B workloadrequest limited
24% of request rate
35
Pisces Adapts to Dynamic Demand
Constant BurstyDiurnal (2x wt)
~2x
even
Tenant Demand