storage: isolation and fairness

35
Storage: Isolation and Fairness Lecture 24 Aditya Akella

Upload: lis

Post on 24-Feb-2016

55 views

Category:

Documents


0 download

DESCRIPTION

Storage: Isolation and Fairness. Lecture 24 Aditya Akella. Performance Isolation and Fairness in Multi-Tenant Cloud Storage , D. Shue , M. Freedman and A. Shaikh , OSDI 2012. Setting: Shared Storage in the Cloud. Y. Y. F. F. Z. Z. T. T. Shared Key-Value Storage. S3. EBS. SQS. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Storage: Isolation and Fairness

Storage: Isolation and Fairness

Lecture 24Aditya Akella

Page 2: Storage: Isolation and Fairness

Performance Isolation and Fairness in Multi-Tenant Cloud Storage, D. Shue, M. Freedman and A. Shaikh, OSDI 2012.

Page 3: Storage: Isolation and Fairness

3

Setting: Shared Storage in the Cloud

Z

Y

TFZ

Y

FT

S3 EBS SQSShared Key-Value Storage

Page 4: Storage: Isolation and Fairness

4

DD DD DD DDShared Key-Value Storage

Z Y T FZ Y FTY YZ F F F

Multiple co-located tenants resource contention⇒

Predictable Performance is Hard

Page 5: Storage: Isolation and Fairness

5

DD DD DD DDDD DD DD DDShared Key-Value Storage

Z Y T FZ Y FTY YZ F F F

Fair queuing @ big iron

Multiple co-located tenants resource contention⇒

Predictable Performance is Hard

Page 6: Storage: Isolation and Fairness

6

Distributed system distributed resource allocation⇒Multiple co-located tenants resource contention⇒

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

Predictable Performance is Hard

Page 7: Storage: Isolation and Fairness

7

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

popu

larit

y data partition

Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒

Z keyspace T keyspace F keyspaceY keyspace

Predictable Performance is Hard

Page 8: Storage: Isolation and Fairness

8

Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F

SS SS SS SS

Skewed object popularity variable per-node demand ⇒

Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒

1kBGET 10BGET 1kBSET 10BSET(small reads)(large reads) (large writes) (small writes)

Disparate workloads different bottleneck resources⇒

Predictable Performance is Hard

Page 9: Storage: Isolation and Fairness

9

Zynga Yelp FoursquareTP

Shared Key-Value Storage

Tenants Want System-wide Resource Guarantees

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

80 kreq/s 120 kreq/s 160 kreq/s40 kreq/s

Skewed object popularity variable per-node demand ⇒

Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒

Disparate workloads different bottleneck resources⇒

Page 10: Storage: Isolation and Fairness

10

Zynga Yelp FoursquareTP

Shared Key-Value Storage

Pisces Provides Weighted Fair-shares

wz = 20% wy = 30% wf = 40%wt = 10%

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

Skewed object popularity variable per-node demand ⇒

Multiple co-located tenants resource contention⇒Distributed system distributed resource allocation⇒

Disparate workloads different bottleneck resources⇒

Page 11: Storage: Isolation and Fairness

11

Pisces: Predictable Shared Cloud Storage• Pisces– Per-tenant max-min fair shares of system-

wide resources ~ min guarantees, high utilization

– Arbitrary object popularity– Different resource bottlenecks

• Amazon DynamoDB– Per-tenant provisioned rates

~ rate limited, non-work conserving– Uniform object popularity– Single resource (1kB requests)

Page 12: Storage: Isolation and Fairness

12

Tenant A

Predictable Multi-Tenant Key-Value Storage

Tenant BVM VM VM VM VM VM

RS

FQ

GET 1101100

RR

Controller

PP

Page 13: Storage: Isolation and Fairness

13

Tenant A

Predictable Multi-Tenant Key-Value Storage

Tenant BVM VM VM VM VM VM

WeightA WeightB

RS

FQ

PP

WA WA2 WB2

GET 1101100

RR

Controller

WA2 WB2WA1 WB1

Page 14: Storage: Isolation and Fairness

14

Strawman: Place Partitions Randomly

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

RS

FQ

WA WA2 WB2Controller

RR

Page 15: Storage: Isolation and Fairness

15

Strawman: Place Partitions Randomly

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

RS

FQ

WA WA2 WB2

RR

ControllerOverloaded

Page 16: Storage: Isolation and Fairness

16

Pisces: Place Partitions By Fairness Constraints

Bin-pack partitions

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

RS

FQ

WA WA2 WB2

RR

Collect per-partition tenant demand Controller

Page 17: Storage: Isolation and Fairness

17

Pisces: Place Partitions By Fairness Constraints

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

Results in feasible partition placement

RS

FQ

WA WA2 WB2

RR

Controller

Page 18: Storage: Isolation and Fairness

18

Controller

Strawman: Allocate Local Weights Evenly

WA1 = WB1 WA2 = WB2

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

WA WA2 WB2

RR RS

FQ

Overloaded

Page 19: Storage: Isolation and Fairness

19

Pisces: Allocate Local Weights By Tenant Demand

A←B A→B

WA1 > WB1 WA2 < WB2

maxmismatch

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

Compute per-tenant+/- mismatch

PP

WA WA2 WB2Controller

WA1 = WB1 WA2 = WB2

Reciprocal weight swap

RR RS

FQ

Page 20: Storage: Isolation and Fairness

20

Strawman: Select Replicas Evenly

50% 50%

RS

PP

WA WA2 WB2Controller

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

GET 1101100

RR

WA1 > WB1 WA2 < WB2

FQ

Page 21: Storage: Isolation and Fairness

21

Tenant A

Pisces: Select Replicas By Local Weight

60% 40%detect weightmismatch byrequest latency

Tenant BVM VM VM

WeightB

Controller50% 50%

RS

PP

WA WA2 WB2

VM VM VM

WeightA

GET 1101100

WA1 > WB1 WA2 < WB2

FQ

RR

Page 22: Storage: Isolation and Fairness

22

Strawman: Queue Tenants By Single Resource

Bandwidth limited Request Limited

bottleneck resource (out bytes) fair share

out req out req

Tenant A Tenant BVM VM VM VM VM VM

Controller

RS

PP

WA WA2 WB2

FQ

WA2 < WB2WA1 > WB1

RR

GET 1101100GET 0100111

Page 23: Storage: Isolation and Fairness

23

Pisces: Queue Tenants By Dominant Resource

Bandwidth limited Request Limited

out req out req

Tenant A Tenant BVM VM VM VM VM VM

Track per-tenantresource vector

dominant resource fair share

Controller

RS

PP

WA WA2 WB2

FQ

WA2 < WB2

RR

Page 24: Storage: Isolation and Fairness

24

Pisces Mechanisms Solve For Global Fairness

minutessecondsmicrosecondsTimescale

Syst

em V

isibi

lity

loca

lgl

obal

RSRR

RR

...

SS

SS

...Co

ntro

ller

Maximum bottleneck flow weight exchange

FAST-TCP basedreplica selection

DRR token-basedDRFQ scheduler

Replica Selection Policies

Wei

ght A

lloca

tions

fairness and capacity constraints

PP

WA WA2 WB2

FQ

Page 25: Storage: Isolation and Fairness

Issues• RS < -- > consistency

– Turn off RS. Why does it not matter?• What happens to previous picture?• Translating SLAs to weights

– How are SLAs specified?• Overly complex mechanisms: weight swap and FAST-based replica

selection?• Why make it distributed?

– Paper argues configuration is not scalable, but requires server configuration for FQ anyway

• What happens in a congested network?• Assumes stable tenant workloads reasonable?• Is DynamoDB bad?

Page 26: Storage: Isolation and Fairness

26

Pisces Summary• Pisces Contributions– Per-tenant weighted max-min fair shares of system-wide

resources w/ high utilization– Arbitrary object distributions – Different resource bottlenecks– Novel decomposition into 4 complementary mechanisms

PPPartition Placement

WA RS FQWeightAllocation

ReplicaSelection

FairQueuing

Page 27: Storage: Isolation and Fairness

Evaluation

Page 28: Storage: Isolation and Fairness

28

Evaluation•Does Pisces achieve (even) system-wide fairness?

- Is each Pisces mechanism necessary for fairness?

- What is the overhead of using Pisces?

•Does Pisces handle mixed workloads?

•Does Pisces provide weighted system-wide fairness?

•Does Pisces provide local dominant resource fairness?

•Does Pisces handle dynamic demand?

•Does Pisces adapt to changes in object popularity?

Page 29: Storage: Isolation and Fairness

29

Evaluation•Does Pisces achieve (even) system-wide fairness?

- Is each Pisces mechanism necessary for fairness?

- What is the overhead of using Pisces?

•Does Pisces handle mixed workloads?

•Does Pisces provide weighted system-wide fairness?

•Does Pisces provide local dominant resource fairness?

•Does Pisces handle dynamic demand?

•Does Pisces adapt to changes in object popularity?

Page 30: Storage: Isolation and Fairness

Pisces Achieves System-wide Per-tenant Fairness

Unmodified Membase

Ideal fair share: 110 kreq/s (1kB requests)

Pisces

0.57 MMR 0.98 MMR

Min-Max Ratio: min rate/max rate (0,1]

8 Tenants - 8 Client - 8 Storage NodesZipfian object popularity distribution

Page 31: Storage: Isolation and Fairness

31

Each Pisces Mechanism Contributes to System-wide Fairness and Isolation

Unmodified Membase

0.59 MMR 0.93 MMR 0.98 MMR

0.36 MMR 0.58 MMR 0.74 MMR

RSWAPPFQ WAPPFQ

0.90 MMR

0.96 MMR 0.97 MMR0.89 MMR

0.57 MMR

FQ

2x vs 1x demand

0.64 MMR

Page 32: Storage: Isolation and Fairness

32

Pisces Imposes Low-overhead

< 5%

> 19%

Page 33: Storage: Isolation and Fairness

33

Pisces Achieves System-wide Weighted Fairness

0.98 MMR4 heavy hitters 20 moderate demand 40 low demand

0.56 MMR

0.89 MMR 0.91 MMR

0.91 MMR

Page 34: Storage: Isolation and Fairness

34

Pisces Achieves Dominant Resource Fairness

Time (s)

Band

wid

th (M

b/s)

GET

Requ

ests

(kre

q/s)76% of bandwidth 76% of request rate

Time (s)

1kB workloadbandwidth limited

10B workloadrequest limited

24% of request rate

Page 35: Storage: Isolation and Fairness

35

Pisces Adapts to Dynamic Demand

Constant BurstyDiurnal (2x wt)

~2x

even

Tenant Demand