towards multi-tenant performance slos

18
Towards Multi-Tenant Performance SLOs Willis Lang*, Srinath Shankar + , Jignesh M. Patel*, Ajay Kalhan ^ *University of Wisconsin-Madison + Microsoft Gray Systems Lab ^ Microsoft Corp. To appear in ICDE 2012 1

Upload: james-french

Post on 03-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Towards Multi-Tenant Performance SLOs. Willis Lang*, Srinath Shankar + , Jignesh M. Patel*, Ajay Kalhan ^ *University of Wisconsin-Madison + Microsoft Gray Systems Lab ^ Microsoft Corp. To appear in ICDE 2012. Overall Operating Costs of Providing Cloud Services are High. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Towards Multi-Tenant Performance SLOs

1

Towards Multi-Tenant Performance SLOsWillis Lang*, Srinath Shankar+, Jignesh M. Patel*, Ajay Kalhan^

*University of Wisconsin-Madison+Microsoft Gray Systems Lab^Microsoft Corp.

To appear in ICDE 2012

Page 2: Towards Multi-Tenant Performance SLOs

2Overall Operating Costs of Providing Cloud Services are High

Dominating costs are server and power costs: 57% and 31% respectively

Monthly Cost of 46,000 Server Data Center[Hamilton, 2011]

Servers $1,852,778

Networking$260,039

Power$1,007,651

Infrastructure$130,019

Server & Power 88%

Networking8%Infrastructure

4%

Page 3: Towards Multi-Tenant Performance SLOs

3Performance Service Level Objectives and Managing Cloud Costs

Tenants can get their own server and high performance

Tenants have performance objectives

Consolidate tenants onto the fewest number of servers (maximize the degree of multi-tenancy) while maintaining perf objectives

Per

form

ance

per

Ten

ant

Data Center Costs

Page 4: Towards Multi-Tenant Performance SLOs

4

An Optimization Problem

Find:(1) Tenant Scheduling Policies and (2) Hardware Provisioning Policies

Such that costs are minimized and performance is delivered

Given: Groups of tenants with different performance objectives and a number of server configurations

High Perf Low Perf

Page 5: Towards Multi-Tenant Performance SLOs

5

Multi-Tenant Scheduling

Perf Objective – TPC-C throughput H tenants– 100tps L tenants– 10tps

Want to maximize degree of multi-tenancy without breaking SLO

What if we also have different server types available?

H Tenants L Tenants

#H: #L:1 15 2020 40

Avg H Perf Avg L Perf

tps ea. tps ea.2000 2000900 110130 30

Page 6: Towards Multi-Tenant Performance SLOs

6

Hardware Setup 2 x Intel Nehalem L5630 32GB DDR3 RAID battery-backed cache 1 x 10k RPM SAS – OS/software+

“diskC” - $4000 ($111 per month)Data: 2 x 10k RPM SAS

300GBLog: 1 x 10k RPM SAS 300GB

“ssdC” - $4500 ($125 per month)Data: 2 x Crucial C300 256GBLog: 1 x Crucial C300 256GB

Page 7: Towards Multi-Tenant Performance SLOs

7

Software Setup SQL Server 2012

All tenants of the ‘H’ performance class get an individual database within a SQL Server instance

Databases in SQL Server have their own physical files for data and log

All tenants of the ‘L’ performance class get an individual database within a different SQL Server instance

SQL Server instance memory provisioning to control performance (not VM)

Page 8: Towards Multi-Tenant Performance SLOs

8

Benchmark server to find max degree multi-tenancy for perf objectives

Systematically reduce ‘H’ tenants, steadily increase ‘L’ tenant scheduling until a perf objective fails

Server characterizing function:

Both perf objectives met

Some perf objective fails

Heterogeneous SLO Characterization

diskC ssdC

0 10 20 30 40 50 60 70 80 901000

5

10

15

20

25

Number of L (10tps) Tenants

Nu

m o

f H

(10

0tp

s) T

enan

ts

0 10 20 30 40 50 60 70 80 901000

5

10

15

20

25

Number of L (10tps) Tenants

Nu

m o

f H

(10

0tp

s) T

enan

ts

0 10 20 30 40 50 60 70 80 901000

5

10

15

20

25

Number of L (10tps) Tenants

Nu

m o

f H

(10

0tp

s) T

enan

ts

Page 9: Towards Multi-Tenant Performance SLOs

9

Applying Our Optimization Framework

0 20 40 60 80 100 120 1400

10

20

30

40ssdC diskC

Number of L (10tps) Tenants

Nu

mb

er H

(10

0tp

s) T

enan

ts

Scenario: 10,000 tenants, 2,000x100tps & 8,000x10tps

Optimal Solution: 94 ssdC servers, 38 10tps tenants and 20 100tps tenants + 5 diskC servers, 25 10tps tenants and 20 100tps tenants + 43 ssdC servers, 100 10tps tenants

38

Page 10: Towards Multi-Tenant Performance SLOs

10

Applying Our Optimization Framework

Optimal Only diskC Tenant Segregated

$0

$5,000

$10,000

$15,000

$20,000

$25,000

$30,000M

on

thly

Se

rve

r C

os

ts

ssdC – 100tps tenantsdiskC – 10tps tenants

Page 11: Towards Multi-Tenant Performance SLOs

11

SummaryWe have presented an optimization framework that tells a Database-as-a-Service provider how to provide performance Service Level Objectives while minimizing cluster infrastructure costs

Page 12: Towards Multi-Tenant Performance SLOs

12

An optimization framework to determine the optimal tenant scheduling and server provisioning in light of tenant performance goals [ICDE 2012]

Complex parallel analytic workloads cause non-linear speedup and force low-power server clusters to be much larger and more expensive than traditional clusters[DaMoN 2010 Best Paper]Parallel data processing bottlenecks such as network bandwith and algorithmic choices are a cause of energy inefficiency [Under Submission]

Computational complexity of MR jobs affects the ability to save energy by using smaller clusters [VLDB 2010] By exploiting existing replication schemes, an elegant relationship between load balancing and energy efficiency can be exploited [SIGMOD Record 2009]

Demonstrated that it is possible to decrease energy and performance in a controlled way using hardware mechanisms (e.g., CPU frequency/voltage and memory parking) and algorithmic choices [CIDR 2009, IEEE DEB 2011]

Thesis Research

Cluster Design, Performance in

the Cloud

Low-Power Server

Hardware

Cluster-level Performance and Energy

Consumption

Node-local Performance and Energy

Consumption

Characterizing Performance vs

Energy and Server Costs

CIDR 09, IEEE DEB 11VLDB 10, SIGMOD Rec 09

DaMoN 10, Under Submission

ICDE 12

Per

form

ance

Data Center Costs

Page 13: Towards Multi-Tenant Performance SLOs

13

AcknowledgementsSpecial thanks to David

DeWitt, Jeff Naughton, Alan Halverson, Eric Robinson, Rimma Nehme, Dimitris Tsirogiannis, Nikhil Teletia, Chris Ré

Funded by a grant from Microsoft Gray Systems Lab

Cluster Design, Performance in

the Cloud

Low-Power Server

Hardware

Cluster-level Performance and Energy

Consumption

Node-local Performance and Energy

Consumption

Characterizing Performance vs

Energy and Server Costs

CIDR 09, IEEE DEB 11VLDB 10, SIGMOD Rec 09

DaMoN 10, Under Submission

ICDE 12

Page 14: Towards Multi-Tenant Performance SLOs

14

Page 15: Towards Multi-Tenant Performance SLOs

15

Page 16: Towards Multi-Tenant Performance SLOs

16

Memory-based resource governor E.g., 2 performance goals, 100tps and 10tps 20 tenants pay for 100tps and 30 tenants pay for 10tps

The aggregate memory for all 100tps tenants:

Similarly, for 10tps tenants:

Page 17: Towards Multi-Tenant Performance SLOs

17

Simplicity vs Cost

Methods ssdC SKU diskC SKU

Optimal Hetero SLO Hetero SLO

ssdC-only Hetero SLO NA

diskC-only NA Hetero SLO

ssdC-H High-perf Low-perf

ssdC-L Low-perf High-perf

20% 100tps, 80% 10tps

50% 100tps, 50% 10tps

80% 100tps, 20% 10tps

0.0

0.5

1.0

1.5

2.0

Re

l. C

os

t

20% 100tps, 80% 10tps

50% 100tps, 50% 10tps

80% 100tps, 20% 10tps

0.0

0.5

1.0

1.5

2.0

Re

l. C

os

t

diskC cost -10% vs ssdCNone of these heuristic methods consistently provides solutions near to the optimal method.

diskC cost -30% vs ssdC

Page 18: Towards Multi-Tenant Performance SLOs

18

Log Disk Bottlenecks

200/0175/1150/1125/1100/175/10

2

4

6

8

10

12

14

020406080100120140160180

<# 1tps tnt>/<# 100tps tnt>

Ave

rag

e L

og

Wri

te W

ait

Tim

e (m

s)

TP

S A

chie

ved

by

On

e 10

0tp

s T

enan

t