capacity planning and performance management for vcl clouds · capacity planning and performance...

24
Budapest University of Technology and Economics Department of Measurement and Information Systems Budapest University of Technology and Economics Fault Tolerant Systems Research Group Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre Kocsis

Upload: others

Post on 25-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Budapest University of Technology and Economics Department of Measurement and Information Systems

Budapest University of Technology and Economics Fault Tolerant Systems Research Group

Capacity Planning and Performance Management for VCL Clouds

Ágnes Salánki, Gergő Kincses, László Gönczy, Imre Kocsis

Page 2: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Motivation

Lab Private university cloud

Enterprise cloud Purchased CPU time

Page 3: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Our VCL cloud

Maintained by our research group

5 semesters

o 2 courses/semester

9 hosts

~20 000 reservations

o Only 22 rejected

Page 4: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Reservation Workflow in VCL

Request

o VM type

o Length

o Immediately or later

Hard reservation limit

Load time

Page 5: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Capacity Planning in VCL Capacity Planning

Can I start solving my homework

now?

Do we have spare capacity

for my research next week?

I am responsible for a course with 250 students next September. Can we handle this workload?

Aron Imre

Agnes

Gergo

Laszlo

Page 6: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Capacity Planning in VCL Capacity Planning

Support for hard limit estimation

Spare capacity prediction

Long-term capacity planning/scheduling

Page 7: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

The Available Dataset

Host1

Host2

VM1 VM2

VM1 VM2

reservation type, time to load, etc.

cpu usage, memory usage, etc.

cpu usage, memory usage, etc.

deadlines, #students

Page 8: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Data Analysis Steps

Host1

Host2

VM1 VM2

VM1 VM2

Workload prediction

Resource util. pred.

Page 9: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Workflow

Workload prediction

Resource util. pred.

Capacity planning Host1

Host2

VM1 VM2

VM1 VM2

Use case 1

Use case 2

Use case 3

Host

VM VM ?

?

? VM VM VM

Page 10: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Workload prediction

Deadline

Page 11: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Workload prediction

Daily workload follows a Gaussian-like distribution

Page 12: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Model fitting

Page 13: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Workload prediction

Daily workload follows a Gaussian-like distribution

Exponential increase in peak numbers

maximum location between 7 PM and 11 PM

~4 hours as standard deviation

Page 14: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Workload prediction

Students work even in the night

They have lunch and dinner

They skip their lectures

Page 15: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Workload prediction

Changes in students’ behavior?

Page 16: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Resource Utilization Prediction

𝑓(𝑤𝑜𝑟𝑘𝑙𝑜𝑎𝑑)??

Page 17: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Challenges

It is a cloud

o Statistical multiplexing

o Workload is not uniformly distributed

Meleg tartalék

Más válasz ugyanarra a terhelésre, pl. a memóriánál

2012/2013/2 2013/2014/1 2013/2014/2 2014/2015/1

Page 18: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Host2

Host1

Challenges

It is a cloud

Hosts show different behavior

o Warm spare

o Different user behavior

o ???

Page 19: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Resource utilization analysis: memory

Linear model

o 𝑀𝑒𝑚(𝑉𝑀1) + 𝑀𝑒𝑚(𝑉𝑀2) + … + 𝑀𝑒𝑚(𝑚𝑔𝑚𝑡)

o Weighted by the workload

Very good at following drastic changes

Within 5% by the 97% of time

Page 20: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Resource utilization analysis: memory

Linear model

o 𝑀𝑒𝑚(𝑉𝑀1) + 𝑀𝑒𝑚(𝑉𝑀2) + … + 𝑀𝑒𝑚(𝑚𝑔𝑚𝑡)

o Weighted by the workload

Page 21: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Resource utilization analysis: CPU

Linear model

o 𝐶𝑃𝑈(𝑉𝑀1) + 𝐶𝑃𝑈(𝑉𝑀2) + … + 𝐶𝑃𝑈(𝑚𝑔𝑚𝑡)

o Weighted by the workload CPU is much more sensitive than memory

Page 22: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Resource utilization analysis: CPU

Linear model

o 𝐶𝑃𝑈(𝑉𝑀1) + 𝐶𝑃𝑈(𝑉𝑀2) + … + 𝐶𝑃𝑈(𝑚𝑔𝑚𝑡)

o Weighted by the workload

The students use the CPU more

intensively before the deadline

Page 23: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Resource utilization analysis: CPU

Linear model

o 𝐶𝑃𝑈(𝑉𝑀1, 𝒘𝒍) + 𝐶𝑃𝑈(𝑉𝑀2, 𝒘𝒍) + …

o Weighted by the workload

Page 24: Capacity Planning and Performance Management for VCL Clouds · Capacity Planning and Performance Management for VCL Clouds Ágnes Salánki, Gergő Kincses, László Gönczy, Imre

Summary

Data-driven static capacity planning

o „user behavior” analysis

o resource fingerprint estimation

Conclusions:

o student behavior can be modelled easily

o we were sometimes (too) strict

Dynamic capacity planning?

o Long loading time failed reservations soon

o When to burst out to a public cloud?