xrm xensummit

26
XRM: An Eventbased Resource Management Framework for XCP Pradeep Padala in collaboration with Ken Igarashi, Akshay I. Mehta, and Ulas C. Kozat

Post on 19-Oct-2014

1.784 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Xrm xensummit

XRM:  An  Event-­‐based  Resource  Management  Framework  for  XCP  

Pradeep  Padala  

in collaboration with Ken Igarashi, Akshay I. Mehta, and Ulas C. Kozat

Page 2: Xrm xensummit

Typical  scenario  in  shared  infrastructures  

Data Center!

Shared infrastructure

(cloud)

Web search Data analytics

Xen Summit AMD 2010

Page 3: Xrm xensummit

ApplicaCon  requirements  

Fast searches Analyze large data

  Low response time   High throughput

  QoS differentiation 3:1

Web search Data analytics

Xen Summit AMD 2010

Page 4: Xrm xensummit

How  to  host  these  applicaCons?  

Xen Summit AMD 2010

app1 web

Virtualization

app1 db

app2 app3

Node I Node II

Node III Node IV

Node I

Virtualized data center

Virtualization Node II

app2 app3

Physical partitioning

   Improved utilization     Reduced costs  High flexibility (elastic!)

×  Wasteful ×  Difficult to manage

app1 web

app1 db

Virtualized  shared  data  center  =  a  new  paradigm!  Challenge  

How  to  allocate  resources  to  meet  goals?  

Page 5: Xrm xensummit

Xen Summit AMD 2010 5

ProvisionVMs() RunApplications()

While (true) { MonitorApplications() If(AppPerformance != GOAL) {

FindReason() If (ScaleUp) {

FindAvailableResources() MigrateVM()

} If (ScaleOut) {

ProvisionVMs() RunApplication() } } If (Consolidation == True) { FindSuitableVMs()

Consolidate() } }

Challenge  #1:  Developers  don’t  want  to  manage  resources  

How  to  determine  what  to  do?  Scale  UP?  Scale  Out?  Migrate?  Clone?    

Where  to  provision  VMs?  

How  to  consolidate  VMs?  

Cloud  Providers  Want  to  Consolidate  MulCple  Services  too!  

Holy  Grail  DeployService();!AutoScale();!

Page 6: Xrm xensummit

Xen Summit AMD 2010

Challenge  #2:  Resource  Management  Spans  MulCple  Layers  

Services  

PaaS  

IaaS  

Hardware  

Resource  

Managem

ent  

How  to  pass  informa.on  between  the  layers  so  that  they  don’t  make  conflic.ng  decisions?  

Page 7: Xrm xensummit

Challenge  #3:  Complexity  of  Scaling  PrimiCves  

  LiZle  overhead    Efficient  X  Limited  to  single  

machine  

Xen Summit AMD 2010

Slicing    Handles  overload    Small  downCme  X  Overhead  

Live  MigraCon  

  State-­‐ful  clone  X  Overhead  X  Side-­‐effects  

Cloning   Maintain  

connecCons  

X  Overhead  

Live  ReplicaCon  

How  to  combine  primi.ves  to  achieve  goals?  

Page 8: Xrm xensummit

What  is  a  perfect  Resource  Manager?  

 AutomaCon   Resource  AllocaCon   High  UClizaCon   High  ApplicaCon  Performance  

Xen Summit AMD 2010

We  are  building  the  (ulCmate)  RM  system  XRM  =  first  incarnaCon  on  XCP!  

A   RM   that   can   automaCcally   re-­‐arrange   resources   to  mulCple  applicaCons/VMs  on  mulCple  physical  machines  and  provides  opCmal  resource  uClizaCon  and  applicaCon  performance    

Page 9: Xrm xensummit

Outline  •  MoCvaCon  •  Challenges  in  RM  

•  XRM  Feedback  Control  based  Design  

•  XRM  ImplementaCon  and  Preliminary  Results  

•  Summary  and  Feedback  

Xen Summit AMD 2010

Page 10: Xrm xensummit

How  to  achieve  the  automaCon?  

“Almost any system that is considered automatic has some

element of feedback control” -Hellerstein et al.

XRM  =  A  Feedback  Control  System  

Xen Summit AMD 2010

Page 11: Xrm xensummit

RM  in  mulCple  layers  

Xen Summit AMD 2010

XRM  =  IaaS  RM  

Does  app  modeling  and  may  request  

changes    

Knows  only  about  VMs  and  hardware  

resources  

High  level  service  request  

Slice  request  

Automated  control  loop  

Slice  changes  

PaaS  RM  

IaaS  RM  

Services  

Hardware  

Page 12: Xrm xensummit

XRM’s  feedback  control  loop  

Monitor  

Control  

AcCon  

XCP  

Network  stats  

Performance  goals  

Control  parameters  

Change  resource  shares  

Migrate   Power-­‐off  machines  

Model  Model  can  model  

applicaCons,  VMs,  and  underlying  resources

Xen Summit AMD 2010

Page 13: Xrm xensummit

Current  incarnaCon  XCP  

monitoring  module  

Stats  analysis  module  

RRD  database  

Out  of  band  stat  updates  from  XCP  

nodes  

Stats   1.  Thresholds  2.  Rules  

Core  algorithm  module  

Algorithm  bank  

Filtered  Stats  and  stats  analysis  data  

Wrapper  

Take  acCon  

XCP  master  node  

Xen Summit AMD 2010

Openflow  

Low-­‐level  commands/XAPI  commands  

Page 14: Xrm xensummit

XRM  is  an  event-­‐based  framework  •  Many  algorithms  can  be  developed  and  plugged  in  •  The  algorithms  register  for  specific  events  

– High  CPU  uClizaCon  – Packet  drops  – PowerOff  – PowerOn  – …  

•  Different  algorithms  may  take  different  acCons  

Xen Summit AMD 2010

A  Common  Abstrac.on  for  ALL  Algorithms  

Page 15: Xrm xensummit

What  algorithms  can  you  implement?  •  AutoControl  –  automated  control  of  mulCple  virtualized  resources  [PadalaEurosys09]  

•  Models  applicaCon  and  sets  VM  shares  based  on  applicaCon  goals  

Xen Summit AMD 2010

Goals Resource Shares

App  Controller  

App  Controller  

App  Controller  

Node  Controller   Node  Controller  

[PadalaEurosys09] Pradeep Padala, Xiaoyun Zhu, Mustafa Uysal et al. Automated Control of Multiple Virtualized Resources. In the proceedings of the EuroSys 2009

Page 16: Xrm xensummit

Outline  •  MoCvaCon  •  Challenges  in  RM  

•  XRM  Feedback  Control  based  Design  

•  XRM  ImplementaCon  and  Preliminary  Results  

•  Summary  and  Feedback  

Xen Summit AMD 2010

Page 17: Xrm xensummit

XRM  features  •  Interface  to  upper  layers  •  Auto-­‐*  features  •  External  control  •  Pluggable  algorithms  

•  Extensibility  

Xen Summit AMD 2010

Page 18: Xrm xensummit

XRM  ImplementaCon  •  Implemented  on  XCP  0.1.1  •  WriZen  in  Python  •  Pluggable  algorithms  have  to  be  wriZen  in  Python  •  Currently  implements  four  algorithms  

–  Bin  packing  –  Bin  packing  +  Live  migraCon  –  Random  host  –  Round-­‐robin  

•  We  have  also  implemented  a  simulator  (run  1  Million  VMs  on  100,000  nodes!)  –  Can  capture  data  during  a  “real”  run  –  Run  mulCple  algorithms  on  exact  same  trace  

Xen Summit AMD 2010

Page 19: Xrm xensummit

XRM  EvaluaCon  •  5  hosts,  4  cores  •  Random  uClizaCons  

•  Random  slice  requests  

•  Three  algorithms  – Bin-­‐packing  – Round-­‐robin  – Random-­‐host  

•  Slicing  algorithms  evaluated  in  previous  work  -­‐  AutoControl  [PadalaEurosy’09]  

Xen Summit AMD 2010

Page 20: Xrm xensummit

Comparing  three  algorithms  

0  

500  

1000  

0  

500  

1000  

1   2   3   4   5   6   7   8   9  

Time Interval

Hos

t Util

izat

ion 0  

500  

1000  

Bin Packing

Random Host

Round-Robin Uses all five hosts, wasting energy

Uses <= five hosts, wasting energy

Uses <= three hosts!

Page 21: Xrm xensummit

•  Experiments on Emulab

•  20 server nodes – 80 VMs

•  20 client nodes

• Mix of applications

•  Load increased on ½ of the VMs chosen randomly

AutoControl  experiments  

Under  loaded  

Under  loaded  

Over  loaded  

Over  loaded  

Over  loaded  

VM1  

VM2  

VM3  

VM4  

No  control  needed  

AutoControl  can  readjust  

Page 22: Xrm xensummit

SLO  (performance  goal)  violaCons  

Time Time

Default Xen AutoControl

Applications

Good Bad Target

Page 23: Xrm xensummit

Summary  •  Resource  management  in  cloud  infrastructures  is  complex  – MulCple  layers  of  RM  –  Complex  primiCves  

–  Complex  decisions  

•  We  are  developing  feedback  control  theory  based  RM    

•  XRM  is  event-­‐based,  pluggable  and  extensible  

•  Complex  algorithms  like  AutoControl  can  be  developed  

•  Research  in  advanced  algorithms  in  progress  

Xen Summit AMD 2010

Page 24: Xrm xensummit

Summary  of  our  experiences  with  XCP  0.1.1  

•  We  are  trying  to  build  a  research  cloud  based  on  XCP  •  Other  than  XRM,  adding  Fault  Tolerance  and  a  Web-­‐based  GUI  to  XCP  

•  Having  to  install  a  special  distribuCon  is  difficult  – Why  not  have  XCP  as  a  set  of  packages  in  RHEL  or  other  distribuCons?  

–  You  are  breaking  toolstacks  developed  at  various  companies  •  XCP  docs  is  same  as  Citrix  Xenserver  docs  

–  Some  of  the  features  don’t  work  or  not  supported  –  BeZer  documentaCon  of  API  

•  XCP  GUI  needs  to  improve  –  Bugs  in  OpenXenCenter  

Xen Summit AMD 2010

Page 25: Xrm xensummit

25

【参考】提供機能概要

Xen Summit AMD 2010

Page 26: Xrm xensummit

We  want  feedback  from  Xen  community  

•  Comments  on  XRM  architecture  •  Should  we  incorporate  XRM  into  XCP?    

– Ocaml  

•  Are  you  interested  in  open  source  XRM?  – Does  the  community  wants  to  be  involved?  

•  QuesCons?            

       ppadala@docomolabs-­‐usa.com  

Xen Summit AMD 2010