xrm xensummit
Post on 19-Oct-2014
1.784 views
DESCRIPTION
TRANSCRIPT
XRM: An Event-‐based Resource Management Framework for XCP
Pradeep Padala
in collaboration with Ken Igarashi, Akshay I. Mehta, and Ulas C. Kozat
Typical scenario in shared infrastructures
Data Center!
Shared infrastructure
(cloud)
Web search Data analytics
Xen Summit AMD 2010
ApplicaCon requirements
Fast searches Analyze large data
Low response time High throughput
QoS differentiation 3:1
Web search Data analytics
Xen Summit AMD 2010
How to host these applicaCons?
Xen Summit AMD 2010
app1 web
Virtualization
app1 db
app2 app3
Node I Node II
Node III Node IV
Node I
Virtualized data center
Virtualization Node II
app2 app3
Physical partitioning
Improved utilization Reduced costs High flexibility (elastic!)
× Wasteful × Difficult to manage
app1 web
app1 db
Virtualized shared data center = a new paradigm! Challenge
How to allocate resources to meet goals?
Xen Summit AMD 2010 5
ProvisionVMs() RunApplications()
While (true) { MonitorApplications() If(AppPerformance != GOAL) {
FindReason() If (ScaleUp) {
FindAvailableResources() MigrateVM()
} If (ScaleOut) {
ProvisionVMs() RunApplication() } } If (Consolidation == True) { FindSuitableVMs()
Consolidate() } }
Challenge #1: Developers don’t want to manage resources
How to determine what to do? Scale UP? Scale Out? Migrate? Clone?
Where to provision VMs?
How to consolidate VMs?
Cloud Providers Want to Consolidate MulCple Services too!
Holy Grail DeployService();!AutoScale();!
Xen Summit AMD 2010
Challenge #2: Resource Management Spans MulCple Layers
Services
PaaS
IaaS
Hardware
Resource
Managem
ent
How to pass informa.on between the layers so that they don’t make conflic.ng decisions?
Challenge #3: Complexity of Scaling PrimiCves
LiZle overhead Efficient X Limited to single
machine
Xen Summit AMD 2010
Slicing Handles overload Small downCme X Overhead
Live MigraCon
State-‐ful clone X Overhead X Side-‐effects
Cloning Maintain
connecCons
X Overhead
Live ReplicaCon
How to combine primi.ves to achieve goals?
What is a perfect Resource Manager?
AutomaCon Resource AllocaCon High UClizaCon High ApplicaCon Performance
Xen Summit AMD 2010
We are building the (ulCmate) RM system XRM = first incarnaCon on XCP!
A RM that can automaCcally re-‐arrange resources to mulCple applicaCons/VMs on mulCple physical machines and provides opCmal resource uClizaCon and applicaCon performance
Outline • MoCvaCon • Challenges in RM
• XRM Feedback Control based Design
• XRM ImplementaCon and Preliminary Results
• Summary and Feedback
Xen Summit AMD 2010
How to achieve the automaCon?
“Almost any system that is considered automatic has some
element of feedback control” -Hellerstein et al.
XRM = A Feedback Control System
Xen Summit AMD 2010
RM in mulCple layers
Xen Summit AMD 2010
XRM = IaaS RM
Does app modeling and may request
changes
Knows only about VMs and hardware
resources
High level service request
Slice request
Automated control loop
Slice changes
PaaS RM
IaaS RM
Services
Hardware
XRM’s feedback control loop
Monitor
Control
AcCon
XCP
Network stats
Performance goals
Control parameters
Change resource shares
Migrate Power-‐off machines
Model Model can model
applicaCons, VMs, and underlying resources
Xen Summit AMD 2010
Current incarnaCon XCP
monitoring module
Stats analysis module
RRD database
Out of band stat updates from XCP
nodes
Stats 1. Thresholds 2. Rules
Core algorithm module
Algorithm bank
Filtered Stats and stats analysis data
Wrapper
Take acCon
XCP master node
Xen Summit AMD 2010
Openflow
Low-‐level commands/XAPI commands
XRM is an event-‐based framework • Many algorithms can be developed and plugged in • The algorithms register for specific events
– High CPU uClizaCon – Packet drops – PowerOff – PowerOn – …
• Different algorithms may take different acCons
Xen Summit AMD 2010
A Common Abstrac.on for ALL Algorithms
What algorithms can you implement? • AutoControl – automated control of mulCple virtualized resources [PadalaEurosys09]
• Models applicaCon and sets VM shares based on applicaCon goals
Xen Summit AMD 2010
Goals Resource Shares
App Controller
App Controller
App Controller
Node Controller Node Controller
[PadalaEurosys09] Pradeep Padala, Xiaoyun Zhu, Mustafa Uysal et al. Automated Control of Multiple Virtualized Resources. In the proceedings of the EuroSys 2009
Outline • MoCvaCon • Challenges in RM
• XRM Feedback Control based Design
• XRM ImplementaCon and Preliminary Results
• Summary and Feedback
Xen Summit AMD 2010
XRM features • Interface to upper layers • Auto-‐* features • External control • Pluggable algorithms
• Extensibility
Xen Summit AMD 2010
XRM ImplementaCon • Implemented on XCP 0.1.1 • WriZen in Python • Pluggable algorithms have to be wriZen in Python • Currently implements four algorithms
– Bin packing – Bin packing + Live migraCon – Random host – Round-‐robin
• We have also implemented a simulator (run 1 Million VMs on 100,000 nodes!) – Can capture data during a “real” run – Run mulCple algorithms on exact same trace
Xen Summit AMD 2010
XRM EvaluaCon • 5 hosts, 4 cores • Random uClizaCons
• Random slice requests
• Three algorithms – Bin-‐packing – Round-‐robin – Random-‐host
• Slicing algorithms evaluated in previous work -‐ AutoControl [PadalaEurosy’09]
Xen Summit AMD 2010
Comparing three algorithms
0
500
1000
0
500
1000
1 2 3 4 5 6 7 8 9
Time Interval
Hos
t Util
izat
ion 0
500
1000
Bin Packing
Random Host
Round-Robin Uses all five hosts, wasting energy
Uses <= five hosts, wasting energy
Uses <= three hosts!
• Experiments on Emulab
• 20 server nodes – 80 VMs
• 20 client nodes
• Mix of applications
• Load increased on ½ of the VMs chosen randomly
AutoControl experiments
Under loaded
Under loaded
Over loaded
Over loaded
Over loaded
VM1
VM2
VM3
VM4
No control needed
AutoControl can readjust
SLO (performance goal) violaCons
Time Time
Default Xen AutoControl
Applications
Good Bad Target
Summary • Resource management in cloud infrastructures is complex – MulCple layers of RM – Complex primiCves
– Complex decisions
• We are developing feedback control theory based RM
• XRM is event-‐based, pluggable and extensible
• Complex algorithms like AutoControl can be developed
• Research in advanced algorithms in progress
Xen Summit AMD 2010
Summary of our experiences with XCP 0.1.1
• We are trying to build a research cloud based on XCP • Other than XRM, adding Fault Tolerance and a Web-‐based GUI to XCP
• Having to install a special distribuCon is difficult – Why not have XCP as a set of packages in RHEL or other distribuCons?
– You are breaking toolstacks developed at various companies • XCP docs is same as Citrix Xenserver docs
– Some of the features don’t work or not supported – BeZer documentaCon of API
• XCP GUI needs to improve – Bugs in OpenXenCenter
Xen Summit AMD 2010
25
【参考】提供機能概要
Xen Summit AMD 2010
We want feedback from Xen community
• Comments on XRM architecture • Should we incorporate XRM into XCP?
– Ocaml
• Are you interested in open source XRM? – Does the community wants to be involved?
• QuesCons?
ppadala@docomolabs-‐usa.com
Xen Summit AMD 2010