1 xen and co.: communication-aware cpu scheduling for consolidated xen-based hosting platforms...
TRANSCRIPT
1
Xen and Co.: Communication-aware CPU Scheduling for Consolidated Xen-based Hosting Platforms
Sriram Govindan, Arjun R Nath,Amitayu Das,Bhuvan Urgaonkar,Anand Sivasubramaniam,
Computer Systems Laboratory,The Pennsylvania State University.
2
Data centers
Rent server resources Provide resource and
performance guarantees
Problem: Server sprawl
Solution: Consolidation Reduce resource wastage Reduced floor space Better power management
How?
3
Linux
Hardware Hardware
Windows
Server virtualization
2-tiered
e-commerce
application
Single tier
streaming
server
Operating
system
Applications
VMM
Ability to create multiple virtual servers from a single physical server
Allows consolidation by hosting heterogeneous OS instances over the same hardware
Why now? Emergence of highly efficient
virtual machine monitors Xen, VMware etc
Hardware support Intel, AMD, IBM etc
Real world example: Amazon EC2
5
Consolidation: Example
Consider a representative e-commerce benchmark, TPC-W, an online book store application
Measure application resource needs and record performance, Run TPC-W tiers on dedicated servers
Hardware Hardware
VMMVMMClients
Jboss Mysql...
Query
Response
RequestsRecord
responsetimes
Recordresource
usage
Responses
6
Consolidation: Example
Hardware Hardware
VMMVMMClients
Jboss Mysql...
CPU utilization
95th Percentile
Jboss ~10%
Mysql ~20% Response time in seconds
CD
F
7
Consolidation: Example
Hardware Hardware
VMMVMM
Jboss mysql
Clients
.
.
.
Resource
underutilizedCPUintensive
VMs
Consolidate the TPC-W tiers on to a single server Use Hypervisor to ensure resource guarantees
Reserve for the peak requirement Pack more applications to utilize the remaining server capacity
10% 20%Almost 100%
Server
Utilization
Other resource
requirements
are also met
8
Consolidation: Example
Clients
.
.
.
Hardware
VMM
Jbossmysql
CPUintensive
VMs
Response time in seconds
CD
F
With consolidation
Withoutconsolidation
Why did this happen?
9
Scheduler induced delays
Jboss
DB
query1
reply1
query2
reply2
Network latency
TPC-W tiers running on dedicated servers
10
Scheduler induced delays
Jboss
DB
query1
reply1
query2
reply2
Network latency
Jboss
DB
query1
reply1
query2
reply2
Scheduler induced delays
TPC-W tiers running on dedicated servers
ConsolidatedTPC-W tiers
11
Does this look familiar?
Parallel systems: Gang scheduling/Co-scheduling Feitelson et al, Ousterhout et al, Andrea et al
Schedulers: low latency dispatch eg. BVT, Duda et al
Our contribution: Fairness guarantees – Applications pay for resources Self-tuning - reduced administrator intervention
Adapt to varying application’s I/O behaviour Network I/O is virtualized – further increases the
delays
12
Xen Virtual Machine Monitor
Xen Hypervisor
Domain 0/Driver domain
ModifiedGuest OS
ModifiedGuest OS
ModifiedGuest OS
…Virtual
machines
I/O virtualization
VM scheduler
Virtual hardware (vCpu, vDisk, vNic, vMemory etc.)
Physical hardware (Cpu, Disk, Nic, Memory etc.)
ApplicationsApplications Applications
13
Network Virtualization in Xen - Reception
NIC
Netback driver
Netfront Driver
Hardware drivers
domain0
Guest VM
Hypervisor
Application
Interrupt
Notify
VirtualInterrupt
Packetdelivery
14
Network Virtualization in Xen - Transmission
NIC
Netback driver
Netfront Driver
Hardware drivers
domain0
Guest VM
ApplicationPacket send
Send overvirtual NIC
Send over NIC
15
Scheduler induced Delays
Delay associated with scheduling of Domain0 When a guest domain transmits a packet When a packet is received at the physical NIC
Jboss
Issues a query to db
dom0
DBdom0
16
Scheduler induced Delays
Delay associated with scheduling of Domain0
Delay at the recipient When Domain0 sends a packet to a guest domain
Jboss
Issues a query to db
dom0
DBdom0
17
Scheduler induced Delays
Delay associated with scheduling of Domain0
Delay at the recipient Delay at the sender
Before a domain sends a network packet (on its virtual NIC).
Unlike reception, sending a packet can only be anticipated.
18
Scheduler induced Delays
Delay associated with scheduling of Domain0 Delay at the recipient Delay at the sender
Network latency
Jboss
DB
query reply
Scheduler induced delays with virtualization overhead Consolidated TPC-W
tiers in a virtualized
environment
dom0
dom0 dom0
dom0 Jboss
19
Scheduler design
Recall: Reservations must be provided Build on top of a reservation based scheduler -SEDF
(slice, period) pair – need ‘slice ms’ every ‘period ms’
Communication aware SEDF scheduler: Enhance CPU scheduler to reduce scheduler induced
delays Change scheduling order to preferentially schedule
communicating domains Introduce short term unfairness
Still preserve reservation guarantees over a coarser time scale - PERIOD
20
Scheduler Implementation
Key idea: Associate impending network activity with each
virtual machine Incorporate communication activity in to decision
making Greedy Heuristic:
Prefer VM that is likely to benefit the most – the VM with most pending packets
21
Communication aware scheduler
Domain0Domain0
…Guest
Domains
Domain 1Domain 2 Domain n
- Reception
NIC
Packet arrive at the NIC
Interrupt
Domain0.pending++
Domain1.pending++
Now, schedule domain0.
Schedule Domain 1.
Hypervisor
Domain0.pending--
domain1.pending--
22
Evaluation Environment
Applications: TPC-W benchmark
jboss and mysql tiers Multi-threaded UDP Streaming server,
Simultaneously stream data at 3Mbps to specified number of clients
Every client is provided with a 8MB buffer size Clients starts consuming data only when the
buffer is full CPU intensive workloads,
Used for illustrative purposes
23
Streaming media experiments - performance improvement
Streaming to 45 Clients at 3Mpbs for 20 minutes
Default scheduler suffered playback discontinuity every 1.5 minutes
24
Streaming media experiments - performance improvement
Streaming to 45 Clients at 3Mpbs for 20 minutes
Default scheduler suffered playback discontinuity every 1.5 minutes
Communication-aware scheduler suffered a discontinuity only after 18th minute
25
Streaming media experiments - improved consolidation
A single buffer under run at the client is fixed as Service Level Objective (SLO)
Communication aware scheduler is able to sustain 30 more clients than the default scheduler
No. of clients supported at the server
No.
of
bu
ffer
un
der
run
s at
the c
lien
t
“SLO”
( Lower the better )
26
TPC-W performance
TPC-W benchmark ran for 20 minutes Around 35 percent improvement in response time
compared to the default scheduler
Scheduler Average (secs)
95th percentile (secs)
Maximum (secs)
Default SEDF
1.3 7.1 26.1
Modified SEDF
0.8 5.7 12.8
Percentage improvemen
t
34.11 % 19.98 % 51.15 %
27
Scheduler Fairness Evaluation
CPU intensive Virtual Machine
The CPU intensive VM lost less than 1% of CPU compared to the default scheduler but was still above their reservation which was 10%
Just changing the order of scheduling resulted in huge response time improvement for the streaming server
Time in minutes
CPU
uti
lizati
on
Reservation
Default SEDF
Modified SEDF
28
Conclusion
A communication-aware CPU scheduler developed for a consolidated environment
Low overhead run-time monitoring of network events by the hypervisor scheduler
Addressed additional problems due to network I/O virtualization in Xen
Source code (~300 lines) and Xen3.0.2 Patch available in the software link in,
http://csl.cse.psu.edu/
29
Questions
30
Streaming media experiments - performance improvement
Streaming to 45 Clients at 3Mpbs for 20 minutes
Default scheduler suffered glitches every 1.5 minutes
Communication-aware scheduler suffered a glitch only after 18th minute
With only domain0 optimization ON, glitch occurred at the 15th minute
31
Communication aware scheduler
Domain0
…Guest
Domains
HypervisorDomain0’s book-keeping page
…
Domain 1Domain 2 Domain n
Guest domainbook-keeping
pages
32
Communication aware scheduler
Domain0Domain0
…Guest
Domains
HypervisorDomain0’s book-keeping page
…
Domain 1Domain 2 Domain n
- Reception
NIC
Packet arrive at the NIC
Interrupt
Domain0: network_reception_intensity++
Domain 1: network_reception_intensit
y++
Now, schedule domain0.
Domain 0 is de scheduled, now we are in the hypervisor.Schedule Domain 1.
Receivepackets.
Domain 1 is de scheduled, now we are in the hypervisor.
UpdatePacket reception.
Updatependingactivity.
33
Communication aware scheduler
Domain0
…Guest
Domains
HypervisorDomain0’s book-keeping page
…
Domain 1Domain 2 Domain n
- Transmission
Domain1: network_transmissio
nintensity++
Domain0:network_transmission
intensity++
Domain1:anticipated_network
transmission_intensity++
Now domain 1 is de scheduled, we are in the hypervisor.