vmworld 2013: drs: new features, best practices and future directions
DESCRIPTION
VMworld 2013 Aashish Parikh, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshareTRANSCRIPT
![Page 1: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/1.jpg)
DRS: New Features, Best Practices
and Future Directions
Aashish Parikh, VMware
VSVC5280
#VSVC5280
![Page 2: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/2.jpg)
2 2
Disclaimer
This session may contain product features that are
currently under development.
This session/overview of the new technology represents
no commitment from VMware to deliver these features in
any generally available product.
Features are subject to change, and must not be included in
contracts, purchase orders, or sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.
Pricing and packaging for any new technologies or features
discussed or presented have not been determined.
![Page 3: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/3.jpg)
3 3
Talk Outline
Advanced Resource Management concepts
• VM Happiness and Load-balancing
New features in vSphere 5.5
• Going after harder problems, tackling specific pain-points
Integrating with new Storage and Networking features
• Playing nicely with vFlash, vSAN and autoscaling proxy switch ports
Future Directions
• What’s cooking in DRS labs!
![Page 4: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/4.jpg)
4 4
Related Material
Other DRS talks at VMworld 2013
• VSVC5821 - Performance and Capacity management of DRS clusters
(3:30 pm, Monday)
• STO5636 - Storage DRS: Deep Dive and Best Practices to Suit Your Storage
Environments (4 pm, Monday & 12:30 pm Tuesday)
• VSVC5364 - Storage IO Control: Concepts, Configuration and Best Practices to Tame
Different Storage Architectures (8:30 am, Wed. & 11 am, Thursday)
From VMworld 2012
• VSP2825 - DRS: Advanced Concepts, Best Practices and Future Directions
• Video: http://www.youtube.com/watch?v=lwoVGB68B9Y
• Slides: http://download3.vmware.com/vmworld/2012/top10/vsp2825.pdf
VMware Technical Journal publications
• VMware Distributed Resource Management: Design, Implementation, and
Lessons Learned
• Storage DRS: Automated Management of Storage Devices In a Virtualized Datacenter
![Page 5: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/5.jpg)
5 5
Advanced Resource Management Concepts
Contention, happiness, constraint-handling and load-balancing
![Page 6: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/6.jpg)
6 6
The Giant Host Abstraction
DRS treats the cluster as one giant host
• Capacity of this giant “host” = capacity of the cluster
1 “giant host”
CPU = 60 GHz
Memory = 384 GB
6 hosts
CPU = 10 GHz
Memory = 64 GB
![Page 7: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/7.jpg)
7 7
Why is That Hard?
Main issue: fragmentation of resource across hosts
Primary goal: Keep VMs happy by meeting their resource demands!
1 “giant host”
CPU = 60 GHz
Memory = 384 GB
VM demand < 60 GHz
All VMs running happily
![Page 8: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/8.jpg)
8 8
Why Meet VM Demand as the Primary Goal?
VM demand satisfied VM or application happiness
Why is this not met by default?
• Host level overload: VM demands may not be met on the current host
Three ways to find more capacity in your cluster
• Reclaim resources from VMs on current host
• Migrate VMs to a powered-on hosts with free capacity
• Power on a host (exit DPM standby mode) and then migrate VMs to it
Note: demand current utilization
![Page 9: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/9.jpg)
9 9
Why Not Load-balance the DRS Cluster as the Primary Goal?
Load-balancing is not free
• When VM demand is low, load-balancing may have some cost but no benefit
Load-balancing is a mechanism used to meet VM demands
Consider these 4 hosts with almost-idle VMs
Note: all VMs are getting what they need
Q: Should we balance VMs across hosts?
![Page 10: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/10.jpg)
10 10
Important Metrics and UI Controls
CPU and memory entitlement
• VM deserved value based on demand, resource settings and capacity
Load is not balanced but all VMs are happy!
![Page 11: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/11.jpg)
11 11
DRS Load-balancing: The Balls and Bins Problem
Problem: assign n balls to m bins
• Balls could have different sizes
• Bins could have different sizes
Key Challenges
• Dynamic numbers and sizes of balls/bins
• Constraints on co-location, placement and others
Now, what is a fair distribution of balls among bins?
DRS load-balancing
• VM resource entitlements are the ‘balls’
• Host resource capacities are the ‘bins’
• Dynamic load, dynamic capacity
• Various constraints: DRS-Disabled-On-VM, VM-to-Host affinity, …
![Page 12: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/12.jpg)
12 12
Goals of DRS Load-balancing
Fairly distribute VM demand among hosts in a DRS cluster
Enforce constraints
• Recommend mandatory moves
Recommend moves that significantly improve imbalance
Recommend moves with long-term benefit
![Page 13: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/13.jpg)
13 13
How Is Imbalance Measured?
Imbalance metric is a cluster-level metric
For each host, we have
Imbalance metric is the standard deviation of these normalized
entitlements
Imbalance metric = 0 perfect balance
How much imbalance can you tolerate?
AWESOME! OOPS! OUCH! YIKES!
0.65 0.1 0.25
Imbalance metric = 0.2843
![Page 14: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/14.jpg)
14 14
The Myth of Target Balance
UI slider tells us what star threshold is acceptable
Implicit target number for cluster imbalance metric
• n-star threshold tolerance of imbalance up to n * 10% in a 2-node cluster
Constraints can make it hard to meet target balance
Meeting aggressive target balance may require many migrations
• DRS will recommend them only if they also make VMs happier
If all VMs are happy, a little imbalance is not really a bad thing!
![Page 15: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/15.jpg)
15 15
Key Takeaway Points
The Goals of DRS
• Provide the giant host abstraction
• Meet VM demand and keep applications at high performance
• Use load balancing as a secondary metric while making VMs happy
DRS recommends moves that reduce cluster imbalance
Constraints can impact achievable imbalance metric
If all VMs are happy, a little imbalance is not really a bad thing!
![Page 16: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/16.jpg)
16 16
New Features in vSphere 5.5 Going after harder problems, tackling specific pain-points
![Page 17: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/17.jpg)
17 17
Restrict #VMs per Host
Consider a perfectly balanced cluster w.r.t. CPU, Memory
Bigger host is ideal for new incoming VMs
However, spreading VMs around is important to some of you!
• In vSphere 5.1, we introduced a new option
LimitVMsPerESXHost = 6
• DRS will not admit or migrate more than 6 VMs to any Host
This number must be managed manually – not really the DRS way!
![Page 18: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/18.jpg)
18 18
AutoTune #VMs per Host
*NEW* in vSphere 5.5: LimitVMsPerESXHostPercent
Currently: 20 VMs New: 12 VMs Total: 32 VMs
VMs per Host limit automatically set to: Mean + (Buffer% * Mean)
• Mean (VMs per host, including new VMs): 8
• Buffer% (value of new option): 50
• Therefore, new limit automatically set to: 8 + (50% * 8) = 12
Use for new operations, not to correct current state
Host1 Host2 Host3 Host4
![Page 19: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/19.jpg)
19 19
Latency-sensitive VMs and DRS
Several efforts to help latency-sensitive apps
• VSVC5596 - Extreme Performance Series: Network Speed Ahead
• VSVC5187 - Silent Killer: How Latency Destroys Performance...And What to
Do About It
DRS recognizes VMs marked as latency-sensitive
DRS initial placement “just works”
DRS load-balancing treats VMs as if soft-affine to current hosts
• Latency-sensitive workloads may be sensitive to vMotions
• Soft affinity is internal, no visible new rule in the UI
VMs will not be migrated unless absolutely necessary
• To fix otherwise unfixable host overutilization
• To fix otherwise unfixable soft/hard rule-violation
DRS recognizes VMs marked
as latency-sensitive
![Page 20: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/20.jpg)
20 20
CPU Ready Time: Overview
Ready time: The time a virtual CPU (vCPU) waits in a ready-to-run
state before it can be scheduled on a physical CPU (pCPU)
Many causes for high ready time; many fixes; many red-herrings
Active area of investigation and research, chime in! We need help!
• Give us more problem details, VC and ESX support files
• www.yellow-bricks.com/2013/05/09/drs-not-taking-cpu-ready-time-in-to-
account-need-your-help
NEW TERMINATED
WAITING
READY RUNNING
dispatch
interrupt
exit admitted
I/O or event wait I/O or event complete
![Page 21: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/21.jpg)
21 21
CPU Ready Time: Common “Problems” and Solutions
“%RDY is 80, that cannot be good”
• Cumulative number – divide by #vCPUs to get an accurate measure
• Rule-of-thumb: up to 5% per vCPU is usually alright
“Host utilization is very low, %RDY is very high”
• Host power-management settings reduce effective pCPU capacity!
• Set BIOS option to “OS control” and let ESX decide
• Low VM or RP CPU limit values restrict cycles delivered to VMs
• Increase limits (or set them back to “unlimited”, if possible)
“Host and Cluster utilization is very high, %RDY is very high”
• Well, physics..
• Add more capacity to the cluster
![Page 22: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/22.jpg)
22 22
CPU Ready Time: NUMA Scheduling Effects
NUMA-scheduler can increase %RDY
vCPUs scheduled for data locality
in caches and local memory
So what’s the solution?
• There’s no real solution – there’s no real problem here!
Application performance is often better!
• Benefit of NUMA-scheduling is usually more than the hit to %RDY
For very CPU-heavy apps, disabling NUMA-rebalancer might help
• http://blogs.vmware.com/vsphere/2012/02/vspherenuma-loadbalancing.html
node0 node1
Lo
cal m
em
ory
Lo
cal
mem
ory
0 7 1 6 2 5 3 4
cores
7 0 6 1 5 2 4 3
cores
![Page 23: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/23.jpg)
23 23
CPU Ready Time: Better Handling in DRS
Severe load imbalance in the cluster - high #vCPU / #pCPU ratio
• DRS automatically detects and addresses severe imbalance aggressively
• Functionality added in vSphere 4.1 U3, 5.0 U2 and 5.1
CPU demand estimate too low – DRS averaging out the bursts
• 5-minute CPU average may capture CPU active too conservatively
• *NEW* in vSphere 5.5: AggressiveCPUActive = 1
• DRS Labs!: Better CPU demand estimation techniques
Other reasons for performance hit due to high %RDY values
• Heterogeneous workloads on vCPUs of a VM, correlated workloads
• DRS Labs!: NUMA-aware DRS placement and load-balancing
![Page 24: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/24.jpg)
24 24
Memory Demand Estimation
DRS’ view of VM memory hierarchy
Active memory determined by statistical sampling by ESX, often
underestimates demand!
*NEW* in vSphere 5.5 – burst protection!
• PercentIdleMBInMemDemand = 25 (default)
DRS / DPM memory demand = Active + (25% * IdleConsumed)
• For aggressive demand computation, PercentIdleMBInMemDemand = 100
Active Idle Consumed
Consumed
Configured
![Page 25: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/25.jpg)
25 25
Key Takeaway Points
Number of VMs per host can now be managed with ease
• Use this wisely, VM happiness and load-balance can be adversely affected
DRS handles latency-sensitive VMs effectively
Better handling of CPU ready time
• *NEW* in vSphere 5.5: AggressiveCPUActive
• We need your help – precise problem descriptions and dump files
Better handling of memory demand – burst protection
• *NEW* in vSphere 5.5: PercentIdleMBInMemDemand
![Page 26: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/26.jpg)
26 26
Integrating with New Storage and Networking Features
Playing nicely with vFlash, vSAN, autoscaling proxy switch ports
![Page 27: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/27.jpg)
27 27
VMware vFlash: Overview
VMware vFlash: new host-side caching solution in vSphere 5.5
vFlash reservations statically defined per VMDK
Cache is write-through (i.e. used as read cache only)
Cache blocks copied to destination host during vMotion by default
ESX host
SSD vFlash
vmdk1 vmdk2
Shared storage
![Page 28: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/28.jpg)
28 28
VMware vFlash: DRS Support and Best Practices
DRS initial placement “just works”
• vFlash reservations are enforced during admission control
DRS load-balancing treats VMs as if soft-affine to current hosts
VMs will not be migrated unless absolutely necessary
• To fix otherwise unfixable host overutilization or soft/hard rule-violation
Best practices:
• Use vFlash reservations judiciously
• Mix up VMs that need and do not need vFlash in a single cluster
![Page 29: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/29.jpg)
29 29
VMware vFlash: Things to Know
Migration time is proportional to cache allocation
No need to re-warm on destination
vMotions may take longer
Host-maintenance mode operations may take longer
vFlash space can get fragmented across the cluster – no defrag. mechanism
Technical deep-dive
• STO5588 - vSphere Flash Read Cache Technical Overview
![Page 30: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/30.jpg)
30 30
VMware vSAN: Interop with DRS
vSAN creates a distributed datastore using local SSDs and disks
A vSAN cluster can also have hosts without local storage
VMDK blocks are duplicated across several hosts for reliability
DRS is compatible with vSAN
vMotion may cause remote accesses until local caches are warmed
![Page 31: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/31.jpg)
31 31
Autoscaling Proxy Switch Ports and DRS
DRS admission control – proxy switch ports test
• Makes sure that a host has enough ports on the proxy switch to map
• vNIC ports
• uplink ports
• vmkernel ports
In vSphere 5.1, ports per host = 4096
• Hosts will power-on no more than ~400 VMs
*NEW* in vSphere 5.5, autoscaling switch ports
• Ports per host discovered at host boot time
• Proxy switch ports automatically scaled
DRS and HA will admit more VMs!
VMware DRS
Virtual Switch
![Page 32: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/32.jpg)
32 32
Key Takeaway Points
DRS and vFlash play nice!
• Initial placement “just works”
• vFlash reservations are enforced during admission control
• DRS load-balancing treats VMs as if soft-affine to current hosts
DRS and vSAN are compatible with each other
Proxy switch ports on ESX hosts now autoscale upon boot
• Large hosts can now admit many more VMs
![Page 33: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/33.jpg)
33 33
Future Directions What’s cooking in DRS labs!
![Page 34: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/34.jpg)
34 34
Network DRS: Bandwidth Reservation
Per-vNIC shares and limits have been available since vSphere 4.1
We are experimenting with per vNIC bandwidth reservations
In presence of network reservations
• DRS will respect those during placement and future vMotion recommendations
• pNIC capacity at host will be pooled together for this
What else would you like to see in this area?
Is pNIC contention an issue in your environment?
Network RPs?
![Page 35: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/35.jpg)
35 35
Static VM Overhead Memory: Overview
Static overhead memory is the amount of additional memory
required to power-on or resume a VM
Various factors affect the computation
• VM config. Parameters
• VMware features (FT, CBRC; etc)
• Host config. parameters
• ESXi Build Number
This value is used for admission control!
• (Reservation + static overhead) must be admitted for a successful power-on
• Used by DRS initial placement, DRS load-balancing, manual migrations
![Page 36: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/36.jpg)
36 36
Static VM Overhead Memory: Better Estimation
Better estimation of static overhead memory leads to more
consolidation during power-on!
• www.yellow-bricks.com/2013/05/06/dynamic-versus-static-overhead-memory
Some very encouraging results so far
Here’s an example of a 10 vCPU VM with memsize = 120 GB
• The current approach computes static overhead memory:
• The new approach computes static overhead memory:
<drumroll>
988.06 MB
12.26 GB
![Page 37: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/37.jpg)
37
Proactive DRS: Overview
Monitor
Compute Evaluate
Remediate Verify
Inventory
Rules and constraints
Available capacity
Usage statistics
Alarms, notifications
RP tree integrity
Rule and constraint
parameters
Free capacity
Growth rates
Derived metrics
RP violations
Rule and constrain
violations
Capacity sufficient?
VMs happy?
Cluster balanced?
Migrate to fix violations
Increase VM entitlement
• On same host (re-divvy)
• On another host (migrate)
• On another host
(power on + migrate)
Monitor
Remediate Compute
Evaluate Predict
Motivation
• VM demand can be clipped
• Remediation can be
expensive
Goals
• Predict and avoid spikes
• Make remediation cheaper
• Proactive load-balancing
• Proactive DPM
• Use predicted data for
placement, evacuation
DRS life-cycle Proactive DRS life-cycle
![Page 38: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/38.jpg)
38 38
Proactive DRS: Predicting Demand with vC Ops
VMware Operations Manager – a.k.a. vC Ops
• Monitors VCs and their inventories
• Collects stats, does analytics and capacity planning
Monitors VC performance stats and provides dynamic thresholds
• Gray area (literally )
• Range of predicted values
Proactive DRS uses upper bound as predicted demand
• Various CPU and Memory metrics
• Not perfect: capacity planning vs demand estimation
• Work in progress, encouraging results so far
![Page 39: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/39.jpg)
39 39
Proactive DRS: Fling
We wrote you a fling!
• http://www.labs.vmware.com/flings/proactive-drs
• http://flingcontest.vmware.com
Fun tool to try out basic Proactive DRS
• Connects to vC Ops, downloads dynamic thresholds, creates stats file
• Puts per-cluster “predicted.stats” files into your VCs, runs once a day
To enable Proactive DRS, simply set ProactiveDRS = 1
More details, assumptions, caveats on the fling page
![Page 40: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/40.jpg)
40 40
Proactive DRS: Demo
![Page 41: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/41.jpg)
41 41
In Summary
VM Happiness is the key goal of DRS
Load balancing keeps VMs happy and resource utilization fair
New features in vSphere 5.5
• Automatic management of #VMs per ESX host, Latency-sensitive VM support
• Better ready-time handling, Better memory demand estimation
Interoperability with vFlash, vSAN, autoscaling proxy switch ports
Tons of cool DRS stuff in the pipeline, feedback welcome!
• Beginnings of network DRS, Better estimation of VM static overhead memory
• Proactive DRS!
![Page 42: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/42.jpg)
THANK YOU
![Page 43: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/43.jpg)
![Page 44: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/44.jpg)
DRS: New Features, Best Practices
and Future Directions
Aashish Parikh, VMware
VSVC5280
#VSVC5280
![Page 45: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/45.jpg)
45 45
Backup slides
![Page 46: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/46.jpg)
46 46
DRS — Overview
DRS
Cluster
vMotion
Ease of Management
Initial Placement
Runtime CPU/Memory Load-balancing
VM-to-VM and VM-to-host
Affinity and Anti-Affinity
Host Maintenance Mode
Add Host
•••
Always respect resource controls!
![Page 47: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/47.jpg)
47 47
How Are Demands Computed?
Demand indicates what a VM could consume given more resources
Demand can be higher than utilization
• Think ready time
CPU demand is a function of many CPU stats
Memory demand is computed by tracking pages in guest address
space and the percentage touched in a given interval
![Page 48: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/48.jpg)
48 48
Evaluating Candidates Moves to Generate Recommendations
Consider migrations from over-loaded to under-loaded hosts
• Δ = Imbalance Metricbefore move – Imbalance Metricafter move
• ‘Δ’ is also called goodness
Constraints and resource specifications are respected
• VMs from Host2 are anti-affine with Host1
Prerequisite/dependent migrations also considered
List of good moves is filtered further
Only the best of these moves are recommended
Host1
Host2
Host3
![Page 49: VMworld 2013: DRS: New Features, Best Practices and Future Directions](https://reader031.vdocuments.mx/reader031/viewer/2022020306/54b767de4a795957768b4619/html5/thumbnails/49.jpg)
49 49
CPU Ready Time: Final Remarks
“Host utilization is very high, %RDY is very high”
• Well, physics..
Other reasons for performance hit due to high %RDY values
• Heterogenous workloads on vCPUs of a VM, correlated workloads
Playing with more ideas in DRS Labs!
• Better CPU demand estimation
• NUMA-aware DRS placement and load-balancing
Active area of investigation and research, chime in! We need help!
• Give us more problem details, VC and ESX support files
• www.yellow-bricks.com/2013/05/09/drs-not-taking-cpu-ready-time-in-to-
account-need-your-help