this presentation may contain vmwaredownload3.vmware.com/vmworld/2005/pac177.pdf · machine2...
TRANSCRIPT
This presentation may contain VMware confidential information.
Copyright © 2005 VMware, Inc. All rights reserved. All other marks and names mentioned herein may be trademarks of their respective
companies.
PAC177Distributed Availability Service
(DAS) Architecture
Sridhar RajagopalVMware
What Is It?
Provides high availability to virtual machines through automatic failover, on a cluster of ESX Server hosts Easy configuration, management and
monitoring of the high availability service Customizable behavior for individual
virtual machines
Outline
Overview Traditional failover Failover of virtual machines Distributed Availability Service (DAS) Example Conclusion
Availability
High availability Immediate response to failure, at the
application or machine level
Proactive availability Responding to events as system maintenance,
load fluctuations, workflow cycles, errors
Availability continuum Improved, high, continuous availability
Active-Passive
X
192.168.1.1 192.168.1.1
Active-Active
X
192.168.1.1
192.168.1.1
192.168.1.2
Advanced Configurations
N-to-one N+1 N-to-N
X
Traditional Failover
Need to cluster important applications Need to configure each node and application,
with additional setup on each node Non-clustered applications have no guarantees Application compatibility Planning spare capacity is not easy
Failover in Virtual Infrastructure
Cluster Server running inside the virtual machine Cluster Server running in the console OS
X
Advantages
Increased application availability at no extra cost No extra application setup
needed on each node. Clustered continuously available
applications get faster resurrection of resources
Issues
Each virtual machine will have to be set up for clustering
Not fully integrated with VirtualCenter(VC), VMotion
VC does not know about failover
Cluster server does not know about planned VC operations like Power-offs, VMotions
VC
VC
?
?
Solution
For a transparent solution VC needs to understand cluster server Virtual machine might change hosts
Cluster Server needs to know about VC Power off, VMotion
Hide the complexity of the interaction from the user VC as a management framework
Distributed Availability Service
A fully integrated, scalable virtual machine failover solution from VMware All virtual machines get transparent
failover support Integrated with VMotion, Distributed Resource
Scheduler (DRS), VC
Architecture DiagramVC Server
Agent
VirtualMachine
Agent Agent
High level architecture with black-box view of Agent
VirtualMachine
VirtualMachine
VC Planned Operations
Power off Power off will not be construed as failure Powered off, and specifically marked virtual machines
will not be failed over
VMotion: There is a brief window of vulnerability DAS has enough information to restart virtual machine
on origin, target, or another suitable host (if both go down)
Distributed Resource Scheduler
React to dynamic load changes Balance load across the cluster by automatic
virtual machine placement and VMotion
Specify complex resource policies across your cluster with hierarchical resource pools Easily manage and view resource policies
and cluster balance recommendations
DRS and DAS
The first priority is restarting of failed virtual machines DRS will kick in and rectify suboptimal
placements DRS has affinity/anti-affinity rules DRS + DAS = proactive + reactive
solution
DAS
Virtual Machine
Virtual Machine
Virtual Machine
Virtual Machine
Virtual Machine
Virtual Machine
Virtual Machine Virtual Machine
X
VC
Planning and Configuring
Planning your cluster (admission control) Admission control ensures that enough spare
capacity is maintained across the cluster for failover Each host has some available headroom in terms of
memory and CPU Each virtual machine has some minimum memory
and CPU requirements
Capacity Planning
Specify number of host failures If N biggest hosts fail, the virtual machines
should still be relocated Determine worst case failure scenario with
max of the host and virtual machine sizes
Capacity Planning Example
VirtualMachine1
VirtualMachine2
VirtualMachine3
VirtualMachine6
VirtualMachine5
Failover capacity: 1 host failure
VirtualMachine1
VirtualMachine2
VirtualMachine3
VirtualMachine4
Failover capacity: 2 host failure
Virtual Machine1Virtual
Machine2Virtual
Machine2Virtual Machine1
Slot size
VirtualMachine4
Capacity Planning, cont.
Assumes a relatively homogenous cluster This is a conservative scheme Advanced users that want to do their own
planning can turn it off
Virtual machines have priorities If sufficient capacity is not available, more
important virtual machines get failed over first
Failover and errors scenarios
Number of host failures exceeds configured spare capacity Virtual machines with higher priority get
failed over first An alarm is generated
Clustering service is monitored and appropriate events/alarms are raised Integrated with standard VC framework
Example
Create a cluster Plan your cluster: set cluster properties Add hosts to the cluster Create virtual machines Power on virtual machines, set virtual
machine specific policies, if needed Sleep well!
Example
VC
Create Cluster
Configure Cluster
Configure Cluster
Add Hosts to Cluster
Add Host
Move Host into Cluster
Set Virtual Machine Specific Overrides
Power On Virtual Machine
Failover
Example 2: VirtualCenter
VirtualCenter is an application Needs to be highly available to ensure
management and monitoring Can VirtualCenter provide failover for itself? Solution: Run VirtualCenter in a virtual
machine in a DAS cluster
Clustered VirtualCenter, 2
Running VirtualCenter inside a virtual machine, in a cluster that it manages, provides high availability to itself!
VC VC
Conclusion
DAS Automatic failover for all virtual machines Fully integrated with VC, VMotion, DRS Applications running in such virtual machines
get increased availability Configuration and management are simplified Scalable Works with traditional application level failover
and enhances it
Questions?
PAC879: The Next Phase of Virtual Infrastructure: Introducing ESX Server 3.0 and VirtualCenter 2.0
PAC177: Distributed Availability Services ArchitecturePAC484: Consolidated Backup with ESX Server:
InDepth ReviewPAC485: Managing Data Center Resources Using the
VirtualCenter Distributed Resource SchedulerPAC532: iSCSI and NAS in ESX Server 3
Details about future releases of our products are available in select sessions at VMworld,
including: