this presentation may contain vmwaredownload3.vmware.com/vmworld/2005/pac177.pdf · machine2...

Post on 06-Jul-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

This presentation may contain VMware confidential information.

Copyright © 2005 VMware, Inc. All rights reserved. All other marks and names mentioned herein may be trademarks of their respective

companies.

PAC177Distributed Availability Service

(DAS) Architecture

Sridhar RajagopalVMware

What Is It?

Provides high availability to virtual machines through automatic failover, on a cluster of ESX Server hosts Easy configuration, management and

monitoring of the high availability service Customizable behavior for individual

virtual machines

Outline

Overview Traditional failover Failover of virtual machines Distributed Availability Service (DAS) Example Conclusion

Availability

High availability Immediate response to failure, at the

application or machine level

Proactive availability Responding to events as system maintenance,

load fluctuations, workflow cycles, errors

Availability continuum Improved, high, continuous availability

Active-Passive

X

192.168.1.1 192.168.1.1

Active-Active

X

192.168.1.1

192.168.1.1

192.168.1.2

Advanced Configurations

N-to-one N+1 N-to-N

X

Traditional Failover

Need to cluster important applications Need to configure each node and application,

with additional setup on each node Non-clustered applications have no guarantees Application compatibility Planning spare capacity is not easy

Failover in Virtual Infrastructure

Cluster Server running inside the virtual machine Cluster Server running in the console OS

X

Advantages

Increased application availability at no extra cost No extra application setup

needed on each node. Clustered continuously available

applications get faster resurrection of resources

Issues

Each virtual machine will have to be set up for clustering

Not fully integrated with VirtualCenter(VC), VMotion

VC does not know about failover

Cluster server does not know about planned VC operations like Power-offs, VMotions

VC

VC

?

?

Solution

For a transparent solution VC needs to understand cluster server Virtual machine might change hosts

Cluster Server needs to know about VC Power off, VMotion

Hide the complexity of the interaction from the user VC as a management framework

Distributed Availability Service

A fully integrated, scalable virtual machine failover solution from VMware All virtual machines get transparent

failover support Integrated with VMotion, Distributed Resource

Scheduler (DRS), VC

Architecture DiagramVC Server

Agent

VirtualMachine

Agent Agent

High level architecture with black-box view of Agent

VirtualMachine

VirtualMachine

VC Planned Operations

Power off Power off will not be construed as failure Powered off, and specifically marked virtual machines

will not be failed over

VMotion: There is a brief window of vulnerability DAS has enough information to restart virtual machine

on origin, target, or another suitable host (if both go down)

Distributed Resource Scheduler

React to dynamic load changes Balance load across the cluster by automatic

virtual machine placement and VMotion

Specify complex resource policies across your cluster with hierarchical resource pools Easily manage and view resource policies

and cluster balance recommendations

DRS and DAS

The first priority is restarting of failed virtual machines DRS will kick in and rectify sub­optimal

placements DRS has affinity/anti-affinity rules DRS + DAS = proactive + reactive

solution

DAS

Virtual Machine

Virtual Machine

Virtual Machine

Virtual Machine

Virtual Machine

Virtual Machine

Virtual Machine Virtual Machine

X

VC

Planning and Configuring

Planning your cluster (admission control) Admission control ensures that enough spare

capacity is maintained across the cluster for failover Each host has some available headroom in terms of

memory and CPU Each virtual machine has some minimum memory

and CPU requirements

Capacity Planning

Specify number of host failures If N biggest hosts fail, the virtual machines

should still be relocated Determine worst case failure scenario with

max of the host and virtual machine sizes

Capacity Planning Example

VirtualMachine1

VirtualMachine2

VirtualMachine3

VirtualMachine6

VirtualMachine5

Failover capacity: 1 host failure

VirtualMachine1

VirtualMachine2

VirtualMachine3

VirtualMachine4

Failover capacity: 2 host failure

Virtual Machine1Virtual

Machine2Virtual

Machine2Virtual Machine1

Slot size

VirtualMachine4

Capacity Planning, cont.

Assumes a relatively homogenous cluster This is a conservative scheme Advanced users that want to do their own

planning can turn it off

Virtual machines have priorities If sufficient capacity is not available, more

important virtual machines get failed over first

Failover and errors scenarios

Number of host failures exceeds configured spare capacity Virtual machines with higher priority get

failed over first An alarm is generated

Clustering service is monitored and appropriate events/alarms are raised Integrated with standard VC framework

Example

Create a cluster Plan your cluster: set cluster properties Add hosts to the cluster Create virtual machines Power on virtual machines, set virtual

machine specific policies, if needed Sleep well!

Example

VC

Create Cluster

Configure Cluster

Configure Cluster

Add Hosts to Cluster

Add Host

Move Host into Cluster

Set Virtual Machine Specific Overrides

Power On Virtual Machine

Failover

Example 2: VirtualCenter

VirtualCenter is an application Needs to be highly available to ensure

management and monitoring Can VirtualCenter provide failover for itself? Solution: Run VirtualCenter in a virtual

machine in a DAS cluster

Clustered VirtualCenter, 2

Running VirtualCenter inside a virtual machine, in a cluster that it manages, provides high availability to itself!

VC VC

Conclusion

DAS Automatic failover for all virtual machines Fully integrated with VC, VMotion, DRS Applications running in such virtual machines

get increased availability Configuration and management are simplified Scalable Works with traditional application level failover

and enhances it

Questions?

PAC879: The Next Phase of Virtual Infrastructure: Introducing ESX Server 3.0 and VirtualCenter 2.0

PAC177: Distributed Availability Services ArchitecturePAC484: Consolidated Backup with ESX Server:

In­Depth ReviewPAC485: Managing Data Center Resources Using the

VirtualCenter Distributed Resource SchedulerPAC532: iSCSI and NAS in ESX Server 3

Details about future releases of our products are available in select sessions at VMworld,

including:

top related