site a but what if there is a catastrophic event? fire, flood, earthquake … same physical location
TRANSCRIPT
![Page 1: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/1.jpg)
Joint Business LaunchDISASTER RECOVERY AND MULTI-SITE CLUSTERING WITH WINDOWS SERVER 2008 R2
VIJAY TEWARI, PRINCIPAL PROGRAM MANAGER, WINDOWS SERVER NOV 17, 2009
![Page 2: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/2.jpg)
Session Objectives And Takeaways
Session Objective(s): Understanding the need and benefit of multi-site clustersWhat to consider as you plan, design, and deploy your first multi-site cluster
Windows Server Failover Clustering is a great solution for not only high availability, but also disaster recovery
![Page 3: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/3.jpg)
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
![Page 4: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/4.jpg)
Site A But what if there is a catastrophic
event?
Fire, flood, earthquake …
Same Physical Location
SAN
Is my Cluster Resilient to Site Failures?
![Page 5: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/5.jpg)
Site BSite AApplications are failed over to a
separate physical location
Node is moved to a physically
separate site
Multi-Site Clusters for DR
Extends a cluster from being a High Availability solution, to also being a Disaster Recovery solution
SANSAN
![Page 6: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/6.jpg)
Benefits of a Multi-Site Cluster
Protects against loss of an entire datacenterAutomates failover
Reduced downtimeLower complexity disaster recovery plan
Reduces administrative overheadAutomatically synchronize application and cluster changesEasier to keep consistent than standalone servers
The primary reason DR solutions fail isdependence on people
![Page 7: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/7.jpg)
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
![Page 8: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/8.jpg)
Network ConsiderationsNetwork Options:
1. Stretch VLAN’s across sites2. Cluster nodes can reside in different subnets
Site A
Public Network
Site B10.10.10.1 20.20.20.1
30.30.30.1 40.40.40.1
Separate
Network
![Page 9: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/9.jpg)
Stretching the NetworkLonger distance traditionally means greater network latencyToo many missed health checks can cause false failoverHeartbeating is fully configurable
SameSubnetDelay (default = 1 second)Frequency heartbeats are sent
SameSubnetThreshold (default = 5 heartbeats)Missed heartbeats before an interface is considered down
CrossSubnetDelay (default = 1 second)Frequency heartbeats are sent to nodes on dissimilar subnets
CrossSubnetThreshold (default = 5 heartbeats)Missed heartbeats before an interface is considered down to nodes on dissimilar subnets
Command Line: Cluster.exe /propPowerShell (R2): Get-Cluster | fl *
![Page 10: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/10.jpg)
Security over the WAN
Encrypt intra-node traffic0 = clear text1 = signed (default)2 = encrypted
Site A Site B10.10.10.1 20.20.20.1
30.30.30.1 40.40.40.1
![Page 11: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/11.jpg)
Enhanced Dependencies – ORNetwork Name resource stays up if either IP Address Resource A OR IP Address Resource B is up
OR
Network Name resource
IP Address Resource A
IP Address Resource B
![Page 12: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/12.jpg)
Client Reconnect Considerations
Nodes in dissimilar subnetsFailover changes resource’s IP AddressClients need that new IP Address from DNS to reconnect
10.10.10.111 20.20.20.222
DNS Server 1DNS Server 2DNS Replication
Record Updated
Record Created
Record Obtained
FS = 10.10.10.111
Record Updated
FS = 20.20.20.222Site A Site B
![Page 13: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/13.jpg)
Solution #1: Configure NN SettingRegisterAllProvidersIP (default = 0 for FALSE)
Determines if all IP Addresses for a Network Name will be registered by DNS
TRUE (1): IP Addresses can be online or offline and will still be registered
Ensure application is set to try all IP Addresses, so clients can connect quicker
HostRecordTTL (default = 1200 seconds)Controls time the DNS record lives on client for a cluster network name
Shorter TTL: DNS records for clients updated sooner
![Page 14: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/14.jpg)
Solution #2: Prefer Local Failover
Local failover for higher availabilityNo change in IP Address
Cross-site failover for disaster recovery
10.10.10.111
DNS Server 1 DNS Server 2
FS = 10.10.10.111Site A Site B
20.20.20.222
![Page 15: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/15.jpg)
Solution #3: Stretch VLAN’s
Deploying a VLAN minimizes client reconnection times
DNS Server 1 DNS Server 2
FS = 10.10.10.111
Site A Site B
10.10.10.11110.10.10.111
VLAN
![Page 16: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/16.jpg)
Solution #4: Abstraction in Device
Network device uses 3rd IP3rd IP is the one registered in DNS & used by clientExample:http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/App_Networking/extmsftw2k8vistacisco.pdf
10.10.10.111 20.20.20.222
DNS Server 1
DNS Server 2
FS = 30.30.30.30Site A Site B
30.30.30.30
![Page 17: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/17.jpg)
This is generic guidance…
If you have other creative ideas, that’s ok!
![Page 18: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/18.jpg)
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
![Page 19: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/19.jpg)
Storage in Multi-Site Clusters
Different than local clusters:Multiple storage arrays – independent per siteNodes commonly access own site storageNo “true” shared disk visible to all nodes
Site A Site B
![Page 20: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/20.jpg)
Site A
Changes are made on Site A and replicated to
Site B
Site B
Replica
Storage Considerations
Need a data replication mechanism between sites
![Page 21: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/21.jpg)
Replication Alternatives
Replication levels:Hardware storage-based replication. Eg.
Software host-based replication. Eg.
Application-based replication
![Page 22: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/22.jpg)
Synchronous vs. Asynchronous
Synchronous AsynchronousNo data loss Potential data loss on
hard failuresRequires high bandwidth/low
latency connection
Enough bandwidth to keep up with data
replicationStretches over shorter
distancesStretches over longer
distancesWrite latencies impact
application performance
No significant impact on application performance
![Page 23: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/23.jpg)
Cluster Validation and Replication
Multi-Site clusters are not required to pass the Storage tests to be supported
Validation Guide and Policy
http://go.microsoft.com/fwlink/?LinkID=119949
![Page 24: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/24.jpg)
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
![Page 25: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/25.jpg)
Quorum Overview
Disk only (not recommended)Node and Disk majority
Node majorityNode and File Share majority
VoteVote Vote Vote Vote
Majority is greater than 50%Possible Voters:
Nodes (1 each) + 1 Witness (Disk or File Share)4 Quorum Types
![Page 26: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/26.jpg)
Replicated Disk Witness
A witness is a decision maker when nodes lose network connectivity
When a witness is not a single decision maker, problems occur
Do not use in multi-site clusters unless directed by vendor
Replicated Storage from
vendor
?
Vote Vote Vote
![Page 27: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/27.jpg)
Site BSite A
Cross site network connectivity
broken!
Can I communicate with
majority of the nodes in the
cluster?Yes, then Stay Up
Can I communicate with
majority of the nodes in the
cluster?No, drop out of
Cluster Membership
5 Node Cluster: Majority = 3
Majority in Primary
Site
SANSAN
Node Majority
![Page 28: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/28.jpg)
Node Majority
Site BSite A
Disaster at Site 1
We are down! Can I communicate with
majority of the nodes in the
cluster?No, drop out of
Cluster Membership
Majority in Primary
Site
5 Node Cluster: Majority = 3
SANSAN
Need to force quorum manually
![Page 29: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/29.jpg)
Forcing Quorum
Always understand why quorum was lostUsed to bring cluster online without quorumCluster starts in a special “forced” stateOnce majority achieved, no more “forced” state
Command Line:net start clussvc /fixquorum (or /fq)
PowerShell (R2):Start-ClusterNode –FixQuorum (or –fq)
![Page 30: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/30.jpg)
Site A Site B
Site C
Complete resiliency and automatic recovery from the loss of any 1 site
Replicated Storage
\\Foo\Cluster1
SAN SAN
WAN
Multi-Site With File Share WitnessFile Share Witness
![Page 31: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/31.jpg)
WANSite A Site B
Site C
Complete resiliency and automatic recovery from the loss of connection between sites
Replicated Storage
SAN SAN
Multi-Site With File Share Witness
Can I communicate with majority of the nodes (+FSW) in the
cluster?Yes, then Stay Up
File Share Witness
Can I communicate with majority of the nodes in the cluster?No (lock failed), drop
out of Cluster Membership
\\Foo\Cluster1
![Page 32: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/32.jpg)
Quorum Model Summary
No Majority: Disk OnlyNot RecommendedUse as directed by vendor
Node and Disk MajorityUse as directed by vendor
Node MajorityOdd number of nodesMore nodes in primary site
Node and File Share MajorityEven number of nodesBest availability solution – FSW in 3rd site
![Page 33: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/33.jpg)
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
![Page 34: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/34.jpg)
Hyper-V in a Multi-Site Cluster
Area ConsiderationsNetwork -On cross-subnet failover, if guest
is …- DHCP, then IP updated automatically- Statically configured IP, then admin
needs to configure new IP-Use VLAN preferred with live migration between sites
Storage -3rd party replication solution required-Configuration with CSV (explained next)
Quorum -No special considerationsLinks: http://technet.microsoft.com/en-us/library/dd197488.aspx
![Page 35: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/35.jpg)
CSV in a Multi-Site Cluster
Architectural assumptions collide…Replication solutions assume only 1 array accessed at a timeCSV assumes all nodes can concurrently access the LUN
CSV is not required for Live MigrationTalk to your storage vendor for their support story
VHD
Nodes in Primary Site Nodes in Disaster Recovery Site
Read/OnlyRead/WriteReplication
VM attempts to access
replica
![Page 36: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/36.jpg)
SQL in a Multi-Site Cluster
Area ConsiderationsNetwork -SQL does not support OR
dependency-Need to stretch VLAN between sites
Storage -No special considerations-3rd party replication solution required
Quorum -No special considerationsLinks:http://technet.microsoft.com/en-us/library/ms189134.aspx http://technet.microsoft.com/en-us/library/ms178128.aspx
![Page 37: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/37.jpg)
Exchange in a Multi-Site Cluster
Area ConsiderationsNetwork -No VLAN needed
-Change HostRecordTTL from 20 minutes to 5 minutes-CCR supports 2 nodes, one per site
Storage -Exchange CCR provides application-based replication
Quorum -File share witness on the Hub Transport server on primary site
Links:http://technet.microsoft.com/en-us/library/bb124721.aspx http://technet.microsoft.com/en-us/library/aa998848.aspx
![Page 38: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/38.jpg)
demo
Setting up a cluster and Live Migration
![Page 39: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/39.jpg)
Demo Environment Overview
HVNODE1(Microsoft Hyper-V Server 2008 R2)
HVNODE2(Windows Server 2008 R2 deployed as Server core)
Gigabit Switch
CONTOSO:Domain Controller and iSCSI storage
![Page 40: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/40.jpg)
Session Summary
Multi-Site Failover Clustering has many benefitsRedundancy is needed everywhereUnderstand your replication needsCompare VLANs with multiple subnetsPlan quorum model & nodes before deploymentFollow the checklist and best practices
![Page 41: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/41.jpg)
ResourcesCluster Team Blog: http://blogs.msdn.com/clustering/ Cluster Information Portal: http://www.microsoft.com/windowsserver2008/en/us/clustering-home.aspx Clustering Technical Resources: http://www.microsoft.com/windowsserver2008/en/us/clustering-resources.aspx Clustering Forum (2008): http://forums.technet.microsoft.com/en-US/winserverClustering/threads/Clustering Forum (2008 R2):
http://social.technet.microsoft.com/Forums/en-US/windowsserver2008r2highavailability/threads/
Clustering Newsgroup: http://www.microsoft.com/communities/newsgroups/list/en-us/default.aspx?dg=microsoft.public.windows.server.clustering
Failover Clustering Deployment Guide: http://technet.microsoft.com/en-us/library/dd197477.aspx TechNet: Configure a Service or Application for High Availability: http://technet.microsoft.com/en-us/library/cc732478.aspx TechNet: Installing a Failover Cluster: http://technet.microsoft.com/en-us/library/cc772178.aspx TechNet: Creating a Failover Cluster: http://technet.microsoft.com/en-us/library/cc755009.aspxWebcast (2008 R2): Introduction to Failover Clustering: http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032407190&Culture=en-USWebcast (2008 R2): HA Basics with Hyper-V: http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032407222&Culture=en-US Webcast (2008 R2): Cluster Shared Volumes (CSV):http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032407238&Culture=en-US
![Page 42: Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location](https://reader036.vdocuments.mx/reader036/viewer/2022062404/551b0ef65503462e578b59dd/html5/thumbnails/42.jpg)
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the
date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.