김병기 김병기 senior technology specialist [email protected] microsoft corporation sql...
TRANSCRIPT
김병기 김병기 Senior Technology [email protected] Corporation
SQL Server 2000SQL Server 2000High AvailabilityHigh Availability
AgendaAgenda
What Is High Availability?What Is High Availability? HA TechnologyHA Technology Designing a Solution for HADesigning a Solution for HA
High AvailabilityHigh Availability
IsIs design, people, process ,technologydesign, people, process ,technology 의의
집합체집합체
오해…오해… .... technology solutiontechnology solution 이 아님이 아님 scalability scalability 또는 또는 manageabilitymanageability 와 유사한 와 유사한
개념이개념이 아님아님 ..
Breaking Down The NinesBreaking Down The Nines
Percentage Downtime (per year)Downtime (per year)
100% None
99.999% < 5.26 minutes
99.99% 5.26 – 52 minutes
99.9 % 52 m – 8 h, 45 min
99 % 8 h, 45 m – 87 h, 36 m
90% 788 h, 24 m – 875 h, 54 m
9
Hardware is well managedHardware is well managed• Can tolerate some hardware failuresCan tolerate some hardware failures
9Good management and planningGood management and planning
• Can tolerate most hardware failures Can tolerate most hardware failures • Can tolerate normal ops tasks (e.g., software upgrades)Can tolerate normal ops tasks (e.g., software upgrades)• Can tolerate some software failuresCan tolerate some software failures
9
Operational, planning, design excellenceOperational, planning, design excellence• Can handle most planned and Can handle most planned and unplanned downtime unplanned downtime • Can tolerate some operations failuresCan tolerate some operations failures
99UnmanagedUnmanaged
In S
um
mary
In S
um
mary
The Cost Of AvailabilityThe Cost Of Availability
99 99.1 99.2 99.3 99.4 99.5 99.6 99.7 99.8 99.9 100
Availability
Cos
t
AgendaAgenda
What Is High Availability?What Is High Availability? HA TechnologyHA Technology Designing a Solution for HADesigning a Solution for HA
Two Levels …Two Levels …
SQL TechnologySQL Technology Failover Clustering Failover Clustering Log ShippingLog Shipping ReplicationReplication
Windows TechnologyWindows Technology Windows ClusteringWindows Clustering NLB/WLBSNLB/WLBS
How Failover Clustering How Failover Clustering WorksWorks
Client PCs
Node A Node B
Shared Disk Array
HeartbeatSQL ServerSQL Server SQL ServerSQL Server
Log ShippingLog Shipping
SecondaryPrimary
Log Shipping Monitor
Transaction Logs
Log Shipping UsesLog Shipping Uses
H.A H.A 사용사용 :: Facilitate 7.0 Facilitate 7.0 2000 upgrade 2000 upgrade
원격지 서버원격지 서버문제 해결 방법 중 문제 해결 방법 중 failover clusterfailover cluster외의 또 다른 해결 방법으로 사용가능외의 또 다른 해결 방법으로 사용가능
Performing maintenance on PrimaryPerforming maintenance on Primary
Check DB healthCheck DB health
SQL HA Technology ComparisonSQL HA Technology ComparisonFeatureFeature Failover ClusteringFailover Clustering Log ShippingLog Shipping Transactional Transactional
ReplicationReplication
Failure detection
AutomaticAutomatic Not AutomaticNot Automatic Not AutomaticNot Automatic
Automatic switch to secondary
YesYes ManualManual ManualManual
Protects against failed server process
YesYes Yes, but …Yes, but … Yes, but …Yes, but …
Protects against failed disk
No, No, Shared-disk clusteringShared-disk clustering Yes, but …Yes, but … Yes, but …Yes, but …
Meta data support
All system and user schema and datAll system and user schema and data for all databasesa for all databases
Some system, all user Some system, all user schema and data for sschema and data for select databaseselect databases
Some user schema Some user schema and dataand data
Transactionally consistent
YesYes YesYes YesYes
Transactionally current
YesYes No, since last No, since last transaction log transaction log backupbackup
No, since last No, since last replicated replicated transactiontransaction
FeatureFeature Failover Clustering Log Shipping Transactional Replication
Performance impact
NoneNone Minimal (file Minimal (file copying on primary)copying on primary)
Log reader Log reader continually continually runningrunning
Time to switch Seconds to minutes, depends on Seconds to minutes, depends on db recovery timedb recovery time
Seconds, more to Seconds, more to recover more recover more thoroughlythoroughly
Seconds, more to Seconds, more to recover more recover more thoroughlythoroughly
Locations Close (unless using distance Close (unless using distance clusters on HCL)clusters on HCL)
Not location boundNot location bound Not location Not location boundbound
Additional complexity
SomeSome SomeSome MoreMore
Maximum number of servers
44 32 with NLB, 32 with NLB, otherwise no limitotherwise no limit
No limitNo limit
Standby available for reporting, etc.
N/A – not a warm standby N/A – not a warm standby solutionsolution
Yes. Possible Read-onlYes. Possible Read-only access when logs are y access when logs are not being loadednot being loaded
YesYes
Partitioning of data to standby
NoNo NoNo YesYes
SQL HA Technology ComparisonSQL HA Technology Comparison
AgendaAgenda
What Is High Availability?What Is High Availability? HA TechnologyHA Technology Designing a Solution for HADesigning a Solution for HA
모든 경우에 적용되는 완전한 모든 경우에 적용되는 완전한 guideline guideline 은 은 있있을 수 없다을 수 없다 ..
Make sure the technology is Make sure the technology is supportable in your environmentsupportable in your environment
Just because something is “cool” , it Just because something is “cool” , it may not be rightmay not be right failover clusteringfailover clustering 이 대부분의 경우에 이 대부분의 경우에
해당하는 해당하는 최상의 최상의 optionoption 이라 할지라도 이라 할지라도 , , 항상 올바른 선택은 아님항상 올바른 선택은 아님 ..
Picking The Right Picking The Right TechnologyTechnology
How Should I Deploy How Should I Deploy Replication?Replication?
ReplicationReplication 은 은 high availability solutionhigh availability solution 의 의 부분으로 이용될 수 있다부분으로 이용될 수 있다 .. 예를 들어예를 들어 , catalog data, catalog data 를 정기적으로 복제해야 하는 를 정기적으로 복제해야 하는
web siteweb site 의 경우의 경우 UpdateUpdate 가 점증적으로 발생하는 가 점증적으로 발생하는 DSS databaseDSS database 의 경우의 경우
It works very well for read-only dataIt works very well for read-only data
Load Load Balancing & Balancing & ReplicationReplication
Read only data Read only data Windows NT® Load Windows NT® Load
Balancing Service Balancing Service (WLBS) is used to (WLBS) is used to distribute the load distribute the load
Data is replicated to Data is replicated to multiple serversmultiple servers
LAN(Ethernet/FDDI)
IIS Server IIS Server IIS Server
Read-Only SQLServer
Read-Only SQLServer
NLB Cluster
NLB Cluster
Read/Write SQLServer
Publisher/Distributor
SN
AP
SH
OT O
R TR
AN
S R
EP
LICA
TION
SN
AP
SH
OT O
R TR
AN
S R
EP
LICA
TION
Public Network
Using Replication…Using Replication… ““continuous” mode replicationcontinuous” mode replication 로 운영시로 운영시
대기시간을 낮출 수 있음 대기시간을 낮출 수 있음 Watch for the exceptions Watch for the exceptions
(i.e., 10,000,000 row deletes)(i.e., 10,000,000 row deletes) 주의사항 주의사항
Only synchronizes what you requestOnly synchronizes what you request 추가의 변화 관리가 필요추가의 변화 관리가 필요 ..
절대 절대 backupbackup 의 대체 솔루션이 아님의 대체 솔루션이 아님
How Should I Deploy How Should I Deploy Log Shipping?Log Shipping? Planned failover scenarioPlanned failover scenario Secondary or remote failoverSecondary or remote failover logical failures + physical failureslogical failures + physical failures 에 에
대한대한 보호책으로 활용 가능보호책으로 활용 가능 저장 대상저장 대상 loglog 들에 대한 지연 들에 대한 지연 settingsetting 가능 가능
1 to 15 minutes is what we typically see1 to 15 minutes is what we typically see 1 minute for availability1 minute for availability 15 minutes or more for logical data 15 minutes or more for logical data
protectionprotection
NLB and Log ShippingNLB and Log Shipping One way to handle switching is via One way to handle switching is via
Windows NT® Load Balancing Service (WLWindows NT® Load Balancing Service (WLBS) BS)
설정설정 :: 22 개의 개의 log shipping SQL Serverlog shipping SQL Serveree 들간의들간의 a pria pri
vate network vate network 필요필요 LMHosts LMHosts 또는또는 WINS serverWINS server 를 통한 를 통한 resolve Iresolve I
P addresses P addresses 필요필요 정기적인 정기적인 sync master and msdb sync master and msdb dbdb 간의 간의 synsyn
cc 필요필요
NLB And Log ShippingNLB And Log Shipping
Switch NLB when primary goes downSwitch NLB when primary goes down Clients don’t have to know that they’re Clients don’t have to know that they’re
connecting to a new SQL Serverconnecting to a new SQL Server But the clients can’t depend on the But the clients can’t depend on the
server name eitherserver name either One solution: Use IIS server to point to One solution: Use IIS server to point to
SQL ServerSQL Server Problem: Problem: 중재안이 필요함중재안이 필요함
Better for planned failoverBetter for planned failover Consider using Data Dependent RoutingConsider using Data Dependent Routing
NLB, IIS And Log ShippingNLB, IIS And Log ShippingClients
NLB Cluster
PrimarySQL Server
SecondarySQL Server
DIP1
DIP2
DIP1
DIP2
Log Shipping
IIS Server
VIP
Log Shipping With Log Shipping With SnapshotSnapshotHardware-assisted solutionHardware-assisted solution Create database on standby with snapshot backup/restoreCreate database on standby with snapshot backup/restore Set up log shippingSet up log shipping Conventional log backups are copied over networkConventional log backups are copied over network
PrimaryPrimary StandbyStandby
Disaster Disaster Recovery SiteRecovery Site
Remote MirroringRemote MirroringSplit Split
Mirror Mirror
Deploying Failover Deploying Failover ClusteringClustering Secondary serverSecondary server 에 대한 에 대한 service levelservice level 을 결정을 결정
Do you need to be at 100% capacity after a Do you need to be at 100% capacity after a failoverfailover
Will help for software upgrades, hardware Will help for software upgrades, hardware upgrades, helps mask hardware faults…upgrades, helps mask hardware faults…
This will drive the decisions surrounding This will drive the decisions surrounding Single Instance or Multi InstanceSingle Instance or Multi Instance
N+1 clustering with Windows 2000 Data N+1 clustering with Windows 2000 Data Center Server (4 nodes)Center Server (4 nodes)
A Failover Cluster A Failover Cluster
VVirtual SQL Server(s)irtual SQL Server(s)
Private NetworkPrivate Network
Public Network(s)Public Network(s)
Node 1Node 2
Cluster Deployment: WinsCluster Deployment: Wins
Windows 2000 Data Center and SQL Windows 2000 Data Center and SQL Server 2000 can provide N+1 Server 2000 can provide N+1 failoverfailover
Note that 4th node could also Note that 4th node could also support an active instancesupport an active instance
Node1: Node1: AccountingAccounting
Node2: Node2: ResearchResearch
Node3:Node3:HRHR
Node4: Node4: SpareSpare
A “Distance” ClusterA “Distance” Cluster
Node 1Node 1
Disaster Recovery
Site
Remote Mirroring
Node 2Node 2
Node 4Node 4Node 3Node 3
New York CityNew York City ChicagoChicago
Primary Site
Log Shipping With Log Shipping With ReplicationReplication Replication can fail overReplication can fail over Typically you’d protect the publisherTypically you’d protect the publisher
PrimaryPrimary StandbyStandby
DistributorDistributor
SubscribersSubscribers
DistributorDistributor
SubscribersSubscribers
StandbyStandby PrimaryPrimary
Failover
Log Shipping With Log Shipping With ReplicationReplicationChoicesChoices Synchronous modelSynchronous model
+ Fault tolerantFault tolerant– Slower change propagation to Slower change propagation to
subscriberssubscribers Semi-synchronous modelSemi-synchronous model
+ Subscribers are updated as changes Subscribers are updated as changes occur occur
– Subscribers see duplicate changes in Subscribers see duplicate changes in some failure situations but replication some failure situations but replication designed to handle this nowdesigned to handle this now
Combining Log Shipping Combining Log Shipping And Failover ClusteringAnd Failover Clustering Protection against both logical and physical Protection against both logical and physical
failuresfailures
Log Ship Log Ship standby server standby server for each Virtual for each Virtual SQL ServerSQL Server
Secondary Secondary standby, for standby, for the truly the truly paranoidparanoid
Log Shipping Versus Log Shipping Versus Failover ClusteringFailover Clustering Failover clustering a good enterprise Failover clustering a good enterprise
solutionsolution Clients are unaware of failover since the Clients are unaware of failover since the
server name/IP address doesn’t change server name/IP address doesn’t change Single copy of dataSingle copy of data
Less synchronization issuesLess synchronization issues No data integrity concernsNo data integrity concerns
Requires specific expensive hardwareRequires specific expensive hardware
Log Shipping Versus Log Shipping Versus Failover ClusteringFailover Clustering Hot Standby has a separate copy of the dataHot Standby has a separate copy of the data
Provides for logical data protectionProvides for logical data protection Can be kept hours behind for more protectionCan be kept hours behind for more protection
Hot Standby need not be geographically Hot Standby need not be geographically closeclose
Can have multiple standby servers with Can have multiple standby servers with log shippinglog shipping
Clients may need to know name/location Clients may need to know name/location of log shipping serverof log shipping server Unless you use NLB/WLBSUnless you use NLB/WLBS Unless you handle this through Data Dependent Unless you handle this through Data Dependent
Routing (DDR)Routing (DDR)
Which One To Use?Which One To Use?
The most popular choice is the The most popular choice is the failover cluster, with a log-shipped failover cluster, with a log-shipped remote standby server. remote standby server.
Log Shipping is also often used for Log Shipping is also often used for reasons other than availabilityreasons other than availability
Replication can be used for availability Replication can be used for availability of read-only dataof read-only data
You can use all three together…You can use all three together…
SummarySummary
Microsoft SQL Server 2000 has several Microsoft SQL Server 2000 has several solutions to improve availabilitysolutions to improve availability Failover clustersFailover clusters Log shippingLog shipping ReplicationReplication
High availability is based on processHigh availability is based on process Pay attention to best practicesPay attention to best practices Practice Practice disaster recoverydisaster recovery Investigate Microsoft Operations Investigate Microsoft Operations
FrameworkFramework
Availability ResourcesAvailability Resources Best Practices in Change, Configuration and Problem ManaBest Practices in Change, Configuration and Problem Mana
gementgement http://www.microsoft.com/http://www.microsoft.com/mofmof
High Availability Operations Guide High Availability Operations Guide http://www.microsoft.com/NTServer/nts/deployment/planguide/Hihttp://www.microsoft.com/NTServer/nts/deployment/planguide/Hi
ghAvail.aspghAvail.asp Monitoring Reliability and Availability of Windows NT-based Monitoring Reliability and Availability of Windows NT-based
ServersServers http://www.microsoft.com/windows2000/library/operations/clustehttp://www.microsoft.com/windows2000/library/operations/cluste
r/monitorrel.aspr/monitorrel.asp Failover Clustering And Load BalancingFailover Clustering And Load Balancing
http://www.microsoft.com/windows2000/library/technologies/clushttp://www.microsoft.com/windows2000/library/technologies/cluster/default.aspter/default.asp
Network Load Balancing Technical OverviewNetwork Load Balancing Technical Overview http://www.microsoft.com/windows2000/library/howitworks/clustehttp://www.microsoft.com/windows2000/library/howitworks/cluste
r/nlb.aspr/nlb.asp
Availability ResourcesAvailability Resources
Capacity Planning for High AvailabilityCapacity Planning for High Availability http://www.microsoft.com/ISN/whitepapers/capacity_plannihttp://www.microsoft.com/ISN/whitepapers/capacity_planni
ng_for_ha_921.aspng_for_ha_921.asp Best Practices for End-to-End High AvailabilityBest Practices for End-to-End High Availability
http://www.microsoft.com/technet/avail/bestprac/http://www.microsoft.com/technet/avail/bestprac/bestprac.htmbestprac.htm
Planning, Deploying, and Managing Highly Planning, Deploying, and Managing Highly Available SolutionsAvailable Solutions http://www.microsoft.com/technet/avail/overview/http://www.microsoft.com/technet/avail/overview/
default.htmdefault.htm Monitoring Reliability and Availability of Windows Monitoring Reliability and Availability of Windows
2000 Servers2000 Servers http://www.microsoft.com/windows2000/library/opeartions/http://www.microsoft.com/windows2000/library/opeartions/
cluster/monitorrel.aspcluster/monitorrel.asp
Availability ResourcesAvailability Resources
Windows 2000 Reliability and Windows 2000 Reliability and Availability ImprovementsAvailability Improvements http://www.microsoft.com/windows2000/guide/platform/http://www.microsoft.com/windows2000/guide/platform/
strategic/relavail.aspstrategic/relavail.asp Cluster Strategy: High Availability and Scalability with IndusCluster Strategy: High Availability and Scalability with Indus
try-Standard Hardwaretry-Standard Hardware http://www.microsoft.com/ntserver/ntserverenterprise/exec/http://www.microsoft.com/ntserver/ntserverenterprise/exec/
prodstrat/cluster2.aspprodstrat/cluster2.asp Windows Clustering Technologies: Cluster Windows Clustering Technologies: Cluster
Service ArchitectureService Architecture http://www.microsoft.com/windows/server/technical/http://www.microsoft.com/windows/server/technical/
management/ClusterArch.aspmanagement/ClusterArch.asp Don’t forget Technet and MSDN…Don’t forget Technet and MSDN…
http://www.microsoft.com/technethttp://www.microsoft.com/technet http://msdn.microsoft.comhttp://msdn.microsoft.com