© 2007 EMC Corporation. All rights reserved.
Business Continuity OverviewBusiness Continuity Overview
Module 4.1
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 2
Business Continuity Overview
After completing this module, you will be able to:
Define and differentiate between Business Continuity and Disaster Recovery
Differentiate between Disaster Recovery and Disaster Restart
Define terminology such as Recovery Point Objective and Recovery Time Objective
Describe (at high level) Business Continuity Planning
Identify Single Points of Failure and describe solutions to eliminate them
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 3
What is Business Continuity?
Business Continuity is the preparation for, response to, and recovery from an application outage that adversely affects business operations
Business Continuity Solutions address systems unavailability, degraded application performance, or unacceptable recovery strategies
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 4
Lost RevenueKnow the downtime costs (per hour, day, two days...)• Number of employees
impacted (x hours out * hourly rate)
Damaged Reputation
• Customers• Suppliers• Financial markets• Banks• Business partners
Financial Performance
• Revenue recognition• Cash flow• Lost discounts (A/P)• Payment guarantees• Credit rating• Stock price
Other ExpensesTemporary employees, equipment rental, overtime costs, extra shipping costs, travel expenses...
Why Business Continuity
• Direct loss• Compensatory payments• Lost future revenue• Billing losses• Investment losses
Lost Productivity
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 5
Information Availability
% Uptime % Downtime Downtime per Year Downtime per Week
98% 2% 7.3 days 3hrs 22 min
99% 1% 3.65 days 1 hr 41 min
99.8% 0.2% 17 hrs 31 min 20 min 10 sec
99.9% 0.1% 8 hrs 45 min 10 min 5 sec
99.99% 0.01% 52.5 min 1 min
99.999% 0.001% 5.25 min 6 sec
99.9999% 0.0001% 31.5 sec 0.6 sec
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 6
Importance of Business Continuity and Planning
Millions of US Dollars per Hour in Lost Revenue
6.5
3.6
2.8
2.6
2.0
1.6
1.6
1.5
1.3
1.2
1.1
Retail brokerage
Point of sale
Energy
Credit card sales authorization
Telecommunications
Call location
Manufacturing
Financial institutions
Information technology
Insurance
Retail
Source Meta Group, 2005
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 7
Tap
e B
acku
p
Per
iod
ic
Rep
licat
ion
Recovery Point Objective (RPO)
Wks Days Hrs Mins Secs
Recovery Point Recovery TimeRecovery Point Recovery Time
Tap
e B
acku
p
Per
iod
ic
Rep
licat
ion
Asy
nch
ron
ou
s R
eplic
atio
nA
syn
chro
no
us
R
eplic
atio
n
Syn
chro
no
us
Rep
licat
ion
Syn
chro
no
us
R
eplic
atio
n
Secs Mins Hrs Days Wks
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 8
Recovery Time Objective (RTO)
Recovery Time includes:
Fault detection
Recovering data
Bringing apps back onlineG
lob
al
Clu
ster
Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks
Recovery Point Recovery TimeRecovery Point Recovery Time
Glo
bal
C
lust
er
Man
ual
M
igra
tio
nM
anu
al
Mig
rati
on
Tap
e R
esto
reT
ape
Res
tore
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 9
Disaster Recovery versus Disaster Restart
Most business critical applications have some level of data interdependencies
Disaster recovery– Restoring previous copy of data and applying logs to that copy to bring it to
a known point of consistency
– Generally implies the use of backup technology
– Data copied to tape and then shipped off-site
– Requires manual intervention during the restore and recovery processes
Disaster restart – Process of restarting mirrored consistent copies of data and applications
– Allows restart of all participating DBMS to a common point of consistency utilizing automated application of recovery logs during DBMS initialization
– The restart time is comparable to the length of time required for the application to restart after a power failure
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 10
Disruptors of Data Availability
Disaster (<1% of Occurrences)
Natural or man made Flood, fire, earthquakeContaminated building
Unplanned Occurrences (13% of Occurrences)
FailureDatabase corruptionComponent failureHuman error
Planned Occurrences (87% of Occurrences)
Competing workloads Backup, reportingData warehouse extractsApplication and data restore
Source: Gartner, Inc.
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 11
Causes of Downtime
Human Error
System Failure
Infrastructure Failure
Disaster
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 12
Business Continuity vs. Disaster Recovery
Business Continuity has a broad focus on prevention:– Predictive techniques to identify risks
– Procedures to maintain business functions
Disaster Recovery focuses on the activities that occur after an adverse event to return the entity to ‘normal’ functioning
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 13
Business Continuity Planning (BCP)
Includes the following activities:
Identifying the mission or critical business functions
Collecting data on current business processes
Assessing, prioritizing, mitigating, and managing risk– Risk Analysis
– Business Impact Analysis (BIA)
Designing and developing contingency plans and disaster recovery plan (DR Plan)
Training, testing, and maintenance
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 14
Objectives
Train, Test, and
Document
Implement,
Maintain, and
Assess
Analysis
Design
Develop
Business Continuity Planning Lifecycle
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 15
Business Impact Analysis (BIA)
# Business Area Affected
Impact (1 -5)
Probability (1-5)
Single Loss Expectancy
# Event p/y
Loss p/y Est cost of mitigation
High Risk SPOF Item
1 Entire Company
5 1 $279,056 .25 $69517 $5,800 No redundant UPS for Networking/phone equip
2 Entire Company
5 1 $279,066 0.2 $55768 $66,456 Cisco net backbone switch not redundant
3 Entire Company
5 1 $279,098 0.2 $55619 $10,000 Relocate net equip to a separate physical rack
4 IT-All 4 3 $16,000 1.0 18000 $80,000 Primary dev platforms don’t have failover
5 Entire Company
4 3 $16,000 0.5 $8000 $122,000 Computer room does not have sufficient UPS capacity to run on single unit
6 IT- Intranet/B2B
2 1 $400 1.0 $1800 $5,000 No failover for development webserver
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 16
User & Application Clients
IP
Identifying Single Points of Failure
Primary Node
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 17
HBA Failures
HBAHBA
Host
Switch
Storage
PortPortHBA
Configure multiple HBAs, and use multi-pathing software
Protects against HBA failure
Can provide improved performance (vendor dependent)
HBA
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 18
Switch/Storage Array Port Failures
HBAHBA
HostSwitch
Storage
PortPortHBAHBA
PortPort
Configure multiple switches
Make the devices available via multiple storage array ports
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 19
Disk Failures
HBAHBA
HostSwitch
Storage
PortPortHBAHBA
PortPort
Use some level of RAID
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 20
Host Failures
HBAHBA
HostSwitch
Storage
PortPortHBAHBA
PortPort
Storage
Host
Host clustering protects against production host failures
HBA
HBA
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 21
Site/Storage Array Failures
HBAHBA
HostSwitch
Storage
PortPortHBAHBA
PortPort
Storage
Remote replication helps protect against either entire site or storage array failures
HBA
HBA
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 22
User & Application Clients
IP
Primary Node
IP
Redundant Network
Kee
p A
live
Clustering Software
Failover Node
Redundant PathsRedundant Disks (RAID 1/RAID 5)
Redundant Site
Switches
Storage Array Storage Array
Resolving Single Points of Failure
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 23
Business Continuity Technology Solutions
Local Replication
Remote Replication
Backup/Restore
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 24
Local Replication
Data from the production devices is copied over to a set of target (replica) devices within the same array
After some time, the replica devices will contain identical data as those on the production devices
Subsequently copying of data can be halted. At this point-in-time, the replica devices can be used independently of the production devices
The replicas can then be used for restore operations in the event of data corruption or other events
Alternatively the data from the replica devices can be copied to tape. This off-loads the burden of backup from the production devices
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 25
Remote Replication
Data from the production devices is copied over to a set of target (replica) devices on a different array at some distance away
Target devices can be kept continuously synchronized with the production devices
In the event of a failure of the production devices, applications can continue to run from the target devices
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 26
Backup/Restore
Backup to tape has been the predominant method for ensuring data availability and business continuity
Low cost, high capacity disk drives are now being used for backup to disk. This considerably speeds up the backup and the restore process
Frequency of backup will be dictated by defined RPO/RTO requirements as well as the rate of change of data
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 27
Module Summary
Key points covered in this module:
Importance of Business Continuity
Types of outages and their impact to businesses
Business Continuity Planning and Disaster Recovery
Definitions of RPO and RTO
Difference between Disaster Recovery and Disaster Restart
Identifying and eliminating Single Points of Failure
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 28
Check Your Knowledge
Which concerns do business continuity solutions address?
What is the difference between RPO and RTO?
What is the difference between Disaster Recovery and Disaster Restart?
What are some of the Single Points of Failure in a typical data center environment?
How can the loss of a storage array port be mitigated?
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 29
Apply Your Knowledge
After completing this topic, you will be able to:
Describe EMC PowerPath
Discuss the features and benefits of PowerPath in storage environments
Explain how PowerPath achieves transparent recovery
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 30
What is EMC PowerPath
DBMSDBMS ManagemManagementent
UtilsUtilsFile File SystemSystemLogical Volume Logical Volume
ManagerManager
ApplicationsApplications
Open Systems
Host
SE
RV
ER
ST
OR
AG
E InterconnectTopology
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
PowerPathPowerPath
Host Based Software
Resides between application and SCSI device driver
Provides Intelligent I/O path management
Transparent to the application
Automatic detection and recovery from host-to-array path failures
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 31
PowerPath Features
Multiple paths, for higher availability and performance
Dynamic multipath load balancing
Proactive path testing and automatic path recovery
Automatic path failover
Online path configuration and management
High-availability cluster support
PowerPath Delivers:
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 32
PowerPath Configuration
All volumes are accessible through all paths
Maximum 32 paths to a logical volume
Interconnect support for – SAN
– SCSI
– iSCSI
Host Application(s)
HBA HBA
SD SDSD
HBA Host BusAdapter
SCSIDriver
Storage
SE
RV
ER
ST
OR
AG
E InterconnectTopology
SD
HBA
PowerPath
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 33
The PowerPath Filter Driver
Platform independent base driver
Applications direct I/O to PowerPath
PowerPath directs I/O to optimal path based on current workload and path availability
When a path fails PowerPath chooses another path in the set
Host Application(s)
HBA HBA
SD SDSD
HBA Host BusAdapter
SCSIDriver
Storage
SE
RV
ER
ST
OR
AG
E InterconnectTopology
SD
HBA
PowerPath Filter Driver
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 34
Path Fault without PowerPath
In most environments, a host will have multiple paths to the Storage System
Volumes are spread across all available paths
Each volume has a single path
Host adapter and cable connections are single points of failure
Work load not balanced among all paths
Storage
Host Application(s)
SD
HBA
SD
HBA
SD
HBA
SD
HBA Host BusAdapter
SCSIDriver
SE
RV
ER
ST
OR
AG
E InterconnectTopology
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 35
Path Fault with PowerPath
If a host adapter, cable, or channel director/Storage Processor fails, the device driver returns a timeout to PowerPath
PowerPath responds by taking the path offline and re-driving I/O through an alternate path
Subsequent I/Os use surviving path(s)
Application is unaware of failure
Host Application(s)
HBA HBA
SD SDSD
HBA Host BusAdapter
SCSIDriver
Storage
SE
RV
ER
ST
OR
AG
E InterconnectTopology
SD
HBA
PowerPath
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 36
Summary
Key points covered in this topic:
PowerPath is server based software that provides multiple paths between the host bus adapter and the Storage Subsystem– Redundant paths eliminate host adapter, cable connection, and
channel adapters as single points of failures and increase availability
– Improves performance by dynamically balancing the workload across all available paths
– Application transparent
Enhances data availability and accessibility