© 2007 emc corporation. all rights reserved. business continuity overview module 4.1

36
© 2007 EMC Corporation. All rights reserved. Business Continuity Overview Module 4.1

Upload: phyllis-lindsey

Post on 14-Dec-2015

297 views

Category:

Documents


0 download

TRANSCRIPT

© 2007 EMC Corporation. All rights reserved.

Business Continuity OverviewBusiness Continuity Overview

Module 4.1

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 2

Business Continuity Overview

After completing this module, you will be able to:

Define and differentiate between Business Continuity and Disaster Recovery

Differentiate between Disaster Recovery and Disaster Restart

Define terminology such as Recovery Point Objective and Recovery Time Objective

Describe (at high level) Business Continuity Planning

Identify Single Points of Failure and describe solutions to eliminate them

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 3

What is Business Continuity?

Business Continuity is the preparation for, response to, and recovery from an application outage that adversely affects business operations

Business Continuity Solutions address systems unavailability, degraded application performance, or unacceptable recovery strategies

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 4

Lost RevenueKnow the downtime costs (per hour, day, two days...)• Number of employees

impacted (x hours out * hourly rate)

Damaged Reputation

• Customers• Suppliers• Financial markets• Banks• Business partners

Financial Performance

• Revenue recognition• Cash flow• Lost discounts (A/P)• Payment guarantees• Credit rating• Stock price

Other ExpensesTemporary employees, equipment rental, overtime costs, extra shipping costs, travel expenses...

Why Business Continuity

• Direct loss• Compensatory payments• Lost future revenue• Billing losses• Investment losses

Lost Productivity

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 5

Information Availability

% Uptime % Downtime Downtime per Year Downtime per Week

98% 2% 7.3 days 3hrs 22 min

99% 1% 3.65 days 1 hr 41 min

99.8% 0.2% 17 hrs 31 min 20 min 10 sec

99.9% 0.1% 8 hrs 45 min 10 min 5 sec

99.99% 0.01% 52.5 min 1 min

99.999% 0.001% 5.25 min 6 sec

99.9999% 0.0001% 31.5 sec 0.6 sec

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 6

Importance of Business Continuity and Planning

Millions of US Dollars per Hour in Lost Revenue

6.5

3.6

2.8

2.6

2.0

1.6

1.6

1.5

1.3

1.2

1.1

Retail brokerage

Point of sale

Energy

Credit card sales authorization

Telecommunications

Call location

Manufacturing

Financial institutions

Information technology

Insurance

Retail

Source Meta Group, 2005

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 7

Tap

e B

acku

p

Per

iod

ic

Rep

licat

ion

Recovery Point Objective (RPO)

Wks Days Hrs Mins Secs

Recovery Point Recovery TimeRecovery Point Recovery Time

Tap

e B

acku

p

Per

iod

ic

Rep

licat

ion

Asy

nch

ron

ou

s R

eplic

atio

nA

syn

chro

no

us

R

eplic

atio

n

Syn

chro

no

us

Rep

licat

ion

Syn

chro

no

us

R

eplic

atio

n

Secs Mins Hrs Days Wks

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 8

Recovery Time Objective (RTO)

Recovery Time includes:

Fault detection

Recovering data

Bringing apps back onlineG

lob

al

Clu

ster

Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks

Recovery Point Recovery TimeRecovery Point Recovery Time

Glo

bal

C

lust

er

Man

ual

M

igra

tio

nM

anu

al

Mig

rati

on

Tap

e R

esto

reT

ape

Res

tore

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 9

Disaster Recovery versus Disaster Restart

Most business critical applications have some level of data interdependencies

Disaster recovery– Restoring previous copy of data and applying logs to that copy to bring it to

a known point of consistency

– Generally implies the use of backup technology

– Data copied to tape and then shipped off-site

– Requires manual intervention during the restore and recovery processes

Disaster restart – Process of restarting mirrored consistent copies of data and applications

– Allows restart of all participating DBMS to a common point of consistency utilizing automated application of recovery logs during DBMS initialization

– The restart time is comparable to the length of time required for the application to restart after a power failure

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 10

Disruptors of Data Availability

Disaster (<1% of Occurrences)

Natural or man made Flood, fire, earthquakeContaminated building

Unplanned Occurrences (13% of Occurrences)

FailureDatabase corruptionComponent failureHuman error

Planned Occurrences (87% of Occurrences)

Competing workloads Backup, reportingData warehouse extractsApplication and data restore

Source: Gartner, Inc.

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 11

Causes of Downtime

Human Error

System Failure

Infrastructure Failure

Disaster

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 12

Business Continuity vs. Disaster Recovery

Business Continuity has a broad focus on prevention:– Predictive techniques to identify risks

– Procedures to maintain business functions

Disaster Recovery focuses on the activities that occur after an adverse event to return the entity to ‘normal’ functioning

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 13

Business Continuity Planning (BCP)

Includes the following activities:

Identifying the mission or critical business functions

Collecting data on current business processes

Assessing, prioritizing, mitigating, and managing risk– Risk Analysis

– Business Impact Analysis (BIA)

Designing and developing contingency plans and disaster recovery plan (DR Plan)

Training, testing, and maintenance

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 14

Objectives

Train, Test, and

Document

Implement,

Maintain, and

Assess

Analysis

Design

Develop

Business Continuity Planning Lifecycle

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 15

Business Impact Analysis (BIA)

# Business Area Affected

Impact (1 -5)

Probability (1-5)

Single Loss Expectancy

# Event p/y

Loss p/y Est cost of mitigation

High Risk SPOF Item

1 Entire Company

5 1 $279,056 .25 $69517 $5,800 No redundant UPS for Networking/phone equip

2 Entire Company

5 1 $279,066 0.2 $55768 $66,456 Cisco net backbone switch not redundant

3 Entire Company

5 1 $279,098 0.2 $55619 $10,000 Relocate net equip to a separate physical rack

4 IT-All 4 3 $16,000 1.0 18000 $80,000 Primary dev platforms don’t have failover

5 Entire Company

4 3 $16,000 0.5 $8000 $122,000 Computer room does not have sufficient UPS capacity to run on single unit

6 IT- Intranet/B2B

2 1 $400 1.0 $1800 $5,000 No failover for development webserver

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 16

User & Application Clients

IP

Identifying Single Points of Failure

Primary Node

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 17

HBA Failures

HBAHBA

Host

Switch

Storage

PortPortHBA

Configure multiple HBAs, and use multi-pathing software

Protects against HBA failure

Can provide improved performance (vendor dependent)

HBA

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 18

Switch/Storage Array Port Failures

HBAHBA

HostSwitch

Storage

PortPortHBAHBA

PortPort

Configure multiple switches

Make the devices available via multiple storage array ports

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 19

Disk Failures

HBAHBA

HostSwitch

Storage

PortPortHBAHBA

PortPort

Use some level of RAID

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 20

Host Failures

HBAHBA

HostSwitch

Storage

PortPortHBAHBA

PortPort

Storage

Host

Host clustering protects against production host failures

HBA

HBA

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 21

Site/Storage Array Failures

HBAHBA

HostSwitch

Storage

PortPortHBAHBA

PortPort

Storage

Remote replication helps protect against either entire site or storage array failures

HBA

HBA

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 22

User & Application Clients

IP

Primary Node

IP

Redundant Network

Kee

p A

live

Clustering Software

Failover Node

Redundant PathsRedundant Disks (RAID 1/RAID 5)

Redundant Site

Switches

Storage Array Storage Array

Resolving Single Points of Failure

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 23

Business Continuity Technology Solutions

Local Replication

Remote Replication

Backup/Restore

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 24

Local Replication

Data from the production devices is copied over to a set of target (replica) devices within the same array

After some time, the replica devices will contain identical data as those on the production devices

Subsequently copying of data can be halted. At this point-in-time, the replica devices can be used independently of the production devices

The replicas can then be used for restore operations in the event of data corruption or other events

Alternatively the data from the replica devices can be copied to tape. This off-loads the burden of backup from the production devices

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 25

Remote Replication

Data from the production devices is copied over to a set of target (replica) devices on a different array at some distance away

Target devices can be kept continuously synchronized with the production devices

In the event of a failure of the production devices, applications can continue to run from the target devices

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 26

Backup/Restore

Backup to tape has been the predominant method for ensuring data availability and business continuity

Low cost, high capacity disk drives are now being used for backup to disk. This considerably speeds up the backup and the restore process

Frequency of backup will be dictated by defined RPO/RTO requirements as well as the rate of change of data

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 27

Module Summary

Key points covered in this module:

Importance of Business Continuity

Types of outages and their impact to businesses

Business Continuity Planning and Disaster Recovery

Definitions of RPO and RTO

Difference between Disaster Recovery and Disaster Restart

Identifying and eliminating Single Points of Failure

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 28

Check Your Knowledge

Which concerns do business continuity solutions address?

What is the difference between RPO and RTO?

What is the difference between Disaster Recovery and Disaster Restart?

What are some of the Single Points of Failure in a typical data center environment?

How can the loss of a storage array port be mitigated?

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 29

Apply Your Knowledge

After completing this topic, you will be able to:

Describe EMC PowerPath

Discuss the features and benefits of PowerPath in storage environments

Explain how PowerPath achieves transparent recovery

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 30

What is EMC PowerPath

DBMSDBMS ManagemManagementent

UtilsUtilsFile File SystemSystemLogical Volume Logical Volume

ManagerManager

ApplicationsApplications

Open Systems

Host

SE

RV

ER

ST

OR

AG

E InterconnectTopology

SCSISCSIDriverDriver

SCSISCSIDriverDriver

SCSISCSIDriverDriver

SCSISCSIDriverDriver

SCSISCSIDriverDriver

SCSISCSIDriverDriver

SCSISCSIControllerController

SCSISCSIControllerController

SCSISCSIControllerController

SCSISCSIControllerController

SCSISCSIControllerController

SCSISCSIControllerController

PowerPathPowerPath

Host Based Software

Resides between application and SCSI device driver

Provides Intelligent I/O path management

Transparent to the application

Automatic detection and recovery from host-to-array path failures

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 31

PowerPath Features

Multiple paths, for higher availability and performance

Dynamic multipath load balancing

Proactive path testing and automatic path recovery

Automatic path failover

Online path configuration and management

High-availability cluster support

PowerPath Delivers:

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 32

PowerPath Configuration

All volumes are accessible through all paths

Maximum 32 paths to a logical volume

Interconnect support for – SAN

– SCSI

– iSCSI

Host Application(s)

HBA HBA

SD SDSD

HBA Host BusAdapter

SCSIDriver

Storage

SE

RV

ER

ST

OR

AG

E InterconnectTopology

SD

HBA

PowerPath

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 33

The PowerPath Filter Driver

Platform independent base driver

Applications direct I/O to PowerPath

PowerPath directs I/O to optimal path based on current workload and path availability

When a path fails PowerPath chooses another path in the set

Host Application(s)

HBA HBA

SD SDSD

HBA Host BusAdapter

SCSIDriver

Storage

SE

RV

ER

ST

OR

AG

E InterconnectTopology

SD

HBA

PowerPath Filter Driver

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 34

Path Fault without PowerPath

In most environments, a host will have multiple paths to the Storage System

Volumes are spread across all available paths

Each volume has a single path

Host adapter and cable connections are single points of failure

Work load not balanced among all paths

Storage

Host Application(s)

SD

HBA

SD

HBA

SD

HBA

SD

HBA Host BusAdapter

SCSIDriver

SE

RV

ER

ST

OR

AG

E InterconnectTopology

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 35

Path Fault with PowerPath

If a host adapter, cable, or channel director/Storage Processor fails, the device driver returns a timeout to PowerPath

PowerPath responds by taking the path offline and re-driving I/O through an alternate path

Subsequent I/Os use surviving path(s)

Application is unaware of failure

Host Application(s)

HBA HBA

SD SDSD

HBA Host BusAdapter

SCSIDriver

Storage

SE

RV

ER

ST

OR

AG

E InterconnectTopology

SD

HBA

PowerPath

© 2007 EMC Corporation. All rights reserved. Business Continuity Overview - 36

Summary

Key points covered in this topic:

PowerPath is server based software that provides multiple paths between the host bus adapter and the Storage Subsystem– Redundant paths eliminate host adapter, cable connection, and

channel adapters as single points of failures and increase availability

– Improves performance by dynamically balancing the workload across all available paths

– Application transparent

Enhances data availability and accessibility