technical consideration in disaster recovery · pdf file– bi-directional – multi...
TRANSCRIPT
© 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
GLOBAL CAPABILITY.PERSONAL ACCOUNTABILITY.
Technical Consideration in Disaster RecoveryTechnical Consideration in Disaster Recovery
Daniel J. MorrisBusiness Solutions ConsultantNovember 2008
2
“And to think... those wimps at the power company use straps and cleats to get up this high!"
3
"Jack stands? Hah! Who needs 'em?"
4
I'm sure this guy still wonders why he got fired over this!
5
Step 1: Remove shoes
Step 2: Place metal ladder in water
Step 3: Begin using power tools while standing barefoot on metal ladder in water
6
TopicsTopics
•The Business Drivers of BCDR
•Protecting the Information: Remote Mirroring
•Network Impact: Topology & Protocol Considerations
•The Verizon Storage Practice: Putting the Pieces Together
© 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
GLOBAL CAPABILITY.PERSONAL ACCOUNTABILITY.
The Business DriversThe Business Drivers Protecting InformationProtecting Information
Network ImpactNetwork Impact
VZ Storage PracticeVZ Storage Practice
8
Business Continuance & Disaster Recovery Business Continuance & Disaster Recovery areare Different Different
Business Continuance Disaster RecoveryFocus is on avoiding disruption Focus is on recovering from disruption
Deploy local and remote technology Deploy primarily remote technology
Zero downtime can be achieved Close to zero downtime can be achieved
Develop contingency plans Develop recovery plans
Procedures to maintain business functions Procedures to recover business functions
Train, Test, and Maintain Train, Test, and Maintain
9
Business Considerations: Starts with Leadership!!Business Considerations: Starts with Leadership!!
POLICY
RISK MANAGEMENTRISK MANAGEMENT
10
Technologies Dictated By RPO/RTOTechnologies Dictated By RPO/RTO
Full Volume Tape Back up Nightly
Tape Vaulting
Database Journaling
Consistent Recovery Restart
Asynchronous “Point in Time” Copy
Continuous Asynchronous
Synchronous Mirror
TransactionsNot Captured
Declaration TransactionRecreation
Data Retrieval
Transit SystemRestore
IPL &Network
DatabaseRestore
Hours of Lost Transactions (RPO) Hours Required to Resume Business (RTO) Cost Per Month
20K
30K
40K
60K
90K
150K
250K
-24 -12 0 12 24 36 48 60 72 84
Full Volume Tape Back up Nightly
Tape Vaulting
Database Journaling
Consistent Recovery Restart
Asynchronous “Point in Time” Copy
Continuous Asynchronous
Synchronous Mirror
TransactionsNot Captured
Declaration TransactionRecreation
Data Retrieval
Transit SystemRestore
IPL &Network
DatabaseRestore
Hours of Lost Transactions (RPO) Hours Required to Resume Business (RTO) Cost Per Month
20K
30K
40K
60K
90K
150K
250K
-24 -12 0 12 24 36 48 60 72 84
Full Volume Tape Back up Nightly
Tape Vaulting
Database Journaling
Consistent Recovery Restart
Asynchronous “Point in Time” Copy
Continuous Asynchronous
Synchronous Mirror
TransactionsNot CapturedTransactionsNot Captured
DeclarationDeclaration TransactionRecreationTransactionRecreation
Data RetrievalData Retrieval
TransitTransit SystemRestoreSystemRestore
IPL &NetworkIPL &Network
DatabaseRestoreDatabaseRestore
Hours of Lost Transactions (RPO) Hours Required to Resume Business (RTO) Cost Per Month
20K
30K
40K
60K
90K
150K
250K
-24 -12 0 12 24 36 48 60 72 84
© 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
GLOBAL CAPABILITY.PERSONAL ACCOUNTABILITY.
The Business DriversThe Business Drivers
Protecting InformationProtecting Information Network ImpactsNetwork Impacts
VZ Storage PracticeVZ Storage Practice
12
Tier 4Tier 3Tier 2Tier 1 Tier 5
TapeCASATA, low-cost Fibre Channel
High-end/midrangeFibre Channel disk
local replicationHigh-end remote
replication
Tiered Storage: The First Step in Protecting Data Tiered Storage: The First Step in Protecting Data
• Seconds to minutes• Minutes to hours • Hours • Hours • Hours to days
Availability (Unplanned downtime)
Recovery Point• Seconds • Seconds to minutes • Minutes to hours • Up to 24 hours • Up to 72 hours
• Dynamic workload • Highest transaction
volume
• High performance for constant workloads
• Moderate performance
• Primarily read access
• Internet performance• Primarily read access
• N/APerformance (Workload)
13
Protection Level DistinctionsProtection Level DistinctionsPARAMETER BACK-UP ARCHIVE MIRRORING
DATA TYPE •Secondary Copy •Primary Copy •Secondary CopyRETENTION DURATION
•Long Term Overwritten Data
•Long Term Retention
•Short Term Retention
DATA ACCESS LAYER
•File Level Access •Block and/or File* •Block Level
RPO/RTO CHARACTERISTIC
•Long RPO/RTO* •UNPROTECTED •Typically Short RPO/RTO
CONTROLLER MECHANISM
•Appliance•VTL •Library
•Host•Appliance•Various Platforms
•Host•Appliance •Array
MEDIA Tape/Disk/CD •Tape/Disk/CD •Disk
14
RTO/RPO Drivers for Remote Protection Schemas RTO/RPO Drivers for Remote Protection Schemas •RTOs/RPOs coupled w/ Geographic Diversity dictate:
–Level Of Protection–Level Of Network Requirements–Level Of Application Requirements
TIER 1 SYNCH MIRRORTIER 1 SYNCH MIRROR TIER 2 ASYNCH MIRRORTIER 2 ASYNCH MIRROR
TapeTape
TIER 3 BACKUP & OFFTIER 3 BACKUP & OFF
15
Mirroring Considerations: Storage SystemsMirroring Considerations: Storage Systems
• Controller Type:– HOST BASED– APPLIANCE BASED– ARRAY BASED & TYPE
• Mirroring Methodology– ASYNCHRONOUS– SYNCHRONOUS– BLENDED– BI-DIRECTIONAL– MULTI-DIRECTIONAL
• Storage Topology– NAS– CAS– SAN
• ROUTING PROTOCOL:– FCIP– FCOE– iSCSI– TCP/IP– GFP/SONET
• DISTANCE
• LATENCY THRESHOLDS (Application)
• NETWORK TOPOLOGY:– DETERMINISTIC– NON-DETERMINISTIC
• BANDWIDTH CONSIDERATIONS
• DATA CHANGE RATES
16
Putting Tiered Info Into ContextPutting Tiered Info Into Context
17
Protection Schemas by Service Level PolicyProtection Schemas by Service Level Policy
Information Protection Services
PLAN Service Levels and Business Requirements
Acceptable data loss 0 seconds Seconds to minutes Hours >24 hours
Business application availability Minutes Minutes Hours >24 hours
Business disruption Very low Low Medium High
CORE PLATFORMS Alternatives, Design, and Technology Portfolio
Tiered availability
Replication
Symmetrix SRDF/S SRDF/A SRDF/AR SRDF/DM
CLARiiON MirrorView/S MirrorView/A SAN Copy
RecoverPoint RecoverPoint CDP and CRR
EMC Centera EMC Centera Archive Replicator
Celerra Celerra Replicator OnCourse
Server clustering and replication AutoStart, VMware
AutoStart, RepliStor RepliStor
Backup and recoveryBackup to disk: CLARiiON Disk Library,
CLARiiON CX and CX3 UltraScale series (SAN), NS Series (LAN), NetWorker)
NETWORK CONNECTIVITYRECOVERY PROVIDERS/FACILITIES
Build Integration and Plan Development Services
Manage Residency Services
18
Complete Tiered Disaster Recovery Protection Complete Tiered Disaster Recovery Protection
Production Site
Standby Site
19
SCENARIO 1: Tier One Replication RequirementsSCENARIO 1: Tier One Replication Requirements SRDF Family MultiSRDF Family Multi--Site Protection OptionsSite Protection Options
Concurrent SRDFMulti-site protection leveraging a single source and concurrently replicating to two remote sites Near Site
Far SiteSourceSRDF/A
SRDF/S
SRDF/StarMulti-site protection Includes SRDF/A link between two remote sites to continue protection if a site fails Near Site
Far SiteSourceSRDF/A
SRDF/S SRDF/A
Cascaded SRDFMulti-site protection SRDF/S between Source and Near Site; SRDF/A between Near Site and Far Site Eliminates need for BCV cycling at Near Site; improves recovery-point objectives at Far Site
Near Site Far SiteSource
New SRDF/S SRDF/A
20
SCENARIO 2: Mid Tier Array Based ReplicationSCENARIO 2: Mid Tier Array Based Replication CLARiiON RemoteCLARiiON Remote--Replication FamilyReplication Family
MirrorView/SynchronousRPO: Zero seconds
• Both images identical• Limited distance• High network bandwidth• One primary to one or two secondaries
Limited distance
Primary Secondary
4
2
3
1
MirrorView/AsynchronousRPO: 30 minutes to hours
• Target updated periodically• Unlimited distance• Restartable copy on secondary if session fails • Optimized for low network bandwidth (consumes
100 Mb/s maximum)• One primary to one secondary
Unlimited distance
Primary Secondary
5
32
14
SAN Copy RPO: Hours to days
• Data mobility between tiers, CLARiiON, and qualified third-party arrays
• Disaster recovery with application coordination for restartable copy on secondary site
• Available as incremental or full modes• Unlimited distance• One source to multiple destinations (up to 100)
Unlimited distance
Primary Secondary
3
12
21
SCENARIO 3: Bandwidth Constrained Replication Recover Point for Heterogeneous Environments
Application- consistent recovery
Corruption protection
Local site Remote site
SANSANSAN
Application response time
Existing infrastructure
Disaster-recovery testing
Oracle Exchange SQLOracle Exchange SQL
Communications cost
Heterogeneous storage
STKIBM HPHDSEMC
STKIBM HPHDSEMC
22
HA Clustering: Marrying Remote Servers to Remote HA Clustering: Marrying Remote Servers to Remote StorageStorage
•Born out of Grid Computing•Create Loose or Tightly coupled systems•High Availability Clustering provides heart beat between Server “nodes”.
•Typically exacting specs/configs between servers•Geo-Clustering, coupled with resilient high speed networks and disk mirroring is the ultimate Business Continuity solution
•AKA: Stretch, Metro, Dispersed or Extended Clustering
SYTEM FAILURE...ATTEMPTING TO REBOOT…NTFS ERROR 14CXX011322
23
SCENARIO 1: Clustering w/ SRDFSCENARIO 1: Clustering w/ SRDF EMC AutoStart for SRDF/S and SRDF/AEMC AutoStart for SRDF/S and SRDF/A
• Zero data-loss business-restart solution with SRDF/S
– Controlled-data-loss business-restart solution with SRDF/A
• AutoStart failover can be automatic or automated (requires operator)
– Business restart automates application restart on top of a disaster-restartable copy of data
• AutoStart ensures R2s are in a consistent state prior to restart
– Verifies that no invalid tracks are owed to R2 prior to bringing applications online with SRDF/S
• Requires DMX with Enginuity 5670 or higher
– AutoStart performs dynamic swap with Enginuity 5x71 and EMC Solutions Enabler V6.2 or higher
Secondary siteProduction site
N-1
N-1
N-1
Heartbeat over IP
N-1
N-1
N-1
R2
R2
R2
R1
R1
R1
24
PRODUCTION DISASTER RECOVERY
• Replicate VMware VMFS (Virtual Machine File System) across heterogeneous storage
• Compress data, optimize bandwidth (up to 10 times)• Protect and recover a single virtual machine or the entire VMware ESX Server• Protect virtual environments with local and/or remote point-in-time recovery
Virtual Center SRM SRA
VMware Infrastructure
Virtual CenterSRMSRA
SCENARIO 2: EMC RecoverPoint with VMware Site SCENARIO 2: EMC RecoverPoint with VMware Site Recovery ManagerRecovery Manager
EMC RecoverPointAdapter for VMware Site
Recovery Manager
VMware Site Recovery Manager
EMC RecoverPoint
Virtual MachinesAPP
OS
APP
OS
APP
OS
APP
OS
APP
OS
APP
OS
APP
OS
Virtual MachinesAPP
OS
APP
OS
APP
OS
APP
OS
APP
OS
APP
OS
APP
OS
VMware InfrastructureServersServers
Heterogeneous StorageHeterogeneous Storage
VMware Site Recovery Manager
© 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
GLOBAL CAPABILITY.PERSONAL ACCOUNTABILITY.
The Business DriversThe Business Drivers
Protecting InformationProtecting Information
Network ImpactNetwork Impact VZ Storage PracticeVZ Storage Practice
26
Mirroring Considerations: Storage SystemsMirroring Considerations: Storage Systems
• Controller Type:– HOST BASED– APPLIANCE BASED– ARRAY BASED & TYPE
• Mirroring Methodology– ASYNCHRONOUS– SYNCHRONOUS– BLENDED– BI-DIRECTIONAL– MULTI-DIRECTIONAL
• Storage Topology– NAS– CAS– SAN
• ROUTING PROTOCOL:– FCIP– FCOE– iSCSI– TCP/IP– GFP/SONET
• DISTANCE
• LATENCY THRESHOLDS (Application)
• NETWORK TOPOLOGY:– DETERMINISTIC– NON-DETERMINISTIC
• BANDWIDTH CONSIDERATIONS
• DATA CHANGE RATES
27
Technology Considerations: NetworkTechnology Considerations: Network
Protocol Typical Usage & Native Distance Maximum Throughput
ESCON•Mainframe•Droop after 9km
200Mbps
FICON•Mainframe (FC + ESCON)•Droop after 120km* (Apps Issue)
400Mbps
FCP (FCoS)•Open Systems; Chan Xtenders•10km w/ Single mode LC; VARIES
Up to 10Gbps
FCIP•Single dedicated tunnel•Over Ethernet (Noisy)
GigE Speeds (Shared up to 10Gbps)
iSCSI•Direct map of SCSI to IP•Subordinate to Network*
GigE Speeds (Shared up to 10Gbps)
iFCP
•Encapsulated & routable•FCP map to SCSI to IP
GigE Speeds (Shared up to 10Gbps)
28
Case 1 Case 1 –– Use VPN to connect to Use VPN to connect to DCsDCs
Metro (National) VPN[Layer 2 or Layer 3]
Metro (National) VPN[Layer 2 or Layer 3]
Pros:• Bandwidth flexibility (1M to 1G)• Easier upgrades• Ethernet everywhere• Ease of adding/removing nodes• Any-to-any communication or
specific point-to-point links
Challenges• No support for storage standards
• FC, ESCON, FICON• No SLAs
for latency and jitter• Data rate typically limited to 1G
Ethernet Access
Ethernet Access
Ethernet Access
Ethernet Access
Ethernet Access
29
Case 2 Case 2 –– InterInter--DC Link DC Link –– Ring OptionsRing Options
Metro (National) VPN[Layer 2 or Layer 3]
Metro (National) VPN[Layer 2 or Layer 3]
Pros:• Greater bandwidth (up to 440G)• Support all sort of interfaces:
• Ethernet, SONET, FC, FICON, etc.• Protected service• Fixed latency and no jitter
Challenges• Cost• Only metro deployment
© 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
GLOBAL CAPABILITY.PERSONAL ACCOUNTABILITY.
The Business DriversThe Business Drivers
Protecting InformationProtecting Information
Protecting the NetworkProtecting the Network
VZ Storage PracticeVZ Storage Practice
31
Look at the full pictureLook at the full picture•
There is tendency to address business needs for BCDR one item at a time.
•
The majority of the requests are intended to address an immediate need and not part of an overall design.
•
The result is a less than optimal design.
32
Building to the Full PictureBuilding to the Full Picture
SAN
MPLS/PIP NETWORK
LAN
WAE DEV
WAE MGR
WAE/WAAS Infrastructure
Collapse Remote Apps/Servers
Centera Active Archive for Filesystems
NAS Gateway to DMX4 plus Centera
NS-NAS FILERS
SAN
STANDBY ARRAY (Target)
DMX-4 ARRAYCentera Active
Archive
Archive Fies
Disk Xtender
Remote Clients
POLI
CY
POLI
CYBUDGET
BUDGET
RPORPO & RTOs
33
Positioning the Right SLA Tiers between Information, Positioning the Right SLA Tiers between Information, Application & NetworksApplication & Networks
BCDR
BACK UP
NAS
ARCHIVAL
Objectives Optimization
ServiceEdgeRouter
Class 5 Switch
OpticalOpticalBackboneBackbone
SONETDWDM
SONETDWDM
SONETDWDM
SONETDWDM
SES Switch
Fast PacketFast PacketATM CloudATM Cloud
Fast PacketFast PacketFrame Relay Frame Relay
CloudCloud
Fast PacketFast PacketATM CloudATM CloudFast PacketFast PacketATM CloudATM Cloud
Fast PacketFast PacketFrame Relay Frame Relay
CloudCloud
Fast PacketFast PacketFrame Relay Frame Relay
CloudCloud
OCn UNI
DS3 UNIDS1 UNI
CPA/IPCPA/IP--MPLSMPLSBackboneBackbone
Network Topologies
ServiceEdgeRouter
Class 5 Switch
OpticalOpticalBackboneBackbone
SONETDWDM
SONETDWDM
SONETDWDM
SONETDWDM
SES Switch
Fast PacketFast PacketATM CloudATM Cloud
Fast PacketFast PacketFrame Relay Frame Relay
CloudCloud
Fast PacketFast PacketATM CloudATM CloudFast PacketFast PacketATM CloudATM Cloud
Fast PacketFast PacketFrame Relay Frame Relay
CloudCloud
Fast PacketFast PacketFrame Relay Frame Relay
CloudCloud
OCn UNI
DS3 UNIDS1 UNI
CPA/IPCPA/IP--MPLSMPLSBackboneBackbone
Network TopologiesILM FWILM FW
34
The California Strategic Sourcing Initiative (CSSI)The California Strategic Sourcing Initiative (CSSI) DGS Contract #: 1SDGS Contract #: 1S--0505--7070--10 (Open Systems Hardware, Software & Services)10 (Open Systems Hardware, Software & Services) DGS Contract #: 1SDGS Contract #: 1S--0505--7070--11 (Mainframe Systems Hardware, Software & Services)11 (Mainframe Systems Hardware, Software & Services)
FeaturesFeatures• Competitively bid Contracts
• Pre-negotiated rates for EMC solutions
• Guaranteed Small Business Participation
• EMC-accredited pre-sales engineering support
• No requirement to use traditional RFP, RFQ or FSR process
Benefits• Reliable storage solutions
• In alignment with Integrated IT Governance Approach
• No cap on contract/order value
• Allows use of design/build approach
• Reduced time and cost in procurement
CSSI WEBSITE: http://verizon.ca.ssicatalog.com/DesktopDefault.aspx
35
Contact Info
Daniel Morris• 11080 White Rock Road, Rancho Cordova, CA• Email:
[email protected]• Office: 916-779-5695• Cell: 916-803-0478
36
Thank you for your time!