data protection, availability and replication andreas tsangaris cto, performance technologies

33
Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Upload: mohammed-merryweather

Post on 15-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Data Protection, Availability and Replication

Andreas TsangarisCTO, Performance Technologies

Page 2: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Overview of DR TechnologyOverview of DR Technology

Backup & VaultingBackup & Vaulting

Mirroring Mirroring &Replication&Replication

Global Global ClusteringClustering

SecsMinsHrsDays Wks Secs Mins Hrs Days Wks

Recovery PointRecovery Point Recovery TimeRecovery Time

Backup & VaultingBackup & Vaulting

Page 3: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: Why?

Normal steps in Disaster Recovery

1. Manual procedures - plan

2. Automated backup – restore

3. Data Replication / mirroring

4. Clustering

Page 4: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: Why?

Backup is the first approach to data continuity Redundant systems replicate user errors, viruses and

data corruption System or component redundancy (RAID, mirroring)

cannot replace backup Backup tapes may represent the last valid copy of data Backups are your last line of defense against total data

loss It is the chipset to deploy It is the fastest to deploy It is always the roll back for a complete DR

Page 5: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: How far (fast) you can go?

Recovery Point with Backup (RPO): כ Assume daily backupsכ Data lost is at least a day כ Very often it will be two days

Recovery Time with Backup (RPO): כ DR site preparation for restoreכ System restore (administrator oriented) כ Data restore

Page 6: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Site Recovery from BackupSite Recovery from Backup

Tape Backup with Offsite Vaulting

כ RPO = Time of last good backup stored offsite

כ RTO = Time required to recover from tape∞ Transport tapes to recovery site∞ Setup systems to receive data / Restore OS∞ Restore from tape

כ Typically 1-3 days between RPO and RTO

SecsMinsHrsDays Wks Secs Mins Hrs Days Wks

Days Wks

Retrieve from VaultSet Up Systems

Restore from Tape

Wks Days

Tape BackupOffsite Vaulting

RPO RTO

Page 7: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: The tools

VERITAS NetBackup Storage Hardware Design Procedures

Page 8: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup Architecture

Master ServerMaster Server

MediaMediaServerServer

MediaMediaServerServer

ClientsClients

Storage DomainStorage Domain

Page 9: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Multi-hosted Drives (SAN)NetBackup: Multi-hosted Drives (SAN)

NetBackupNetBackupServerServer

NetBackup Server &NetBackup Server &Robotic ControllerRobotic Controller

NetBackupNetBackupServerServer

Data PathData PathData PathData Path

Data PathData PathRobot Robot

Control PathControl Path

Robot Robot RequestsRequests

Robot Robot RequestsRequests

Page 10: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Tape Multiplexing - Multistreaming

Client orfilesystem

A

Client orfilesyste

mB

Client orfilesystem

C

Client orfilesystem

D

32 streams per tape drive e.g. 32 streams per tape drive e.g. Benchmarked at over 10Tb/hourBenchmarked at over 10Tb/hour

Page 11: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Multiple steams, multiple systems

Run multiple client backup sessions simultaneously and Run multiple client backup sessions simultaneously and stream the data to the same devicestream the data to the same device

remotely attached

disks

high-speedtape devicesNetBackup

server

clientsclients

backup streams

Page 12: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Flat File Database

John/doe/file tom

fish/head/usr carol

usr/foo/bar mary

one/fine/day paul

John/doe/file tom

fish/head/usr carol

usr/foo/bar mary

one/fine/day paul

John/doe/file tom

fish/head/usr carol

usr/foo/bar mary

one/fine/day paul

Flat ASCII filesכ simpleכ compressibleכ browsableכ very tolerantכ easy to repairכ easy to sort, filter and report

Proprietary DBכ complex

כ prone to corrupt

כ difficult to rebuild

כ needs defragmenting

Page 13: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Server Independent Restores

NBU client

NetBackupServer 1

NetBackupServer 4

Page 14: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: True Image RecoveryNetBackup: True Image Recovery

Ensures only your current files are restored

Full Backup

Sat100 files

Restore 141 filesIgnoring 52 deletedLatest versions of 90 modified

NetBackup: Without True Image Recovery we would get 193 files restored

Restore

valid valid files files onlyonly

Incremental Backup

Tue60 files

modified

32 deleted

48 new

Incremental Backup

Mon30 files

modified

20 deleted

15 new

WedDisk

Damaged

Page 15: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Network Bandwidth Throttling

OtherOthernetwork-basednetwork-based

applicationsapplications

Keeps backup workload from saturating your networkKeeps backup workload from saturating your network

NetBackup NetBackup ClientsClients

restrict by user restrict by group dynamically balanced

NetBackup NetBackup ServersServers

DatabaseDatabaseapplicationsapplications

Page 16: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup Vaulting ProcessNetBackup Vaulting Process

#1 – Duplicate / in line copy backups

#2 – Eject duplicate tapes from tape library and prepare for transport

#3 – Transport tapes to the offsite vault

#4 – Record where each tape is located within the vault and when each tape will expire

#5 – When tapes expire, notify the offsite facility

#6 – Transport expired tapes back onsite

#7 – Load tapes back into the tape library

Page 17: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: The tools

VERITAS NetBackup Storage Hardware Design Procedures

Page 18: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Second-host Backup

Disk Array Subsystem

Tape Library

LANEthernet

FDDIATM

Offload backup processing to the backup server, freeing up Offload backup processing to the backup server, freeing up cycles on the application server.cycles on the application server.

Application ServerBackup Server

Page 19: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

NetBackup: Third Party Copy NetBackup: Third Party Copy

LANEthernet

FDDIATM

Fibre Switch/HubNetBackup Server

Application Server

Application Server

Disk Array Subsystem

Tape Library

CatalogCatalog

Third Party copy totally offloads the application server and Third Party copy totally offloads the application server and the backup server from the actual data transferthe backup server from the actual data transfer

Page 20: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: The tools

VERITAS NetBackup Storage Hardware Design Procedures

Page 21: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Automated Vaulting & Pre - restoresAutomated Vaulting & Pre - restores

SUNSUNServerServer

SUNSUNServerServer

XDRDR

XPrimaryPrimary

Standby Server

Page 22: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #1: Single Large Site

Multiple servers (more than 40) Large data (more than 2 TB) Central Enterprise storage box Some servers boot from the SAN DR site has equivalent servers

Page 23: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #1: Single Large Site

The tools: VERITAS NetBackup VERITAS NetBackup Vault SAN attached Tape library Enterprise central storage Procedures

Page 24: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #1: Single Large Site

Disk Array Subsystem Tape Library

Application Servers Backup Servers

SAN

LAN

Backup Stage:SAN attached Servers:

Snapshot Full backup (daily)

LAN attached Servers: LAN Backup (Weekly Full & Daily incremental)

Multiple Media Servers

Multiplexing & Multistreaming

VAULT: Dual in line copyOff site set of tapes

Page 25: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #1: Single Large Site

Tape Library

Backup Server

Tape Vault stage:

Export NetBackup catalog:Write to to a tape

Transfer Tapes to DR:Directions for operators

Insert tapes to DR’s library

Initiate Site recovery:Import catalogStart automatic site restore

Tape Library

Backup Server

PRIMARY SITE

DR SITE

Page 26: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Example #1: Single Site

Disk Array Subsystem Tape Library

Application Server Backup Server

SAN

LAN

Site Restore Stage:

SAN attached Servers: Restore to locally mounted boot & data volumes

LAN attached Servers: Use a small pre-installed OS & restore over the LAN

Page 27: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Example #1: Single Site

Recovery Point:

Average: 12 hoursWorst case: 1 Day

Recovery Time:

Normally minimal (Site is pre-restored)

Worst case: 8 hours(Backup time) plus 4 hours for tape transfer

Disk Array Subsystem Tape Library

Application Server Backup Server

SAN

LAN

Page 28: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #2: Multiple Sites

Multiple geographic sites (more than 6) Small data footprint (less than 100 GB

per site) Large available bandwidth Multiple database applications Need for cross site recovery

Page 29: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #2: Multiple Sites

The tools: VERITAS NetBackup VERITAS NetBackup Vault Direct attached Tape library Multiple direct attached storage boxes DR site has spare servers and available

attached storage

Page 30: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Design #2: Multiple Sites

Application Servers Backup Server

Direct Attached

Disk Repository

Application Servers Backup Server

Application Servers Backup Server

Tape Library

Direct Attached

Disk Repository

Master Backup Server

Page 31: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup Stage:

Full backups (daily):1. Backup to disk (free up

application faster)2. Duplicate over WAN to

central backup server

Database log & incremental backup (hourly):

1. Backup to local disk AND 2. to central backup server

(INLINE Copy) over the WAN

Design #2: Multiple Sites

Restore Stage:

Auto restore all (remote) full backups to local spare server

Use different disk volumes on the spare server

Auto restore all incremental & database logs

Spare server is ready to recover!

RPO: maximum 2 hoursRTO: less than 1 hour (plus log

replay)

Page 32: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Backup for DR: How far (fast) you can go?

Recovery Point with Backup (RPO): כ Normally 1 dayכ May go up to 1 hour (WAN is essential)

Recovery Time with Backup (RPO): כ Pre restored imagesכ May go up to 1 hour

Page 33: Data Protection, Availability and Replication Andreas Tsangaris CTO, Performance Technologies

Thank you!