backup and restore cpte 433 john beckett. why back up? so you can restore later! sla restore policy...

23
Backup and Restore CPTE 433 John Beckett

Upload: ami-ford

Post on 11-Jan-2016

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Backup and Restore

CPTE 433John Beckett

Page 2: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Why Back Up?

• So you can restore later!

SLARestor

e Policy

Backup Policy

Backup Schedu

le

Page 3: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Toward a Backup Strategy

• Corporate guidelines define terminology and specs for data recovery

• SLA defines specs for specific site or app

• Policy documents implementation of the SLA

• Procedure says how policy will be implemented

• Schedule anchors the policy in real time

Page 4: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Why are we planning a Restore?

• Accidental file deletion• Data corruption

– Hardware or system software– Application error– Procedural error

• Disk failure• Archive

– Snapshot of system: legal or fiduciary reasons

Page 5: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Self-Service Restores?

• History: VAX, File Motel, NetApp Filer• Great for document-oriented systems• Lousy for database restores• Uh…what if they overwrite the new one?

JB’s take: • Best to have two people involved in a restore,

unless one is technical• Store current state before restoring.• Use non-rewriteable media for backups if you can

Page 6: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Restore Requests

• Should be logged• Use a Web page to collect these, and

log them in a database– Identify clearly who did the restore,

when, and exactly what was restored– Your backup software should facilitate

this – it should not be a manual process– User should receive clear notification

when the process is complete

Page 7: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

What Order For Full Restore?

Right

1. Most recent incremental – Including complete

directory

2. Get updates from incremental

3. Get older files from full backup

Wrong

1. Most recent full backup

2. Most recent incremental

• What about files that were purged since the full backup?

– Use the directory on the Incremental

This is a good time to

have your term

inology

standardized!

Page 8: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Where’s the Backup?

• On site– Vulnerable to employee error and

misdeeds• Off site or safe/vault

– Not available when you need it– How do you get it in there in the first

place?– Are you putting your most valuable

backups in the safest place, or doing it “later?”

Page 9: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

When Do You Back Up?

• Backups can take a lot of CPU time– Compression can sometimes be tuned to

mitigate this, at the expense of more media

– Consider the split-RAID and Zap-copy methods

• Schedule them at off-hours– Do you have an employee to load

media?– Maybe you need to buffer the backup for

later transfer (buffer drive is cheap)

Page 10: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

The Storage Dilemma

• Disk space is increasing much faster than backup media space

• Use disk space to buffer backups so that you can do them during off-times

• Perhaps: Copy to a disk drive, then do your compression on a separate computer

Page 11: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 1

• Differential Backups– Increases the number of tapes that must be

good for you to survive.• Leaning on RAID

– RAID is not a backup solution, it is an uptime improver.

– If the RAID hardware fails, you’re still dead.• So buy good RAID hardware and drives!• Multiple drives on a single chain is not a good idea.• Back up anyhow

Page 12: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 2

• Closed Loop– Remember to cycle tapes out of your

backup group regularly.• It’s a way to avoid wearing them out.• It’s a way to give you a backup in case all

your rotated tapes are bad (it happens).• Do it fairly often.

Page 13: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 3

• Capitalizing tapes– Tapes are expense items, not capital

investments because they wear out. – The diabolical truth about many IT items

is that their longevity falls between the accounting definitions of “expense” and “capital”.

Page 14: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 4

• “The backup went OK”– Did you read it back to see if it has

internal integrity?– Did you restore a file to see if it came

back?– Idea: Use a cron job (or translate into

Gates-ese) to grab (wget) the home page for cnn.com daily. Make sure it goes in a directory near the end of the tape.• That way you can check any backup easily

to see if you can restore from it.

Page 15: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 5

• You don’t need to consider network bandwidth in designing your backup scheme.– At some point, tape drives and

computers will be fast enough to move the bottleneck to your network – at which point making things better may get a lot more expensive

– One answer: A separate high-speed network between disk farm and backup device

Page 16: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 6

• All backups should be kept off-site.– Will the most important (recent)

backups be kept there?– Can you get them when you need them?

– On the other hand, a professional off-site company will provide clear records of tape transactions which will help prevent fraud.

Page 17: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Dangerous Ideas - 7

• The drive needs to be replaced– This happened with DDS drives. The

system mfgr bought them from another company (Sony) in this case, and had no idea how to make them work or control quality. All failures were serviced by replacing drives. Failures occurred often as a result of this lack of understanding.

– For a year, this was the single greatest contributor to downtime in our system.

Page 18: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Does It Have To Be Tape?

• Disk drives are faster, probably cheaper, and more reliable.

• The only disadvantage of disk drives is that the media are not removable.

• Perhaps back up to a disk, then copy that disk to tape.

• Also: If your data fits on a DVD or CD, that is probably a better solution anyhow.– But allow room for data expansion.

Page 19: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Transaction Backups

• Many database systems log transactions to a file, which can be located in a safe place (like a separate drive – perhaps located in a different building).

• These transaction logs are useful for:– Rebuilding the database to the moment

of failure.– Analyzing transaction patterns when

tuning for performance or function.

Page 20: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

Expiry Dates

• Automatic tape libraries should label tapes according to their retention cycle, and automatically handle re-use.

• The same function should “kick out” tapes that are considered worn out.

Page 21: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

The Backup Process

• Quiesce • Stop transactions?

– Put transactions into a buffer?– Do transactions matter?

• Separate– May be combined with next step

• Save• Un-quiesce• Verify

Page 22: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

What Are You Saving?

• Simply the files• A disk image

Page 23: Backup and Restore CPTE 433 John Beckett. Why Back Up? So you can restore later! SLA Restore Policy Backup Policy Backup Schedule

How Do You Restore

• Can it be a disk image?– Faster

• Can it be a single file?– If not, do you have to take the entire

system offline just to pull back a single file?

• Ideally: Either