backup architectures in the modern data center. author ... · pdf filebackup architectures in...

9

Click here to load reader

Upload: vantuong

Post on 07-Mar-2018

216 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

Backup architectures in the modern data center .Author: Edmond van [email protected]

Competa IT b.v.

Page 2: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

Existing backup methods

Most companies see an explosive growth in the amount of data that they have to process and store.This fact given, it’s usually necessary to take a closer look to how backups are done and how themethods of backing up data can/must be expanded or changed. To come to the right decision onwhat backup method(s) to use, there first must be a good understanding of what methods there areand what benefits and disadvantages these methods have.

An overview.

1. Backups to disk2. Removable media. like CDR’s (CD recordable), MO (Magneto Optical)drives, ZIP drives, etc.3. Local (per server) tape drive(s).4. Shared tape drive(s). (1 or more backup servers)5. Tape stackers (shared or not).6. Robotic tape libraries (shared or not).

There are more methods you can think of. I chose these because they are the most common in my opinion.

1. The first option, is probably the most expensive. You are in for a shock when you compare thecost per Gigabyte of diskdrives to cost per Gigabyte of tapes. However, it seems that more andmore companies are using or going to use this option to increase the speed of important backupsand dramatically decrease restore times. A common application of backing up to disk issnapshotting databases to third mirror disks or volumes. Using this mechanism you have aninstant copy of your data without downtime. In case of data loss, you can synchronize the’production’ disk with the backup disk you made earlier. It will still be necessary to stream thisdata to tapes for long time storage. Another advantage of using disks as backup medium is thatyou can actually use the data on those disks for testing of new applications for instance beforeputting them on your production environment.

2. It seems obvious that the second option in the list is merely useful for relatively small amounts ofdata. It is way too time consuming to start backing up big chunks of data to a backup mediumwith a storage capacity of less than a Gb. However, there are situations where these kinds ofmedia could come in handy. You can think of data that has to be kept for an extensive period oftime, like archives of tax data for instance. Or, backups of important correspondence that will bedeleted from main storage but has to be kept for possible future use. Technologies keep on beingimproved every few months. Two years ago you could write 650 Mb to an MO disk. Today youcan write 9 Gb to that MO disk. This makes it a perfect solution for backing up your desktopcomputers.

3. A widely spread method of backing up data is that every server has it’s own tape drive(s). Theadvantage is that you do not have to worry about network bandwidth or the availability of abackup server. You might get in trouble though when the amount of data that you have to backup, outgrows the capacity of the locally attached hardware and used media. In this case, you willhave to invest in a mechanism that can replace full media with empty media in order to continuethe backup from were it left of. This mechanism can be a person changing the tape by hand whatmakes it very labour intensive for large sites, and is prone to human error. Or you can increasethe capacity by using extra hardware, like an extra drive or a tape stacker.

Page 3: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

4. Another common method is using a backup server. This machine has the task of gathering data ofother machines spread over a network. This can be a dedicated machine, but you can also use amachine that is idle at the time you want to perform your backups. Of course, this machine has tobe well equipped for this task. I will come to system requirements later. In some situations it issensible to have more than one backup server. For instance, you have 30 machines to back up.You do not have the money to invest in a big dedicated backup server. In this case, you couldpoint out 3 machines that are idle over night which will backup the data of 10 machines each.Like in the former example, you must carefully choose the type of hardware. Backing up severalmachines without making a careful inventory can easily cause your machine or chosen media torun out of capacity. Other considerations are: Are there any future plans for these machines, whatkind of support contract do you have on them and are they fast enough to write all the data totape in the designated backup window?

5. A tape stacker is an obvious choice when you run out of your single drive’s capacity. A stackerneeds a special kind of attention. Especially when you use your own scripts or other solutions toget your backups to tape. A stacker works according to the ’sequential’ system. The first tape iswritten until it is completely full. After that, the next tape has its turn. This routine repeats itselfuntil the last tape is used up. Using a simple schedule, this will force you to change tapes as soonas the stacker is full. When you use your own smart schedule with some kind of retention periodor a third party solution, it might be possible round−robbin to the first tape in the sequence if thecapacity of the stacker allows you to.

6. The Robotic tape library uses a totally different mechanism than the tape stacker. Unless you area superb programmer, you need vendor or third party software solutions to make use of yourpurchase. The tape library is ’ inventory aware’ . This means that the library ’knows’ about howmany tapes there are in the library. It often uses barcodes attached to the tapes as a reference.These barcodes are read with laser or camera techniques. The software that controls the librarykeeps track of the content of the tapes. This offers you the possibility to mix several sets ofdifferent clients, schedules and retention periods.

Do you need changes?

It is of course possible that you run your backups very smoothly with your existing hardware andsoftware. Only make changes if you feel or know that something is wrong. However, changes maybe necessary if the policies of your users change. For instance, when you normally have 6 hours ofdowntime for a database backup, it could happen that your users want a higher availability of thatdatabase. This could force you in having to do online backups of that database. Another example isthat hardware is added to the environment. When a department is in the process of implementing a 1Terabyte fileserver, your are likely to get in to trouble when you do not investigate what the impactwill be on your existing backup architecture. It is a commonly made error that project teams startlooking at how they are going to get the enormous amounts of data to tape after the machine isinstalled. This is too late. Another reason for change is that data has to be more and more available.If availability increases, backup window size decreases. No more database downtime.

Page 4: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

I f changing, what do you need?

This is a tough question to answer. First you must know what kind of data you have to back up (e.g.Databases, fileservers, webservers, etc.) and the different combinations that (have to) coexist in theenvironment.

Use the following list of basics to begin planning your backup environment:

� Dataset size and average number of files.� Network bandwidth.� Memory.� Tape drive Technologies.� Software packages.� Off−site storage.� Backup frequency.

Dataset size and average number of files.

To determine the total tape capacity you need, there are a number of things to consider. First of all;What is the total amount of data that you have to back up when doing a full backup (at this point itis not yet important to know if you are going to back up locally or to a backup server). If you aregoing to use a third party or OEM’ed (Original Equipment Manufacturer) product, you must knowthe number of files that you have to backup too. Products like NetBackup from Veritas orNetWorker from Legato use about 150 bytes to store the location, retention period and backup timeof the backed up file in its index database. So, backing up one million files leaves you with atheoretical index database size of around 150 Mb. To be able to tune your backup system in a laterstage, the total number of files that reside on a file system could come in handy too. With these twovalues (total amount of data and number of files), you can calculate the average file size. On largefile servers, you can improve backup and restore performance by tuning the block size written totape to the average file size. On relatively small machines you will hardly notice any differences inperformance.

If you really want to be sure that you are not going to buy too many tapes, you can add compressiontoo as a parameter. Also very important, the on−drive hardware compression helps to improveperformance because the amount of data written to the drives cache is in fact more than actually iswritten to tape.

In the following table, the typical compression ratios and gained performance when usingcompression is shown.

Data Type Gained Performance Compression

Text 1.46:1 1.44:1

File server 1.60:1 1.63:1

Web server 1.57:1 1.82:1

Database 1.60:1 1.57:1Ref. 1

Page 5: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

The native write capacity of a DLT drive is 5Mb/s. This means that you can write 5Mb/s ofuncompressed data to the drive. According to the table, writing speeds for a fileserver improves with1.60:1 which produces a theoretical data stream of 8Mb/s.

Network Bandwidth.

In the case of a network related or centralized backup architecture, knowing about the way yournetwork will be utilized during backups in the peak hours is essential. The following table shows anumber of commonly used network technologies with their theoretical speeds and their morerealistic speeds based on vendor specifications and personal experience. This table is only aguideline and transfer rates depend on actual network load at any one time.

Network adapter Speed in theory Speed in ’real life’

10BaseT Ethernet 10Mbps 0.65−0.85 MB/sec

100BaseT Ethernet 100Mbps 6.5−8.5 MB/sec

Gigabit Ethernet 1000Mbps 25−53 MB/secRef. 1

Please note that normally the transfer rate of a network interface card is noted in bits, where 1 Mbpsis actually 1000x1000 bits. We are far more interested in how many bytes, or Mega bytes for thatmatter, we can pump through the network.

A good way of testing your network transfer rate is to create a reasonably large file on a client andsetup an ftp connection to the backup server (if you have one already). If you copy the file to/dev/null on the receiving machine, you can get a pretty good picture of your network transfer ratein combination with the read capabilities of the hard drive(s) on the client. When ftp finishes the filetransfer, it gives you the number of Kbytes/sec. Dividing this number by 1024 gives youMbytes/sec. If you divide the total amount of data that has to be backup up from the client by theresult of the ftp transfer, you get the estimated time the backup would take. The actual results of alive backup depends on how many clients you back up at the same time, hardware capabilities of thebackup server and the type of backup hardware/media you use. If you merely want to test actualTCP traffic from network card to network card, you can use tools like ttcp for example. Using the’ /dev/null’ method can produce unreliable results on some Unixes though.

Let’s assume your backup server has one 100Mbps link to the network. This should be enough toget two DLT (Digital Linear Tape) drives to stream. If you would experience slow backups, addingtape drives would probably slow down your backups because the drives must share the availablebandwidth causing them to slow down because they can not get into streaming mode. So, youshould keep the minimum drive throughput in combination with your network speed on thereceiving side in mind. The client machines are less important in this case. If you have slow clients,you can start more than one backup at the same time to increase the data stream to your backupserver. For this reason it is important to know what link speeds there are on your client machines.Create an inventory list of client hardware to plan the schedules for your different server groups.

Page 6: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

Memory.

On the client side, available memory is usually enough because the system load during a backup canbe compared with normal operation (more or less depending on server task). If your system behavesacceptably during operating hours, it should not degrade during backup hours. However, there aresome memory considerations on the backup server, like the following:Some third party backup solutions use chunks of shared memory to buffer data before it istransferred to the tape drives used during a backup or restore. The calculation of the size of sharedmemory depends on 4 parameters: 1. The size of the buffers, 2. the number of buffers, 3. the numberof drives and 4. the number of multiplexes (Veritas) or the number of parallelisms (Legato).Multiplexes or parallelisms are the number of backups that you stream to a single drivesimultaneously. This can be multiple file systems from one client or more clients at the same time ora mix of those two. Note that you can multiplex on a per client basis or multiplex on a per drivebasis. So, if you have 2 drives and 4 multiplexes per drive, you can backup 4 clients with 2multiplexes each at the same time.

Veritas suggests the following calculation to determine the size of shared memory:

Shared_memory=(buffer_size*nBuffers)*nDrives*nMPXRef. 2

So, if you have 4 drives you can use simultaneously, 4 multiplexes per drive, a buffer size of 64 Kband 16 buffers, you would get the following:

(65536*16)*4*4=16777216 which is 16 Megabytes of shared memory.

Note that the calculation is done using the full notation of the number of bytes. Shared memory is alimited resource. Setting the buffer size and number of buffers too high for available shared memorycan cause you problems because the software expects there to be enough shared memory to store itsbuffers in.

A buffer size of 65536 bytes seems to give best performance on DLT drives.

Tape dr ive technologies.

The following table shows a number of different tape drive technologies:

Technology Native capacity in Gb Speed in Mb/second

DDS−3 12 1

DDS−4 20 3

Mammoth (Exabyte) 20 3

DLT 7000 35 5

DLT 8000 40 6Ref. 1

Page 7: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

Choose the technology that best fits your needs and budget. Even if you choose to use a robotic tapelibrary, you can still decide what kind of tape technology you use to a certain level because somelibraries support different kinds of tape drives. Let’s say you have to back up four machines with atotal of 60 Gb of data. It will probably be cheaper to buy one DLT drive that you connect to onemachine and perform remote backups with ufsdump. It’s not only cheaper than buying four drivesfor every machine (extra SCSI cards perhaps?), it is better to manage because you only have tochange tapes in one drive instead of four. Next to that, administration concentrates on one locationinstead of four.

Software packages.

There are a wide variety of backup/restore software packages available on the market today. A lot ofthese packages support a broad range of soft− and hardware platforms. There are already a numberof packages designed for the Linux community too.

A small selection of backup/restore software packages:

� Armanda� Veritas NetBackup� Legato Networker� BRU from EST (Linux)� Arkeia from Knox (Linux)� HP Omniback

The decision on what software to choose should be based on the needs of your environment,personal preferences (evaluate that stuff!) and , as always, budgetary limitations (or absolutefreedom).

Off−site storage.

Many companies require to have business critical data stored outside the company walls. You canimagine that in case of a fire, you do not want all your data (disks, tapes) at one location. Theproblem here is that you still need your data within reach to ensure quick restores if needed. Tosolve this problem you can ’clone’ or ’duplicate’ the data that has been written to tape duringbackup hours, usually at night. The most ideal way is to have a dedicated backup server that canperform these duplications when it is idle during non−backup hours. This mechanism can/willinfluence the way you set up multiplexes or parallelisms. For instance, when you want to duplicate atape with 8 multiplexes, the duplication process will run over the tape 8 times to construct 8 files onthe duplication tape. The reason for this is to decrease restore times dramatically when you areforced to use the duplicate to restore data. Duplication can be a motivation to buy extra tape driveseven if the network bandwidth of the backup server would withhold you to use them.

Page 8: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

Backup frequency.

Basically you can designate three kinds of data: 1. Static data, 2. dynamic data, 3. Machinegenerated data.

1. Making regular backups of static data can be a waste of time, tape space and therefore money.With static data, you can think of files the OS consists of or application files. It should besufficient to make full backups, let’s say, once a month, and an incremental backup in betweenfull backups. It all depends on how dynamic your configuration files are.

2. Dynamic data is data that applications use or data that resides in databases such as Oracle orSybase. This data can change very quickly. It can change so quickly that you even mightconsider not to make incremental backups at all because in the end they will just be full backups.

3. In some cases, machine produce data from, for instance, mathematical calculations or ’end ofmonth’ financial calculations. The decision whether you back this data up should be based onhow long it will take for the machine to regenerate or the time you need to get it back from tape.

How to motivate management.

Implementing a brand new backup system can be very expensive. This investment is usually apainful one for management. Most IT systems are used to run the business more efficiently and theypay themselves back in the long run. Backup systems cost money and they just sit there doingbackups. Of course this is a wrong assumption. The reason you make backups is to protect yourprecious data from hardware failures, software failures, human error and foul play. Loosing data isway more expensive that maintaining a backup system to prevent data loss in the first place.

The question you must ask yourself is: What is the company going to loose when data is completelylost or it will take an extensive period of time to get it back, one day for example. A good backupmechanism should be considered as an insurance policy. Compared with the value of the data theyprotect, it is a relatively cheap insurance policy too.

Conclusion.

Implementing a data center wide backup solution is not an easy task. Talk to the differentdepartments and with the end users about there requirements and data types. Compare the widerange of products there are on the market and evaluate software packages. And, when it all runs?Test your restore procedures on a regular basis.

Page 9: Backup architectures in the modern data center. Author ... · PDF fileBackup architectures in the modern data ... you can actually use the data on those disks for testing of new

Resources

Ref. 1. Planning you backup architecture. White paper. Sun Microsystems.Ref. 2. Size/number of data buffers. How it works and how to determine what to use. TechNote. Veritassoftware.