sean traber cs-147 fall 2009. 7.9 raid 7.9.1 raid level 0 7.9.2 raid level 1 7.9.3 raid level 2 ...

21
Sean Traber CS-147 Fall 2009

Upload: kelly-jocelyn-daniel

Post on 16-Dec-2015

239 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Sean TraberCS-147 Fall 2009

Page 2: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

7.9 RAID 7.9.1 RAID Level 0 7.9.2 RAID Level 1 7.9.3 RAID Level 2 7.9.4 RAID Level 3 7.9.5 RAID Level 4 7.9.6 RAID Level 5 7.9.7 RAID Level 6 7.9.8 RAID DP (Double Parity) 7.9.9 Hybrid RAID Systems

Page 3: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Originally stood for Redundant Arrays of Inexpensive Disks.

The term “inexpensive” was too ambiguous and was assumed to be referring to cost which was not true.

The term “inexpensive” was actually referring to the slight performance hit required to make the data storage reliable.

The accepted acronym now uses “independent” instead of “inexpensive”.

Page 4: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Based on the 1988 paper “A Case for Redundant Arrays of Inexpensive Disks,” by David Patterson, Garth Gibson, and Randy Katz of U.C. Berkeley.

They coined the term RAID and defined five types of RAID (called levels)

These were the definitions of the levels numbered 1 through 5.

Definitions for levels 0, 6, and DP came later. Many vendors offer enterprise-class storage

systems that are not protected by RAID, these systems are often referred to as JBOD or Just a Bunch Of Disks.

Page 5: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Not a true RAID implementation, lacks the “R” Very quick for Read/Write performance. Not a good way to store sensitive data. If one disk fails, all the data is lost. Less reliable than single disk systems. Array of 5 disks each with a design life of ~

50,000 hours (6 years) gives the entire system an expected design life of 50,000/5 = 10,000 hours (14 months).

As the number of disks increases, the probability of failure increases until it reaches near certainty.

Page 6: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Best used for data that is non-critical, or data that rarely changes and is backed up frequently.

Page 7: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Also known as disk mirroring. Writes all data twice, once to the main drives,

and once to a set of drives called a mirror set or shadow set.

Inefficient use of space, requires twice as much space as the amount of data to store.

Highest fault tolerance of any RAID system. Read performance can as much as double on

multi-threaded systems that support “split seeks”

Reads from the disk arm that is closest to the target sector.

Page 8: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Best suited for transaction-oriented, high-availability environments and applications requiring a high fault tolerance, such as accounting and payroll.

Page 9: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Cuts down on the number of extra drives needed to protect the data.

Takes data striping to the extreme, writing only 1 bit per strip, instead of in arbitrary size blocks.

This requires a minimum of 8 surfaces to write to.

Generates Hamming code for data protection and stores it on the extra drives.

Cuts the number of extra drives needed from n to log(n).

Page 10: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Hamming code generation is time consuming, and therefore RAID-2 is too slow for most commercial implementations.

RAID-2 is best known just for serving as the theoretical bridge between RAID-1 and RAID-3.

Page 11: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6
Page 12: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Another “theoretical” RAID level, like RAID-2 it would offer poor performance if implemented.

Uses one extra parity disk, like RAID-3 but writes data in blocks of uniform size instead of one bit at a time.

Let’s take a stripe spanning 5 drives (4 data, 1 parity) To write to strip 3 we have to:

1. Read the data currently occupying strip 3, and the parity strip.

2. XOR the old data with the new data to get the new parity.

3. Write the new data to strip 3.

4. Update the parity strip.

If there were write requests pending for strips 1 & 4, each would have to wait until the previous write completed since each need the updated state of the parity.

Previous levels could do multiple writes concurrently as no strips data was dependant upon the data on any other strip.

Page 13: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

This essentially turns the parity drive into a large bottleneck, which negates the performance gains of multiple disk systems.

Due to its expected poor performance, there are no commercial implementations of RAID-4.

Page 14: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Similar to RAID-4, but the parity strips are spread throughout the array. This allows for concurrent writes as long as the requests involve different sets of disk arms for both parity and data.

This doesn’t guarantee that writes can be performed concurrently, but RAID-5’s write performance is considered to be “acceptable”.

This design offers the best read performance of all the parity models, but also requires the most complex disk controllers of all the levels.

Page 15: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

RAID-5 has been a commercial success because it offers the best protection for the least cost, and is more commonly used than any of the other RAID systems.

Page 16: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

The previous levels of RAID could tolerate at most 1 disk failure at a time.

However, disk failures tend to come in clusters for 2 reasons:1. If they were manufactured at approximately the same time, they

will reach the end of their expected useful life at around the same time.

2. Drive failures are often caused by events such as power surges, which hit all drives at the same instant, causing the weakest one to fail first, then the next weakest, and so on.

The larger the drives are, the longer it takes to rebuild one from the parity, which leaves the system vulnerable to a second drive crashing and rendering the entire array useless.

Page 17: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

RAID-6 attempts to protect against multiple-disk failure by using two sets of error-correction strips for each row of drives.

One strip holds the parity, and the other stores Reed-Solomon error-correcting codes.

Although RAID-6 offers good protection against the entire array becoming useless while a single failed disk is rebuilt, it is not often used commercially because it has 2 performance drawbacks.1. There is sizeable overhead involved in generating Reed-Solomon

code.2. It requires twice as many read/write operations to update the error-

correcting codes.

Page 18: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Most commonly the DP stands for Double Parity, sometimes called Diagonal Parity.

Protects against two-disk failures by making sure that each data block is protected by two linearly independent parity functions.

Page 19: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

RAID DP can offer much more reliable protection of arrays containing many disks the simple parity protection of RAID-5 can.

The second parity gives much better performance than the Reed-Solomon correction of RAID-6.

The write performance is still poorer than that of RAID-5 due to the need for dual reads and writes, so there is a tradeoff of performance for reliability.

Page 20: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6
Page 21: Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6

Most systems have different needs for different operations, and therefore benefit from using a variety of RAID levels.

For example operating system files are very important and therefore you may be willing to use the extra space to have a full mirror copy of those files using RAID-1.

RAID-5 is sufficient for most data files, and temp files can benefit from the speed of RAID-0.

RAID schemes can be combined to form a “new” RAID level. RAID-10 combines the striping of RAID-0 with the mirroring of RAID-1 to give the best possible read performance while providing the best possible availability, but is extremely expensive.