raid a rrays redundant array of inexpensive discs
TRANSCRIPT
RAID ARRAYS
Redundant Array of Inexpensive Discs
WHAT IS RAID ARRAYS?RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant Array of Inexpensive Drives (or Disks)The various types of RAID are data storage schemes that divide and/or replicate data among multiple hard drives
WHY USE RAID?
Improved ReliabilityImproved PerformanceFault ToleranceImproved AvailabilityHigher Data Security
RA
ID A
rrays
KEY TERMS Mirroring - the copying of data to more than
one disk Striping - the splitting of data across more
than one disk Parity - a redundancy check that ensures that
the data is protected without having to have a full set of duplicate drives.
Duplexing - an extension of mirroring that is based on the same principle as that technique expect it goes one step further in that it also duplicates the hardware that controls the two hard drives (or sets of hard drives).
RA
ID A
rrays
RAID - REDUNDANT ARRAY OF INDEPENDENT DISKS
RAIDController
RAIDController
RAID Array
Host
RA
ID A
rrays
RAID COMPONENTS
RAIDController
RAIDController
Logical Array
Logical Array
Physical Array
RAID Array
Host
RA
ID A
rrays
DATA ORGANIZATION: STRIPS AND STRIPES
Stripe 1Stripe 2Stripe 3
Strips
RA
ID A
rrays
RAID LEVELS
0 Striped array with no fault tolerance 1 Disk mirroring 3 Parallel access array with dedicated parity
disk 4 Striped array with independent disks and a
dedicated parity disk 5 Striped array with independent disks and
distributed parity 6 Striped array with independent disks and
dual distributed parity Combinations of levels (I.e., 1 + 0, 0 + 1,
etc.)
RAID 0
A striped set of at least two disks without parityThe data is broken down into blocks and each block is written to a separate disk driveBest performance is achieved when data is striped across multiple controllers with only one drive per controller
RAID Arrays
RAID 0 – STRIPED ARRAY WITH NO FAULT TOLERANCE
- 10
RAIDController
RAIDControllerBlock 4Block 4 Block 4Block 4Block 3Block 3 Block 3Block 3Block 2Block 2 Block 2Block 2Block 1Block 1 Block 1Block 1Block 0Block 0 Block 0Block 0
Host
ADVANTAGES OF RAID 0
I/O performance is greatly improved by spreading the I/O load across many channels and drivesNo parity calculation overhead is involvedVery simple designEasy to implement
DISADVANTAGES OF RAID 0
Not a "True" RAID because it is NOT fault-tolerant The failure of just one drive will result in all data in an array being lostShould never be used in mission critical environments
RAID Arrays
RAID 1 – DISK MIRRORING
- 13
RAIDController
RAIDControllerBlock 1Block 1 Block 1Block 1Block 1Block 1Block 0Block 0 Block 0Block 0Block 0Block 0
Host
RAID 1 ADVANTAGESHigh data availability and high I/O rate (small block size). Improves read performance - twice the read transaction rate of single disks, same write transaction rate as single disks100% redundancy of data means no rebuild is necessary in case of a disk failure, just a copy to the replacement diskSimplest RAID storage subsystem design – easy to maintain
RAID 1 DISADVANTAGES
Expensive due to the extra capacity required to duplicate data. Overhead cost equals 100%, while usable storage capacity is 50%.May not support hot swap of failed disk when implemented with software. Use hardware implementation.
RA
ID A
rrays
RAID 0+1 – STRIPING AND MIRRORING
RAIDController
RAIDControllerBlock 3Block 3 Block 3Block 3Block 3Block 3Block 2Block 2 Block 2Block 2Block 2Block 2Block 1Block 1 Block 1Block 1Block 1Block 1Block 0Block 0 Block 0Block 0Block 0Block 0
Host
RA
ID A
rrays
RAID 1+0 – MIRRORING AND STRIPING
RAIDController
RAIDControllerBlock 3Block 3 Block 3Block 3Block 3Block 3Block 2Block 2 Block 2Block 2Block 2Block 2Block 1Block 1 Block 1Block 1Block 1Block 1Block 0Block 0 Block 0Block 0Block 0Block 0
Host
RA
ID A
rrays
RAID 0+1 VS. RAID 1+0
Benefits are identical under normal operations
Rebuild operations are very different RAID 1+0 uses a mirrored pair – only 1 disk is
rebuilt if a disk fails RAID 0+1 if a single drive fails, the entire stripe
is faulted RAID is 0+1 is a poorer solution and is less common
RA
ID A
rrays
RAID REDUNDANCY: PARITY
- 19
Parity Disk
0
84
1
95
2
106
3
117
0 1 2 3
8 9 10 114 5 6 7
RAIDController
RAIDController
Host
RA
ID A
rrays
PARITY CALCULATION
Parity
Data
Data
Data
Data
4
2
3
5
14
5 + 3 + 4 + 2 = 14
The middle drive fails:
5 + 3 + ? + 2 = 14
? = 14 – 5 – 3 – 2
? = 4
RAID Array
RA
ID A
rrays
RAID 3 – PARALLEL TRANSFER WITH DEDICATED PARITY DISK
RAIDController
RAIDController
Block 1Block 1
Block 2Block 2
Block 3Block 3
P 0 1 2 3
Block 0Block 0Block 3Block 3Block 2Block 2Block 1Block 1Block 0Block 0
ParityGenerated
Host
RA
ID A
rrays
RAID 4 – STRIPING WITH DEDICATED PARITY DISK
RAIDController
RAIDController
P 0 1 2 3
Block 0Block 0
Block 0Block 0
Block 4Block 4
Block 1Block 1
Block 5Block 5
Block 2Block 2
Block 6Block 6
Block 3Block 3
Block 7Block 7
P 0 1 2 3P 0 1 2 3
P 4 5 6 7P 4 5 6 7
ParityGenerated
Block 0Block 0
P 0 1 2 3P 0 1 2 3
Host
RA
ID A
rrays
RAID 5 – INDEPENDENT DISKS WITH DISTRIBUTED PARITY
Block 0Block 0
P 0 1 2 3P 0 1 2 3
Block 7Block 7
RAIDController
RAIDController
P 0 1 2 3
Block 0Block 4Block 0
Block 1Block 1
Block 5Block 5
Block 2Block 2
Block 6Block 6
Block 3Block 3
ParityGenerated
Block 0Block 0
P 0 1 2 3P 0 1 2 3
Block 4Block 4
P 4 5 6 7P 4 5 6 7P 4 5 6 7P 4 5 6 7
Block 4Block 4
P 4 5 6 7
Block 4ParityGenerated
Host
RA
ID A
rrays
RAID 6 – DUAL PARITY RAID Two disk failures in a RAID set leads to data
unavailability and data loss in single-parity schemes, such as RAID-3, 4, and 5
Increasing number of drives in an array and increasing drive capacity leads to a higher probability of two disks failing in a RAID set
RAID-6 protects against two disk failures by maintaining two parities Horizontal parity which is the same as RAID-5
parity Diagonal parity is calculated by taking diagonal
sets of data blocks from the RAID set members Even-Odd, and Reed-Solomon are two
commonly used algorithms for calculating parity in RAID-6
RA
ID A
rrays
RAID IMPLEMENTATIONS Hardware (usually a specialized disk controller card)
Controls all drives attached to it Performs all RAID-related functions, including volume
management Array(s) appear to the host operating system as a
regular disk drive Dedicated cache to improve performance Generally provides some type of administrative software
Software Generally runs as part of the operating system Volume management performed by the server Provides more flexibility for hardware, which can reduce
the cost Performance is dependent on CPU load Has limited functionality
RA
ID A
rrays
HOT SPARES
RAIDController
RAIDController
RA
ID A
rrays
HOT SWAP
RAIDController
RAIDController
RAIDController
RAIDController
RAIDController
RAIDController
RA
ID A
rrays
CHECK YOUR KNOWLEDGE
What is a RAID array? What benefits do RAID arrays provide? What methods can be used to provide higher
data availability in a RAID array? What is the primary difference between RAID
3 and RAID 5? What is a hot spare?