rimac: redundancy-based hierarchical i/o cache architecture for energy-efficient, high- performance...
Post on 18-Dec-2015
220 views
TRANSCRIPT
RIMAC: Redundancy-based hierarchical I/O cache
architecture for energy-efficient, high-performance
storage systemsXiaoyu Yao and Jun Wang
Computer Architecture and Storage System Laboratory (CASS)
University of Nebraska - Lincoln
2006-04-20 University of Nebraska-Lincoln 2
Big Picture Current energy-efficient storage solutions
promising: Saving energy at the cost of performance Saving energy by using DRPM disk
New RIMAC: Redundancy-based hierarchical I/O cache architecture Making storage cache aware of redundancy Solving the performance problem with power
aware request transformation
2006-04-20 University of Nebraska-Lincoln 3
Outline Background & Motivation
Why? How does RIMAC differ?
RIMAC: Redundancy-based Hierarchical I/O Cache Architecture
Evaluation Conclusion
2006-04-20 University of Nebraska-Lincoln 4
Energy Issues of Internet Data Center
Router
Web Servers
Application Servers
Database Servers
…
…
…
SAN
…
Storage SystemInternet
27% of Total Energy*
* From WP’02 http://www.max-t.com
70%/YearSwitch
2006-04-20 University of Nebraska-Lincoln 5
Backend Storage System
High performance SCSI disks Small disk array as building block
RAID-1, mirrored disk array RAID-5, parity disk array
Multi-level I/O cache Large storage cache Moderate RAID controller cache
2006-04-20 University of Nebraska-Lincoln 6
Related Work
Name Conventional Disk
DiskArray
Storage Cache
Performance Penalty
MAID(SC ‘02) Yes RAID-0 Yes Yes
PDC (ICS '04) Yes No No Yes
FS2(SOSP’05) Yes No No No
DRPM(ISCA’03)
No RAID-1RAID-5
No
PA/PB(HPCA’04)
No No Yes
Hibernator(SOSP’05)
No RAID-5 No
2006-04-20 University of Nebraska-Lincoln 7
Motivations Server workload characteristics
Dispersed idle period High performance vs. energy conservation
Long “Passive spin-up” delay in conventional disks (10-15 seconds)
Exploiting existing infrastructure to consolidate the short idle period Internal redundancy in disk array Multi-level I/O cache
2006-04-20 University of Nebraska-Lincoln 8
RIMAC - Redundancy
Identifying sources of “passive spin-up” Non-blocking read Derivative read due to parity update Dirty block flushing [Zhu et. al. HPCA’04]
Exploiting inherent redundancy to untouched sources of “passive spin-up” 1/N redundancy in RAID-5, Requests on standby disks are transformed to
active disk accesses
2006-04-20 University of Nebraska-Lincoln 9
RIMAC - Cooperative Cache Deploying parity exclusive cache
Storage cache: user data RAID controller cache: parity
Leveraging redundancy exploitation in cache High performance power-aware request
transformation in multi-level I/O cache Larger effective storage cache size with new
placement/replacement algorithm
2006-04-20 University of Nebraska-Lincoln 10
Sample Scenario – Transformable Read in Cache
(TRC)
P4
9
5
1
10
P3
6
2
11
7
P2
3
12
8
4
P1
Bottom-Half
Up-Half Storage Cache
Parity CacheP2 P3
XOR6 4 5 …
Read (addr=6,
len=1)
FRONT-END
Response
Idle/Active Idle/Active Idle/ActiveStandby
RIMAC
Storage System
……
Disk1 Disk3 Disk4Disk2
2006-04-20 University of Nebraska-Lincoln 11
Sample Scenario – Transformable Read on Disk
(TRD)
P4
9
5
1
10
P3
6
2
11
7
P2
3
12
8
4
P1
Bottom-Half
Up-Half Storage Cache
Parity CacheP2 P3
XOR6 4 8 …
Read (addr=6,
len=1)
FRONT-END
Response
Idle/Active Idle/Active Idle/ActiveStandby
RIMAC
Storage System
……
Disk1 Disk3 Disk4Disk2
2006-04-20 University of Nebraska-Lincoln 12
Power-aware Request TransformationStorage Cache
Parity Cache
Disks
Read
TRC
TRD
PU-DA-CPU-CA-C
PU-DA-DPU-CA-D
Write(PUPA)
2006-04-20 University of Nebraska-Lincoln 13
PUPA - Parity Update with Power-Aware
Direct Access: P2’ = 5’ XOR 5 XOR P2
Complementary Access: P2’ = 5’ XOR 6 XOR 4
P4
9
5
1
10
P3
6
2
11
7
P2
3
12
8
4
P1
Write (addr=5,
len=1)
2006-04-20 University of Nebraska-Lincoln 14
Cache Placement/Replacement
Algorithms Storage Cache LRU with N-1 constraints Compatible with MQ, LIRS, ARC algorithm
Parity Cache Parity stripe only Second chance replacement algorithm
2006-04-20 University of Nebraska-Lincoln 15
Evaluation Trace driven simulation
Disksim 2.0 3-state disk power models (IBM 36Z15) RIMAC front-end, bottom-half and upper-half
implementation with 5000 lines of C code Workloads
Cello99 from HP: file server TPC-D from HP: decision support SPC-SE from SPC: search engine
2006-04-20 University of Nebraska-Lincoln 16
System Performance
INF 64 320
0.2
0.4
0.6
0.8
1
1.2
1.4
Cache Size (MB)
No
rmal
ized
Ave
rage
Res
ponse
Tim
e
(a) Cello99
INF 128 640
0.2
0.4
0.6
0.8
1
1.2
1.4
Cache Size (MB)
No
rmal
ized
Ave
rage
Res
ponse
Tim
e
(b) TPC-D
INF 256 1280
0.2
0.4
0.6
0.8
1
1.2
1.4
Cache Size (MB)
No
rmal
ized
Ave
rage
Res
ponse
Tim
e
(c) SPC-SE
LRURIMAC
LRURIMAC
LRURIMAC
Cello99 TPC-D SPC-SE
30%
20-30% 2-6% 5-14%
Larger cache does improve performance
2006-04-20 University of Nebraska-Lincoln 17
Energy Consumption
INF 64 320
0.2
0.4
0.6
0.8
1
(a) Cello99
Cache Size (MB)INF 128 64
0
0.2
0.4
0.6
0.8
1
(b) TPC-D
Cache Size (MB)INF 256 128
0
0.2
0.4
0.6
0.8
1
(c) SPC-SE
Cache Size (MB)
LRURIMAC
LRURIMAC
LRURIMAC
Cello99 TPC-D SPC-SE
14-15%
33-34%
15-16%
Larger cache may not save more energy
2006-04-20 University of Nebraska-Lincoln 18
Effects of Read Policies
LRU RIMAC0
5
10
15
20
25
30
35
40
Rat
io (
%)
(a) Cello99
LRU RIMAC0
2
4
6
8
10
12
14
Rat
io (
%)
(b) TPC-D
LRU RIMAC0
3
6
9
12
15
18
21
23
Rat
io (
%)
(c) SPC-SE
Read HitTRCTRD
Read HitTRCTRD
Read HitTRCTRD
Cello99-64 MB TPC-D 128 MB SPC-SE 256 MB
33.8%
12.8%49.5%
10.1%4.1%
6.9%
2006-04-20 University of Nebraska-Lincoln 19
Effects of Power Aware Parity Update Policies
BASE RIMAC0
10
20
30
40
50
60
70
80
90
100
Rat
io (
%)
(a) Cello99
BASE RIMAC0
5
10
15
20
25
30
35
40
45
50
Rat
io (
%)
(b) TPC-D
Cello99 TPC-D0
2
4
6
8
10
12
14
16
18
Rat
io (
%)
(c) Parity Hit Ratio
BASERIMAC
Write HitPU-DA-CPU-CA-CPU-DA-D+PU-CA-D
Write HitPU-DA-CPU-CA-CPU-DA-D+PU-CA-D
Cello99-64 MB TPC-D 128 MB Parity Hit Ratio
83.7%13.8%
2006-04-20 University of Nebraska-Lincoln 20
Anatomy of Energy Consumption
Disk1(BASE) Disk1(RIMAC) Disk4(BASE) Disk4(RIMAC)0
0.5
1
1.5
2
2.5
3
3.5
4x 10
4
Ener
gy C
onsu
mpti
on (
J)
ActiveIdleSeekStandby
Cello99-64 MB
2006-04-20 University of Nebraska-Lincoln 21
Conclusions RIMAC: Redundancy based Hierarchical I/O
cache architecture with minimum overhead Address an open problem - “passive spin-up”
in energy-efficient server storage systems by power-aware request transformation both in caches and on disks
Reduce energy cost by up to 33% and improve performance by up to 30%