indirection systems for shingled-recording disk...

44
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. Indirection Systems for Shingled-Recording Disk Drives Yuval Cassuto with co-authors: M. Sanvido, C. Guyot, D. Hall and Z. Bandic MSST 2010, May 7

Upload: phamthuan

Post on 23-Mar-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved.

Indirection Systems for Shingled-Recording

Disk Drives

Yuval Cassuto

with co-authors:

M. Sanvido, C. Guyot, D. Hall and Z. Bandic

MSST 2010, May 7

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 2

Shingled Recording

Track 1

Track 2

Track 3

Track 4

Track 5

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 3

Shingled Recording – No Random Write

Track 1

Track 2

Track 3

Track 4

Track 5

Write Erase

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 4

Shingled Recording Magnetics

head motion

cornerhead

progressive

scans

downtrack

cross-track

123

4

head field contours

track layout for shingled-recording

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 5

Shingled Recording Tradeoff

• More: Capacity

• Less: Functionality → Performance

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 6

The Optimization Envelope

Observed

Performance

Excess Capacity

Non-shingled

Heads /

media

shingled drive

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 7

From Magnetics to Systems

write

Ib

invalid

PBA 1

PBA M

Track Interference Block Interference

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 8

Drive Based Shingled Recording Indirection

Host

LBA R/W

(unrestricted)

Indirection

PBA R/W

(w/ shingled-

recording

constraints)

HDD

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 9

Drive vs. Host Indirection

• Transparent to Host

• Complete knowledge of physical layout

• “Shingle aware” access and allocation

• System specific performance optimization

Why on the drive?

Why on the host?

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 11

Core Concepts

• Append/Read-Modify-Write with Shingled Regions

Region 1

Region 2

Region 3

Region K

independent

: guard

pure sequential

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 12

General Method for LBA↔PBA Mapping

Temporary

Storage

Permanent

Storage

fixed

variable

(recent writes)

LBA

PBA

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 14

Evaluation Platform

R/W Traces

Synthetic R/W

loads

(random/seq.)Simulated

Performance

IOPS, MB/s

Internal

Statistics

LBA PBA

Algorithms

Data

Structures

Indirection Module

Drive

Model

Write

Overload

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 15

Set-Associative Disk Cache Architecture

• Disk is partitioned to LBA shingled

regions and cache shingled regions

• Each LBA region (shingle) is

associated with one cache region

(shinglet)

• Multiple LBA regions are associated

with each cache region (set-

associative cache)

Shinglet 1

Shinglet 0

Shingle 7

Shingle 6

Shingle 5

Shingle 4

Shingle 3

Shingle 2

Shingle 1

Shingle 0

Associa

ted c

ache

Associa

ted c

ache

HDD PBA SpacePBA 0:

PBA M:

Perm

an

en

tC

ach

e

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 16

Set-Associative Disk Cache: Write Path

Cache Region

LBA

regionsAssociated to

cache

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 17

Write to Disk Cache

Cache Region

LBA

regions FULL!

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 18

Garbage Collection

• Read cache

• For each LBA

region present in

cache: RMW* full

region

• Full cache available

again

Cache Region

LBA

regions Present in cache

* RMW = Read-Modify-Write

(Read+Write) x #Present

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 19

Results from PC Traces

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1 2 3 4 5 6 7 8 9 10

cache size [%]

Wri

te O

ve

rlo

ad

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Slo

wd

ow

n

Write Overload

Slowdown

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 20

4K Random IOPS

0

100

200

300

400

500

600

700

800

900

1000

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Time [sec]

IOP

S

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 21

Effect of Region Size

0

20

0 1000 2000 3000 4000

Time [sec]

IOP

S

0

20

0 1000 2000 3000 4000

Time [sec]

IOP

S

Size 50000 Size 10000

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 22

Set-Associative Disk Cache: Score Sheet

• Significant improvement over plain Read-Modify-Write

• Simple to implement

• Large dips in performance due to long garbage collections

• Region updated in place → consistency issues

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 23

The S-Blocks Architecture

• Temporary (red) and

permanent (blue)

storage managed as

ring buffers

• S-Block: intermediate

unit between sector and

region

S-Blocks

E-cache

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 24

Ring Buffer

V

X

Used, valid

UnusedUsed, invalid

V

X

V

V

V

V

V

V VV

V

V

V

V

X

X X

X

X

HeadTail

Head-Tail guard band

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 25

Initial Conditions

PI

S-Blocks

E-cache

1

2

3

LI

Over-provisioning

Unit = sector

Unit = S-block

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 26

Write Operation

• If full S-Block: write

directly in S-Block

buffer

• If smaller than full

S-Block: write in E-

cache

PI

S-Blocks

E-cache

LI

HeadTail

Head

Tail

S-Block

Buffer

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 27

E-cache Full

• If no room for

incoming write:

reclaim invalid

exceptions in E-cache

(defrag)PI

S-Blocks

E-cache

LI

Head

HeadTail

S-Block

Buffer

xx

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 28

Defrag(E-cache)

• If not enough invalid

exceptions:

– Choose S-Block

– Destage(S-Block)PI

S-Blocks

E-cache

LI

HeadTail

HeadTail

S-Block

Buffer

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 29

Destage(S-Block)

19

17

1

18

16

15

14

12 98

7

5

3

2

13

11 10

6

4

Head Tail

20

S-Block

Buffer

2217

4

12

58

86

4294

9573577

629

Tail

Head

S-Block 1 → sectors 1-10

E-cache

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 30

Destage(S-Block)

19

17

1

18

16

15

14

12 98

7

5

3

2

13

11 10

6

4

Head

Tail20

2217

12

58

86

4294

953577

29

Tail

Head

S-block writes in E-cachce are not

needed after destage

E-cache

S-Block

Buffer

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 31

4K Random IOPS

0

50

100

150

200

250

300

350

400

450

0 200 400 600 800 1000 1200

Time [sec]

IOP

S

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 32

Choosing S-Block to Destage

Choose S-Block with highest

exception count

Best amortization of S-Block

rewrite

Good for biased workloads (e.g.

hotspots)

Optimal Destage Optimal Defrag

Choose S-Block closest to the

tail

Least amount of copying in S-

Block defrag

Good for random workloads

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 33

Comparison for Random Workload

• S-Block destage average # invalidated exceptions 55.26

• S-Block defrag average copy count 1744.64

• average IOPS = 73.44

• S-Block destage average # invalidated exceptions 12.52

• S-Block defrag average copy count 0.00

• average IOPS = 121.54

Optimal Destage

Optimal Defrag

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 34

Effect of S-Block Choice on Performance

0

200

400

600

800

1000

1200

0.00

0

0.00

2

0.00

4

0.00

6

0.00

8

0.01

0

0.03

0

0.05

0

0.07

0

0.09

0

0.20

0

0.40

0

0.60

0

0.80

0

1.00

0

3.00

0

5.00

0

7.00

0

9.00

0

Time [sec]

# C

om

ma

nd

s

milli

hundredths

tenths

seconds

0

200

400

600

800

1000

1200

0.00

0

0.00

2

0.00

4

0.00

6

0.00

8

0.01

0

0.03

0

0.05

0

0.07

0

0.09

0

0.20

0

0.40

0

0.60

0

0.80

0

1.00

0

3.00

0

5.00

0

7.00

0

9.00

0

Time [sec]

# C

om

ma

nd

s

milli

hundredths

tenths

secondsOptimal Destage

Optimal Defrag

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 35

Lazy Garbage Collection

0

50

100

150

200

250

300

350

400

450

0 200 400 600 800 1000 1200

Time [sec]

IOP

S

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 36

Constant Garbage Collection

0

50

100

150

200

250

300

350

400

0 100 200 300 400 500 600

Time [sec]

IOP

S

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 37

S-Blocks: Score Sheet

• Good sustained random write performance

• Append update → no consistency issues

• Flexible to handle different workload types

• Good sequential performance with direct S-block writes

• Non-trivial implementation

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 38

Summary

• Shingled magnetic recording opens a rich area of systems

research

• Good understanding of main issues and tradeoffs

• Proposed architectures likely basis for real implementations

• Significant research needed in performance optimization

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 39

Thank YOU !

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 40

Backup Slides

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 41

Add to Ring Buffer

V

X

Used, valid

UnusedUsed, invalid

V

X

V

V

V

V

V

V VV

V

V

V

V

X

X X

X

X

Head Tail

V

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 42

Invalidate in Ring Buffer

V

X

Used, valid

UnusedUsed, invalid

V

X

V

V

V

V

V

V VV

V

V

X

V

X

X X

X

X

Head Tail

V

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 43

Defrag Ring Buffer

#

X

Used, valid

UnusedUsed, invalid

19

X

1

18

16

15

14

12 98

7

5

X

2

X

X X

X

X

Head Tail

20

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 44

Defrag Ring Buffer

#

X

Used, valid

UnusedUsed, invalid

19

X

1

18

16

15

14

12 98

7

5

X

2

X

X X

X

X

Head

Tail20

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 45

Defrag Ring Buffer

#

X

Used, valid

UnusedUsed, invalid

19

X

1

18

16

15

14

12 98

7

5

X

2

X

X X

X

X

Head

Tail

20

Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 46

Defrag Ring Buffer

#

X

Used, valid

UnusedUsed, invalid

19

X

1

18

16

15

14

12 98

7

5

2

X

X X

X

Head

Tail

20

Free