adapting swift for tape storage or other high-latency media

12
Adapting Swift for Tape Storage or other high-latency media October 27, 2015 Harald Seipp (IBM Systems – Presenter) Slavisa Sarafijanovic (IBM Research)

Upload: harald-seipp

Post on 22-Jan-2018

628 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Adapting Swift for Tape Storage or other high-latency media

Adapting Swift for Tape Storage or other high-latency

media

October 27, 2015

Harald Seipp (IBM Systems – Presenter)Slavisa Sarafijanovic (IBM Research)

Page 2: Adapting Swift for Tape Storage or other high-latency media

Goal

Augment cloud object storage with a low-cost, cold storage tier for archive/backup use cases

Reduced cost● significantly lower than disk

Reduced availability● on the order of minutes

primary storage

highly available

archival storage

low-cost

archive

restore

Standard API (REST)

Client Application

HDD High-latencymedia

OpenStack Swift Cluster

Page 3: Adapting Swift for Tape Storage or other high-latency media

Main Idea

Single Object Storage name space for Objects on● Tape or● Optical Disc or● SMR or MAID Disk

integrated with a standard disk-based OpenStack Swift installation

primary storage

highly available

archival storage

low-cost

archive

restore

Standard API (REST)

Client Application

HDD High-latencymedia

OpenStack Swift Cluster

Page 4: Adapting Swift for Tape Storage or other high-latency media

Facts about Tape

Tape is 5x-10x cheaper than diskTape density scaling and cost are

projected to be advantageous over disk for the next 10 years (see 220 TB cartridge demo)

Tape is a mature technologyTape is already used in today’s

cloud offeringsLTFS is a widely adapted standard

primary storage

highly available

archival storage

low-cost

archive

restore

Standard API (REST)

Client Application

HDD LTFS Tape

OpenStack Swift Cluster

Page 5: Adapting Swift for Tape Storage or other high-latency media

Shortcomings to be solved

Time-to-data● Up to (single-digit) minutes

→ Not playing well with Swift infrastructure (application/load balancer) time-out assumptions

Resource availability● Few drives per 100s cartridges

→ Random access (mounts/seeks) can lead to resource congestion

Page 6: Adapting Swift for Tape Storage or other high-latency media

Addressing shortcomings

Swift API for archiving operations● Support explicit bulk operations (to minimize tape mounts and seeks)● Store/provide object state (“offline bit”) in a standardized way● Provide additional error code (“in transit”) upon access of migrated object

Improved timeout management Configurable Data Ring Auditing

● Support asynchronous tape data verification

Policy based global cluster object distribution● Assumption: related data (e.g. container) is likely to be accessed together

Page 7: Adapting Swift for Tape Storage or other high-latency media

Discussed atVancouver Summit

Addressing shortcomings

Reference: https://etherpad.openstack.org/p/liberty-swift-tape-storage

Swift API for archiving operations● Support explicit bulk operations (to minimize tape mounts and seeks)● Store/provide object state (“offline bit”) in a standardized way● Provide additional error code (“in transit”) upon access of migrated object

Improved timeout management Configurable Data Ring Auditing

● Support asynchronous tape data verification

Policy based global cluster object distribution● Assumption: related data (e.g. container) is likely to be accessed together

Page 8: Adapting Swift for Tape Storage or other high-latency media

SwiftSwift API

Swift API ILM extensions

ILM capable backend

POSIX File System

Swift API ILM* extensions:• Migrate (High-Latency media → Disk)

• Recall (Disk → high-latency media)

• Query status

Implementation proposal:• SwiftILM middleware

• Control path to ILM capable backend:• (1) Swift EA ←→ file attribute (async) • (2) Backend executable (sync/async)

(1)

(2) SwiftILM

Middleware

Diskcache

Tape

OpticalDisc

MAID/SMR

CallExecutable

Swift archiving API through SwiftILM

*Information Lifecycle Management

Page 9: Adapting Swift for Tape Storage or other high-latency media

SwiftILM API proposal

To migrate a single object, issue following HTTP POST http://SWIFT-URL/ACCT/CONT/OBJ?MIGRATE● Similar GET/HEAD requests for RECALL and STATUS

Bulk operations on container levelhttp://SWIFT-URL/ACCT/CONT?MIGRATE

...or through regular expressions on Swift namespace● Get back a request ID for efficient status tracking

Page 10: Adapting Swift for Tape Storage or other high-latency media

SwiftILM API proposal – advanced

(Optional) Setting ILM operations through SwiftILM API● Migration/recall based on object age/size/type etc.

(Optional) Backend-specific additions● e.g. to control placement to specific library/medium/pool

(Optional) Co-existence with Swift3● enabling ILM for S3 protocol as well

Page 11: Adapting Swift for Tape Storage or other high-latency media

Add ILM to your existing Swift cluster

OpenStack Swift

Client Application

Standard Swift API with SwiftILM extensions(REST)

Standard Disk Data Ring(replication or erasure code)

scale-out

ILM-based Data Ring(replication across nodes)

scale-out

SwiftILMMiddleware

Take unmodified Swift Configure ILM-based

Data Ring Add SwiftILM

middleware Add ILM-capable

backendILM

capablebackend

Storage Node

ILMcapablebackend

Storage Node

Diskcache

Tape OpticalDisc

MAID/SMR

Diskcache

Tape OpticalDisc

MAID/SMR

Page 12: Adapting Swift for Tape Storage or other high-latency media

Join us at the Design Summit or IBM boothfor further discussions!

[email protected]: hseipp

Twitter: @HaraldSeipp

http://www.research.ibm.com/labs/zurich/sto/tier_icetier.html