building an online archive using openstack swift, ibm gpfs and tape storage
DESCRIPTION
Despite continued predictions about the death of tape and the desire of almost everyone from administrators to application developers, ultra low cost storage continues to only be possible through the use of tape in the storage hierarchy, This talk describes our case study of how we can have combined the use of disks and tape in the Swift storage hierarchy without modifying the Swift codebase and while being able to keep all of the data “online”. We will describe how this can be done with any system that supports policy-based migration of data through such interfaces as DMAPI, and also the additional configuration required to avoid having Swift processes cause unnecessary migration of data back and forth between disk and tape. Finally, through use of the Swift DiskFile module and Storage Policies, we will show how to provide per-container policies for data migration based upon object characteristics.TRANSCRIPT
© 2014 IBM Corporation
Building an Online Archive using Swi4 combined with Tape Storage
Joseph Dain, Dean Hildebrand, Greg Kishi,, Bill Owen,
© 2014 IBM Corporation 2
2
Elastic Storage Object Enterprise Storage and OpenStack Swift
Elastic Storage
SSD Fast Disk
Slow Disk
Tape
Swift
© 2014 IBM Corporation 3
3
Elastic Storage
SSD Fast Disk
Slow Disk
Tape
Swift
1. Archive data • Write once, read rarely
2. Hot and Cold data • Recently created data is hot,
but access quickly fades • E.g., Project, Weather
Why Tape?
© 2014 IBM Corporation 4
4
What is Elastic Storage Object?
Elastic Storage
SSD Fast Disk
Slow Disk
Tape
Swift
– Software Defined Storage – Leverage unmodified OpenStack Swift – File and Object API support
l Support SwiftonFile StackForge project
– Extreme density l 160:1 disks to servers
– Data protection l Mature erasure code implementation l Disk hospital l No replication network l Snapshots, Backup, Disaster Recovery
– Encryption – No practical object size limits – Seamless capacity growth
© 2014 IBM Corporation 5
Data Ingestion or creation
Data Processing Access Archival
High Performance Disk Tier
SSD, SAS
Parallel Access
Provide highest performance for most
demanding applications
High volume storage
Single Global Name Space across all tiers
Lower costs by allocating the right tier of storage to
the right need
Archival storage with low
cost disk or tape
Integration with Tivoli Storage Manager/LTFS
Policy based Archival and remote Disaster Recovery
Elastic Storage Manages the Full Data Lifecycle Cost Effectively
© 2014 IBM Corporation 6
6
OpenStack Swift Policy-based Data Lifecycle Management
Elastic Storage
Swift
– Create policies to migrate objects to archive tier
– Example Policies 1. Migrate entire containers 2. Migrate “cold” objects across all containers
– External storage integration via DMAPI • Example low-cost storage pool:
• TSM, LTFS, ProtectTier
– Avoid recalls for commonly accessed data • All object attributes and xattrs stored on disk
for fast access for searching, listing, etc • Container and Account information stored on
disk to allow searching, listing, etc. – Increase timeouts
System pool (SSD)
Gold pool (SAS)
Silver pool ( SATA)
TSM / LTFS / ProtectTier
© 2014 IBM Corporation 7
0
1
2
3
4
5
6
$/TB/Month
• Lower TCO by 6x • Up to 29.3PB per 4 frame TS4500 library @2.5:1
compression
Thin layer high capacity disk or
Flash
TS4500 HD Tape Library
6X
HDD
Tape
6x lower TCO than HDD
Active Archive with LTFS
Policy + DMAPI
RedHat 6.5
Elastic Storage
LTFS EE
8Gb FC
HTTP Swift Requests
Swift
© 2014 IBM Corporation 8
Active Archive with ProtecTier Deduplication
HTTP Swift Requests
Policy + DMAPI
Elastic Storage
10Gb NFS
TS7650G ProtecTIER
Deduplication Gateway V3.3 8Gb FC
8Gb FC
Deduplica<on Pool
8TB Physical Capacity
200TB (25x) Capacity
Represented
Seamless policy based tiering
Flash or Disk
Swift
– Keep a single deduplicated, compressed copy of data
– Efficient storage with low TCO nearly equal to physical tape
• Reduce storage footprint by up to 25:1
– Disk access speeds to deduplicated object data
• Up to 700 MB/s tiering to / from dedupe pool
– Up to 25PB data in a single deduplication pool
© 2014 IBM Corporation 9
9
Benefits of Tape and OpenStack Swift
Elastic Storage
Swift
– Drive down storage costs • Tape reduces TCO of 3x replication and
erasure coding • Reserve expensive storage for valued data
– Performance • Match data performance requirements to
storage tier
– Compliant archives • HIPAA, NSF, Financial regulations,
warranties, etc
– Easy to manage – Policy-based tiering – Scalability
System pool (SSD)
Gold pool (SAS)
Silver pool ( SATA)
TSM / LTFS / ProtectTier
© 2014 IBM Corporation 10 10
IBM Elastic Storage and OpenStack EcoSystem
OpenStack Management
Cinder Glance
Nova Manila NFS SMB POSIX Swift HDFS
Elastic Storage
SSD Fast Disk
Slow Disk
Tape