ppt
TRANSCRIPT
L-Store Distributed storage system
L-Store Distributed storage system
Alan TackettVanderbilt University
Joint ProjectVanderbilt - ACCRE (namespace and glue)
LoCI - UTK - Micah Beck, Terry Moore (storage protocol)Nevoa Networks - Hunter Hagewood (end user tools)
Alan TackettVanderbilt University
Joint ProjectVanderbilt - ACCRE (namespace and glue)
LoCI - UTK - Micah Beck, Terry Moore (storage protocol)Nevoa Networks - Hunter Hagewood (end user tools)
L-Store GoalsL-Store Goals
Scalable in both quantity and rate Metadata (# of files and transactions/sec) Data throughput(amount of data and throughput)
Reliable Secure Accessible
Scalable in both quantity and rate Metadata (# of files and transactions/sec) Data throughput(amount of data and throughput)
Reliable Secure Accessible
What makes L-Store different?Based on a highly generic abstract block for
storage(IBP)
What makes L-Store different?Based on a highly generic abstract block for
storage(IBP)
IBP Internet Backplane Protocol Middleware for managing and using remote
storage Allows advanced space and TIME
reservation Supports multiple threads/depot User configurable block size Designed to support large scale, distributed
systems Provides global “malloc()” and “free()” End-to-end guarantees AES encryption with each allocation having
a separate key Capabilities
» Each allocation has separate Read/Write/Manage keys
IBP Internet Backplane Protocol Middleware for managing and using remote
storage Allows advanced space and TIME
reservation Supports multiple threads/depot User configurable block size Designed to support large scale, distributed
systems Provides global “malloc()” and “free()” End-to-end guarantees AES encryption with each allocation having
a separate key Capabilities
» Each allocation has separate Read/Write/Manage keys
LoCI ToolsLogistical Computing and Internetworking Lab
LoCI ToolsLogistical Computing and Internetworking Lab
http://loci.cs.utk.edu
IBP is at the “waist of the hourglass
for storage”
IBP is at the “waist of the hourglass
for storage”
What makes L-Store different?Based on a highly generic abstract block for
storage(IBP)
What makes L-Store different?Based on a highly generic abstract block for
storage(IBP)
exNodeXML file containing metadata
exNodeXML file containing metadata
A B C
0
300
200
100
IBP Depots
Network
Analogous to a disk I-node and contains Allocations How to assemble file Fault tolerance encoding scheme Encryption keys
Analogous to a disk I-node and contains Allocations How to assemble file Fault tolerance encoding scheme Encryption keys
Normal file Replicated at different sites
Replicated and striped
L-Store PerformanceL-Store Performance
3 GB/s
30 Mins
Multiple simultaneous writes to 24 depots. Each depot is a 3 TB disk server in a 1U case. 30 clients on separate systems uploading files. Rate has scaled linearly as depots added. Planned REDDnet deployment of 167 Depots will be able to sustain 25 GBytes/s
Multiple simultaneous writes to 24 depots. Each depot is a 3 TB disk server in a 1U case. 30 clients on separate systems uploading files. Rate has scaled linearly as depots added. Planned REDDnet deployment of 167 Depots will be able to sustain 25 GBytes/s
Data SharingData Sharing
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
Research Laboratory
Building
Analysis Laboratory
Computing Center
NetworkNetwork
1. Sample created in lab
1. Sample created in lab
Data SharingData Sharing
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
Research Laboratory
Building
Analysis Laboratory
Computing Center
NetworkNetwork
2. Taken for analysis2. Taken for analysis
3. Store data in L-Store3. Store data in L-Store
3. Update metadata3. Update metadata
4. Update DB4. Update DB
FileA -> 1
1 1
Data SharingData Sharing
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
Research Laboratory
Building
Analysis Laboratory
Computing Center
NetworkNetwork
5. Make copy in data center with full fault
tolerance
5. Make copy in data center with full fault
tolerance
FileA -> 1
1 1
FileA -> 1,2222 26. Which is used on the cluster
6. Which is used on the cluster
Data SharingData Sharing
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
Research Laboratory
Building
Analysis Laboratory
Computing Center
NetworkNetwork
5. Researcher Analyzes data in lab
5. Researcher Analyzes data in lab
1 1
FileA -> 1,2222 2
6. Triggers local cached copy
6. Triggers local cached copy
3
FileA -> 1,2,3
What is L-Store?What is L-Store?
Provides a file system interface to (globally) distributed storage devices (“depots”) Parallelism for high performance and reliability Data and Metadata scale independently Infrastructure added as needed
Uses IBP (from UTK) for data transfer & storage service. Write: break file into blocks, upload blocks simultaneously to multiple depots (reverse for reads) Generic, high performance, wide area capable, storage virtualization service
L-Store utilizes a DHT implementation to provide metadata scalability and reliability Multiple metadata servers increase performance and fault tolerance Real time addition/deletion of metadata server nodes allowed
Nevoa Networks for user interface and “LUNs” Nevoa Explorer - WebDAV, CIFS StorCore - Resource Management (LUNS)
L-Store supports Weaver Erasure Encoding of stored files (similar to RAID) for reliability and fault tolerance (support for up to 10 depot failures).
Can recover files even if multiple depots fail. Computation on storage element Support for 3rd party pluggable modules or services
File system interface Auth/AuthZ
Flexible role based AuthZ
Provides a file system interface to (globally) distributed storage devices (“depots”) Parallelism for high performance and reliability Data and Metadata scale independently Infrastructure added as needed
Uses IBP (from UTK) for data transfer & storage service. Write: break file into blocks, upload blocks simultaneously to multiple depots (reverse for reads) Generic, high performance, wide area capable, storage virtualization service
L-Store utilizes a DHT implementation to provide metadata scalability and reliability Multiple metadata servers increase performance and fault tolerance Real time addition/deletion of metadata server nodes allowed
Nevoa Networks for user interface and “LUNs” Nevoa Explorer - WebDAV, CIFS StorCore - Resource Management (LUNS)
L-Store supports Weaver Erasure Encoding of stored files (similar to RAID) for reliability and fault tolerance (support for up to 10 depot failures).
Can recover files even if multiple depots fail. Computation on storage element Support for 3rd party pluggable modules or services
File system interface Auth/AuthZ
Flexible role based AuthZ