network support for network-attached storage(2000.05.19)
DESCRIPTION
TRANSCRIPT
Network Support for Network-Attached Storage
David F. Nagle, Gregory R.Ganger,Jeff Butler, Garth Goodson, Chris Sabol
Carnegie Mellon University
Introduction• Storage market is growing rapidly
– Densities increase at 60%/year– 35%-50% per year decrease in the cost/byte– Expected annual growth is at least 60%
• Increasing demands on storage performance– Demands on effective sharing, better
administrative control, less redundancy– Storage performance must cost-effectively
scale with customer investment
Limitations on Scaling
• Problem of current distributed file system architecture– File server machine is bottleneck point
• Bytes copied through peripheral buses⇔file server⇔client’s lan
– File server machine act as application level inter-network router
Limitations on Scaling
• Problem of traditional client-server protocol stack
– Too many times of data copy in delivering it to applications
– Significantly reduce network attached storage’s sustained bandwidth
Contention
• Cost-effective scalable storage performance depends on
– Eliminating file server’s role• Drives inject packets directly network
– Efficient networking• Traditional protocol stack is bottleneck
point• Use user-level networking
Goal of Research
• Scalable storage systems with cost-effective performance
• For it, examines– Networking requirements for scalable
storage– Integration of user-level networking
with Network Attached Storage(NAS)
NAS Architecture
• Scalable storage system requirement– Minimize file manager bottle neck– Provide appropriate degree of integrity & security
• Network Attached Secure Disk (NASD)– Command interface
• Avoid file manager bottleneck by reducing client-storage interactions that must be relayed through the file manager
NAS Architecture
• Data-intensive operation – Data read & write– Go straight to the disk
• Policy making operation– Name space and access control manipulations– Go to the file manager
– Drive has metadata to map and authorize a request
– Path name resolution is split between file manager & client
Network Attached Secure Disk
NASD Implementation
• Prototype NASD storage interface– AFS, NFS and Striped version of NFS– Encapsulate striping control in
striping manager• Striping is transparent to NASD/NFS file
manger & NASD drives• Striping manager exports NASD interface
to the file manager
NASD Implementation
• Benchmark– Max 10 client’s read request
• Reading striped single NASD/NFS file • Reading striped single SAD/NFS file• Each client reads separate SAD/NFS file
• Testbed– NASD drive & Client
• DEC Alpha 3000/400 ( 133MHz, 64MB, Digital UNIX 3.2g-3)
– Link• 155Mbps OC-3 ATM
NASD Implementation
Network Support for NAS
• Several issue to consider– File system traffic patterns
• Network file access entails small msg traffic (metadata, command, attribute manipulation, etc.)
• Current protocol impose significant connection overhead and long code-path
– Drive Resources• Network trends are increasing the resource requirements
and complexity of drives• Much smaller subset of service class is need
Network Support for NAS
– Cluster SAN, LAN and WAN• High performance but significant protocol
overhead
– Reliability• Complex HW based error handling is unnecessary• Supporting only essential error handling is
desirable for efficiency and flexibility
– Multicast• Efficient multicasting is need for storage
replication
NASD with User Level Networks
• Providing applications with high-bandwidthby using direct user-level access to the networks– Effective in high-bandwidth application
• Virtual Interface Architecture (VIA)– User level NIC access with protection mechanism
– Provide simple application/NIC interface • Basic send and receive primitives• Remote DMA for reads and write
NASD over VIA• Drive Resource Problem
– Drive must support VIA interface• Need VI connection for each client application• Each connection require
– For state : 2KB– Flow control for read & write : 4 * 8KB buffer
– If 100 client -> need at most 3MB RAM – Write Burst Problem
• Disk write rate : 25MB/s • Fibre channel transfer rate : 100MB/s• If client write data at 100MB/s ???
Using Remote DMA
• Read process– Client send read command with a pointer to memory
where data to be stored– Drives uses VIA RDMA write command to write data
out to client RAM
• Write process– Client send write command with a pointer to the data– Drive uses VIA RDMA read command to pull data out
of client’s memory ( without interrupting the client CPU )
– Do not need large buffer
Using Remote DMA
• Benefit of RDMA
– Drive can use client’s RAM as extended buffer
• Solves disk resource problem and write burst problem
– Drive can optimize disk scheduling• RDMA uses memory model rather than stream
model
Network Striping & Incast• File striping is due to client
– Clients use middleware layer to mapping NASD objects into single application object.
• Problem with reading striped file– Client must receive equal bandwidth from each sou
rce• Else frequent buffer overruns will reduce performance
– Use VIA’s application level flow control• Link level flow control is not sufficient• Network level flow control can’t understand higher level n
otion of striping
Incast Traffic Pattern
NASD over VIA
Conclusion
• For cost effective scalable NAS, – High performance & low latency networking is esse
ntial
• User level networking, such as VIA, is potential solution– VIA’s RDMA efficiently reduces drive’s resource req
uired– VIA’s application level flow control enables striped
file’s flow control