network file systems ii

36
Network File Systems II Frangipani: A Scalable Distributed File System A Low-bandwidth Network File System

Upload: seamus

Post on 11-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Network File Systems II. Frangipani: A Scalable Distributed File System A Low-bandwidth Network File System. Why Network File Systems?. Scalability support more users and data handle server failure gracefully Improved accessibility allow more users access - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Network File Systems II

Network File Systems II

Frangipani: A Scalable Distributed File System

A Low-bandwidth Network File System

Page 2: Network File Systems II

Why Network File Systems?

• Scalability– support more users and data– handle server failure gracefully

• Improved accessibility– allow more users access– extend conditions under which access is

feasible

Page 3: Network File Systems II

File System Requirements

• Coherence: consistent, predictable file state

• Efficiency: timely reads and writes

• Security: provide access control

• Recoverability: allow backup of file system

Page 4: Network File Systems II

Frangipani and LBFS

• Frangipani file system: transparent scalability– easy administration at any scale

– takes advantage of parallelism for good performance

• Low Bandwidth File System (LBFS): reduce bandwidth to increase performance– takes advantage of duplicate file information

– uses cacheing and compression to limit data volume

Page 5: Network File Systems II

Features of Frangipani

• Petal: shared virtual disk• Frangipani: provides naming and structure

for Petal• Lock system: distributed across servers• Leases: manage connections with lower

state requirements• Backups: generated from Petal snapshots

using the recovery process

Page 6: Network File Systems II

An Example Configuration

Page 7: Network File Systems II

The Petal Virtual Disk

• Storage read/written in blocks

• Sparse address space: 264

• Physical storage allotted only on write

• Allows replication for high availability

• Read-only snapshot feature

Page 8: Network File Systems II

Frangipani Disk Layout

• Region 1: Disk configuration info (1 TB)• Region 2: Log space (1 TB), divided into 256

individual server logs• Region 3: Allocation bitmaps (3 TB), chunks owned

by individual servers

Page 9: Network File Systems II

More Frangipani Disk Layout

• Region 4: Inodes (1 TB), 231 512 byte inodes • Region 5: Small data blocks (128 TB) 235 blocks

at 4 KB each• Region 6: Large data blocks, 224 1 TB blocks

Page 10: Network File Systems II

Frangipani Server Logs

• Bounded: 128 KB, split across physical disks

• Circular buffer scheme: 25% reclaimed when full

• Uses sequence numbers to mark wrap point

• 1000 to 1600 operations can be held in the log (entry size 80 to 128 bytes)

Page 11: Network File Systems II

Server Logging

• Write-ahead redo policy

• File metadata and physical file dated updated on disk after log write

• Unix daemon handles disk writes every 30 seconds

Page 12: Network File Systems II

Lock Service

• Many reader/single writer “sticky” locks

• Asynchronous communication

• Lamport’s Paxos algorithm replicates infrequently-changed data

• Heartbeat messages determine liveness

Page 13: Network File Systems II

Locking: Avoiding Contention

• Single lockable data structure per disk sector eliminates false sharing

• Each file, directory, or symlink and its inode treated as a single lockable segment

• Lock algorithm for aquiring multiple locks to avoid deadlock

Page 14: Network File Systems II

Crash Recovery

• Detection of server crash based on lapsed leases, no network response– Recovery daemon given now owns log and locks

– Metadata sequence nums prevent update replay

No high-level semantic guarantee to users!

Petal snapshot can be used for entire system recovery

Page 15: Network File Systems II

Performance Benchmarks

Connectathon Benchmark

0 0.5 1 1.5 2 2.5 3 3.5

create

rm

getwd/stat

chmod/stat

write

read

readdir

link/rename

symlink/readlink

statfs

Description

Seconds

AdvFS Raw AdvFS NVR Frang. Raw Frang. NVR

Page 16: Network File Systems II

Frangipani: Conclusions

• Frangipani meets the goals set for it:– coherent access– easy administration– scalable performance (limit is network itself)– good failure recovery

Testing on a larger scale will be the true test of Frangipani

Page 17: Network File Systems II

Introduction to LBFS

• Designed for efficient remote file access over low bandwidth networks– Exploits similarities between files and file

versions– Client maintains a large cache of working files– Compression further reduces data volume– Uses NFS protocol for access control and

access to existing file systems

Page 18: Network File Systems II

Why Do We Need LBFS?

• Typical network file systems designed for 10 Mbit/sec or better bandwidth

• Problems using FS over WAN:– interactive programs that freeze– batch commands that run several times slower– less agressive applications are starved – some applictions may not run at all!

Page 19: Network File Systems II

Why LBS (contined)

• Downloading and editing files locally can lead to version conflicts

• Upstream bandwidth is still limited with broadband

LBFS eliminates these problems while still preserving consistency

Page 20: Network File Systems II

LBFS File Chunk Scheme

• Server and client keep index of hashed chunks– Server index has chunk hashes for

entire FS

– Client index has chunk hashes for working files

In order to exploit commonality, files need to be broken into chunks

Page 21: Network File Systems II

Chunk Creation Algorithm

• Need to handle shifting offsets while keeping the chunk index managable– Examine every overlapping 48 byte

region of the file– With probability 2-13, consider a

region to be a breakpoint, or file chunk end marker

Page 22: Network File Systems II

Rabin Fingerprints

• Rabin fingerprints help find breakpoints– Polynomial representation of data modulo an

irreducible polynomial– When the low 13 bits of a region’s fingerprint

equals a certain number, then it is selected– Given random data, the expected chunk size is

213 = 8192 = 8 KB + 48 byte breakpoint

Page 23: Network File Systems II

File Revisions With Breakpoints

a. Original file c. Insert that includes breakpoint

b. Text Insertion d. Elimination of a breakpoint

Page 24: Network File Systems II

Breakpoint Pathological Cases

Data is usually not random! Worst case scenarios:– All 48 byte regions are breakpoints: the chunk index

same size as file

– No 48 byte regions are breakpoints: large chunks take extra time and memory for RPC

Solution: define bounds:– min chunk size = 2 KB

– max chunk size = 64 KB

Page 25: Network File Systems II

The Chunk Database

• Each chunk indexed by the first 64 bits of its SHA-1 hash

• Keys index <file, offset, count> tuples: must update when chunk changes

• LBFS always recomputes hash value before use– hash collisions are detected– penalty of bad DB data only performance hit

Page 26: Network File Systems II

Benefits Provided by NFS 3

• NFS 3 IDs files by opaque handles that persist through file renaming

• Handles access control for LBFS

• Allows LBFS to use NFS protocol to access existing file systems

• Disadvantage: i-number not changed when file is overwritten, so extra copy required

Page 27: Network File Systems II

LBFS Protocol Enhancements

• Leases save permissions checks and data validation for recently-accessed files

• Uses RPC, but with agressive pipelining

• Gzip compression

Page 28: Network File Systems II

Maintaining File Consistency

• Close-to-open consistency

• Client needs whole-file cache

• Multiple processes on a single client are allowed write access to same file simultaneously– LBFS writes back to file system on each close– Last close overwrites previous changes

Page 29: Network File Systems II

Profile of a Read Request

Page 30: Network File Systems II

Profile of a Write Request

Page 31: Network File Systems II

Security: One Concern

• It is possible, through a systematic use of the CONDWRITE RPC call, to determine whether a particular hashed chunk exists in the file system: given away by response time variations

Page 32: Network File Systems II

LBFS Server Implementation

• LBFS can run on top of another FS– server pretends to be an NFS client

• Server creates a .lbfs.trash dir at root of every exported system– stores temp files indefinitely and garbage collect a random file when

full

Page 33: Network File Systems II

LBFS Client Implementation

• Client uses xfs device driver– passes messages through device node in /dev– xfs tells LBFS when to transfer file contents to/from server– LBFS fetches files to client cache, notifies xfs driver of

bindings between cache contents and open files

Page 34: Network File Systems II

LBFS Performance Testing

LBFS consumed far less bandwidth and allowed better application performance under test conditions– Workloads tested were typical applications of

MSWord, gcc, and ed– CIFS, NFS, and AFS were tested (based on

workload) for comparison– Also tested a “Leases and Gzip” only version

Page 35: Network File Systems II

LBFS: Conclusions

• In low-bandwidth networks, LBFS out-performs the traditional file systems tested– similar consistency guarantees– implemented as transparent layer on top of an

existing file system– public key cryptography provides security– client cacheing distributes load and reduces

network dependency

Page 36: Network File Systems II

Last Word: Frangipani & LBFS

• Both Frangipani and LBFS meet file system and distributed system requirements, but targeted different problems:

– Frangipani achieved transparent scalability without performance loss

– LBFS achieved feasible performance over WANs as a transparent add-on to a traditional FS using improved protocols and load sharing