afs made by andrew carnegie & andrew mellon carnegie mellon university presented by christopher...
Post on 22-Dec-2015
227 Views
Preview:
TRANSCRIPT
AFSMade ByAndrew Carnegie & Andrew MellonCarnegie Mellon University
Presented By Christopher Tran & Binh Nguyen
AFS: ANDREW FILE SYSTEM
▪ Abstraction of DFS from users
▪ Accessing a file is similar to using a local file
▪ Scalability with region distribution
▪ Permissions Control with Access Control Lists
▪ University Environment (Large number of users)
▪ Weak Consistency by Design
AFS: PRIMARY FEATURES
▪ Implemented in UNIX at the system call level
▪ Work Unit is the entire file
▪ Applications and users are unaware of distributed system
▪ Kerberos Authentication for use over insecure networks
▪ Access Control Lists (ACLs) control permissions
▪ File Consistency through stateful servers
AFS: IMPLEMENTATION OVERVIEW
1. Application Opens file stored on AFS server
2. System Call is intercepted by a hook in the workstation Kernel (Venus)
3. Andrew Cache Manager Checks for local copy of file
4. Andrew Cache Manager Checks for callback status
5. If needed, Andrew Cache Manager forwards request to file server
6. If needed, Andrew Cache Manager receives file and stores on local machine
7. File Descriptor is returned to application
AFS: SYSTEM DIAGRAM
AFS: CALL INTERCEPT
AFS: VERSION 1
▪ Clients would constantly check with the server for consistency▪ Message intervals
▪ Every message would include authentication information▪ Server has to authenticate source
▪ Messages included full path to the file▪ Server had to traverse directories
▪ Approximately 20 clients per server (in 1988)
Check file?
It’s Good!
AFS: VERSION 1 PROBLEMS
▪ Servers spending too much time communicating with clients
▪ Clients constantly checking if a file is consistent increasing network traffic
▪ Server constantly authenticating messages using CPU time
▪ Server traversing directories every read, write, and file check using CPU time
AFS: VERSION 2
▪ Callback Mechanism – Server promises to inform clients of file change▪ Stateful Server
▪ 50 Clients per server (in 1988)
▪ Clients request file based on FID
▪ Volumes can exist on any server
I’ll let you know if
something changes
AFS: CALLBACK
▪ Server Keeps track of clients using threads▪ Each Client is managed by a separate thread
▪ Client and Server use RPC that to respective daemons▪ Server has Vice daemon
▪ Client has Venus daemon
▪ Each file a client opens also gets a AFSCallback Structure.▪ AFSCallback contains an expiration for how long the callback is valid and how the
server will communicate with the client
▪ Clients assume that file is consistent until server callback is received or the expiration time lapses.
AFS: CALLBACK INVALIDATION
AFS Server
VICE Daemon
Client 1 Client 2
Client 3 Client 4
Thread FID
Client1 412
Client2 412
Client3 412
Client4 492
FID: 412 FID: 412
FID: 492FID: 412
1. Store(412)
2. Write(412)
3. invalidate(412)
AFS: CALLBACK ISSUES
▪ No description of why the callback was initiated▪ Modified portions
▪ Appended data
▪ Saved but no data changed
▪ File Moved
▪ Etc
▪ Client has to redownload entire file when reading▪ No support for differential update
▪ If application reads more data, file is re-downloaded but updates may not be reflected in application▪ If user is reading past the changes in a file, the application is unaware of such
changes.
AFS: VOLUMES
▪ Collection of files
▪ Does not follow directory path
▪ Mounted to a directory
▪ Venus on client maps the pathname to a FID
▪ Vice on server gets file based on FID▪ Less directory traversal
AFS: SERVER SCALABILITY
▪ Server Replication: Multiple Servers act as a single logical server
▪ Server keeps track of clients in System Memory using threads
▪ Clients have a heartbeat to the server to make sure server is alive
▪ Volumes can be located on any server and moved to any other server
▪ Volume Read-Only clones used to distribute across physical space
▪ All servers share the same common name space▪ /afs/……..
▪ Local server name space can be unique where volumes are mounted▪ /afs/server2
▪ /afs/home/server3
▪ AFS servers have links to other AFS servers for Volume Locations▪ Servers know which server has a volume with specific files
AFS: FAULT HANDLING
▪ Client Crash – Worst Case Scenario▪ Upon boot to OS: check local cache against server for consistency
▪ Server Crash – Start Fresh▪ Clients detect server crashed from missed heartbeats
▪ Upon connection: clients re-establish communication
▪ Server rebuilds client list
▪ Clients check file consistency
I crashed or server crashed,
I’m probably wrong
Uptime 0 secondsLet’s GO!
AFS: WEAK CONSISTENCY
▪ Condition▪ Two or more clients have file open
▪ Two or more clients modify file
▪ Two or more clients close file to be written
▪ Result▪ Client that sends store() and received by server LAST is the current file
I got here
FIRST!
I got here
LAST, I WIN!
AFS: WHY WEAK CONSISTENCY
▪ Majority of all DFS access is reading files
▪ In a University, Users are rarely modifying files simultaneously.▪ Users work out of home directories
▪ Simplicity in Implementation▪ Allows multiplatform implementation
▪ Does not add complexity to crash recovery▪ No need to resume from a crash point
AFS: ACCESS CONTROL LISTS
▪ Standard Unix/Linux permissions are based on Owner/Group/Other
▪ ACLs allow refined control per user/group
Example, you want to share a directory with only one other person so they can read files.
Linux/Unix: make group, give group read access, add user to group
ACLs: Add user/group with read permissions
Months later: you want to give someone read/write access
Linux/Unix: can’t do it without giving “other” group read access and everyone now has read access
ACLs: Add user/group with read/write permissions
AFS: ADVANTAGES
▪ First Read performance is similar to other DFS
▪ Second Read performance is improved in almost all cases since read requests are far greater than write requests
▪ Creating new files is similar in performance with other DFS
▪ Use of ACLs over default file system permissions
▪ For read-heavy scenarios, supports a larger client-server ratio
▪ Volumes can be migrated to other AFS servers without interruption
▪ Kerberos Authentication allows access over insecure networks
▪ Build into Kernel so user login is authentication and UNIX/Linux applications can use AFS without modifications
AFS: DISADVANTAGES
▪ Entire file must be downloaded before file can be used▪ Causes a noticeable latency when accessing files the first time
▪ Modifications require entire file to be uploaded to server
▪ Short reads in large files is much slower than other DFS
▪ No simultaneous editing of files
AFS: CONTRIBUTIONS
▪ AFS highly influences NFS v4
▪ Basis of the Open Software Foundations Distributed Computing Environment▪ Framework for Distributed Computing in the Early 1990s
Current Implementations
▪ Open AFS
▪ Aria
▪ Transarc (IBM)
▪ Linux Kernel v2.6.10
AFS: SUGGESTED IDEAS
▪ Automatic Download of file when server sends consistency invalidation
▪ Smart invalidation by determining if a user needs to redownload▪ If a user is beyond the changes of a file, no need to redownload entire file.
▪ Supporting differential updates▪ Only sending information on what changed
AFS PERFORMANCE
▪ Andrew Benchmark (Still sometimes used today)▪ Simulation of typical user
▪ Multi Stage Benchmark▪ File Access▪ File Write▪ Compiling Program
▪ Response Time in creating various sized files in and out of AFS servers▪ How long until file is available to be used?
▪ AFS performance was around half in comparison to a file stored locally on a hard drive
AFS PERFORMANCE
File Count File Size File System /tmpSeconds
/tmpAverage/File
AFSSeconds
AFSAverage/File
100 8192 /tmp 0 0.00 2 0.02
1000 8192 /tmp 2 0.00 13 0.01
10,000 8192 /tmp 21 0.00 154 0.02
100,000 8192 /tmp 212 0.00 > 20 minutes n/a
Varying the count of small files
AFS PERFORMANCE
Varying the size of one file
File Count File Size File System /tmpSeconds
/tmpAverage/File
AFSSeconds
AFSAverage/File
5 102,400 /tmp 0 0.00 1 0.205 512,000 /tmp 0 0.00 3 0.605 1,024,000 /tmp 1 0.20 6 1.205 2,048,000 /tmp 1 0.20 13 2.605 3,072,000 /tmp 1 0.20 19 3.805 4,096,000 /tmp 1 0.20 26 5.205 5,120,000 /tmp 3 0.60 32 6.405 10,240,000 /tmp 3 0.60 64 12.805 20,480,000 /tmp 6 1.20 126 25.205 40,960,000 /tmp 13 2.60 270 54.00
AFS PERFORMANCE
▪ Largest impact is when making lots and lots of small files or very large files
▪ The extra overhead is directly proportional to the total number of bytes in the file(s)
▪ Each individual file has its own additional overhead, but until the number of files get very large, it is not easy to detect
AFS PERFORMANCE
AFS PERFORMANCE
▪ AFS: server-initiated invalidation
▪ NFS: client-initiated invalidation
▪ Server-initiated invalidation performs better than client-initiated invalidation
AFS PERFORMANCE
Andrew NFS
Total Packets 3,824 10,225
Packets from Server to Client 2,003 6,490
Packets from Client to Server 1,818 3,735
Network Traffic Comparison
AFS PERFORMANCE
AFS NFSCallback Mechanism (Server initiated) Client-initiated Invalidation
Network traffic reduced by callbacks, large buffers
Network traffic increased by limited caching
Stateful servers Stateless servers
Excellent performance in wide-area configurations
Inefficient in wide-area configurations
Scaleable; maintains performance in any size installation
Best in small- to medium-size installations
AFS: QUESTIONS
BIBLIOGRAPHY
"AFS and Performance." University of Michigan. Web. Accessed 16 May 2014. <http://csg.sph.umich.edu/docs/unix/afs/>
"Andrew File System." Wikipedia. Wikimedia Foundation, 05 July 2014. Web. 16 May 2014.
<http://en.wikipedia.org/wiki/Andrew_File_System>
"The Andrew File System." University of Wisconsin. Web. Accessed 16 May 2014. <http://pages.cs.wisc.edu/~remzi/OSTEP/dist-
afs.pdf>
Coulouris, George F. Distributed Systems: Concepts and Design. 5th ed. Boston: Addison-Wesley, 2012. Print.
John H Howard, "An Overview of the Andrew File System", in Winter 1988 USENIX Conference Proceedings, 1988
M. L. Kazar, "Synchronization and Caching Issues in the Andrew File System", In Proceedings of the USENIX Winter Technical Conference, 1988
top related