file systems - new mexico institute of mining and...
TRANSCRIPT
CSE325 Principles of Operating Systems
File Systems
David P. Duggan [email protected]
March 22, 2011
3/22/12 2
External View of File Manager
Hardware
Application Program
File
Mgr
Dev
ice
Mgr
Mem
ory
Mgr
Pr
oces
s Mgr
UNIX Fi
le M
gr
Dev
ice
Mgr
Mem
ory
Mgr
Pr
oces
s Mgr
Windows
open() read()
close()
write()
lseek()
CreateFile() ReadFile() CloseHandle()
SetFilePointer()
WriteFile() mount()
Why do we need files?
3/22/12 CSE325 - File Systems 3
File Management • File is a named, ordered collection of information • File manager’s job
– Storing the information on a device – Mapping the block storage to a logical view – Allocating/deallocating storage – Organizing files efficiently
• What abstraction should be presented to programmer?
3/22/12 CSE325 - File Systems 4
Outline • File Management Overview • File and Directory Structures • File Sharing & Protection • Allocation Methods
– Contiguous allocation – Linked allocation – Indexed allocation
3/22/12 5
File Structure • None - sequence of words, bytes • Simple record structure
– Lines – Fixed length – Variable length
• Complex Structures – Formatted document – Relocatable load file
• Who decides: – Operating system – Program
3/22/12 6
Directories • Set of files • Used to organize files • Translates external file names to internal file names (file
labels, i-nodes, file control blocks) • Treated as a file but with special operations
• Why do we need directories?
• Operations – Search for a file – Create a file – Delete a file – List a directory – Rename a file – Traverse the file
system
3/22/12 CSE325 - File Systems 7
Organize the Directory (Logically) • Grouping – logical grouping of files by properties,
(e.g., all Java programs, all games, …) • Efficiency – locating a file quickly (performance) • Naming – convenient to users
– Two users can have same name for different files – The same file can have several different names
• Directory structures?
3/22/12 CSE325 - File Systems 8
Single-Level Directory
• A single directory for all users
Grouping problem Naming problem
Good News? Problems?
9
Two-Level Directory
• Separate directory for each user
Path name" Can have the same file name for different user" Efficient searching" No grouping capability"
Advantages? Problems?
3/22/12 CSE325 - File Systems 10
Tree-Structured Directories
3/22/12 CSE325 - File Systems 11
Tree-Structured Directories (Cont.) • Efficient searching
• Grouping Capability
• Current directory (working directory)
– cd /spell/mail/prog – type list
CSE325 - File Systems 12
Tree-Structured Directories (Cont.) • Absolute or relative path name • Creating a new file is done in current directory • Delete a file
rm <file-name> • Creating a new subdirectory is done in current
directory mkdir <dir-name>
Example: if in current directory /mail mkdir count
mail"
prog" copy" prt" exp" count"
Deleting “mail” ⇒ deleting the entire subtree rooted by “mail”"
3/22/12 CSE325 - File Systems 13
Acyclic-Graph Directories • Have shared subdirectories and files
3/22/12 CSE325 - File Systems 14
Acyclic-Graph Directories (Cont.) • Two different names (aliasing)
• New directory entry type – Link – another name (pointer) to an existing file – Resolve the link – follow pointer to locate the file
• If dict deletes count ⇒ dangling pointer Solutions:
– Backpointers, so we can delete all pointers Variable size records a problem
– Entry-hold-count solution
3/22/12 CSE325 - File Systems 15
General Graph Directory
3/22/12 CSE325 - File Systems 16
General Graph Directory (Cont.) • How do we guarantee no cycles?
– Allow only links to file not subdirectories – Every time a new link is added use a cycle detection
algorithm to determine whether it is OK
3/22/12 CSE325 - File Systems 17
Directory Implementation • Linear list of file names with pointer to the data
blocks. – simple to program – time-consuming to execute
• Hash Table – linear list with hash data structure. – decreases directory search time – collisions – situations where two file names hash to the
same location – fixed size
3/22/12 CSE325 - File Systems 18
File Sharing • Sharing of files on multi-user systems is desirable
• On distributed systems, files may be shared across a network
• Network File System (NFS) is a common distributed file-sharing method
• Sharing may be done through a protection scheme
3/22/12 19
File Sharing – Remote File Systems • Uses networking to allow file system access between systems
– Manually via programs like FTP – Automatically, seamlessly using distributed file systems – Semi-automatically via the world wide web
• Client-server model allows clients to mount remote file systems from servers – Server can serve multiple clients – Client and user-on-client identification is insecure or
complicated – NFS is standard UNIX client-server file sharing protocol – CIFS is standard Windows protocol – Standard operating system file calls are translated into remote
calls
3/22/12 CSE325 - File Systems 20
File Sharing – Consistency Semantics • Consistency semantics specify how multiple
users are to access a shared file simultaneously – Similar to process synchronization algorithms – Unix file system (UFS) implements:
• Writes to an open file visible immediately to other users of the same open file
• Sharing file pointer to allow multiple users to share the pointer of current location into the file
– Andrew File System (AFS) implemented complex remote file sharing semantics
– AFS has session semantics • Writes only visible to sessions starting after the file is
closed
3/22/12 CSE325 - File Systems 21
Access Lists and Groups • Mode of access: read, write, execute • Three classes of access
RWX a) owner access 7 ⇒ 1 1 1
RWX b) group access 6 ⇒ 1 1 0 RWX c) public access 1 ⇒ 0 0 1
• Ask manager to create a group (unique name), say G, and add some users to the group.
• For a particular file (say game) or subdirectory, define an appropriate access.
owner" group" public"
chmod" 761" game"Attach a group to a file
chgrp G game
22
Protection Mechanisms • Files are OS objects: unique names and a finite set of
operations that processes can perform on them
• Protection domain is a set of {object,rights}, where right is the permission to perform one of the operations
• Each process runs in some protection domain.
• How do we store all the protection domains ?
Domain"
3/22/12 CSE325 - File Systems 23
Access Control Lists (ACL)
Example: Owner’s ID stored in PCB File owner’s ID and access permissions stored in inode
3/22/12 CSE325 - File Systems 24
Capabilities List (C-List)
• Each process has a capability for every resource it can access – Kept with other process meta data – Checked by the kernel on every access
3/22/12 CSE325 - File Systems 25
Allocation Methods • An allocation method refers to how disk blocks
are allocated for files:
– Contiguous allocation
– Linked allocation
– Indexed allocation
26
Contiguous Allocation • Each file occupies a set of contiguous blocks on the
disk • Maps the N file blocks into N contiguous blocks on the
secondary storage device
Head position 237 … First block 785 Number of blocks 25
File descriptor
• Simple – only starting location (block #) and length (number of blocks) are required • Random access
Good news?
3/22/12 CSE325 - File Systems 27
Contiguous Allocation (cont.) • Mapping from logical address to physical
LA/512"
Q"
R"
Block to be accessed = Q + starting address Displacement into block = R
3/22/12 CSE325 - File Systems 28
Contiguous Allocation of Disk Space
• Wasteful of space (dynamic storage-allocation problem)
• Difficult to support dynamic file sizes (files cannot grow)
Drawbacks?
How do we solve this problem?
3/22/12 CSE325 - File Systems 29
Linked Allocation • Each file is a linked list of disk blocks: blocks may
be scattered anywhere on the disk. • Each block contains a header with
– Number of bytes in the block – Pointer to next block
pointer"block ="
3/22/12 CSE325 - File Systems 30
Linked Allocation
• Simple – need only starting address
• Free-space management system – no waste of space
Good news?
• No random access • Seeks can be slow • Mapping of free space
Bad news?
3/22/12 CSE325 - File Systems 31
File-Allocation Table
3/22/12 CSE325 - File Systems 32
Indexed Allocation • Brings all pointers together into the index block. • Logical view.
index table"
3/22/12 CSE325 - File Systems 33
Example of Indexed Allocation
• Simplify seeks • Random access • Dynamic access without external fragmentation • May link indices together (for large files)
Good news?
• Overhead of index block.
Bad news?
3/22/12 CSE325 - File Systems 34
Indexed Allocation – Mapping (Cont.)
"
outer-index"
index table" file"
3/22/12 CSE325 - File Systems 35
Combined Scheme: UNIX (4K bytes per block)
3/22/12 CSE325 - File Systems 36
Unallocated Blocks • How should unallocated blocks be managed? • Need a data structure to keep track of them
– Linked list • Very large • Hard to manage spatial locality
– Block status map (“bit map” or “bit vector”) • Bit per block • Easy to identify nearby free blocks • Useful for disk recovery
3/22/12 CSE325 - File Systems 37
Efficiency and Performance • Efficiency dependent on
– disk allocation and directory algorithms – types of data kept in file’s directory entry
• Performance – disk cache – separate section of main memory for
frequently used disk blocks – free-behind and read-ahead – techniques to optimize
sequential access – improve PC performance by dedicating section of
memory as virtual disk, or RAM disk