cs431-cotter1 file systems tanenbaum chapter 4 silberschatz chapters 10, 11, 12
TRANSCRIPT
cs431-cotter 1
File SystemsFile Systems
Tanenbaum Chapter 4
Silberschatz Chapters 10, 11, 12
cs431-cotter 2
Essential requirements for long-term information storage:
• It must be possible to store a very large amount of information.
• The information must survive the termination of the process using it.
• Multiple processes must be able to access the information concurrently.
File Systems
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 3
File Structure
• None:– File can be a sequence of words or bytes
• Simple record structure:– Lines– Fixed Length– Variable Length
• Complex Structure:– Formatted documents– Relocatable load files
• Who decides?
cs431-cotter 4
Think of a disk as a linear sequence of fixed-size blocks and supporting reading and writing of blocks. Questions that quickly arise:
• How do you find information?• How do you keep one user from reading another’s data?• How do you know which blocks are free?
File Systems
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 5Figure 4-1. Some typical file extensions.
File Naming
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 6
File Access Methods
• Sequential Access– Based on a magnetic tape model– read next, write next– reset
• Direct Access– Based on fixed length logical records– read n, write n– position to n– relative or absolute block numbers
cs431-cotter 7
Figure 4-2. Three kinds of files. (a) Byte sequence. (b) Record sequence. (c) Tree.
File Structure
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 8Figure 4-3. (a) An executable file. (b) An archive.
File Types
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 9
Figure 4-4a. Some possible file attributes.
File Attributes
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 10
The most common system calls relating to files:
File Operations
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
• Append• Seek• Get Attributes• Set Attributes• Rename
• Create• Delete• Open • Close• Read• Write
cs431-cotter 11Figure 4-5. A simple program to copy a file.
Example Program Using File System Calls (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
. . .
cs431-cotter 12
Figure 4-5. A simple program to copy a file.
Example Program Using File System Calls (2)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 13
Directory Structure
• Collection of nodes containing information on all files
F1F2
F3
F4 F5
cs431-cotter 14
Information in a Device Directory
• File name:• File Type:• Address:• Current Length• Maximum Length• Date Last accessed (for archiving)• Date Last updated (for dumping)• Owner ID• Protection information
cs431-cotter 15
Directory Operations
• Search for a file• Create a file• Delete a file• List a directory• Rename a file• Traverse the file system
cs431-cotter 16
Objectives for a Directory System
• Make it efficient– It should be easy to locate a file quickly
• Make file (and directory) naming convenient– Allow 2 users to have the same name for different files– Allow the same file to have more than 1 name
• Allow logical grouping of files– All word processing files together– All c++ files together– etc.
cs431-cotter 17
Alternative Directory Structures
• Single-Level Directory
• Issues:– Naming– Grouping
cat bo a test data mail cont hex word calc
cs431-cotter 18
Alternative Directory Structures
• Two-Level Directory
User1 User2 User3
cs431-cotter 19
Tree-Structured Directory
cs431-cotter 20
Figure 4-8. A UNIX directory tree.
Path Names
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 21
File System Structure• A data structure on a disk that holds files• Disk Storage attributes:
– Allows direct access to any given block of information– Supports re-writing in place (copy block, modify,
rewrite)
• Organized in Layers:– Application Programs– Logical file system– file-organization module– basic file system– I/O control– devices
cs431-cotter 22
File System Organization
• Devices– Peripheral equipment- disk drives, tapes, etc.
• I/O Control– device drivers and interrupt handlers– Communicates directly with peripheral devices– Starts and closes out I/O request
• Basic File System– Requires knowledge of physical device (track,
cylinder, head, etc.)– Manages data transfer at block level (buffering,
placement)
cs431-cotter 23
File System Organization• File-organization Module
– Maps logical block structure of file to physical block structure of disk
– Includes free space manager
• Logical file system– Manages file directory information – Provides security, protection
• Applications– Maintains basic data about the file– Allows users to access file information
cs431-cotter 24
File System Structure
• Access to File– file control block (generic)– file descriptor (UNIX)– file handle (Windows)
• File System Requirements– boot mechanism– file system information– file information– file data
cs431-cotter 25
Allocation MethodsContiguous Allocation
• Each file occupies a set of contiguous blocks on the disk.
• Number of blocks needed identified at file creation– May be increased using file extensions
• Advantages:– Simple to implement– Good for random access of data
• Disadvantages– Files cannot grow– Wastes space
cs431-cotter 26
Contiguous Allocation
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
FileA
FileB
FileC
FileE
FileD
File Allocation Table
File Name Start Block Length
FileA
FileBFileCFileDFileE
2 39 5
18 830 226 3
FileA
cs431-cotter 27
Allocation MethodsLinked Allocation
• Each file consists of a linked list of disk blocks.
• Advantages:– Simple to use (only need a starting address)– Good use of free space
• Disadvantages:– Random Access is difficult
ptrdata ptrdata ptrdata Nulldata
cs431-cotter 28
Linked Allocation
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
FileB File Allocation Table
File Name Start Block End
... ... ...
......FileB 28
...1
cs431-cotter 29
Linked Allocation
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
FileB File Allocation Table
File Name Start Block End
... ... ...
......FileB 28
...1
cs431-cotter 30
Allocation MethodsIndexed Allocation
• Collect all block pointers into an index block.
• Advantages:– Random Access is easy– No external fragmentation
• Disadvantages– Overhead of index block
Index Table
cs431-cotter 31
Indexed Allocation
1
83
14
28
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
File Allocation Table
File Name Index Block
Jeep 24
cs431-cotter 32
Indexed Allocation
1831428
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
File Allocation Table
File Name Index Block
Jeep 24
cs431-cotter 33
UNIX File SystemFile Types
• Regular File
• Directory
• Block-oriented Device
• Character-oriented Device
• Symbolic Link
cs431-cotter 34
UNIX File Ownership
read write execute
owner
group
others
1 0
00
00
1
1
1
(for directories: read, write, search)
-rwx------ 1 rsmith is 785 Feb 11 1994 send_exedrwxr-xr-x 2 rsmith is 512 Apr 16 13:49 sysmon_server
cs431-cotter 35
File Access Model
• open -- read -- write -- close
fdesc = open (*filename, r / w / a / +);n = read (fdesc, buff, 24);n = write (fdesc, buff, 24);lseek (fdesc, 100L, L_SET);
(L_SET, L_CUR, L_END)
close (fdesc);
cs431-cotter 36
Concurrent File Access
• From separate processes– OPEN generates a new file descriptor.– Allows independent access to the same file.– No inherent blocking in access
• From parent / child processes– fork duplicates all active file descriptors. – all processes access a common “current” pointer
• File locking– flock, lockf
cs431-cotter 37
UNIX File Sharing- unrelated processes -
fd flags ptrfd 0
process table entry
fd flags ptrfd 0
process table entry
File Table
file status flagscurrent file offsetv-node ptr
file status flagscurrent file offsetv-node ptr
V-node Table
v-node infoI-node infofile size
cs431-cotter 38
UNIX File Sharing - related processes -
fd flags ptrfd 0
process table entry
fd flags ptrfd 0
process table entryfile status flagscurrent file offsetv-node ptr
File Table
file status flagscurrent file offsetv-node ptr
V-node Table
v-node infoI-node infofile size
cs431-cotter 39
direct blocks
UNIX i-node
modeowners(2)
timestamps(3)size block
count
single indir
triple indir
double indir
data
data
data
cs431-cotter 40
direct blocks
UNIX i-node
modeowners(2)
timestamps(3)size block
count
single indir
triple indir
double indir
data
data
data
data
data::
cs431-cotter 41
direct blocks
UNIX i-node
modeowners(2)
timestamps(3)size block
count
single indir
triple indir
double indir
data
data
data
data
data::
::
::
::
data
data
data
data
cs431-cotter 42
I-node File Sizes
1 KB block 2KB block 4KB block
Direct 12 k bytes 24 k bytes 48 k bytes
Single indirect
256 k bytes 1 M bytes 4 M bytes
Double indirect
64 M bytes 512 M bytes
2 G bytes
Triple indirect
16 G bytes 256 G bytes
2 T bytes
cs431-cotter 43
Figure 4-20. Percentage of files smaller than a given size (in bytes).
Disk Space Management Block Size (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 44
Disk Drive Layout(simplified)
DiskDrive
partition partition partition
I-list directory blocks and data blocks
I-node I-node I-node I-node
Block Group
Partition Block group Block group Block group
cs431-cotter 45
Sample filesystem: Creating the directory “testdir”
i-nodei-node1267
i-nodei-node2549
i-listdirectoryblock
cs431-cotter 46
i-nodei-node1267
i-nodei-node2549
i-listdirectoryblock
1267 •
Sample filesystem: Creating the directory “testdir”
cs431-cotter 47
i-nodei-node1267
i-nodei-node2549
i-listdirectoryblock
2549 testdir
1267 •
Sample filesystem: Creating the directory “testdir”
cs431-cotter 48
i-nodei-node1267
i-nodei-node2549
i-listdirectoryblock
2549 testdir
1267 •
Sample filesystem: Creating the directory “testdir”
cs431-cotter 49
i-nodei-node1267
i-nodei-node2549
i-listdirectoryblock
directoryblock
2549 •
1267 • •
2549 testdir
1267 •
Sample filesystem: Creating the directory “testdir”
cs431-cotter 50
Sample filesystem after creating the directory “testdir”
i-nodei-node1267
i-nodei-node2549
i-listdirectoryblock
directoryblock
2549 •
1267 • •
2549 testdir
1267 •
cs431-cotter 51
Viewing directory structure
• ls –ai /– 2 .– :– 1730 home
• ls –ai /home– 1730 .– 2 . .– :– 199 rcotter
• ls –ai /home/rcotter– 199 .– 1730 . .– :– 241224 cs431
• ls –ai ~/cs431– 241224 .– 199 . .– :– 245436 exec.cpp
cs431-cotter 52
Shared Files (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Figure 4-16. File system containing a shared file.
ln –s shared_file new_link
cs431-cotter 53
Figure 4-17. (a) Situation prior to linking. (b) After the link is created. (c) After the original owner removes the file.
Shared Files (2)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 54
Linux File System Structure• Linux uses a Virtual File System (VFS)
– Defines a file object– Provides an interface to manipulate that object
• Designed around OO principles– File system object– File object– Inode object (index node)
• Primary File System - ext2fs– Supports (or maps) several other systems
(MSDOS, NFS (network drives), VFAT (W95), HPFS (OS/2), etc.
cs431-cotter 55
Virtual Filesystem• A kernel software layer that handles all system calls
related to a standard UNIX filesystem.• Supports:
– Disk-based filesystems• IDE Hard drives (UNIX, LINUX, SMB, etc.)• SCSI Hard drives• floppy drives
– Network filesystems• remotely connected filesystems
– Special filesystems• /proc
cs431-cotter 56
VF Exampleinf = open (“/floppy/test”,
O_RDONLY, 0);
outf = open (“/tmp/test”, O_WRONLY|O_CREATE|O_TRUNC, 0600);
do {
cnt = read(inf, buf, 4096);
write (outf, buf, cnt);
} while (cnt);
close (outf);
close (inf);
ext2 MS-DOS
VFS
cp
cs431-cotter 57
Virtual file system Model
• Superblock object– metadata about a mounted filesystem
• inode object– general information about a specific file
• file object– information about process and open file interaction
• dentry object– directory entry information
cs431-cotter 58
Virtual Filesystem and Processes
disk file
Process 1
Process 2
Process 3
file object
dentryobject
dentryobject
file object
file object
superblock inode object
cs431-cotter 59
Ext2fs History
• Minux
• extfs
• ext2fs - 1994
cs431-cotter 60
Ext2fs Blocks
• 1k default
• 2k, 4k possible
• choice based on file sizes
cs431-cotter 61
Ext2fs Disk Data Structures
• Each Block Group contains:– A copy of the Super Block– A copy of the group of block group descriptors– A data block bitmap (for this group)– An inode bitmap (for this group) – A group of inodes (for this group)– Data blocks
BootBlock
Block Group 0 Block Group N
SuperBlock
GroupDescriptors
Data blockBitmap
InodeBitmap
InodeTable
Data Blocks
cs431-cotter 62
Disk Data Structures
• SuperBlock– Contains metadata about the block group– # free blocks, – # free inodes, – block size, – times (last mount, last write), – check status, – blocks to pre-allocate, – alignments, etc. (~38 items)
cs431-cotter 63
Disk Data Structures
• Block Groups– Block group limited to 8 * block size by data
block bitmap.– 1k block limited to 8,192 blocks or 8 mbytes
per block group. 4k block = 128 mbytes
• Group descriptor - 24 bytes per group– block numbers for bitmaps, start of tables, etc.
cs431-cotter 64
Disk Data Structures
• Inode Table– Inodes - 128 bytes– File type and access rights– owner– length– timestamps (last access, last change, etc.)– hard links counter– pointers to data blocks,– etc.
cs431-cotter 65
Disk Data Structures
• Items cached in memory on filesystem mount– Superblock - always cached– Group descriptor - always cached– Block bitmap - fixed limit - most recent only– Inode bitmap - fixed limit - most recent only– Inode - dynamic– Data block - dynamic
cs431-cotter 66
File Blocks Grouped
• Physically and logically
• Inodes distributed throughout disk
• Data blocks allocated close to inode
• Pre-allocation of free blocks to minimize frag.
cs431-cotter 67
Summary Features
• Fast Symbolic Links
• pre-allocation of data blocks
• file-updating strategy
• immutable files or append only files
cs431-cotter 68
Journal File Systems
• Designed to speed file system recovery from system crashes.
• Traditional systems use an fs recovery tool (e.g. fsck) to verify system integrity
• As file system grows, recovery time grows. • Journal File Systems track changes in meta-data
to allow rapid recovery from crashes
cs431-cotter 69
Journal File System
• Uses a version of a transaction log to track changes to file system directory records.
• If a system crash occurs, the transaction log can be reviewed back to a check point to verify any pending work (either undo or redo).
• For large systems, recovery time goes from hours or days to seconds.
cs431-cotter 70
Journal File System - Examples
• XFS– Commercial port by SGI (IRIX OS)– Based on 64 bit system (ported to 32 bit)– Supports large systems 2TB -> 9 million TB– Uses B+trees for improved performance– Journals file system meta-data – Supports Quotas, ACLs.– Supports filesystem extents (contiguous blocks)
cs431-cotter 71
Journal File System - Examples
• JFS– Commercial Port by IBM. Used in OS/2– 64 bit system ported to 32 bits.– Journal support of meta-data.– System Size 2TB -> 32PB– Built to scale on SMP architectures– Uses B+trees for improved performance– Supports use of extents.
cs431-cotter 72
Journal File System - Examples
• ReiserFS– Designed originally for Linux (32 bit system)– Supports file systems to 2TB -> 16TB– Journal support of meta-data– Btree structure supports large file counts– Supports block packing for small files– No fixed inode allocation - more flexible.
cs431-cotter 73
Journal File System - Examples
• Ext3fs– Simple extension of ext2fs that adds journaling– All virtual file system operations are journaled.– Other limitations of ext2fs remain.
cs431-cotter 74
Journal File System - Examples
• ext4fs– Next generation of file system development, incorporating
improvements and changes to ext3fs.– Journalling filesystem
• Improvements is journal checksumming– Larger file systems – 1 MTB (exabyte)
• Larger files – 16 TB– Delayed block allocation, multi-block allocator
• Allows files to be written as contiguous sequences of blocks: improves read performance of streaming data files.
• Poses the risk of lost data if system crashes before data is written.– Use of extents
• Large (<= 128 MB) contiguous physical blocks.
cs431-cotter 75
Free Space Management• Bit Vector management
• One bit for each block– 0 = free; 1 = occupied
• Use bit manipulation commands to find free block
• Bit vector requires space– block size = 4096 = 2 12
– disk size = 1 gigabyte = 2 30
– bits = 2 (30-12) = 2 18 = 32k bytes
0 1.....11 0
cs431-cotter 76
Free Space Management
• Bit vector (advantages):– Easy to find contiguous blocks
• Bit vector (disadvantages):– Wastes space (bits allocated to unavailable blocks)
• Issues:– Must keep bit vector on disk (reliability)– Must keep bit vector in memory (speed)– cannot allow memory != disk (memory = 1, disk = 0)
cs431-cotter 77
Free Space Management
• Linked List management– Use linked list to identify free space
• Advantages:– no wasted space
• Disadvantages:– harder to identify contiguous space.
• Issues:– Must protect pointer to free list
cs431-cotter 78
Free Space Management
• Grouping of blocks
Bootblock
Filesystem
descriptor
Filedescriptors
cs431-cotter 79
Figure 4-29. (a) I-nodes placed at the start of the disk. (b) Disk divided into cylinder groups, each with its own blocks and i-nodes.
Reducing Disk Arm Motion
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 80Figure 4-22. (a) Storing the free list on a linked list. (b) A bitmap.
Keeping Track of Free Blocks (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 81
Figure 4-24. Quotas are kept track of on a per-user basis in a quota table.
Disk Quotas
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 82
Backups to tape are generally made to handle one of two potential problems:
• Recover from disaster.• Recover from stupidity.
File System Backups (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 83
Figure 4-25. A file system to be dumped. Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number.
File System Backups (2)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 84
Figure 4-26. Bitmaps used by the logical dumping algorithm.
File System Backups (3)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 85
Figure 4-27. File system states. (a) Consistent. (b) Missing block. (c) Duplicate block in free list. (d) Duplicate data block.
File System Consistency
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 86
Figure 4-28. The buffer cache data structures.
Caching (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 87
• Some blocks, such as i-node blocks, are rarely referenced two times within a short interval.
• Consider a modified LRU scheme, taking two factors into account:
•Is the block likely to be needed again soon?•Is the block essential to the consistency of the file system?
Caching (2)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 88
Example File Systems
• CD-ROM– ISO 9660– Rock Ridge– Joliet
• MS-DOS
cs431-cotter 89
Figure 4-30. The ISO 9660 directory entry.
The ISO 9660 File System
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 90
Rock Ridge extension fields:
• PX - POSIX attributes.• PN - Major and minor device numbers.• SL - Symbolic link.• NM - Alternative name.• CL - Child location.• PL - Parent location.• RE - Relocation.• TF - Time stamps.
Rock Ridge Extensions
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 91
Joliet extension fields:
• Long file names.• Unicode character set.• Directory nesting deeper than eight levels.• Directory names with extensions
Joliet Extensions
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 92
Figure 4-31. The MS-DOS directory entry.
The MS-DOS File System (1)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 93
Figure 4-32. Maximum partition size for different block sizes. The empty boxes represent forbidden combinations.
The MS-DOS File System (2)
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
cs431-cotter 94
Summary
• File Structures
• Directory Structures
• File System Implementation
• File System Management
• Example File Systems
cs431-cotter 95
Questions• Two primary access methods are used to manage files.
Discuss these two methods, including which types of physical devices each is normally associated with.
• What is the difference between a tree structured file directory and an acyclic graph directory? Please give an example Operating System used today that uses each.
• Discuss 3 ways in which files might be allocated to file stores (disks). What are the advantages or disadvantages of each?
• How might a file system manage its free space so that it can easily assign space for a new file when needed? Give at least 2 approaches.
• What can an OS designer to improve the efficiency of a file system? What factors should be considered? How do those affect the efficiency of the system?
• Discuss at least 2 mechanisms that might be used to ensure the recovery of information stored in a file system.