files – chapter 2
DESCRIPTION
Files – Chapter 2. Basic File Processing Operations. Outline. Physical versus Logical Files Opening and Closing Files Reading, Writing and Seeking Special Characters in Files The Unix Directory Structure Physical Devices and Logical Files Unix File System Commands. - PowerPoint PPT PresentationTRANSCRIPT
January 13, 2000 1
Files – Chapter 2
Basic File Processing Operations
2
Outline
• Physical versus Logical Files• Opening and Closing Files• Reading, Writing and Seeking• Special Characters in Files• The Unix Directory Structure• Physical Devices and Logical Files• Unix File System Commands
3
Physical versus Logical Files
• Physical File: A collection of bytes stored on a disk or tape.
• Logical File: A “Channel” (like a telephone line) that hides the details of the file’s location and physical format to the program.
• When a program wants to use a particular file, “data”, the operating system must find the physical file called “data” and make the hookup by assigning a logical file to it. This logical file has a logical name which is what is used inside the program.
4
Opening Files
• Once we have a logical file identifier hooked up to a physical file or device, we need to declare what we intend to do with the file:
• Open an existing file• Create a new file
That makes the file ready to use by the programWe are positioned at the beginning of the file and
are ready to read or write.
5
Opening Files in UNIX/C• The UNIX system function open( ) is used to
open an existing file or create a new file.fd = open(filename, flags, [pmode]);
– fd: the file description -- the logical file name. The fd is an integer. If there is an error in the attempt to open the file, fd is negative (-1).
– filename: the physical file name. The filename argument can be a pathname.
6
– flags: an integer argument that controls the operation of the open function. The values of flag is set by performing a bitwise OR of the following values:
• O_APPEND: Append every write operation to the end of the file.
• O_CREAT: Create and open a file for writing.• O_EXCL: Return an error if O_CREAT opens an existing
file.• O_RDONLY: Open a file for reading only.• O_RDWR: Open a file for reading and writing.• O_TRUNC: Truncate an existing file to a length of 0,
destroying its contents.• O_WRONLY: Open a file for writing only.• and many others for synchronization.
7
Opening Files in UNIX/C (cont’d)
–pmode: An integer argument to specify the protection mode. • If O_CREAT is specified, pmode is required.
• In UNIX, the pmode is a three-digit octal that indicates how the file can be used by the owner (1st digit), by members of the owner’s group (2nd digit), and by everyone else (3rd digit). r: read permission, w: write permission, e: execute permission.
pmode = 751 = r w er w e r w e1 1 1 1 0 1 0 0 1owner group world
• File protection is tied more to the operating system than to a specific language.
8
– Examples:
fd = open(filename, O_RDWR | O_CREAT, 0751);
fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0751);
fd = open(filename, O_RDWR | O_CREAT | O_EXCL, 0751);
9
Closing Files
• Makes the logical file name available for another physical file (it’s like hanging up the telephone after a call).
• Ensures that everything has been written to the file [since data is written to a buffer prior to the file].
• Files are usually closed automatically by the operating system (unless the program is abnormally interrupted).
10
Reading
• Read(Source_file, Destination_addr, Size)
• Source_file = location the program reads from, i.e., its logical file name
• Destination_addr = first address of the memory block where we want to store the data.
• Size = how much information is being brought in from the file (byte count).
11
Writing
• Write(Destination_file, Source_addr, Size)
• Destination_file = the logical file name where the data will be written.
• Source_addr = first address of the memory block where the data to be written is stored.
• Size = the number of bytes to be written.
12
• A program does not necessarily have to read through a file sequentially: It can jump to specific locations in the file or to the end of file so as to append to it.
• The action of moving directly to a certain position in a file is often called seeking.
• Seek(Source_file, Offset)– Source_file = the logical file name in which the seek will
occur– Offset = the number of positions in the file the pointer is to
be moved from the start of the file.
13
• The seek function in UNIX/C: lseek( )pos = lseek(fd, byte_offset, origin)
– pos: a long integer value returned by lseek( ) equal to the number of bytes from the beginning to the file pointer after it has been moved.
– fd: the file descriptor.– byte_offset: the number of bytes to move from some
origin in the file. The byte_offset is a long integer and can be a negative value.
– origin: a value that specifies the starting position from which the byte_offset is to be taken. The values of origin:
• SEEK_SET: lseek( ) from the beginning of the file;• SEEK_CUR: lseek( ) from the current position;• SEEK_END: lseek( ) from the end of the file.
14
C/C++ streams• In C/C++, a file (and other devices like keyboard) is a stream of data.• There are two sets of I/O operations.
– C streams in stdio.h– C++ stream classes in iostream.h and fstream.h
• Comparison between UNIX/C operations and C/C++ streams– both support a complete set of file operations
UNIX/C
•Available mostly on UNIX, (also in Microsoft Visual C++)
•Fast
•Low level
C/C++ Streams
•Standard C/C++ features, available on almost all operating systems
•Provide structured I/O
15
C Streams
• Three standard streams: stdin, stdout, and stderr.• Opening file
fopen(const char *filename, const char *mode)• Closing file
fclose(FILE *fp)• Reading file
fread(void *buf, size_t size, size_t num, FILE *fp)//read num items of size bytes into buf from fpfgetc(FILE *fp) // return the next character from fpfgets(char *buf, int size, FILE *fp) // read a line or up to size bytes into buf from fpfscanf(FILE *fp, const char *format, …)// read and format data from fp
16
C Streams (Cont.)
• Writing filefwrite(const void *buf, size_t size, size_t num, FILE *fp)
//write num items of size bytes from buf to fpfputc(int ch, FILE *fp) //write the character ch to fpfputs(const char *buf, FILE *fp)
// write the string in buf to fpfprintf(FILE *fp, const char *format, …)
// write formatted data to fp• Seeking file
fseek(FILE *fp, long offset, int origin)
17
• C++ handles file I/O by creating objects of the stream classes.
• Standard stream objects: cin, cout, cerr, clog• Stream classes:
in file iostream.h: ios, istream, ostream, iostream,
in file fstream.h: ifstream, ofstream, fstream
ios
istream ostream
ifstream iostream ofstream
fstream
18
• Opening fileconstructormember function open
• Closing filedestructormember function close
• Reading fileoverloaded extracting operator <<many others: read, get, getline
• Writing fileoverloaded inserting operator >>many others: write, put
• Seeking fileseekg: set the read/get pointerseekp: set the write/put pointer
19
The LIST Program
• A simple file processing program: LIST– Display a prompt for the name of the input file.– Read the user’s response from the keyboard
into a variable called filename.– Open the file for input.– While there are still characters to be read from
the input file,• read a character from the file and,• write the character to the terminal screen.
– Close the input file.
20
/* read characters from a file and write them to the terminal screen */
#include <stdio.h>#include <fcntl.h>
main( ){
char c;int fd; /* file descriptor */char filename[20];
printf(“Enter the name of the file: “); /* step 1 */gets(filename); /* step 2 */fd = open(filename, O_RDONLY); /* step 3 */
while (read(fd, &c, 1) != 0) /* step 4a */putchar(c); /* write(stdout, &c, 1); does not work step 4b */
close(fd); /* step 5 */}
21
// listc.cpp// program using C streams to read characters from a file // and write them to the terminal screen #include <stdio.h>main( ) {
char ch;FILE * file; // file descriptorchar filename[20];printf("Enter the name of the file: "); // Step 1gets(filename); // Step 2file =fopen(filename, "r"); // Step 3while (fread(&ch, 1, 1, file) != 0) // Step 4a
fwrite(&ch, 1, 1, stdout); // Step 4bfclose(file); // Step 5
}
22
// listcpp.cpp DO THIS ONE...// list contents of file using C++ stream classes#include <fstream.h>void main (){
char ch;fstream file; // declare fstream unattachedchar filename[20];cout <<"Enter the name of the file: " // Step 1
<<flush; // force outputcin >> filename; // Step 2 file.open(filename, ios::in); // Step 3 file.unsetf (ios::skipws); // include white space in readwhile (1){
file >> ch; // Step 4a if (file.fail()) break;cout << ch; // Step 4b
}file.close(); // Step 5
}
23
Detecting End-of-File
• In UNIX/C– read returns 0
• Using C streams– fread returns -1– feof returns true
• Using C++ stream classes– fail returns true– eof returns true
24
Special Characters in Files I
• Sometimes, the operting system attempts to make “regular” user’s life easier by automatically adding or deleting characters for them.
• These modifications, however, make the life of programmers building sophisticated file structures (YOU) more complicated!
25
Special Characters in Files II: Examples
• Control-Z is added at the end of all files (MS-DOS). This is to signal an end-of-file.
• <Carriage-Return> + <Line-Feed> are added to the end of each line (again, MS-DOS).
• <Carriage-Return> is removed and replaced by a character count on each line of text (VMS)
26
The Unix Directory Structure I
• In any computer systems, there are many files (100’s or 1000’s). These files need to be organized using some method. In Unix, this is called the File System.
• The Unix File System is a tree-structured organization of directories. With the root of the tree represented by the character “/”.
• Each directory can contain regular files or other directories.• The file name stored in a Unix directory corresponds to its
physical name.
27
The Unix Directory Structure II
• Any file can be uniquely identified by giving it its absolute pathname. E.g., /usr6/mydir/addr. (see the next slide)
• The directory you are in is called your current directory.• You can refer to a file by the path relative to the current
directory.• “.” stands for the current directory and “..” stands for the
parent directory.
28
29
Physical Devices and Logical Files
• Unix has a very general view of what a file is: it corresponds to a sequence of bytes with no worries about where the bytes are stored or where they come from.
• Magnetic disks or tapes can be thought of as files and so can the keyboard and the console.
• No matter what the physical form of a Unix file (real file or device), it is represented in the same way in Unix: by an integer.
30
Stdout, Stdin, Stderr
• Stdout --> Console
fwrite(&ch, 1, 1, stdout);
• Stdin --> Keyboard
fread(&ch, 1, 1, stdin);
• Stderr --> Standard Error (again, Console)
[When the compiler detects an error, the error message is written in this file]
31
I/O Redirection and Pipes
• < filename [redirect stdin to “filename”]
• > filename [redirect stdout to “filename”]
E.g., a.out < my-input > my-output
• program1 | program2 [take any stdout output from program1 and use it in place of any stdin input to program2.
E.g., list | sort
32
Unix System Commands
• cat filenames --> Print the content of the named textfiles.• tail filename --> Print the last 10 lines of the text file.• cp file1 file2 --> Copy file1 to file2.• mv file1 file2 --> Move (rename) file1 to file2.• rm filenames --> Remove (delete) the named files.• chmod mode filename --> Change the protection mode on the
named file.• ls --> List the contents of the directory.• mkdir name --> Create a directory with the given name.• rmdir name --> Remove the named directory.