latest unix mat3
DESCRIPTION
Latest UnixTRANSCRIPT
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 1
Chapter-1 INTRODUCTION
History of Unix
• UNIX was "created" 1968-1970. UNIX time is kept from Jan 1, 1970.
• Ken Thompson, Dennis Ritchie (AT&T Bell Labs), Multics, PDP7
• Thompson wrote "B" and then 'C' in 1972
• UNIX rewritten in 'C' in 1973 - greatly increased portability
• 6th edition of UNIX distributed in 1975.
• 7th edition distributed in 1979, ported to Interdata 8/32 (IBM370 like), PDP 11.
• University of California, Berkeley UCB -- BSD UNIX.
Today there are many "flavors" of UNIX's with slight differences.
The name "UNIX" is a trademark, originally of AT&T, but since sold several times - most
variants have names other than "UNIX", but all derive from AT&T code originally.
Why learn Unix?
• UNIX has been called the "Internet Operating System" because much of the
Internet was built around and on UNIX machines (TCP/IP developed by
Berkeley). UNIX has/had the communications technology that allowed the
Internet to explode.
• Remote login and file transfer (FTP)
• Software Development, UNIX has many powerful tools that aided programmers;
the "data streams" philosophy allows tools to be connected together.
• Personal UNIX, Linux for your mac (or PC).
• High end word processing and desktop publishing
• High end image processing and analysis
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 2
Main Features of UNIX
• Multi-user more than one user can use the machine at a time
supported via terminals (serial or network connection)
• Multi-tasking
more than one program can be run at a time
• Hierarchical directory structure
to support the organisation and maintenance of files
• Portability
only the kernel ( <10%) written in assembler. This meant the operating system
could be easily converted to run on different hardware
• Tools for program development, a wide range of support tools (debuggers,
compilers)
Why Unix is so widely used?
• Scalable and portable Runs in many environments. TV set top boxes, bank ATM, phone switches, Real
Time, PC's, Workstations, Servers, 12,000 CPU parallel processors.
• Multiuser Even for a one user machine, protected environment. Multiple logins allow
shared resources.
• Preemptive multitasking Each program gets a "chunk" of system resources. Many users and programs
can run at the same time (in other words, it has efficient context switching, good
virtual memory, idle programs are paged out -- you can do more with less
resources)
• Robust Protected execution space, keeps on running. Makes it good for OLTP(On Line
Transaction Processing) and high availability applications. Failing user programs
do not crash the whole system (usually).
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 3
• Networks and the Internet
UNIX has many "built in" communication tools, News, mail, ftp, rlogin etc.
Unix O.S Structure
Unix is a layered operating system. The innermost layer is the hardware that provides
the services for the OS. The operating system, referred to in Unix as the kernel, interacts directly with the hardware and provides the services to the user programs.
These user programs do not need to know anything about the hardware. They just need
to know how to interact with the kernel and it is up to the kernel to provide the desired
service. One of the big appeals of Unix to programmers has been that most well written
user programs are independent of the underlying hardware, making them readily
portable to new systems.
User programs interact with the kernel through a set of standard system calls. These
system calls request services to be provided by the kernel. Such services would include
accessing a file: open close, read, write, link, or execute a file; starting or updating
accounting records; changing ownership of a file or directory; changing to a new
directory; creating, suspending, or killing a process; enabling access to hardware
devices; and setting limits on system resources.
Unix is a multi-user, multi-tasking operating system. Many users can login into a
system simultaneously. It is the kernel's job to keep each process and user separate
and to regulate access to system hardware, including cpu, memory, disk and other I/O
devices.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 4
The System Call Interface
In UNIX, all user programs and application software use the system call interface to
access system resources like disks, printers, memory etc. The system call interface in
UNIX provides a set of system calls (C functions).
The purpose of the system call interface is to provide system integrity. As all low-level
hardware access is under control of the operating system, this prevents a users
program corrupting the system.
The operating system, upon receiving a system call, validates its authenticity or
permission, then executes it on behalf of the users program, after which it returns the
results. If the request is invalid or not authenticated, then the operating system does not
perform the request and simply returns an error code to the users program.
Hardware
Kernel
Hardware
who a.out
date
wc
grep
ed
vi ld
as
comp
cpp
nroff
sh
Architecture of UNIX
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 5
Header Files Header files define how a system call works. A header file contains a prototype of the
system call, and the parameters (variables) required by the call, and the parameters
returned by the system call.
When a programmer develops programs, the header file for the particular system call is
incorporated (included) into the program. This allows the compiler to check the number
of parameters and their data type.
The UNIX Operating System
The basic structure of the UNIX operating system, as a division of three parts is as
below.
• kernel schedules programs
manages data/file access and storage
enforces security mechanisms
performs all hardware access
• shell presents each user with a prompt
interprets commands types by a user
executes user commands
supports a custom environment for each user
• utilities
file management (rm, cat, ls, rmdir, mkdir)
user management (passwd, chmod, chgrp)
process management (kill, ps)
printing (lp, troff, pr)
program development tools
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 6
UNIX Shells
The shell sits between the user and the operating system, acting as a command
interpreter. It reads your terminal input and translates the commands into actions taken
by the system. The shell is analogous to command.com in DOS. When you log into the
system you are given a default shell. When the shell starts up it reads its startup files
and may set environment variables, command search paths, and command aliases, and
executes any commands specified in these files.
The original shell was the Bourne shell, sh. Every Unix platform will either have the
Bourne shell, or a Bourne compatible shell available. It has very good features for
controlling input and output, but is not well suited for the interactive user. To meet the
latter need the C shell, csh, was written and is now found on most, but not all, Unix
systems. It uses C type syntax, the language Unix is written in, but has a more
awkward input/output implementation. It has job control, so that you can reattach a job
running in the background to the foreground. It also provides a history feature, which
allows you to modify and repeat previously executed commands.
The default prompt for the Bourne shell is $ (or #, for the root user). The default prompt
for the C shell is %.
Numerous other shells are available from the network. Almost all of them are based on
either sh or csh with extensions to provide job control to sh, allow in-line editing of
commands, page through previously executed commands, provide command name
completion and custom prompt, etc. Some of the more well known of these may be on
your favorite Unix system: the Korn shell, ksh, by David Korn and the Bourne Again
SHell, bash, from the Free Software Foundations GNU project, both based on sh, the
T-C shell, tcsh, and the extended C shell, cshe, both based on csh. Below we will
describe some of the features of sh and csh .
The shells have a number of built-in, or native commands. These commands are
executed directly in the shell and don't have to call another program to be run. These
built-in commands are different for the different shells.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 7
Sh
For the Bourne shell some of the more commonly used built-in commands are:
: null command
. source (read and execute) commands from a file
case case conditional loop
cd change the working directory (default is $HOME)
echo write a string to standard output
eval evaluate the given arguments and feed the result back to the shell
exec execute the given command, replacing the current shell
exit exit the current shell
exportshare the specified environment variable with subsequent shells
for for conditional loop
if if conditional loop
pwd print the current working directory
read read a line of input from stdin
set set variables for the shell
test evaluate an expression as true or false
trap trap for a typed signal and execute commands
umask set a default file permission mask for new files
unset unset shell variables
wait wait for a specified process to terminate
while while conditional loop
Csh For the C shell the more commonly used built-in functions are:
alias assign a name to a function
bg put a job into the background
cd change the current working directory
echo write a string to stdout
eval evaluate the given arguments and feed the result back to the shell
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 8
exec execute the given command, replacing the current shell
exit exit the current shell
fg bring a job to the foreground
foreach for conditional loop
glob do filename expansion on the list, but no "\" escapes are honored
history print the command history of the shell
if if conditional loop
jobs list or control active jobs
kill kill the specified process
limit set limits on system resources
logout terminate the login shell
nice command lower the scheduling priority of the process, command
nohup command do not terminate command when the shell exits
popd pop the directory stack and return to that directory
pushd change to the new directory specified and add the current one to the
directory stack
rehash recreate the hash table of paths to executable files
repeat repeat a command the specified number of times
set set a shell variable
setenv set an environment variable for this and subsequent shells
source source (read and execute) commands from a file
stop stop the specified background job
switch switch conditional loop
umask set a default file permission mask for new files
unalias remove the specified alias name
unset unset shell variables
unsetenv unset shell environment variables
wait wait for all background processes to terminate
while while conditional loop
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 9
Environment Variables Environmental variables are used to provide information to the programs you use. You
can have both global environment and local shell variables. Global environment
variables are set by your login shell and new programs and shells inherit the
environment of their parent shell. Local shell variables are used only by that shell and
are not passed on to other processes. A child process cannot pass a variable back to its
parent process.
The current environment variables are displayed with the “env” or “printenv”
commands. Some common ones are:
• DISPLAY The graphical display to use
• EDITOR The path to your default editor, e.g. /usr/bin/vi
• GROUP Your login group, e.g. staff
• HOME Path to your home directory, e.g. /home/frank
• HOST The hostname of your system
• IFS Internal field separators, usually any white space (defaults to tab, space and
<newline>)
• LOGNAME The name you login with, e.g. frank
• PATH Paths to be searched for commands, e.g. /usr/bin:/usr/ucb:/usr/local/bin
• PS1 The primary prompt string, Bourne shell only (defaults to $)
• PS2 The secondary prompt string, Bourne shell only (defaults to >)
• SHELL The login shell you’re using, e.g. /usr/bin/bash
• TERM Your terminal type, e.g. xterm
• USER Your username, e.g. frank
Many environment variables will be set automatically when we login. You can modify
them or define others with entries in your startup files or at anytime within the shell.
Some variables you might want to change are PATH and DISPLAY. The PATH variable
specifies the directories to be automatically searched for the command you specify.
Examples of this are in the shell startup scripts below.
We set a global environment variable with a command similar to the following for the
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 10
C shell: % setenv NAME value
and for Bourne shell:
$ NAME=value; export NAME
You can list your global environmental variables with the env or printenv commands.
You unset them with the unsetenv (C shell) or unset (Bourne shell) commands.
To set a local shell variable use the set command with the syntax below for C shell.
Without options set displays all the local variables.
% set name=value
For the Bourne shell set the variable with the syntax:
$ name=value
The current value of the variable is accessed via the “$name”, or “${name}”, notation.
The Bourne Shell, sh Sh uses the startup file .profile in your home directory. There may also be a system-
wide startup file, e.g. /etc/profile. If so, the system-wide one will be sourced (executed)
before your local one.
A simple .profile could be the following:
PATH=/usr/bin:/usr/ucb:/usr/local/bin:. # set the PATH
export PATH # so that PATH is available to subshells
# Set a prompt
PS1=”{‘hostname‘ ‘whoami‘} “ # set the prompt, default is “$”
# functions
ls() { /bin/ls -sbF “$@”;}
ll() { ls -al “$@”;}
# Set the terminal type
stty erase ^H # set Control-H to be the erase key
eval ‘tset -Q -s -m ‘:?xterm’‘ # prompt for the terminal type, assume xterm
#umask 077
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 11
Whenever a # symbol is encountered the remainder of that line is treated as a
comment. In the PATH variable each directory is separated by a colon (:) and the dot (.) specifies that the current directory is in your path. If the latter is not set it’s a simple
matter to execute a program in the current directory by typing:
./program_name
It’s actually a good idea not to have dot (.) in your path, as you may inadvertently
execute a program you didn’t intend to when you cd to different directories.
A variable set in .profile is set only in the login shell unless you “export” it or source
.profile from another shell. In the above example PATH is exported to any subshells.
You can source a file with the built-in “.” command of sh, i.e.:
. ./.profile
You can make your own functions. In the above example the function ll results in an “ls -al” being done on the specified files or directories.
With stty the erase character is set to Control-H (^H), which is usually the Backspace
key.
The tset command prompts for the terminal type, and assumes “xterm” if we just hit
<CR>. This command is run with the shell built-in, eval, which takes the result from the
tset command and uses it as an argument for the shell. In this case the “-s” option to
tset sets the TERM and TERMCAP variables and exports them.
The last line in the example runs the umask command with the option such that any
files or directories you create will not have read/write/execute permission for group and
other. For further information about sh type “man sh” at the shell prompt.
Job Control With the C shell, csh, and many newer shells including some newer Bourne
shells, you can put jobs into the background at anytime by appending “&” to the
command, as with sh. After submitting a command you can also do this by typing
^Z (Control-Z) to suspend the job and then “bg” to put it into the background. To
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 12
bring it back to the foreground type “fg”. You can have many jobs running in the background. When they are in the background
they are no longer connected to the keyboard for input, but they may still display output
to the terminal, interspersing with whatever else is typed or displayed by your current
job. You may want to redirect I/O to or from files for the job you intend to background.
Your keyboard is connected only to the current, foreground, job.
The built-in jobs command allows you to list your background jobs. You can use the kill command to kill a background job. With the %n notation you can reference the nth
background job with either of these commands, replacing n with the job number from
the output of jobs. So kill the second background job with “kill %2” and bring the third
job to the foreground with “fg %3”.
History
The C shell, the Korn shell and some other more advanced shells, retain information
about the former commands you’ve executed in the shell. How history is done will
depend on the shell used. Here we’ll describe the C shell history features.
You can use the history and savehist variables to set the number of previously
executed commands to keep track of in this shell and how many to retain between
logins, respectively. You could put a line such as the following in .cshrc to save the last
100 commands in this shell and the last 50 through the next login.
set history=100 savehist=50
The shell keeps track of the history list and saves it in ~/.history between logins.
You can use the built-in history command to recall previous commands, e.g. to print the
last 10:
% history 10
52 cd workshop
53 ls
54 cd unix_intro
55 ls
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 13
56 pwd
57 date
58 w
59 alias
60 history
61 history 10
You can repeat the last command by typing !!: % !!
53 ls
54 cd unix_intro
55 ls
56 pwd
57 date
58 w
59 alias
60 history
61 history 10
62 history 10
You can repeat any numbered command by prefacing the number with a !, e.g.:
% !57
date
Tue Apr 9 09:55:31 EDT 1996
Or repeat a command starting with any string by prefacing the starting unique part of the
string with a !, e.g.:
% !da
date
Tue Apr 9 09:55:31 EDT 1996
When the shell evaluates the command line it first checks for history substitution before
it interprets anything else. Should you want to use one of these special characters in a
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 14
shell command you will need to escape, or quote it first, with a \ before the character,
i.e. \!. The history substitution characters are summarized in the following table.
C Shell History Substitution Command Substitution Function
!! repeat last command !n repeat command number n !-n repeat command n from last !str repeat command that started with string str !?str? repeat command with str anywhere on the line !?str?% select the first argument that had str in it !: repeat the last command, generally used with a modifier !:n select the nth argument from the last command (n=0 is the command name) !:n-m select the nth through mth arguments from the last command !^ select the first argument from the last command (same as !:1) !$ select the last argument from the last command !* select all arguments to the previous command !:n* select the nth through last arguments from the previous command !:n- select the nth through next to last arguments from the previous command ^str1^str2^ replace str1 with str2 in its first occurrence in the previous command
!n:s/str1/str2/ substitute str1 with str2 in its first occurrence in the nth command, ending with a g substitute globally
Additional editing modifiers are described in the man page.
Special Unix Features
One of the most important contributions Unix has made to Operating Systems is the
provision of many utilities for doing common tasks or obtaining desired information.
Another is the standard way in which data is stored and transmitted in Unix systems.
This allows data to be transmitted to a file, the terminal screen, or a program, or from a
file, the keyboard, or a program; always in a uniform manner. The standardized
handling of data supports two important features of Unix utilities: I/O redirection and
piping.
With output redirection, the output of a command is redirected to a file rather than to
the terminal screen. With input redirection, the input to a command is given via a file
rather than the keyboard. Other tricks are possible with input and output redirection as
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 15
well, as you will see. With piping, the output of a command can be used as input
(piped) to a subsequent command. In this chapter we discuss many of the features and
utilities available to Unix users
File Descriptors
There are 3 standard file descriptors:
• stdin 0 Standard input to the program
• stdout 1 Standard output from the program
• stderr 2 Standard error output from the program
Normally input is from the keyboard or a file. Output, both stdout and stderr, normally go
to the terminal, but you can redirect one or both of these to one or more files.
You can also specify additional file descriptors, designating them by a number 3 through
9, and redirect I/O through them
File Redirection
Output redirection takes the output of a command and places it into a named file. Input
redirection reads the file as input to the command. The following table summarizes the
redirection options.
File Redirection Symbol Redirection
> output redirect >! same as above, but overrides noclobber option of csh >> append output
>>! same as above, but overrides noclobber option on csh and creates the file if it doesn't already exist.
| pipe output to another command < Input redirection
<<String Read from standard input until "String" is encountered as the only thing on the line. Also known as a "here document" (see Chapter 8).
<<\String same as above, but don't allow shell substitutions An example of output redirection is:
cat file1 file2 > file3
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 16
The above command concatenates file1 then file2 and redirects (sends) the output to
file3. If file3 doesn't already exist it is created. If it does exist it will either be truncated to
zero length before the new contents are inserted, or the command will be rejected, if the
noclobber option of the csh is set. The original files, file1 and file2, remain intact as
separate entities.
Output is appended to a file in the form:
cat file1 >> file2
This command appends the contents of file1 to the end of what already exists in file2.
(Does not overwrite file2).
Input is redirected from a file in the form:
program < file
This command takes the input for program from file. To pipe output to another command use the form:
command | command
This command makes the output of the first command the input of the second command
Sh
2> file direct stderr to file
> file 2>&1 direct both stdout and stderr to file
>> file 2>&1 append both stdout and stderr to file
2>&1 | command pipe stdout and stderr to command
To redirect stdout and stderr to two separate files you can do:
$ command 1> out_file 2> err_file
or, since the redirection defaults to stdout:
$ command > out_file 2> err_file
With the Bourne shell you can specify other file descriptors (3 through 9) and redirect
output through them. This is done with the form:
n>&m redirect file descriptor n to file descriptor m
We used the above to send stderr (2) to the same place as stdout (1), 2>&1, when we
wanted to have error messages and normal messages to go to file instead of the
terminal. If we wanted only the error messages to go to the file we could do this by
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 17
using a place holder file descriptor, 3. We'll first redirect 3 to 2, then redirect 2 to 1, and
finally, we'll redirect 1 to 3:
$ (command 3>&2 2>&1 1>&3) > file This sends stderr to 3 then to 1, and stdout to 3, which is redirected to 2. So, in effect,
we've reversed file descriptors 1 and 2 from their normal meaning. We might use this in
the following example:
$ (cat file 3>&2 2>&1 1>&3) > errfile
So if file is read the information is discarded from the command output, but if file can't
be read the error message is put in errfile for your later use.
You can close file descriptors when you're done with them:
m<&- closes an input file descriptor
<&- closes stdin
m>&- closes an output file descriptor
>&- closes stdout
Other Special Command Symbols
In addition to file redirection symbols there are a number of other special symbols you
can use on a command line. These include:
; command separator
& run the command in the background
&& run the command following this only if the previous command completes
successfully, e.g.:
grep string file && cat file
|| run the command following only if the previous command did not complete
successfully, e.g.:
grep string file || echo "String not found."
( ) the commands within the parentheses are executed in a subshell. The output of the
subshell can be manipulated as above.
' ' literal quotation marks. Don't allow any special meaning to any characters within
these quotations.
\ escape the following character (take it literally)
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 18
" " regular quotation marks. Allow variable and command substitution with theses
quotations (does not disable $ and \ within the string).
'command' take the output of this command and substitute it as an argument(s) on the
command line
# everything following until <newline> is a comment
The \ character can also be used to escape the <newline> character so that you can
continue a long command on more than one physical line of text
Wild Cards
The shell and some text processing programs will allow meta-characters, or wild cards, and replace them with pattern matches. For filenames these meta-characters
and their uses are:
? match any single character at the indicated position
* match any string of zero or more characters
[abc...] match any of the enclosed characters
[a-e] match any characters in the range a,b,c,d,e
[!def] match any characters not one of the enclosed characters, sh only
{abc,bcd,cde} match any set of characters separated by comma (,) (no spaces), csh
only
~ home directory of the current user, csh only
~user home directory of the specified user, csh only.
Some Useful UNIX Utility Programs In addition to the various tools built into whichever shell you use, UNIX normally has a
variety of programs to help you get your work done. These programs are often called
"tools", since you may not be able to accomplish your entire task with one, but a
collection of them will often help you achieve your goal.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 19
Here, we'll introduce some of the more commonly used tools. Remember, these tools
are often best used when combined together using the pipe mechanism descried
earlier.
For more detail, consult the man page for these commands.
tee
Tee forms a "T" fitting in pipes ( | ). It will take whatever is fed to it, copy it to a file, and
also feed the same data to its standard input. Thus you can keep a record of whatever
is flowing through some section of your pipes.
Use it as: "tee filename"
For example: ls | sort | tee sorted.list | less
script
Script is used to make a log file of your session. When you issue the command: script filename A new session will be started for you, and every character that's displayed on your
terminal (including your typing that's echoed to the screen) will go into the file. Type
"exit" or [Control-D] to end the log file.
grep
Grep is one of the classic UNIX tools. It will search through its input, and write to its
standard output any lines which contain text which matches a string you give it. This
allows you to quickly search a file, or a group of files for something.
The key to using grep are the regular expressions, which are similar to the wildcards
described above. A regular expression is a "formula" which describes what a text string
must contain in order for a "match" to occur. Here are some of the operators which
make up such a "formula":
- just match a single character
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 20
string - match an occurrence of string
. - match (almost) ANY character (once)
[string] - match any character in string (once)
[char1-char2] - match any character in ASCII collating sequence
between character char1 and character char2
* - match anything which has zero or more occurrences of
^ - make the "formula" match only if it's
at the start of a line
$ - make the "formula" match only if it's at the end of a line
\ - use a \ if you want to use special characters, such as "[" or "*"
To use grep, type : grep expression filename Where expression is a regular expression as described above, and filename is a
filename, or a shell wildcarded filename.
For example: grep #include *.c : list all the include lines in *.c. ntp.c:#include
ntp.c:#include
ntp.c:#include
test.c:/*#include "ntp.h"*/
grep ^#include *.c : more precise way to do above.
ntp.c:#include
ntp.c:#include
ntp.c:#include
grep ^#include *.c | less : pipe to a pager ps -axu | grep "r.*t" | less : find anything with an "r" followed eventually by a "t"
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 21
root 112 0.0 0.0 28 0 ? I Nov 1 0:00 (nfsd)
root 53 0.0 0.0 68 0 ? IW Nov 1 0:04 portmap
root 2 0.0 0.0 0 0 ? D Nov 1 10:47 pagedaemon
dela 2213 0.0 0.0 40 0 co IW Nov 1 0:00 /usr/openwin/bin/xinit -
dela 2321 0.0 0.0 36 0 co IW Nov 1 0:00 rsh augustus.me.rocheste
Finally, grep -v will list every line except those that match the regular expression. The
following example is just like the one above, but skips entries which include "root".
ps -axu | grep "r.*t" | grep -v root | less
dela 2213 0.0 0.0 40 0 co IW Nov 1 0:00 /usr/openwin/bin/xinit -
dela 2321 0.0 0.0 36 0 co IW Nov 1 0:00 rsh augustus.me.rocheste
diff
Diff will list the lines that are different between two files. Typically you'll use this to look
at two versions of the same file to see how it's changed. The output is somewhat
cryptic; it shows you the "ed" commands to change the first file into the second file,
followed by the affected lines from the two files.
Invoke diff via:
diff file1 file2
For example:
diff ntp_proto.c.original ntp_proto.c
176c176
< pkt->status = sys.leap | NTPVERSION_1 | peer->hmode;
---
> pkt->status = sys.leap.year | NTPVERSION_1 | peer->hmode;
sort
Sort will sort the contents of a file. By default it will use the ASCII collating sequence
(which is wrong for numbers, 101 will sort before 12). The two options of interest are "-
n", which will sort numerically, and "+#" where # is the number of the word in each line
(start at zero) to sort on. For example
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 22
ls -l
produces:
-rw-r--r-- 1 dela 5692 Nov 4 16:34 #slides.txt#
-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt
-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt
-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1
ls -l | sort -n +3 | less
produces:
-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt
-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt
-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1
-rw-r--r-- 1 dela 5692 Nov 4 16:34 #slides.txt#
wc
Wc will count the words in a file. It also reports how many lines and characters there are
in a file.
wc slides.txt
204 987 6244 slides.txt
head and tail
Head and tail are two programs that will show the beginning and the end of their input
respectively. They are often used in pipes as well. Both commands will take a numeric
argument that determines how many lines to show (the default is 10).
ls -l | sort -n +3 | head -2
-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt
-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1
less
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 23
Less is a pager which you use just like more, but it's better. Like more, less will page
through your document when you hit [space], but unlike more you can page backwards
by hitting the "b" key. Also "g"
will move you to the start of the file, "G" will move you to the end of the file. Hit the "h"
key when running less for help.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 24
Chapter-2 EDITORS
Introduction
The editor is the basic tool used to create or modify text files under any computer
system. The UNIX system provides several standard editors. One of the most popular of
these is vi.
Vi is useful because it is a screen editor. The file being edited is displayed on screen.
The user can move the cursor (a pointer) around the text. Any changes made to the text
are displayed immediately on the screen.
Most general editing commands are available under vi. The cursor is used to navigate
around a file. Text may be inserted, deleted, or changed, and a range of powerful
pattern matching commands allow large global edits to be performed.
The best way to learn vi is to use it. It isn't necessary to learn all the commands at once.
Indeed this handout describes just a subset of commands available. Users will find that
they develop their own `working set' of commands. These will suit their own type of
work. Occasional glances back at this handout will prove useful, as you may discover
new useful commands and add these to your working set.
Starting vi
The vi editor can be invoked with one of the following command lines, as well as a few
others that are only needed for more advanced users:
Open file under vi:
vi file
Open file at line n:
vi +n file
Open file at first occurrence of pattern:
vi +/pattern file
NOTE: If you start vi with a non-existent filename, vi will create an empty file for you.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 25
Modes
The vi editor has the following three modes of operation:
• Command mode, in which keystrokes are interpreted as vi editor subcommands
to be carried out immediately without being displayed.
In command mode numbers act as command modifiers. Entering a number n
before a command means that the command is to be acted upon n times. For
example, the command x deletes 1 character, 5x deletes 5 characters.
• Text input mode, in which a keystroke is interpreted as text to be displayed as it
is added to the file.
• Last line mode, in which all keystrokes, until the enter or return key is pressed,
form a subcommand that appears at the bottom of the screen as you type.
Navigating in vi
Some of the following commands will vary depending on the terminal emulator you are
using. The following commands and control sequences will let you move around in the
editor:
Left, Down, Up, Right (respectively):
h, j, k, l
You can also use the arrow keys.
Move forward one character:
spacebar
Scroll forward or backward one screen (respectively):
^f (Ctrl-f), ^b (Ctrl-b)
You can also use Page-up and Page-down.
Move to the beginning or the end of the file:
1G, G
Move to line number n:
nG
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 26
Move to the beginning of the current line:
0
Move to the first non-blank character:
^
Move to the end of the current line:
$
Redraw screen:
l̂ (ctrl-l)
Inserting and Editing Text
Insert mode can be started through one of the following commands: Append after cursor:
a
Append at end of line:
A
Insert before cursor:
i
Insert at beginning of line:
I
Open a line below current line:
o (this is a lower-case ooh)
Open a line above current line:
O (this is an upper-case ooh)
Terminate insert mode:
ESC
Searching The vi editor can also be used simply for finding data in file. The use of searches can
simplify this process. The more common search commands are as follows:
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 27
Search forward for text:
/text
Repeat previous search:
n
Repeat previous search in opposite direction:
N
Repeat forward search:
/
Deleting and moving text
Eventually, you will need to delete and/or copy and move text. The next few commands
will help you in doing so:
Delete line:
dd
Delete n lines:
ndd
Copy line to a buffer:
yy
Paste current buffer contents back into the text:
p
Marking and copying a section:
1. Move the cursor to the beginning of the section you want to copy.
2. type ma (where a is any letter from a to z)
3. move the cursor to the end of the section you want to copy
4. type y'a (where a is the letter you previously used as a marker)
5. and, then use p to past it at your target destination.
Marking and deleting a section
1. move cursor to the beginning of the section you want to delete
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 28
2. type ma (where a is any letter from a to z)
3. move the cursor to the end of the section you want to delete
4. type d'a (where a is the letter you previously used as a marker)
Saving and Exiting the vi editor Now that you have edited and finished whatever file you are working with, you can quit
and save or abort any work you may have done on the file. The next list of commands
will accomplish this:
Quit vi, saving all changes:
ZZ or :x or :wq
Write to file (save):
:w
Write to file (save as):
:w file
Quit file:
:q
Quit file and abort changes since last save:
:q!
Edit file2 without leaving vi:
:e file2
Entering ex Commands in Last Line Mode
The vi editor allows you to enter ex editor commands. You use ex commands from
command mode, by entering ":" followed by the ex command.
Examples: Global replace of <search_string> with <replacement_string>:
:<line_range>s/<search_string>/<replacement_string>/g
Global deletion of all lines containing <search_string>:
:<line_range>g/<search_string>
<line_range>
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 29
The range specifies an area of the file on which to perform a specific command.
Following are a few of the common line_range formats:
3,5 (from line 3 to 5)
^,45 (from the beginning to line 45)
45,$ (from line 45 to the end)
% (entire file)
// for example: /abc/ (the one line containing the next
occurrence of the pattern)
Here are just a few commands that the ex editor provides:
s - substitutes one string for another
g - globally performs a command
d - deletes a line
<search_string>
one or a series of characters that form a pattern, called a regular expression.
Letters (e.g. A-Z or a-z) and numbers, in the search string have to be matched by
the same letter or number in the text. Some special symbols that have special
meaning are:
^ - beginning of the line
$ - end of the line
* - any number of the preceding characters
. - any single character
Regular expressions can be used in both a pattern in line_range and as a
search_string.
Example 1: :%s/rate/value/g
Changes all occurrences of rate to value in the file being edited.
% says to apply the command to all lines in the file.
s is the substitute command.
g says to apply the substitution globally to all
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 30
occurrences on the same line.
Example 2: :%g/^.*0.*/d
Deletes all lines which contain a 0.
% says to apply the command to the entire file.
g says its a global change.
^ is the beginning of the line symbol.
. is the any single character symbol.
* is the any number of character symbol.
^.*0.* matches all lines in the file that contain any number of
characters starting at the beginning of the line, a zero,
and then any number of additional characters.
d says to delete the lines.
Command Summary
starting vi vi filename edit a file named "filename"
vi newfile create a new file named "newfile"
entering text i insert text left of cursor
a append text right of cursor
I insert text at the beginning of the line
A insert text at the end of the line
moving the cursor h, (left arrow) left one space
j, (down arrow), + down one line
k, (up arrow), - up one line
l, (right arrow) right one space
0 (zero) to beginning of line
^ to first non-blank character
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 31
$ to end of line
H to top line of screen
M to middle line of screen
L to last line of screen
w forward word by word
b backward word by word
$ to end of line
basic editing x delete character
nx delete n characters
X delete character before cursor
dw delete word
ndw delete n words
dd delete line
ndd delete n lines
D delete characters from cursor to end of line
r replace the character the cursor is under
R replace characters until ESC is pressed
cw replace a word
ncw replace n words
C change text from cursor to end of line
cc Change the current line
o insert blank line below the line the cursor is on
(ready for insertion)
O insert blank line above the line the cursor is on
(ready for insertion)
J join succeeding line to current cursor line
nJ join n succeeding lines to current cursor line
. repeat the last command
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 32
u undo the last command
U undo the last change on the line
ns replace n characters
moving around in the file
^f, Pagedown scroll forward one screen
^b, Pageup scroll backward one screen
z+ scroll forward one screen
z^ scroll backward one screen ^d scroll down one-half screen
^u scroll up one-half screen
/string forward search for string
?string backward search for string
n repeat last search in same direction
N repeat last search in opposite direction
G to last line of file
1G to first line of file
nG to the nth line
start entering an ex command
:
closing and saving a file
ZZ save file and then quit
:w save file
:wq save file and then quit
:q! discard changes and quit file
:q quit
For more information about the vi editor, see the vi man page
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 33
Chapter-3 UNIX file system
A file system is a logical method for organising and storing large amounts of information
in a way that makes it easy manage. The file is the smallest unit in which information is
stored. The UNIX file system has several important features.
• Different types of file
• Structure of the file system
• Your home directory
• Your current directory
• Pathnames
• Access permissions
Different types of file
To the user, it appears as though there is only one type of file in UNIX - the file which is
used to hold your information. In fact, the UNIX filesystem contains several types of file.
• Ordinary files
• Directories
• Special files
• Pipes
Ordinary files
This type of file is used to store your information, such as some text you have written or
an image you have drawn. This is the type of file that you usually work with.
Files which you create belong to you - you are said to "own" them - and you can set
access permissions to control which other users can have access to them. Any file is
always contained within a directory.
Directories
A directory is a file that holds other files and other directories. You can create directories
in your home directory to hold files and other sub-directories.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 34
Having your own directory structure gives you a definable place to work from and allows
you to structure your information in a way that makes best sense to you.
Directories which you create belong to you - you are said to "own" them - and you can
set access permissions to control which other users can have access to the information
they contain.
Special files
This type of file is used to represent a real physical device such as a printer, tape drive
or terminal.
It may seem unusual to think of a physical device as a file, but it allows you to send the
output of a command to a device in the same way that you send it to a file. For example:
cat scream.au > /dev/audio
This sends the contents of the sound file scream.au to the file /dev/audio which
represents the audio device attached to the system.
The directory /dev contains the special files which are used to represent devices on a
UNIX system.
Pipes
UNIX allows you to link commands together using a pipe.
The pipe acts a temporary file which only exists to hold data from one command until it
is read by another.
Structure of the file system
The UNIX file system is organised as a hierarchy of directories starting from a single
directory called root which is represented by a / (slash). Imagine it as being similar to
the root system of a plant or as an inverted tree structure.
Immediately below the root directory are several system directories that contain
information required by the operating system. The file holding the UNIX kernel is also
here.
• UNIX system directories
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 35
• Home directory
• Pathnames
UNIX system directories The standard system directories are shown below. Each one contains specific types of
file. The details may vary between different UNIX systems, but these directories should
be common to all. /(root) | -------------------------------------------------------------- | | | | | | | | /bin /dev /etc /home /lib /tmp /usr kernel file /bin: This directory contains the commands and utilities that you use day to day. These
are executable binary files - hence the directory name bin.
Often in modern UNIX systems this directory is simply a link to /usr/bin.
/dev: This directory contains special files used to represent real physical devices such
as printers and terminals. One of these files represents a null (non-existent) device.
/etc: This directory contains various commands and files which are used for system
administration. One of these files - motd - contains a 'message of the day' which is
displayed whenever you login to the system.
/home: This directory contains a home directory for each user of the system.
/lib: This directory contains libraries that are used by various programs and languages.
Often in modern UNIX systems this directory is simply a link to /usr/lib.
/tmp: This directory acts as a "scratch" area in which any user can store files on a
temporary basis
/usr: This directory contains system files and directories that you share with other users.
Application programs, on-line manual pages, and language dictionaries typically reside
here.
/Kernel file: As its name implies, the kernel is at the core of each UNIX system and is
loaded in whenever the system is started up - referred to as a boot of the system.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 36
It manages the entire resources of the system, presenting them to you and every other
user as a coherent system. You do not need to know anything about the kernel in order
to use a UNIX system. This information is provided for your information only.
Amongst the functions performed by the kernel are:
• managing the machine's memory and allocating it to each process.
• scheduling the work done by the CPU so that the work of each user is carried out
as efficiently as is possible.
• organising the transfer of data from one part of the machine to another.
• accepting instructions from the shell and carrying them out.
• enforcing the access permissions that are in force on the file system.
Home Directory Any UNIX system can have many users on it at any one time. As a user you are given a
home directory in which you are placed whenever you log on to the system.
User's home directories are usually grouped together under a system directory such as
/home. A large UNIX system may have several hundred users, with their home
directories grouped in subdirectories according to some schema such as their
organisational department.
Pathnames Every file and directory in the file system can be identified by a complete list of the
names of the directories that are on the route from the root directory to that file or
directory – Absolute Pathname .
Each directory name on the route is separated by a / (forward slash). For example:
/usr/local/bin/ue
This gives the full pathname starting at the root directory and going down through the
directories usr, local and bin to the file ue
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 37
Relative pathnames You can define a file or directory by its location in relation to your current directory. The
pathname is given as / (slash) separated list of the directories on the route to the file (or
directory) from your current directory.
A .. (dot dot) is used to represent the directory immediately above the current directory.
In all shells except the Bourne shell, the ~ (tilde) character can be used as shorthand for
the full pathname to your home directory.
Unix acess perms
UNIX files have protection mode given by short integer (2 bytes).
Symbolically labeled bits are set as follows
Bit Set Description
b block special file (d and c bits set)
c character special file
d directory
r read permission granted
w write permission granted
x execute permission granted
s set user id on execution
S set group id on execution
File System Model
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 38
Installations of UNIX have several physical file systems on
• different discs
• different partitions of same disc
Physical file systems never span disc partitions.
File systems are sequence of fixed sized file blocks of bytes either
512 1024 2048 ... Pointers link blocks together into ordered chains.
Block size is trade off between performance and storage efficiency
Block Size Advantage larger higher transfer rate between disc and RAM smaller higher effective storage capacity
Components of file system
The word inode is short for index node.
Files may have holes in them, which are
• created by moving pointer past file end and writing data
• interpreted as zero valued bytes
boot block start of file system, typically the first sector initialisation code to boot UNIX, possibly empty super block state of file system - size, file capacity, free space inodes kernel indexes into inode list (includes root inode) data block file and administrative data (no shared blocks)
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 39
Boot Block The Boot block is the beginning of a file system typically the first sector and may
contain the bootstrap code that is read into the machine to boot, or initialize the
Operating system. Although only one boot block is needed to boot the system,
every file system has a possibly empty boot block.
Super Block The super block describes the state of a file system:
The super block consists of the following fields:
1.Size of the file system
2. No. of free blocks in the file system
3. A list of free blocks available on the file system
4. Index of the next free block in the free block list
5. Size of the I- node list
6. No. of free I-nodes in the file system.
7. Index of the next free I-node in the file system
Inode
Each I-node consists of the following information
1. File ownership 2. File type 3. File Access permissions 4. Creation time 5. Modification Time 6. Time of last access 7. Number of links to a file representing the number of names the file has 8. File size 9. Array of 13 pointers to file
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 40
Process file descriptors and disc blocks are linked as above.
File has one inode but may have several names or links that are either
• Hard • Soft
Soft or symbolic link to file is
• implemented as file containing absolute or relative pathname
• interpreted at access time and need not succeed in referring
• interpreted relative to link name's directory if relative name
open() system call applied to link follows the link to its target.
stat() system call applied to link reports link file's status.
Symbolic links unlike hard links can refer to
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 41
• files on other file systems • directories • other symbolic links in a loop
File Updates Updates of files are converted to updates of disc sectors.
Modification of data in logical block is done by
• allocating system buffer • determining location of physical block on disc • reading physical block into system buffer • altering part of buffer contents with user buffer contents • writing block back to disc
Data blocks: Pure data is stored in the data blocks, which commences from the point the I-node. An
allocated data block can belong to one and only one file in the file system.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 42
Block addressing Scheme There are 13 entries in the I-node table containing the addresses of upto 13 disk blocks.
The first 10 addresses are simple. They contain the disk addresses of the first 10 blocks
of the file. However, reserving space for 10 addresses in the I-node table doesn’t mean
that 10 disk blocks are automatically allocated. If a file is only 3 blocks long, the first 3
entries in the table contain the disk block numbers & the remaining entries are flushed
out with zeros. As the file grows beyond 10 blocks, an eleventh block is allocated to
specify a disk block, which contains the addresses of the next 341 data blocks
(Assuming the block size is 1024 & each data is a 3 byte address: 1024/3). This block is
called the Single indirect block. With these eleven pointers, the size of the file
becomes 10k + 341k. When the file grows beyond this size, the twelfth block, known as
the Double indirect block is used. This block contains the address of another block,
which contains the addresses of 341 indirect blocks. This enables us to reference 10k +
341k + (341 * 341k) data blocks. Finally, if the file size exceeds this, which is very
unlikely, the thirteenth pointer, known as the Triple indirect block, enhances the
maximum possible file size to 10k + 341k +(341 * 341k) + (341 * 341 * 341k). The
organization of the data blocks used by the file is depicted below.
Direct 0
Direct 1
Direct 2
Direct 3
Direct 4
Direct 5
Direct 6
Direct 7
Direct 8
Direct 9
single indirect
double indirect
triple indirect
Data BlocksInode
“UNIX”
©
File Related System Calls
open
A file is opened by
The pathname is the name of the file to be opened. The symbolic specification
of the oflag argument which can be combined together by or’ ing together is as
below:
O_RDONLY Open for reading only
O_WRONLY Open for writing only
O_RDWR Open for reading and writing
O_NDELAY Do not block on open or read or write
O_APPEND Append to the end of file on each write
O_CREAT Creat a file if it doesn’t exist
O_TRUNC If the file exists, truncate its length to zero
O_EXCL Error if O_CREAT and the file already exists
The third argument is used only if a new file is being created.
creat
A new file can be created by
#include<fcntl.h>
int creat ( char * pathname, int mode );
Returns a file descriptor if successful, -1 on error
#include<fcntl.h>
int open ( char * pathname, int oflag, [int mode ]);
Returns a file descriptor if successful, -1 on error.
CRANES VARSITY ALL RIGHTS RESERVED 43
“UNIX”
©
This call is equivalent to opening a file with O_CREAT|O_WRONLY|O_TRUNC
mode using the open system call.
close
An open file is closed by
W
r
D
T
a
W
r
W
T
i
#include<fcntl.h>
int close ( int filedes);
Returns a 0 if successful, -1 on error
CRANES VARSITY ALL RIGHTS RESERVED 44
hen a process terminates, the kernel closes all the files automatically.
ead
ata is read from an open file using
here are several cases in which the no. of bytes actually read is less than the
mount requested.
hen reading from a regular file, if the no. of bytes are less than what was
equested for.
hen reading from a terminal device, normally upto one line is read at a time.
he read operation starts at the file’s offset. Before a successful return, the offset
s incremented by the number of bytes actually read.
#include<fcntl.h>
int read ( int filedes, char * buf, int nbytes);
Returns the no. of bytes read if successful – this can be less than the nbytes that was requested, 0 if there are no bytes to be read or -1 on error
“UNIX”
©
write
Data is written into an open file using
lE
a
f
t
p
f
T
#include<fcntl.h>
int write ( int filedes, char * buf, int nbytes);
Returns the no. of bytes written if successful – this can be less than the nbytes that was requested, 0 if there is no space to write or -1 on error
CRANES VARSITY ALL RIGHTS RESERVED 45
seek very open file has a current byte position associated with it. This is measured
s the number of bytes from the start of the file. The creat system call sets the
ile’s position to the beginning of the file, as does the open system call, unless
he O_APPEND is set. The read and write system calls update the file’s
osition by the number of bytes read or written. Before read or write, an open
ile can be positioned using
he offset and whence arguments are interpreted as follows:
If the whence is 0, the file’s position is set to offset bytes from the beginning of the file.
If the whence is 1, the file’s position is set to its current position plus the offset. The offset can be positive or negative.
If whence is 2, the file’s position is set to the size of the file plus offset. The offset can be positive or negative.
#include<fcntl.h>
long lseek ( int filedes, long offset, int whence);
Returns the new long integer byte offset of the file or -1 on error
“UNIX”
©
link link system call adds a new link to a directory. Every time a new file is created,
you are putting a pointer to a directory. This pointer associates a filename with a
place on the disk. The link utility creats an additional pointer to an exixting file. It
does not make another copy of the file. Because there is only one file, the file
status information is the same.
The first parameter, old path must be an existing link. The second parameter,
new path indicates the name of the new link.
unlink
The unlink system call removes a specified link from the directory, reducing the
link count in the
I-node by one. If the resulting link count is zero, the file system will discard the
file. All disk space that is used will be made available for reuse. The I-node will
become available for reuse too.
#include<fcntl.h>
int link ( char * old path, char* new path);
Returns 0 on success or -1 on error
#include<fcntl.h>
int unlink ( char * path );
Returns 0 on success or -1 on error
CRANES VARSITY ALL RIGHTS RESERVED 46
“UNIX”
©
Chmod
This system call allows us to change the access permissions for an existing file.
C
Ta
f
To
T
F
F
#include<fcntl.h>
int chmod ( char * path, int mode );
Returns 0 on success or -1 on error.
hown
his system call allows us to change the ownership for an existing file, i.e, it llows us to change the User Id and Group Id of the file.
cntl
he fcntl system call is used to change the properties of a file that is already pen.
#include<fcntl.h>
int chown ( char * path, int owner, int group);
Returns 0 on success or -1 on error.
#include<fcntl.h>
int fcntl ( int filedes, int cmd, int arg );
Returns 0 on success or -1 on error.
CRANES VARSITY ALL RIGHTS RESERVED 47
he cmd argument must be one of the following:
_DUPFD Duplicate the file descriptor filedes. It allows us to specify the lowest number that the new filedescriptor is to assume indicated by the value of arg. The return value is the new file descriptor.
_SETFD Set the close-on-exec flag for the file to the low-order bit of arg. If the low order bit of arg is set, the file is closed on exec system call. Otherwise the file remains open across an exec.
“UNIX”
©
F_GETFD Return the close-on-exec flag for the file as the value of the system call.
F_SETFL Set the status flags for this file to the value of the arg. The only flags that can be changed are O_APPEND and O_NDELAY.
F_GETFL Return the status flags for this file as the value of the system call.
Stat and fstat
The stat and fstat system calls return the attributes of a specified file to the caller.
Sp
f
s
T
S
#include<sys/types.h> #include<sys/stat.h>
int stat ( char * pathname, struct stat * buf);
int fstat ( int filedes, struct stat * buf );
Returns 0 on success or -1 on error.
CRANES VARSITY ALL RIGHTS RESERVED 48
tat and fstat are used to get status information from an I-node. Stat takes a
ath and finds the I-node by following it. fstat takes an open file descriptor and
inds the I-node from the active I-node table inside the user supplied stat
tructures which is defined in /usr/include/sys/stat.h.
he stat structure used in these system calls is given below:
truct stat { ushort st_mode; /* file type ans acess permissions*/ ino_t st_ino; /* I-node number*/
dev_t st_dev; /*Id of the device containing a directory entry for this file*/ short st_nlinks; /*number of links*/ ushort st_uid; /*User Id*/ ushort st_gid; /*Group Id*/
dev_t st_rdev; /*Id for the device, for char special or block special files.*/ off_t st_size; /* file size in bytes*/ time_t st_atime; /* time of last file access*/ time_t st_mtime; /* time of last file modification*/ time_t st_ctime; /* time of last file status change*/ };
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 49
Chapter-3 Process management
UNIX Process
A process is an instance of running a program. If, for example, three people are
running the same program simultaneously, there are three processes there, not just
one. In fact, we might have more than one process running even with only person
executing the program, because (you will see later) the program can ``split into two,''
making two processes out of one.
UNIX Process State Transition Diagram
“UNIX”
© CRAN
Process states & transitions
The lifetime of a process can be conceptually divided into a set of states.
1.The process is executing in user mode.
2.The process is executing in kernel mode.
3. The process is not executing , but is ready to run as soon as the kernel schedules it.
4.The process is sleeping and resides in main memory.
5. The process is ready to run, but the swapper process swaps it out of memory.
6. The process is sleeping, and swapper has swapped out the process.
7.The process is returning from the kernel to user mode, but the kernel preempts it.
8.The process id newly created and is in a transition state. i.e. process is neither in
sleep state nor is ready to run. The start state of a process.
9.The process executed the exit system call and is in the zombie state.
Process Data structures
Proc
u-area
ES VARSITY ALL RIGHTS
ess table
per process region table
RESERVED
iProcess table
memoryMain memory
50
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 51
Process table entry /u-area
Layout of system memory •UNIX system contains three logical sections: text, data and stack.
•The text section m/c executable instruction set of a process; addresses in the text
section include text, data and stack addresses.
•Compiler generates addresses for a virtual address space with a given address range
and the m/c’s memory management unit translates the virtual addresses generated by
the compiler into address locations in physical memory.
•The subsystems of the kernel and the hardware that cooperate to translate virtual to
physical addresses comprise the memory management subsystem.
pointer to
u area
state of
process
Size of
process
UIDs
PIDs
Event
descriptor
scheduling params
enum of
signals
times used
usr/sys
alarm
process table entry
pointer to PT
entry
Real UID
effect-ive UID
React to
signals
array
login terminal
errors of sys call
Return val of sys call
I/O
parameters
times used usr/sys
current directo
ry
current root
user file
descriptor
file size limit
process size
limit
U-area
Pointer to
dynamic stack
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 52
Regions •The kernel divides the virtual address space of a process into logical regions.
•The concept of the regions is independent of the memory management policies
implemented by the operating system.
•A region is a contiguous area of the virtual address space of a process that can be
treated as a distinct object to be shared or protected.
•The region table contains the information to determine where its contents are located in
physical memory.
•The per process region table. Each pregion entry has 3 fields.
1. points to a region table entry.
2. contains the start virtual address of the region, and
3. a permission field that indicates the type of access allowed to a process.
Processes & regions
Per Process region table (virtual address)
text
data
stack
text
data
stack 32k
8k
16k
4k
8k
32k
b
a
c
d
e
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 53
Memory triplets
Memory is organised in pages of 1K bytes, accessed via page tables. The system
contains a set of mmu reg triples as shown.
addr of page table First vir addr control info such as no. of pages,
in physical mem mapped page access perms etc.,
Context of a process
The context of a process
–consists of contents of its(user) address space, contexts of hardware registers and
Kernel data structures
–is the union of its user-level context, register context and system-level context
–System-level context consists of static and dynamic portion
Dynamic Portion of Context
Static Portion of Context
User level Context
Process text Data Stack
Shared Data
Process table entry U Area
Per Process Region Table
Static Part of System Level Context
User Level
Kernel Context Layer 0
Kernel Stack for Layer 1 Saved Register Context For Layer 0
Layer 1
Kernel Stack for Layer 2 Saved Register Context For Layer 1
Layer 2
Kernel Stack for Layer 3 Saved Register Context For Layer 2
Layer 3
logical pointer to current context layer
Components of the Context of a Process
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 54
How the Kernel Manages Processes in Unix
Address Space: For each new process created, the kernel sets up an address space
in memory. This address space consists of the following logical segments:
• text - contains the program's instructions.
• data - contains initialized program variables.
• bss - contains uninitialized program variables.
• stack - a dynamically growable segment, it contains variables allocated locally
and parameters passed to functions in the program.
Each process has two stacks: a user stack and a kernel stack. These stacks are used
when the process executes in the user or kernel mode (described below).
Mode Switching: At least two different modes of operation are used by the Unix kernel
- a more privileged kernel mode, and a less privileged user mode. This is done to
protect some parts of the address space from user mode access.
User Mode: Processes, created directly by the users, whose instructions are currently
executing in the CPU are considered to be operating in the user-mode. Processes
running in the user mode do not have access to code and data for other users or to
other areas of address space protected by the kernel from user mode access.
Kernel Mode: Processes carrying out kernel instructions are said to be running in the
kernel-mode. A user process can be in the kernel-mode while making a system call,
while generating an exception/fault, or in case on an interrupt. Essentially, a mode
switch occurs and control is transferred to the kernel when a user program makes a
system call. The kernel then executes the instructions on the user's behalf.
While in the kernel-mode, a process has full privileges and may access the code and
data of any process (in other words, the kernel can see the entire address space of any
process).
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 55
The Context of a Process and Context Switching: The context of a process is
essentially a snapshot of its current runtime environment, including its address space,
stack space, etc. At any given time, a process can be in user-mode, kernel-mode,
sleeping, waiting on I/O, and so on. The process scheduling subsystem within the
kernel uses a time slice of typically 20ms to rotate among currently running processes.
Each process is given its share of the CPU for 20ms, then left to sleep until its turn
again at the CPU. This process of moving processes in and out of the CPU is called
context switching. The kernel makes the operating system appear to be multi-tasking
(i.e. running processes concurrently) via the use of efficient context-switching.
At each context switch, the context of the process to be swapped out of the CPU is
saved to RAM. It is restored when the process is scheduled its share of the CPU again.
All this happens very fast, in microseconds.
To be more precise, context switching may occur for a user process when
• a system call is made, thus causing a switch to the kernel-mode,
• a hardware interrupt, bus error, segmentation fault, floating point exception, etc.
occurs,
• a process voluntarily goes to sleep waiting for a resource or for some other
reason, and
• the kernel preempts the currently running process (i.e. a normal process
scheduler event).
Context switching for a user process may occur also between threads of the same
process.
Extensive context switching is an indication of a CPU bottleneck.
Context switch can occur under the following situations:
• When the Process puts itself to sleep
• When a process exits
• When it returns from a sys call to user mode, but is not eligible to run
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 56
• When it returns to user mode after the kernel completes handling an interrupt, but is
not eligible to run.
Talking to Running Processes
Unix provides a way for a user to communicate with a running process. This is
accomplished via signals, a facility which enables a running process to be notified
about the occurrence of a) an error event generated by the executing process, or b) an
asynchronous event generated by a process outside the executing process.
Signals are sent to the process ultimately by the kernel. The receiving process has to be
programmed such that it can catch a signal and take a certain action depending on
which signal was sent.
Here is a list of common signals and their numerical values:
SIGHUP 1 Hangup
SIGINT 2 Interrupt
SIGKILL 9 Kill (cannot be caught or ignore)
SIGTERM 15 Terminate (termination signal from SIGKILL)
(Many more signals exist; these are the most commonly used ones.)
You send a running process a signal using the Unix kill command. The basic usage is
kill -<VALUE> <PID_OF_THE_PROCESS>
Processes vs. Jobs
During the Unix shell discussion, we spoke of job control in csh and other, newer shells.
Job control is basically an explicit exercise in using signals. When you fire up a long
running command or program from your shell prompt, you are starting a process. If you
hit CTRL-Z, assuming it's bound to your tty's suspend function (i.e. "stty susp ^Z"), you
are sending the process a terminal stop signal (SIGTSTP) and asking it to be stopped. It
may be brought back to a running state with a shell command like fg, which sends a
continue signal (SIGCONT) to the sleeping (i.e. waiting) process.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 57
Process Related system Calls fork The only way a new process is created by the kernel is when an existing process calls the fork system call.
The new process created by fork is called the child process. This system call is
called once but returns twice. The only difference in the returns is the return
value in the child is 0. While the return value in the parent is the process ID of the
new child. The reason the child’s process ID is returned to the parent is because
the process can have more than one child. So there is no function that allows
process to obtain the process Ids of its children. The reason fork returns zero to
the child is because a process has only a single parent, so the child can always
call getppid to obtain the process ID of its parent.
Both the child and parent continue executing with the instruction that follows the
call to fork. The child is the copy of the parent. For eg, the child gets a copy of
parenmt’s data space, heap and stack. Note that this is the copy for the child, the
parent and the child do not share these portion of memory.
Many current implementations do not perform a complete copy of the parent’s
data, heap and stack, since a fork is often followed by an exec. Inbstead they
use copy-on-write. These regions are shared by the parent and the child and
have their protection changed by the kernel to read only. If either process tries to
modify this region, the kernel makes a copy of that piece of memory only typically
a “page” in a virtual memory system.
An important feature of fork operation is that the child process shares the files
that were open in the parent process before the fork. This feature provides an
int fork ( );
Returns 0 to child and process ID of child to partent or -1 on error.
“UNIX”
©
easy way for the parent process to open specific files or devices and pass those
open files to the child process. After the fork the parent closes the files that it
opened for the child, So that the processes are not sharing the same file.
The values of the following in child process are copied from the parent process
• The real user ID
• Real group ID
• Effective user ID
• Effective group ID
• Process group ID
• Terminal group ID
• Root directory
• Current working directory
• Signal handling settings
• File mode creation mask
The child process differs from the parent process in the following ways: • The child process has a new, unique process ID
• The child process has a different Parent Process ID
• The return value from fork
• It has its own copies of the parents file descriptors
• The time left until an alarm clock signal is set to zero in the file.
• File locks set by the parent are not inherited by the child
Wait and waitpid
The process can wait for one of the child processes to finish by executing the wait system call.
int wait ( int * status );
int waitpid ( int pid, int * status, int options );
Returns process ID of the child that terminated if successful or -1 on error.
CRANES VARSITY ALL RIGHTS RESERVED 58
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 59
If the process that calls the wait does not have any child processes, wait returns a value
of –1 immediately. If the process that calls wait has one or more child processes that
has not yet terminated, then the calling
process is suspended by the kernel until one of its child processes terminates. When a
child process terminates and waits returns, if the status argument is not NULL, the
value passed to exit by terminating child process is stored in the status variable. Some
additional information is also returned by wait.
There are three conditions for which wait returns a PID as its return value.
1. A child process called exit
2. A child process was terminated by a signal
3. A child process was being traced and the process stopped. This occurs when
process tracing the execution of another process, such as when a debugger is
being used to step through a process.
What happens to the parent process ID of a child process when the parent process
terminates before the child process?
There are the following possible scenarios to consider
1. The child process terminates before the parent process:
This is the “normal” condition when we are entering commands to an
interactive shell
a. If the parent process has already executed a wait, then the wait returns to
the parent process with the process ID of the child that terminated.
b. If the parent process has not executed a wait, then the child process
becomes a “Zombie” process. (If the parent process of the existing
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 60
process is not executing a wait, the terminating process is marked as
“Zombie” Process.)
The parent process terminates before the child process; the child process
becomes an “Orphan” process. For this child process that are about to be
orphaned, UNIX sets their parent ID to 1, the PID of the init process.
The difference between these tow system calls are:
Wait blocks the caller until a child process terminates, while waitpid has
an option that prevents it from blocking.
Waitpid does not wait for the first child to terminate. It has a number of
options that control which process it waits for. For both the system calls, the second parameter status is a pointer to an integer.
If this argument is not a NULL, the termination or the exit status of the terminated
process is stored in the location pointed to by the argument.
The interpretation of the pid argument for waitpid depends on its value:
pid == -1 waits for any child process (equivalent to wait )
pid > 0 waits for the child whose process ID equals pid
pid == 0 waits for any child whose process group ID equals that of the calling process
pid < -1 waits for any child whose process group ID equals the absolute value of pid
waitpid returns the process ID of the child that terminated, and its termination
status is returned through status. With wait a -1 is returned if the calling
process has no children. With waitpid, however, it is possible to get an error if th
specified process or the process group does not exist ot it is not a child of the
calling process.
“UNIX”
©
The option argument lets us further control the operation of waitpid. This
argument is either 0 or is constructed from the bitwise OR of the following
constants:
WNOHANG waitpid will not block if a child specified by the pid is not
immediately available. In this case the return value is 0.
WUNTRACED if the implementation supports job control, the status of any
child specified by the pid that has stopped, and the whose status
has not been reported since it has stopped, is returned.
Hence waitpid allows us to wait for a particular process, provides nonblocking
version of wait and supports jobcontrol (with WUNTRACED option)
exec
The only way to execute a program in UNIX is for an existing process to issue the exec system call.
T
ed
c
p
int execlp( char * filename, char* arg0, char* arg1, …, char* argn, (char*) 0); int execl( char * pathname, char* arg0, char* arg1, …, char* argn, (char*) 0); int execle( char* pathname, char* arg0,char* arg1, …,char* argn, (char*)0, char**envp); int execvp( char * filename, char** argv); int execv( char * pathname, char** argv);
int execve( char * pathname, char** argv, char** envp);.
Returns to the caller only if an error occurs. Otherwise the control is passed to the start of the new program.
CRANES VARSITY ALL RIGHTS RESERVED 61
he exec system call replaces the current process with the new program. The
xec system call reinitializes a process from a designated program. The PID
oes not change. We refer to a process that issues an exec system call as the
alling process and the program that is execed as the new program. The
rocess ID does not change across an exec. The relationship between these six function are shown:
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 62
The program invoked by the exec system call inherits the following attributes from the process that calls exec.
• PID
• PPID
• GPID
• TPID
• Time left until an alarm clock signal
• Root directory
• Current working directory
• File mode creation mask
• Real user ID
• Real Group ID
• File locks
The two attributes that can change a new program is execed are:
• Effective user ID, Effective Group ID
getpid, getppid, getuid, geteuid, getgid, getegid
exit
Sys call
Add envp
Convert file to path
execlp (file, arg, ---,0) execl (path, arg, ---,0) execle (path, arg, ---,0, envp)
execve (path, argv, envp) execv (path, argv) execvp (file, argv)
int getpid ( void ); Returns process ID of calling process
int getppid ( void ); Returns parent process ID of calling process
int getuid ( void ); Returns real user ID of calling process
int geteuid ( void ); Returns effective user ID of calling process
int getgid ( void ); Returns real group ID of calling process
int getegid ( void ); Returns effective group ID of calling process
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 63
exit A process terminates by calling the exit system call. This system call never returns to
the caller. When exit is called, and int exit status is passed by the process to the kernel.
This exit status is then available to the parent process of the exiting process through the
wait system call. The low order 8 bits of the exit status only should be used, allowing a
process to terminate with an exit status in the range 0 through 255. By
convention, a process that terminates normally returns an exit status of zero, while the
nonzero values are used to indicate an error condition.
void exit (int status);
No Return value.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 64
Chapter-4 Memory Management
Memory Management under Unix
One of the numerous tasks the Unix kernel performs while the machine is up is to
manage memory. In this section, we explore relevant terms (such as physical vs. virtual
memory) as well as some of the basic concepts behind memory management.
• Physical vs. virtual memory
• What is a page of memory?
• Cache memory
• How the kernel organizes memory:
Dividing the RAM
System and user areas
• Paging vs. swapping
Physical vs. Virtual Memory
Unix, like other advanced operating systems, allows you to use all of the physical
memory installed in your system as well as area(s) of the disk (called swap space)
which have been designated for use by the kernel in case the physical memory is
insufficient for the tasks at hand. Virtual memory is simply the sum of the physical
memory (RAM) and the total swap space assigned by the system administrator at the
system installation time. Mathematically,
Virtual Memory (VM) = Physical RAM + Swap space
Dividing Memory into Pages
The Unix kernel divides the memory into manageable chunks called pages. A
single page of memory is usually 4096 or 8192 bytes (4 or 8KB). Memory pages
are laid down contiguously across the physical and virtual memory.
Cache Memory
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 65
With increasing clock speeds for modern CPUs, the disparity between the CPU speed
and the access speed for RAM has grown substantially. Consider the following:
Typical CPU speed today: 250-500MHz (which translates into 4-2ns clock tick)
Typical memory access speed (for regular DRAM): 60ns
Typical disk access speed: 13ms
In other words, to get a piece of information from RAM, the CPU has to wait for 15-30
clock cycles, a considerable waste of time.
Fortunately, cache RAM has come to the rescue. The RAM cache is simply a small
amount of very fast (and thus expensive) memory that is placed between the CPU and
the (slower) RAM. When the kernel loads a page from RAM for use by the CPU, it also
prefetches a number of adjacent pages and stores them in the cache. Since programs
typically use sequential memory access, the next page needed by the CPU can now be
supplied very rapidly from the cache. Updates of the cache are performed using an
efficient algorithm, which can enable cache hit rates of nearly 100% (with a 100% hit
ratio being the ideal case).
CPUs today typically have hierarchical caches. The on-chip cache (usually called the L1
cache) is small but fast (being on-chip). The secondary cache (usually called the L2
cache) is often not on-chip (thus a bit slower) and can be quite large, sometimes as big
as 16MB for high-end CPUs (obviously, you have to pay a hefty premium for a cache
that size)
How the Kernel Organizes Memory
Dividing the RAM
When the kernel is first loaded into memory (at boot time), it sets aside a certain amount
of RAM for itself as well as for all system and user processes: Main categories in which
RAM is divided are:
• Text: to hold the text segments of running processes.
• Data: to hold the data segments of running processes.
• Stack: to hold the stack segments of running processes.
• Shared Memory: This is an area of memory which is available to running
programs if they need it. Consider a common use of shared memory: Let assume
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 66
you have a program which has been compiled using a shared library (libraries that
look like libxxx.so; the C-library is a good example - all programs need it). Assume
that five of these programs are running simultaneously. At run-time, the code they
seek is made resident in the shared memory area. This way, a single copy of the
library needs to be in memory, resulting in increased efficiency and major cost
savings.
• Buffer Cache: All reads and writes to the filesystem are cached here first. You
may have experienced situations where a program that is writing to a file doesn't
seem to work (nothing is
written to the file). You wait a while, then a sync occurs, and the buffer cache is
dumped to disk and you see the file size increase.
The System and User Areas
When the kernel loads, it uses RAM to keep itself memory resident. Consequently, it
has to ensure that user programs do not overwrite/corrupt the kernel data structures (or
overwrite/corrupt other users' data structures). It does so by designating part of RAM as
kernel or system pages (which hold kernel text and data segments) and user pages
(which hold user stacks, data, and text segments). Strong memory protection is
implemented in the kernel memory management code to keep the users from corrupting
the system area. For example, only the kernel is allowed to switch from the user to the
system area. During the normal execution of a Unix process, both system and user
areas are used.
A common system call when memory protection is violated is SIGSEGV (you see a
"Segmentation violation" message on the screen when this happens. The culprit
process is killed and its in-memory portions dumped to a disk file called "core").
Paging vs. Swapping
Paging: When a process starts in Unix, not all its memory pages are read in from the
disk at once. Instead, the kernel loads into RAM only a few pages at a time. After the
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 67
CPU digests these, the next page is requested. If it is not found in RAM, a page fault occurs, signaling the kernel to load the next few pages from disk into RAM. This is
called demand paging and is a perfectly normal system activity in Unix. (Just so you
know, it is possible for you, as a programmer, to read in entire processes if there is
enough memory available to do so.)
The Unix SVR4 daemon which performs the paging out operation is called pageout. It is
a long running daemon and is created at boot time. The pageout process cannot be
killed. There are three kernel variables which control the paging operation (Unix SVR4):
• minfree - the absolute minimum of free RAM needed. If free memory falls below
this limit, the memory management system does its best to get back above it. It
does so by page stealing from other, running processes, if practical.
• desfree - the amount of RAM the kernel wants to have free at all times. If free
memory is less than desfree, the pageout syscall is called every clock cycle.
• lotsfree - the amount of memory necessary before the kernel stops calling
pageout. Between desfree and lotsfree, pageout is called 4 times a second.
Swapping: Let's say you start ten heavyweight processes (for example, five xterms, a
couple netscapes, a sendmail, and a couple pines) on an old 486 box running Linux
with 16MB of RAM. Basically, you *do not have* enough physical RAM to accomodate
the text, data, and stack segments of all these processes at once. Since the kernel
cannot find enough RAM to fit things in, it makes use of the available virtual memory by
a process known as swapping. It selects the least busy process and moves it in its
entirety (meaning the program's in-RAM text, stack, and data segments) to disk. As
more RAM becomes available, it swaps the process back in from disk into RAM. While
this use of the virtual memory system makes it possible for you to continue to use the
machine, it comes at a very heavy price. Remember, disks are relatively slower (by the
factor of a million) than CPUs and you can feel this disparity rather severely when the
machine is swapping. Swapping is not considered a normal system activity. It is
basically a sign that you need to buy more RAM.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 68
In Unix SVR4, the process handling swapping is called sched (in other Unix variants, it
is sometimes called swapper). It always runs as process 0. When the free memory falls
so far below minfree that pageout is not able to recover memory by page stealing,
sched invokes the syscall sched(). Syscall swapout is then called to free all the memory
pages associated with the process chosen for being swapping out. On a later invocation
of sched(), the process may be swapped back in from disk if there is enough memory.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 69
Chapter-5 Locking Techniques in UNIX
There are situations where multiple processes want to share some
resource.Locking is a facility provided so that only one process at a time can
access the resources. Locking is of two types. They are
• Advisory Locking
• Mandatory Locking
Advisory Locks Vs Mandatory Locks Advisory locking means that the operating system maintains a correct knowledge
of which files have been locked by which process, but it does not prevent some
process from writing to a file that is locked by another process. A process can
ignore an sdvisory lock and write to a file that is locked, if the process has
adequate permissions. Advisory locks are fine for what is known as cooperating
processes (the programs that accesses a shared resource ) .
The other type of file locking-mandatory locking, is provided by some
systems.Mandatory locks mean that the operating system checks every read and
write request to verify that the operation does not interfere with a lock held by a
process.
File locking Vs Record Locking File locking locks an entire file, while record locking allows a process to lock a
specified portion of a
file. The definition of a record for UNIX record locking is given by specifying a
starting byte offset in the file and the number of bytes from that position.
Lockf file Locking
#include<unistd.h>
int lockf(int fd, int function, long size);
Returns 0 on success, or -1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 70
The fd is the file descriptor of the file to be locked. The function has one of the following values: F_ULOCK unlock a previously locked region
F_LOCK lock a region(blocking)
F_TLOCK Test and lock a region(nonblocking)
F_TEST Test a region to see if it is locked The size is the size of bytes to be locked. The lockf function uses the current file offset offset (which the process can set
using the lseek system call) and the size argument to define the “record”. The
record starts at the current offset and extends forward for a positive size, or
extends backwards for a negative size. If the size is 0, the record affected
extends from the current offset through the largest file offset(the end of file).
Doing an lseek to the beginning of the file followed by a lockf with a size of zero
locks the entire file.
The lockf function provides both the ability to set a lock and test if a lock is set.
When the function is F_TLOCK and the region is already locked ny another
process, the calling process is put to sleep until the region is available. This is
termed blocking. The F_TLOCK operation, however is termed a nonblocking call
– if the region is not available, lockf returns immediately with a value of –1. Also
the F_TEST operation allows a process to test if a lock is set, without setting a
lock.
fcntl record locking
For record locking cmd is F_GETLK F_SETLK F_SETLKW
#include<fcntl.h>
int fcntl(int fd, int cmd, struct flock * arg);
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 71
The third argument is a pointer to a flock structure Struct flock { short l_type; /* F_RDLCK, F_WRLCK or F_UNLCK */ off_t l_start; /* offset in bytes, relative to l_whence */ short l_whence; /* SEEK_SET, SEEK_CUR or SEEK_END */ off_t l_len; /* length, in bytes; 0 means lock till EOF */ pid_t l_pid; /* returned with F_GETLK */ } This structure describes
• The type of lock desired: F_RDLCK (a shared read lock), F_WRLCK (an
exclusive write lock), or F_UNLCK ( unlocking a region)
• The starting byte offset of the region being locked or unlocked (l_start and
l_whence)
• The size of the region(l_len)
There are numerous rules about the specification of the region to be locked or unlocked.
• The two elements that satisfy the starting offset of the region are similar to
the last two arguments of the lseek function.Indeed, the l_whence
member is specified as SEEK_SET, SEEK_CUR or SEEK_END.
• Locks can start and extend beyond the current end of file, but cannot start
or extend before the beginning of the file.
• If the l_len is 0, it means that the lock extends to the largest possible
offset of the file (till the end of file). This allows us to lock a region starting
anywhere in the file, up through and including any data that is appended
to the file.
• To lock the entire file, we set l_start and l_whence to point to the
beginning of the file, and specify a length (l_len) of 0.
The basic rule is that any number of processes can have a shared read lock on a
byte, but only one process can have an exclusive write lock on a given byte.
Furthermore, if there are one or more read locks on a byte, there cannot be any
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 72
write lock on that byte, and if there is an exclusive write lock on a byte, there
cannot be any read locks on that byte.
To obtain a read lock, the descriptor must be open for reading, and to obtain a
write lock the descriptor must be open for writing.
The three different cmd for the fcntl function. F_GETLK Determine if the lock described by the structure flockptr is blocked by
some other lock. If a lock exists, that would prevent ours from being
created, the information on that existing lock overwrites the information
pointed to by the flockptr. If no lock exists, that would prevent ours from
being created, the structure pointed to by the flockptr is left unchanged
except for the l_type member, which is set to F_UNLCK.
F_SETLK Set the lock described by flockptr. If we are trying to obtain a read lock (
l_type or F_RDLCK) or a write lock (l_type of F_WRLCK) and the
compatibility rule prevents the system from giving us the lock, fcntl returns immediately with error.This command is also used to clear the
lock described by flockptr(l_type or F_UNLCK).
F_SETLKW This command is a blocking version of F_SETLK. If the requested read
lock or write lock cannot be guaranteed because another process
currently has some part of the requested region locked, the calling
process is put to sleep. This sleep is interrupted if a signal is caught.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 73
Chapter-6 Inter Process Commnication
Communication in UNIX plays a very important role. Process in computer
memory are said to be communicating when a process passes data to another or
vice-versa. The only requirement is that the communicating processes must
mutually agree with the means of communication.
The following are some methods of Inter Process Communication under UNIX. • Signals
• Pipes
• FIFO
• Message Queues
• Semaphores
• Shared Memory
• Sockets
Interrupts and Signals:
In this section will look at ways in which two processes can communicate using signals.
When a process terminates abnormally it usually tries to send a signal indicating what
went wrong. User specified communication could take place in this way.
Signals are software generated interrupts that are sent to a process when an event
occurs. Signals can be synchronously generated by an error in an application, such as
SIGFPE and SIGSEGV, but most signals are asynchronous. Signals can be posted to a
process when the system detects a software event, such as a user entering an interrupt
or stop or a kill request from another process. Signals can also be come directly from
the OS kernel when a hardware event such as a bus error or an illegal instruction is
encountered. The system defines a set of signals that can be posted to a process.
Signal delivery is analogous to hardware interrupts in that a signal can be blocked from
being delivered in the future. Most signals cause termination of the receiving process if
no action is taken by the process in response to the signal. Some signals stop the
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 74
receiving process and other signals can be ignored. Each signal has a default action
which is one of the following:
• The signal is discarded after being received
• The process is terminated after the signal is received
• A core file is written, then the process is terminated
• Stop the process after the signal is received
Each signal defined by the system falls into one of five classes: • Hardware conditions
• Software conditions
• Input/output notification
• Process control
• Resource control
Macros are defined in <signal.h> header file for common signals. These include: SIGHUP 1 /* hangup */ SIGINT 2 /* interrupt */
SIGQUIT 3 /* quit */ SIGILL 4 /* illegal instruction */
SIGABRT 6 /* used by abort */ SIGKILL 9 /* hard kill */
SIGALRM 14 /* alarm clock */
SIGCONT 19 /* continue a stopped process */
SIGCHLD 20 /* to parent on child stop or exit */ Signals can be numbered from 0 to 31.
Sending Signals -- kill
The common function used to send signals The first parameter is the process ID of the process to which the signal is sent.
int kill(int pid, int signal);
Returns 0 on success, or -1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 75
The pid can can have the following values: pid > 0 the signal is sent to the process whose process ID is equal to pid. pid == 0 the signal is sent to all processes in the sender’s process group. pid == -1 The kernel sends the signal to all processes whose Real user ID =
Effective user ID of the sender. If the sender process has Effective user ID
of Super user, the kernel sends the signal to all the process except
process 0 and 1.
pid < -1 the kernel sends the signal to all process in the process group equal to the
absolute value of the pid. The second parameter is the signal number. There is also a UNIX command called kill that can be used to send signals from the
command line - see man pages for further details.
NOTE: that unless caught or ignored, the kill signal terminates the process. Therefore
protection is built into the system.
Only processes with certain access privileges can be killed off.
Basic rule: only processes that have the same user can send/receive messages.
The SIGKILL signal cannot be caught or ignored and will always terminate a process.
For example kill (getpid (), SIGINT); would send the interrupt signal to the id of the
calling process.
This would have a similar effect to exit() command. Also ctrl-c typed from the command
sends a SIGINT to the process currently being.
Signal Handling -- signal()
An application program can specify a function called a signal handler to be invoked
when a specific signal is received. When a signal handler is invoked on receipt of a
signal, it is said to catch the signal. A process can deal with a signal in one of the
following ways:
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 76
• The process can let the default action happen
• The process can block the signal (some signals cannot be ignored)
• The process can catch the signal with a handler.
Signal handlers usually execute on the current stack of the process. This lets the signal
handler return to the point that execution was interrupted in the process. This can be
changed on a per-signal basis so that a signal handler executes on a special stack. If a
process must resume in a different context than the interrupted one, it must restore the
previous context itself
Receiving signals is straighforward with the function:
int ( *signal (int sig, void (*func)( )))( ) -- that is to say the function signal( ) will call the
func functions if the process receives a signal sig. Signal returns a pointer to function
func if successful or it returns an error to errno and -1 otherwise.
func( ) can have three values:
SIG_DFL -- a pointer to a system default function SID_DFL( ), which will terminate the process upon
receipt of sig.
SIG_IGN -- a pointer to system ignore function SIG_IGN( ) which will disregard the sig action (UNLESS it
is SIGKILL).
A function address -- a user specified function.
SIG_DFL and SIG_IGN are defined in signal.h (standard library) header file.
Thus to ignore a ctrl-c command from the command line. We could do:
signal(SIGINT, SIG_IGN);
TO reset system so that SIGINT causes a termination at any place in our program, we
would do:
signal(SIGINT, SIG_DFL);
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 77
So lets write a program to trap a ctrl-c but not quit on this signal. We have a function
sigproc ( ) that is executed when we trap a ctrl-c. We will also set another function to
quit the program if it traps the SIGQUIT signal so we can terminate our program:
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 78
Chapter-7 The Pipe
Every user of UNIX has almost certainly used the pipe at some stage in interacting with
the operating system. "ls | more" pipes the output of the ls command to the more
command, so producing a paged listing of a directory. In effect the pipe acts as a
temporary file holding the output from the first command until it is read by the second.
Remember here that each of these two commands is run as a separate process on the
system. Assuming that the ls command runs for long enough then the ps command can
be used to examine all the processes on the system. Both the ls and the more
processes will be seen.
This method of establishing a pipe requires the use of the shell and could be used as a
means of communicating between processes if a program writes and then executes a
shell script. This is not perhaps the most efficient method of working, as it requires the
creation of a number of processes with the associated overheads involved. Execution of
"ls | more" from within a program would require generation of a shell process which
would subsequently create ls and more processes.
The pipe( ) system call returns an array of two file descriptors, the first one open for
reading and the second for writing. Data can therefore be passed down the write only
file descriptor and will appear out of the read only file descriptor. Note here that these
are UNIX file descriptors which are integer numbers representing open files, not C FILE
pointers. Reading and writing to and from these descriptors must therefore use the
UNIX read ( ) and write( ) system calls.
int pipe ( pfd );
int pfd[2];
Returns 0 on success, or -1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 79
For using pipe with in a single process: #include <signal.h> main( ) { int pfd[2]; charbuf[10]; int pid; pipe (pfd); pid = fork ( ); switch (pid) {
case –1: printf(“Fork failed\n”); exit(2);
default: if(write (pfd[1], “hello”, 6)= = -1) printf(“Write failed\n”); break; case 0: sleep(1); if(read(pfd[0], buf, sizeof(buf)= = -1) printf(“ Read failed\n”); break; }
}
Pipe does a destructive read, which means that the data once read from the pipe
cannot be retrieved. A pipe has a finite size, always atleast 4K.
The main disadvantage of pipes is:
They can be used only between the inter-related processes, like parent and child.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 80
Chapter-8 Fifos
A named pipe works much like a regular pipe, but does have some noticeable
differences.
• Named pipes exist as a device special file in the file system.
• Processes of different ancestry can share data through a named pipe.
• When all I/O is done by sharing processes, the named pipe remains in the file
system for later use.
Creating a FIFO
There are several ways of creating a named pipe. The first two can be done directly
from the shell.
mknod MYFIFO p
mkfifo a=rw MYFIFO
The above two commands perform identical operations, with one exception. The mkfifo
command provides a hook for altering the permissions on the FIFO file directly after
creation. With mknod, a quick call to the chmod command will be necessary.
FIFO files can be quickly identified in a physical file system by the ``p'' indicator seen
here in a long directory listing:
$ ls -l MYFIFO
prw-r--r-- 1 root root 0 Dec 14 22:15 MYFIFO|
Also notice the vertical bar (``pipe sign'') located directly after the file name. Another
great reason to run Linux!
mknod pathname is the name of the FIFO to be created.
mode is S_IFIFO|Permissions
#include <sys/types.h>
int mknod( char *pathname, int mode, int dev );
Returns 0 on success, or -1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 81
dev is 0 ( ignored )
I/O operations on a FIFO are essentially the same as for normal pipes, with one major
exception. An ``open'' system call or library function should be used to physically open
up a channel to the pipe. With half-duplex pipes, this is unnecessary, since the pipe
resides in the kernel and not on a physical filesystem.
Blocking Actions on a FIFO
Normally, blocking occurs on a FIFO. In other words, if the FIFO is opened for reading,
the process will "block" until some other process opens it for writing. This action works
vice-versa as well. If this behavior is undesirable, the O_NONBLOCK flag can be used
in an open( ) call to disable the default blocking action.
The alternative would be to jump to another virtual console and run the client end,
switching back and forth to see the resulting action.
The Infamous SIGPIPE Signal
On a last note, pipes must have a reader and a writer. If a process tries to write to a
pipe that has no reader, it will be sent the SIGPIPE signal from the kernel. This is
imperative when more than two processes are involved in a pipeline.
Eg. For usage of FIFOs: Process 1 #include <sys/types.h> #include <stdio.h> main( ) { int writefd; char*msg1= “Process #1”;
if(mknod(“FIFO1”, S_FIFO|0600, 0)= = -1) printf(“Could not create a FIFO\n”); if(writefd = open(“FIFO1”, O_RDWR) == -1) printf(“Open failed\n); if(write(writefd,msg1,10) == -1) printf(“write failed\n”);}
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 82
Process 2 #include <sys/types.h> #include <stdio.h> main( ) { int readfd; char buf[15]; if(readfd= open(“FIFO1”, O_RDONLY) == -1) printf(“Open failed\n); if(read(readfd, buf, sizeof(buf)= = -1) printf(“Read failed\n”); printf(“The contents of buf is:%s”, buf); }
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 83
Chapter – 8 Message Queues
Two (or more) processes can exchange information via access to a common system
message queue. The sending process places via some (OS) message-passing module
a message onto a queue which can be read by another process. Each message is
given an identification or type so that processes can select the appropriate message.
Process must share a common key in order to gain access to the queue in the first
place (subject to other permissions -- see below).
Basic Message Passing: IPC messaging lets processes send and receive messages,
and queue messages for processing in an arbitrary order. Unlike the file byte-stream
data flow of pipes, each IPC message has an explicit length. Messages can be
assigned a specific type. Because of this, a server process can direct message traffic
between clients on its queue by using the client process PID as the message type. For
single-message transactions, multiple server processes can work in parallel on
transactions sent to a shared message queue.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 84
Before a process can send or receive a message, the queue must be initialized
.Operations to send and receive messages are performed by the msgsnd() and
msgrcv() functions, respectively.
When a message is sent, its text is copied to the message queue. The msgsnd() and
msgrcv() functions can be performed as either blocking or non-blocking operations.
Non-blocking operations allow for asynchronous message transfer -- the process is not
suspended as a result of sending or receiving a message. In blocking or synchronous
message passing the sending process cannot continue until the message has been
transferred or has even been acknowledged by a receiver. IPC signal and other
mechanisms can be employed to implement such transfer. A blocked message
operation remains suspended until one of the following three conditions occurs:
• The call succeeds.
• The process receives a signal.
• The queue is removed.
For every message in the system, the kernel maintains the following structure of information, defined in sys/msg.h>
struct msqid_ds { struct ipc_perms;
struct msg* msg_first; /* pointer to first message on queue */ struct msg* msg_last; /* pointer to last message on queue */ ushort msg_cbytes; /* current number of bytes in queue*/
ushort msg_qbytes /* no. of bytes on queue; */ ushort q_num; /* current number messages on queue */
ushort msg_lspid; /* pid of last message send */ ushort msg_lrpid; /* pid of last message receive*/
time_t stime; /* time of last message send */ time_t rtime; /* time of last message receive */ time_t ctime; /* time of last message control */
}
The ipc_perms structure is common to Message Queues, Semaphores and Shared
Memory. The members of this structure is:
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 85
struct ipc_perms{ ushort uid; /* Owner’s user id */ ushort gid; /* Owner’s group id */ ushort cuid; /* Creator’s User id */ ushort cgid; /* Creator’s group id */ ushort mode; /* Access modes */ key_t key; /* key */ }
Internally, the kernel maintains the message structures in the form of linked lists.
msg_first msg_last . . .
Initialising the Message Queue
The msgget( ) function initializes a new message queue:
The value passed as the msgflg argument must be an octal integer with settings for the
queue's permissions and control flags.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/msg.h>
int msgget(key_t key, int msgflg);
Returns the message queue ID (msqid) of the queue corresponding to the key argument On success or –1 on error.
msg_perm structure
Link type = 100 length = 1 Data
Link type = 200 length = 2 Data
Link type = 300 length = 3 Data
msqid
struct msqid_ds
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 86
IPC Functions, Key Arguments, and Creation Flags:
Processes requesting access to an IPC facility must be able to identify it. To do this,
functions that initialize or provide access to an IPC facility use a key_t key argument.
(key_t is essentially an int type defined in <sys/types.h>
The key is an arbitrary value or one that can be derived from a common seed at run
time. One way is with ftok( ) , which converts a filename to a key value that is unique
within the system. Functions that initialize or get access to messages (also semaphores
or shared memory) return an ID number of type int. IPC
functions that perform read, write, and control operations use this ID. If the key
argument is specified as IPC_PRIVATE, the call initializes a new instance of an IPC
facility that is private to the creating process. When the IPC_CREAT flag is supplied in
the flags argument appropriate to the call, the function tries to create the facility if it does
not exist already. When called with both the IPC_CREAT and IPC_EXCL flags, the
function fails if the facility already exists. This can be useful when more than one
process might attempt to initialize the facility. One such case might involve several
server processes having access to the same facility. If they all attempt to create the
facility with IPC_EXCL in effect, only the first attempt succeeds. If neither of these flags
is given and the facility already exists, the functions to get access simply return the ID of
the facility. If IPC_CREAT is omitted and the facility is not already initialized, the calls
fail. These control flags are combined, using logical (bitwise) OR, with the octal
permission modes to form the flags argument. For example, the statement below
initializes a new message queue if the queue does not exist.
msqid = msgget(ftok("/tmp",key), (IPC_CREAT | IPC_EXCL | 0400)); The first argument evaluates to a key based on the string ("/tmp"). The second
argument evaluates to the combined permissions and control flags.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 87
Sending and Receiving Messages
The msgsnd( ) and msgrcv( ) functions send and receive messages, respectively: The msqid argument must be the ID of an existing message queue. The msgp
argument is a pointer to a structure that contains the type of the message and its text.
The structure below is an example of what this user-defined buffer might look like:
struct msgbuf { long mtype; /* message type */ char mtext[MSGSZ]; /* message text of length MSGSZ */ } The msgsz argument specifies the length of the message in bytes.
The structure member msgtype is the received message's type as specified by the
sending process.
The argument msgflg specifies the action to be taken if one or more of the following are
true:
• The number of bytes already on the queue is equal to msg_qbytes (maximum bytes
in the queue)
• The total number of messages on all queues system-wide is equal to the system-
imposed limit.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/msg.h> int msgsnd(int msqid, struct msgbuf *msgp, int msgsz, int msgflg); int msgrcv(int msqid, struct msgbuf *msgp, int msgsz, long msgtype,int msgflg); Returns 0 On success or –1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 88
These actions are as follows:
• If (msgflg & IPC_NOWAIT) is non-zero, the message will not be sent and the calling
process will return immediately.
• If (msgflg & IPC_NOWAIT) is 0, the calling process will suspend execution until one
of the following occurs:
o The condition responsible for the suspension no longer exists, in which case the
message is sent.
o The message queue identifier msqid is removed from the system; when this occurs,
errno is set equal to EIDRM and -1 is returned.
o The calling process receives a signal that is to be caught; in this case the message
is not sent and the calling process resumes execution.
Upon successful completion, the following actions are taken with respect to the data
structure associated with msqid:
o msg_qnum is incremented by 1.
o msg_lspid is set equal to the process ID of the calling process.
o msg_stime is set equal to the current time.
Controlling message queues The msgctl( ) function alters the permissions and other characteristics of a message
queue. The owner or creator of a queue can change its ownership or permissions using
msgctl() Also, any process with permission to do so can use msgctl() for control
operations.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/msg.h> int msgctl(int msqid, int cmd, struct msqid_ds *buf ) Returns 0 On success or –1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 89
The msqid argument must be the ID of an existing message queue. The cmd argument is one of: IPC_STAT -- Place information about the status of the queue in the data structure pointed to by buf. The
process must have read permission for this call to succeed.
IPC_SET -- Set the owner's user and group ID, the permissions, and the size (in number of bytes) of the
message queue. A process must have the effective user ID of the owner, creator, or superuser
for this call to succeed.
IPC_RMID -- Remove the message queue specified by the msqid argument.
The following code illustrates the msgctl() function with all its various flags: #include<sys/types.h> #include <sys/ipc.h> #include <sys/msg.h> ... if (msgctl(msqid, IPC_STAT, &buf) == -1) { perror("msgctl: msgctl failed"); exit(1); } ... if (msgctl(msqid, IPC_SET, &buf) == -1) { perror("msgctl: msgctl failed"); exit(1); } ... ...
Disadvantages of Message Queues:
• System calls overheads.
• Speed is less.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 90
Chapter-9 Semaphores
Semaphores are a programming construct designed by E. W. Dijkstra in the late 1960s.
Dijkstra's model was the operation of railroads: consider a stretch of railroad in which
there is a single track over which only one train at a time is allowed. Guarding this track
is a semaphore. A train must wait before entering the single track until the semaphore is
in a state that permits travel. When the train enters the track, the semaphore changes
state to prevent other trains from entering the track. A train that is leaving this section of
track must again change the state of the semaphore to allow another train to enter. In
the computer version, a semaphore appears to be a simple integer. A process (or a
thread) waits for permission to proceed by waiting for the integer to become 0. The
signal if it proceeds signals that this by performing incrementing the integer by 1. When
it is finished, the process changes the semaphore's value by subtracting one from it.
Semaphores let processes query or alter status information. They are often used to
monitor and control the availability of system resources such as shared memory
segments.
Semaphores can be operated on as individual units or as elements in a set. Because
System V IPC semaphores can be in a large array, they are extremely heavy weight.
Much lighter weight semaphores are available in the threads library and POSIX
semaphores (see below for brief). Threads library semaphores must be used with
mapped memory . A semaphore set consists of a control structure and an array of
individual semaphores. A set of semaphores may contain up to 25 elements.
For every semaphore in the system, the kernel maintains the following structure of information. struct semid_ds { struct ipc_perm; /* permission structure */ struct sem *sem_base; /* pointer to first semaphore in the set */ ushort sem_nsems; /* number of semaphores in set */ time_t sem_otime; /* time of last semop */ time_t sem_ctime; /* time of last change */ }
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED
The sem structure is the internal datastructure used by the kernel to maintain the set of values for a given semaphore. struct sem { ushort semval; /* semaphore value, non-negative */ short sempid; /* pid of the last operation */ ushort semncnt; /* number of processes awaiting semval > cval */ ushort semzcnt; /* number of processes awaiting semval = 0 */ } We can picture a particular semaphore in the kernel as being a semid_ds structure that points to an array of sem structures.If the semaphore has two members in its set, we would have the picture as shown.
In a similar fashion to message queues, the s
semget( ); the semaphore creator can chang
semctl(); and semaphore operations are perf
now discussed below:
sem_perm structure
sem_base
sem_nsems
sem_otime
sem_ctime
[0]
struct sem_id_ds
semid
kernel
Semval
Sempid
Semncnt
Semzcnt
[1]
Semval
Sempid
Semncnt
Semzcnt
91
emaphore set must be initialized using
e its ownership or permissions using
ormed via the semop() function. These are
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 92
Initializing a Semaphore Set The function semget() initializes or gains access to a semaphore. The key argument is a access value associated with the semaphore ID.
The nsems argument specifies the number of elements in a semaphore array. The call
fails when nsems is greater than the number of elements in an existing array; when the
correct count is not known, supplying 0 for this argument ensures that it will succeed.
The semflg argument specifies the initial access permissions and creation control flags.
Semaphore Operations
semop( ) performs operations on a semaphore set. The semid argument is the semaphore ID returned by a previous semget() call. The
sops argument is a pointer to an array of structures, each containing the following
information about a semaphore operation:
• The semaphore number
• The operation to be performed
• Control flags, if any.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/sem.h> int semget(key_t key, int nsems, int semflg); Returns semaphore ID(semid) On success or –1 on error.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/sem.h> int semop(int semid, struct sembuf *sops, int nsops); Returns semaphore ID(semid) On success or –1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 93
The sembuf structure specifies a semaphore operation, as defined in <sys/sem.h>. struct sembuf { ushort_t sem_num; /* semaphore number */ short sem_op; /* semaphore operation */ short sem_flg; /* operation flags */ }; The nsops argument specifies the length of the array, the maximum size of which is
determined by the SEMOPM configuration option; this is the maximum number of
operations allowed by a single semop( ) call. The operation to be performed is
determined as follows:
• A positive integer increments the semaphore value by that amount.
• A negative integer decrements the semaphore value by that amount. An attempt
to set a semaphore to a value less than zero fails or blocks, depending on whether
IPC_NOWAIT is in effect.
• A value of zero means to wait for the semaphore value to reach zero.
There are two control flags that can be used with semop():
IPC_NOWAIT -- Can be set for any operations in the array. Makes the function return without changing
any semaphore value if any operation for which IPC_NOWAIT is set cannot be
performed. The function fails if it tries to decrement a semaphore more than its current
value, or tests a nonzero semaphore to be equal to zero.
SEM_UNDO -- Allows individual operations in the array to be undone when the process exits.
This function takes a pointer, sops, to an array of semaphore operation structures. Each
structure in the array contains data about an operation to perform on a semaphore. Any
process with read permission can test whether a semaphore has a zero value. To
increment or decrement a semaphore requires write permission. When an operation
fails, none of the semaphores is altered.
The process blocks (unless the IPC_NOWAIT flag is set), and remains blocked until:
• the semaphore operations can all finish, so the call succeeds,
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 94
• the process receives a signal, or
• the semaphore set is removed.
Only one process at a time can update a semaphore. Simultaneous requests by
different processes are performed in an arbitrary order. When an array of operations is
given by a semop() call, no updates are done until all operations on the array can finish
successfully.
If a process with exclusive use of a semaphore terminates abnormally and fails to undo
the operation or free the semaphore, the semaphore stays locked in memory in the
state the process left it. To prevent this, the SEM_UNDO control flag makes semop()
allocate an undo structure for each semaphore operation, which contains the operation
that returns the semaphore to its previous state. If the process dies, the system applies
the operations in the undo structures. This prevents an aborted process from leaving a
semaphore set in an inconsistent state. If processes share access to a resource
controlled by a semaphore, operations on the semaphore should not be made with
SEM_UNDO in effect. If the process that currently has control of the resource
terminates abnormally, the resource is presumed to be inconsistent. Another process
must be able to recognize this to restore the resource to a consistent state. When
performing a semaphore operation with SEM_UNDO in effect, you must also have it in
effect for the call that will perform the reversing operation. When the process runs
normally, the reversing operation updates the undo structure with a complementary
value. This ensures that, unless the process is aborted, the values applied to the undo
structure are cancel to zero. When the undo structure reaches zero, it is removed.
NOTE:Using SEM_UNDO inconsistently can lead to excessive resource consumption
because allocated undo structures might not be freed until the system is rebooted.
“UNIX”
Controlling Semaphores
#include <sys/types.h> #include<sys/ipc.h> #include<sys/sem.h> int semctl(int semid, int semnum, int cmd, union semun arg);
© CRANES VARSITY ALL RIGHTS RESERVED 95
semctl() changes permissions and other characteristics of a semaphore set.
It must be called with a valid semaphore ID, semid. The semnum value selects a
semaphore within an array by its index. The cmd argument is one of the following
control flags:
GETVAL -- Return the value of a single semaphore. SETVAL -- Set the value of a single semaphore. In this case, arg is taken as arg.val, an int.
GETPID -- Return the PID of the process that performed the last operation on the semaphore or array. GETNCNT -- Return the number of processes waiting for the value of a semaphore to increase.
GETZCNT -- Return the number of processes waiting for the value of a particular semaphore to reach zero.
GETALL -- Return the values for all semaphores in a set. In this case, arg is taken as arg.array, a pointer
to an array of unsigned shorts (see below).
SETALL -- Set values for all semaphores in a set. In this case, arg is taken as arg.array, a pointer to an
array of unsigned shorts.
IPC_STAT -- Return the status information from the control structure for the semaphore set and place it in
the data structure pointed to by arg.buf, a pointer to a buffer of type semid_ds.
IPC_SET
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 96
-- Set the effective user and group identification and permissions. In this case, arg is taken as
arg.buf.
IPC_RMID -- Remove the specified semaphore set.
A process must have an effective user identification of owner, creator, or superuser to
perform an IPC_SET or IPC_RMID command. Read and write permission is required as
for the other control commands.
The fourth argument union semun arg is optional, depending upon the operation
requested. If required it is of type union semun, which must be explicitly declared by the
application program as:
union semun { int val; struct semid_ds *buf; ushort *array; } arg;
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 97
Chapter-10 Shared Memory
Shared Memory is an efficeint means of passing data between programs. One program
will create a memory portion which other processes (if permitted) can access.
When write access is allowed for more than one process, an outside protocol or
mechanism such as a semaphore can be used to prevent inconsistencies and
collisions.
A process creates a shared memory segment using shmget( ). The original owner of a
shared memory segment can assign ownership to another user with shmctl(). It can also
revoke this assignment. Other processes with proper permission can perform various
control functions on the shared memory segment using shmctl(). Once created, a
shared segment can be attached to a process address space using shmat(). It can be
detached using shmdt(). The attaching process must have the appropriate permissions
for shmat(). Once attached, the process can read or write to the segment, as allowed by
the permission requested in the attach operation. A shared segment can be attached
multiple times by the same process. A shared memory segment is described by a
control structure with a unique ID that points to an area of physical memory. The
identifier of the segment is called the shmid.
The structure definition for the shared memory segment control structures and
prototypes can be found in <sys/shm.h>.
For each shared memory, the kernel maintains the following structure:
struct shmid_ds { struct ipc_perm; shm_segsize; /* segment size */ ushort shm_lpid; /* pid of last operation */ ushort shm_cpid; /* creator pid */ ushort shm_nattch; /* current number attached */ time_t shm_atime; /* last attach time */ time_t shm_dtime; /* last detach time */ time_t shm_ctime; /* last change time */
}
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 98
Initialising shared memory shmget() is used to obtain access to a shared memory segment.
Accessing a Shared Memory Segment
The key argument is a access value associated with the semaphore ID. The size
argument is the size in bytes of the requested shared memory. The shmflg argument
specifies the initial access permissions and creation control flags.
When the call succeeds, it returns the shared memory segment ID. This call is also
used to get the ID of an existing shared segment (from a process requesting sharing of
some existing memory portion).
Attaching a Shared Memory Segment
shmid is the identifier returned by the shmget( ).
The valid values for shmaddr are given below:
• If the shmaddr argument is zero, the stystem selects the address for the caller.
• If the shmaddr is a non-zero, the address returned depends on whether the
caller specifies the SHM_RND value for the shmflag argument.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> int shmget(key_t key, int size, int shmflg); Returns shared memory ID(shmid) On success or –1 on error.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> char *shmat (int shmid, char *shmaddr, int shmflg); Returns the starting address of the shared memory on success or –1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 99
o If the SHM_RND value is not specified, the shared memory segment is
attached at the address specified by the shmaddr argument.
o If SHM_RND is specified, the shared memory segment is attached at the
address specified by the shmaddr argument, rounded down by the
constant SHMLBA. LBA stands for “lower boundary address”.
By default, the shared memory segment is attached for both reading and writing by the
calling process, if the calling process has read-write permissions for the segment. The
SHM_RDONLY value can be specified in the flag argument, specifying read-only
access.
Detatching a Shared Memory Segment shmdt( ) detaches the shared memory segment located at the address indicated by shmaddr. This call does not delete the shared memory segment.
Controlling a Shared Memory Segment
shmctl( ) is used to alter the permissions and other characteristics of a shared memory segment. The process must have an effective shmid of owner, creator or superuser to perform
this command. The cmd argument is one of following control commands:
#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> char *shmdt (char *shmaddr ); Returns 0 on success or –1 on error.
#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> int shmctl(int shmid, int cmd, struct shmid_ds *buf); Returns 0 on success or –1 on error.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 100
SHM_LOCK
-- Lock the specified shared memory segment in memory. The process must have the effective
ID of superuser to perform this command.
SHM_UNLOCK -- Unlock the shared memory segment. The process must have the effective ID of superuser to
perform this command.
IPC_STAT -- Return the status information contained in the control structure and place it in the buffer
pointed to by buf. The process must have read permission on the segment to perform this
command.
IPC_SET -- Set the effective user and group identification and access permissions. The process must
have an effective ID of owner, creator or superuser to perform this command.
IPC_RMID -- Remove the shared memory segment.
The buf is a sructure of type struct shmid_ds which is defined in <sys/shm.h>
Address Spaces and Mapping
Since backing store files (the process address space) exist only in swap storage, they
are not included in the UNIX named file space. (This makes backing store files
inaccessible to other processes.) However, it is a simple extension to allow the logical
insertion of all, or part, of one, or more, named files in the backing store and to treat the
result as a single address space. This is called mapping. With mapping, any part of any
readable or writable file can be logically included in a process's address space. Like any
other portion of the process's address space, no page of the file is not actually loaded
into memory until a page fault forces this action. Pages of memory are written to the file
only if their contents have been modified. So, reading from and writing to files is
completely automatic and very efficient. More than one process can map a single
named file. This provides very efficient memory sharing between processes. All or part
of other files can also be shared between processes.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 101
Not all named file system objects can be mapped. Devices that cannot be treated as
storage, such as terminal and network device files, are examples of objects that cannot
be mapped. A process address space is defined by all of the files (or portions of files)
mapped into the address space. Each mapping is sized and aligned to the page
boundaries of the system on which the process is executing. There is no memory
associated with processes themselves.
A process page maps to only one object at a time, although an object address may be
the subject of many process mappings. The notion of a "page" is not a property of the
mapped object. Mapping an object only provides the potential for a process to read or
write the object's contents. Mapping makes the object's contents directly addressable by
a process. Applications can access the storage resources they use directly rather than
indirectly through read and write. Potential advantages include efficiency (elimination of
unnecessary data copying) and reduced complexity (single-step updates rather than the
read, modify buffer, write cycle). The ability to access an object and have it retain its
identity over the course of the access is unique to this access method, and facilitates
the sharing of common code and data.
Because the file system name space includes any directory trees that are connected
from other systems via NFS, any networked file can also be mapped into a process's
address space.
“UNIX”
©
Chapter-11 Sockets
Sockets are a generalized networking capability first introduced in 4.1cBSD and
subsequently refined into their current form with 4.2BSD. The sockets feature is
available with most current UNIX system releases. (Transport Layer Interface (TLI) is
the System V alternative). Sockets allow communication between two different
processes on the same or different machines.
Sockets are anapplication program interface(they provide the user an interface with the
system). They can be used for local interprocess communication and also across
TCP/IP networks. They provide both connection oriented and connectionless modes of
communication.
Socket
T
t
T
T
s
b
.
#include <sys/types.h> #include <sys/socket.h> int socket(int family, int type, int protocol); Returns socket ID on success or –1 on error
CRANES VARSITY ALL RIGHTS RESERVED 102
he family or the domain the socket is supposed to function in. This determines
he address format to be used.
he two mainly used socket domains are: • The UNIX domain ( or AF_UNIX, for Address Format UNIX ):- In this
domain, a socket is given a pathname within the system name space.
• The internet domain ( or AF_INET ):- Addresses in the internet domain
consist of a machine network address and an identifying number, called
the port. Internet domain names allow communication between machines.
he second parameter type is the type of the socket. Communication follows
ome particular “style”. Currently, communication is either through a stream or
y a datagram.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 103
The two mainly used socket types are:
• SOCK_STREAM ( for streams sockets )
• SOCK_DGRAM ( for datagram sockets )
stream communication means:
• Communication takes place across a connection between two sockets
• The communication is bi-directional, reliable, error free and no message
boundaries are kept.
• Reading from a stream may result in, reading the data sent from one or
several calls to write( ). The whole data may be read at one go, or only a
part of the data if there is not enough room for the entire message, or if
not all the dat from a large message has been transferred across.
• The protocol implementing such a style will retransmit messages received
with errors.
• The protocol will also return error messages if one tries to send a
message after the connection has been broken.
Datagram communication means:
• It is connectionless, and is bi-directional
• Each ,message is addressed individually
• If the address is correct, it will generally be received, although this is not
guaranteed.
• Often datagrams are used for requests that requires a response from the
recipient. If no response arrives in a reasonable amount of time (timeout),
the request is repeated.
• The individual datagrams will be kept separatewhen they are read, that is
message boundaries are preserved.
A protocol is a set of rules, data formats and conventions that regulates the
transfer of data between participants while communicating. There is one protocol
for each socket type (stream, datagram etc.,) within each domain.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 104
• The program that implements a protocol keeps track of the names that are
bound to sockets, sets up connections and transfers data between
sockets.
• It is possible for several protocols, differing only in low level details, to
implement the same style of communication within a particular domain.
Although it is possible to select which protocol should be used, for nearly
all uses it is sufficient to request the default protocol.
• Usually the protocol argument to socket( ) is kept 0, which invokes default
protocol.
Socket address structure This structure contains information about the family (UNIX, internet etc.,),
network and host address, and port (which service to ask for)
• The network, host address and the port are stored in a 14 byte long
string, which is set protocol – specific.
• The structure as defined in sys/socket.h is:
Struct sockaddr { u_short1 sa_family; /* address family, AF_UNIX for UNIX */ /* AF_INET for the Internet */ char sa_data[14]; /* Upto 14 bytes of protocol-specific */ }
For the INTERNET domain the following information is provided • Family name ( AF_INET )
• The IP address:
The network and host address: a 4 dotted decimal number
Client must identify the server it wants to communicate with, by
using this 4 dotted decimal number.
• The port number : The service the client wants to use on the server.
Port numbers below 5000 are reserved for specific services.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 105
The server registers the service offered by the socket to a port via
the bind( ) call.
The port number for the clients socket is assigned automatically.
The structures for the internrt domain are defined in the netinet/in.h
header file. The structures are:
Struct in_addr
{
u_long s_addr; /* 32 bit net_id/host_id, network
byte ordered*/
}
struct sockaddr_in {
short sin_family; /* AF_INET */
u_short sin_port; /* 16 bit port number */
struct in_addr sin_addr; /*32bit netid/hostid
network byte ordered /
char sin_zero; /*unused */
}
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 106
Connection oriented socket calls
Server (connection oriented protocol)
socket( )
bind( )
listen( )
Blocks Until connection from client
accept( )
read( )
Process request
write( )
Connection establishment
Data (request)
socket( )
connect( )
write( )
read( )
client
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 107
Connectionless socket calls bind A socket is created without a name. Until a name is bound to a socket, other
processes have no way to reference the socket. This means no messages can
be exchanged.
A server must bind its local addresses to receive the client’s requests. A
coonection oriented client does not have to explicitly bind its local addresses, it
will be bound during the connection.
Struct sockaddr
{ u_short1 sa_family; /* address family */ char sa_data[14]; /*upto 14 bytes of protocol specific address */ }
Server (connectionless protocol)
socket( )
bind( )
recvfrom()
Blocks Until connection from client
Sendto( )
Data (request)
Data (reply)
Process request
socket( )
bind( )
sendto( )
recvfrom()
client
“UNIX”
©
bind
The bind system call assignes a name to the unnamed socket. Binding an
address allows a process to register its address with the system.This makes it
possible for other process to find it. Binding uses domain specific address
formats.
The second argument struct sockaddr * address is a pointer to a protocol
specific address .
The third argument is the length of the address structure.
Server processes register their well-known addresses with the system.
This tells the system that any messages received at the address pointed
to by the address should be forwarded to the server process.Both
connection oriented and connectionless servers have to register their
sockets before accepting any client requests.
A client can register a specific address for itself.
A connectionless client should make sure that the system assigns it a
unique address, so that the server has a valid address to send the data to.
Connect
A
e
l
#include <sys/types.h> #include <sys/socket.h> int connect( int sock, struct sockaddr * servaddr, int addrlen ); Returns 0 on success or –1 on error.
#include <sys/types.h> #include <sys/socket.h> int bind( int sock, struct sockaddr * address, int addresslength ); Returns 0 on success or –1 on error.
CRANES VARSITY ALL RIGHTS RESERVED 108
client process connects a socket descriptor following the socket system call to
stablish a connection with the server.
isten
“UNIX”
©
This system call is used by a connection oriented server to indicate that it is
willing to receive connections.
T
a
T
q
c
I
b
a
o
r
a
A
w
a
A
.
#include <sys/types.h> #include <sys/socket.h> int listen( int sock, int backlog); Returns 0 on success or –1 on error
his call is executed after the socket and bind system calls have been executed,
nd before the accept system call.
he backlog argument specifies how many connections requests may be
ueued by the system, while it waits for the server to execute the accept system
all.
n the time it takes the server to handle the request of an accept (the time taken
y the server to for a child process, and then have the parent process execute
nother accept( )), it is possible for additional connection requests to arrive from
ther clients.what the backlog argument refers to is this queue of pending
equests for connections.
ccept
#include <sys/types.h> #include <sys/socket.h> int accept( int sock, struct sockaddr * clientname, int *addrlen); Returns the address of the client in clientname or –1 on error.
CRANES VARSITY ALL RIGHTS RESERVED 109
fter a connection oriented server executes the listen system call, the server
aits for a connection from some client process by having the server execute the
ccept call.
ccept performs the following functions:
• Blocks until a connection request is in the queue.
• Creats a new socket with the same properties as a sock
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 110
The clientname and addresslen arguments are used to return the address of
the connected peer process(the client).addrlen is called a ‘value result
argument’. The caller sets its value before the system call, and the system call
stores a result in the ariable. Usually these value result arguments are integers
that the caller sets to the size of the buffer, with the system call changing this
value on the return, to the actual amouint of data stored in it.
For this system call the caller sets the addrlen to the size of the sockaddr
structure whose address is passed as the clientname argument. On return, the addresslen contains the actual number of bytes that the stystem call stores in
the clientname argument.
The system call returns upto 3 values:
An integer return code that is either an error indicator or a new socket
descriptor.
The address of the client process(clientname), and
The size of this address.
Send, sendto,recv,recvfrom
When a connection is established, the server and client can exchange data.This can be performed using
The standarad read( ) and write( ) system calls.
In the connection oriented mode
send( ) and recv( ) system calls.
In the connectionless mode
Sendto( ) and recvfrom( ) system calls.
“UNIX”
© CR
Ta
T
#include <sys/types.h> #include <sys/socket.h> int send( int sock, char *buf, int bytes, int flags); int sendto( int sock, char *buf, int bytes, int flags, struct sockaddr *to, int addr_to_len); int recv( int sock, char *buf, int bytes, int flags); int recvfrom( int sock, char *buf, int bytes, int flags,struct sockaddr *from, int addr_from_len); Returns the length of the data that was transferred or –1 on error.
ANES VARSITY ALL RIGHTS RESERVED 111
he first three arguments are similar to the first three arguments of the read nd write system calls.
he flags argument is 0 or formed by OR’ing one of the following constants.
MSG_OOB Send/receive out-of-band data.
MSG_PEEK Peek at the incoming message (recv or recvfrom )
MSG_DONTROUTE Send without using the routing tables( send or sendto ).
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 112
An A-Z Index of the Linux BASH command line
alias Create an alias
awk Find and Replace text within file(s)
break Exit from a loop
builtin Run a shell builtin
cal Display a calendar
case Conditionally perform a command
cat Display the contents of a file
cd Change Directory
chgrp Change group ownership
chmod Change access permissions
chown Change file owner and group
chroot Run a command with a different root directory
cksum Print CRC checksum and byte counts
clear Clear terminal screen
cmp Compare two files
comm Compare two sorted files line by line
command Run a command - ignoring shell functions
continue Resume the next iteration of a loop
cp Copy one or more files to another location
cron Daemon to execute scheduled commands
crontab Schedule a command to run at a later time
csplit Split a file into context-determined pieces
cut Divide a file into several parts
date Display or change the date & time
dc Desk Calculator
dd Data Dump - Convert and copy a file
declare Declare variables and give them attributes
df Display free disk space
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 113
diff Display the differences between two files
diff3 Show differences among three files
dir Briefly list directory contents
dircolors Colour setup for `ls'
dirname Convert a full pathname to just a path
dirs Display list of remembered directories
du Estimate file space usage
echo Display message on screen
ed A line-oriented text editor (edlin)
egrep Search file(s) for lines that match an extended expression
eject Eject CD-ROM
enable Enable and disable builtin shell commands
env Display, set, or remove environment variables
eval Evaluate several commands/arguments
exec Execute a command
exit Exit the shell
expand Convert tabs to spaces
export Set an environment variable
expr Evaluate expressions
factor Print prime factors
false Do nothing, unsuccessfully
fdformat Low-level format a floppy disk
fdisk Partition table manipulator for Linux
fgrep Search file(s) for lines that match a fixed string
find Search for files that meet a desired criteria
fmt Reformat paragraph text
fold Wrap text to fit a specified width.
for Expand words, and execute commands
format Format disks or tapes
free Display memory usage
fsck Filesystem consistency check and repair.
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 114
gawk Find and Replace text within file(s)
getopts Parse positional parameters
grep Search file(s) for lines that match a given pattern
groups Print group names a user is in
gzip Compress or decompress named file(s)
hash Remember the full pathname of a name argument
head Output the first part of file(s)
history Command History
hostname Print or set system name
id Print user and group id's
if Conditionally perform a command
import Capture an X server screen and save the image to file
info Help info
install Copy files and set attributes
join Join lines on a common field
kill Stop a process from running
less Display output one screen at a time
let Perform arithmetic on shell variables
ln Make links between files
local Create variables
locate Find files
logname Print current login name
logout Exit a login shell
lpc Line printer control program
lpr Off line print
lprint Print a file
lprintd Abort a print job
lprintq List the print queue
lprm Remove jobs from the print queue
ls List information about file(s)
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 115
m4 Macro processor
man Help manual
mkdir Create new folder(s)
mkfifo Make FIFOs (named pipes)
mknod Make block or character special files
more Display output one screen at a time
mount Mount a file system
mtools Manipulate MS-DOS files
mv Move or rename files or directories
nice Set the priority of a command or job
nl Number lines and write files
nohup Run a command immune to hangups
passwd Modify a user password
paste Merge lines of files
pathchk Check file name portability
popd Restore the previous value of the current directory
pr Convert text files for printing
printcap Printer capability database
printenv Print environment variables
printf Format and print data
ps Process status
pushd Save and then change the current directory
pwd Print Working Directory
quota Display disk usage and limits
quotacheck Scan a file system for disk usage
quotactl Set disk quotas
ram ram disk device
rcp Copy files between two machines.
read read a line from standard input
readonly Mark variables/functions as readonly
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 116
remsync Synchronize remote files via email
return Exit a shell function
rm Remove files
rmdir Remove folder(s)
rpm Remote Package Manager
rsync Remote file copy (Synchronize file trees)
screen Terminal window manager
sdiff Merge two files interactively
sed Stream Editor
select Accept keyboard input
seq Print numeric sequences
set Manipulate shell variables and functions
shift Shift positional parameters
shopt Shell Options
shutdown Shutdown or restart linux
sleep Delay for a specified time
sort Sort text files
source Run commands from a file `.'
split Split a file into fixed-size pieces
su Substitute user identity
sum Print a checksum for a file
symlink Make a new name for a file
sync Synchronize data on disk with memory
tac Concatenate and write files in reverse
tail Output the last part of files
tar Tape ARchiver
tee Redirect output to multiple files
test Evaluate a conditional expression
time Measure Program Resource Use
times User and system times
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 117
touch Change file timestamps
top List processes running on the system
traceroute Trace Route to Host
trap Run a command when a signal is set(bourne)
tr Translate, squeeze, and/or delete characters
true Do nothing, successfully
tsort Topological sort
tty Print filename of terminal on stdin
type Describe a command
ulimit Limit user resources
umask Users file creation mask
umount Unmount a device
unalias Remove an alias
uname Print system information
unexpand Convert spaces to tabs
uniq Uniquify files
units Convert units from one scale to another
unset Remove variable or function names
unshar Unpack shell archive scripts
until Execute commands (until error)
useradd Create new user account
usermod Modify user account
users List users currently logged in
uuencode Encode a binary file
uudecode Decode a file created by uuencode
v Verbosely list directory contents (`ls -l -b')
vdir Verbosely list directory contents (`ls -l -b')
watch Execute/display a program periodically
wc Print byte, word, and line counts
whereis Report all known instances of a command
“UNIX”
© CRANES VARSITY ALL RIGHTS RESERVED 118
which Locate a program file in the user's path.
while Execute commands
who Print all usernames currently logged in
whoami Print the current user id and name (`id -un')
xargs Execute utility, passing constructed argument list(s)
yes Print a string until interrupted