latest unix mat3

118
“UNIX” © CRANES VARSITY ALL RIGHTS RESERVED 1 Chapter-1 INTRODUCTION History of Unix UNIX was "created" 1968-1970. UNIX time is kept from Jan 1, 1970. Ken Thompson, Dennis Ritchie (AT&T Bell Labs), Multics, PDP7 Thompson wrote "B" and then 'C' in 1972 UNIX rewritten in 'C' in 1973 - greatly increased portability 6th edition of UNIX distributed in 1975. 7th edition distributed in 1979, ported to Interdata 8/32 (IBM370 like), PDP 11. University of California, Berkeley UCB -- BSD UNIX. Today there are many "flavors" of UNIX's with slight differences. The name "UNIX" is a trademark, originally of AT&T, but since sold several times - most variants have names other than "UNIX", but all derive from AT&T code originally. Why learn Unix? UNIX has been called the "Internet Operating System" because much of the Internet was built around and on UNIX machines (TCP/IP developed by Berkeley). UNIX has/had the communications technology that allowed the Internet to explode. Remote login and file transfer (FTP) Software Development, UNIX has many powerful tools that aided programmers; the "data streams" philosophy allows tools to be connected together. Personal UNIX, Linux for your mac (or PC). High end word processing and desktop publishing High end image processing and analysis

Upload: anonymous-vfh06fqxba

Post on 24-Dec-2015

32 views

Category:

Documents


0 download

DESCRIPTION

Latest Unix

TRANSCRIPT

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 1

Chapter-1 INTRODUCTION

History of Unix

• UNIX was "created" 1968-1970. UNIX time is kept from Jan 1, 1970.

• Ken Thompson, Dennis Ritchie (AT&T Bell Labs), Multics, PDP7

• Thompson wrote "B" and then 'C' in 1972

• UNIX rewritten in 'C' in 1973 - greatly increased portability

• 6th edition of UNIX distributed in 1975.

• 7th edition distributed in 1979, ported to Interdata 8/32 (IBM370 like), PDP 11.

• University of California, Berkeley UCB -- BSD UNIX.

Today there are many "flavors" of UNIX's with slight differences.

The name "UNIX" is a trademark, originally of AT&T, but since sold several times - most

variants have names other than "UNIX", but all derive from AT&T code originally.

Why learn Unix?

• UNIX has been called the "Internet Operating System" because much of the

Internet was built around and on UNIX machines (TCP/IP developed by

Berkeley). UNIX has/had the communications technology that allowed the

Internet to explode.

• Remote login and file transfer (FTP)

• Software Development, UNIX has many powerful tools that aided programmers;

the "data streams" philosophy allows tools to be connected together.

• Personal UNIX, Linux for your mac (or PC).

• High end word processing and desktop publishing

• High end image processing and analysis

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 2

Main Features of UNIX

• Multi-user more than one user can use the machine at a time

supported via terminals (serial or network connection)

• Multi-tasking

more than one program can be run at a time

• Hierarchical directory structure

to support the organisation and maintenance of files

• Portability

only the kernel ( <10%) written in assembler. This meant the operating system

could be easily converted to run on different hardware

• Tools for program development, a wide range of support tools (debuggers,

compilers)

Why Unix is so widely used?

• Scalable and portable Runs in many environments. TV set top boxes, bank ATM, phone switches, Real

Time, PC's, Workstations, Servers, 12,000 CPU parallel processors.

• Multiuser Even for a one user machine, protected environment. Multiple logins allow

shared resources.

• Preemptive multitasking Each program gets a "chunk" of system resources. Many users and programs

can run at the same time (in other words, it has efficient context switching, good

virtual memory, idle programs are paged out -- you can do more with less

resources)

• Robust Protected execution space, keeps on running. Makes it good for OLTP(On Line

Transaction Processing) and high availability applications. Failing user programs

do not crash the whole system (usually).

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 3

• Networks and the Internet

UNIX has many "built in" communication tools, News, mail, ftp, rlogin etc.

Unix O.S Structure

Unix is a layered operating system. The innermost layer is the hardware that provides

the services for the OS. The operating system, referred to in Unix as the kernel, interacts directly with the hardware and provides the services to the user programs.

These user programs do not need to know anything about the hardware. They just need

to know how to interact with the kernel and it is up to the kernel to provide the desired

service. One of the big appeals of Unix to programmers has been that most well written

user programs are independent of the underlying hardware, making them readily

portable to new systems.

User programs interact with the kernel through a set of standard system calls. These

system calls request services to be provided by the kernel. Such services would include

accessing a file: open close, read, write, link, or execute a file; starting or updating

accounting records; changing ownership of a file or directory; changing to a new

directory; creating, suspending, or killing a process; enabling access to hardware

devices; and setting limits on system resources.

Unix is a multi-user, multi-tasking operating system. Many users can login into a

system simultaneously. It is the kernel's job to keep each process and user separate

and to regulate access to system hardware, including cpu, memory, disk and other I/O

devices.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 4

The System Call Interface

In UNIX, all user programs and application software use the system call interface to

access system resources like disks, printers, memory etc. The system call interface in

UNIX provides a set of system calls (C functions).

The purpose of the system call interface is to provide system integrity. As all low-level

hardware access is under control of the operating system, this prevents a users

program corrupting the system.

The operating system, upon receiving a system call, validates its authenticity or

permission, then executes it on behalf of the users program, after which it returns the

results. If the request is invalid or not authenticated, then the operating system does not

perform the request and simply returns an error code to the users program.

Hardware

Kernel

Hardware

who a.out

date

wc

grep

ed

vi ld

as

comp

cpp

nroff

sh

Architecture of UNIX

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 5

Header Files Header files define how a system call works. A header file contains a prototype of the

system call, and the parameters (variables) required by the call, and the parameters

returned by the system call.

When a programmer develops programs, the header file for the particular system call is

incorporated (included) into the program. This allows the compiler to check the number

of parameters and their data type.

The UNIX Operating System

The basic structure of the UNIX operating system, as a division of three parts is as

below.

• kernel schedules programs

manages data/file access and storage

enforces security mechanisms

performs all hardware access

• shell presents each user with a prompt

interprets commands types by a user

executes user commands

supports a custom environment for each user

• utilities

file management (rm, cat, ls, rmdir, mkdir)

user management (passwd, chmod, chgrp)

process management (kill, ps)

printing (lp, troff, pr)

program development tools

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 6

UNIX Shells

The shell sits between the user and the operating system, acting as a command

interpreter. It reads your terminal input and translates the commands into actions taken

by the system. The shell is analogous to command.com in DOS. When you log into the

system you are given a default shell. When the shell starts up it reads its startup files

and may set environment variables, command search paths, and command aliases, and

executes any commands specified in these files.

The original shell was the Bourne shell, sh. Every Unix platform will either have the

Bourne shell, or a Bourne compatible shell available. It has very good features for

controlling input and output, but is not well suited for the interactive user. To meet the

latter need the C shell, csh, was written and is now found on most, but not all, Unix

systems. It uses C type syntax, the language Unix is written in, but has a more

awkward input/output implementation. It has job control, so that you can reattach a job

running in the background to the foreground. It also provides a history feature, which

allows you to modify and repeat previously executed commands.

The default prompt for the Bourne shell is $ (or #, for the root user). The default prompt

for the C shell is %.

Numerous other shells are available from the network. Almost all of them are based on

either sh or csh with extensions to provide job control to sh, allow in-line editing of

commands, page through previously executed commands, provide command name

completion and custom prompt, etc. Some of the more well known of these may be on

your favorite Unix system: the Korn shell, ksh, by David Korn and the Bourne Again

SHell, bash, from the Free Software Foundations GNU project, both based on sh, the

T-C shell, tcsh, and the extended C shell, cshe, both based on csh. Below we will

describe some of the features of sh and csh .

The shells have a number of built-in, or native commands. These commands are

executed directly in the shell and don't have to call another program to be run. These

built-in commands are different for the different shells.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 7

Sh

For the Bourne shell some of the more commonly used built-in commands are:

: null command

. source (read and execute) commands from a file

case case conditional loop

cd change the working directory (default is $HOME)

echo write a string to standard output

eval evaluate the given arguments and feed the result back to the shell

exec execute the given command, replacing the current shell

exit exit the current shell

exportshare the specified environment variable with subsequent shells

for for conditional loop

if if conditional loop

pwd print the current working directory

read read a line of input from stdin

set set variables for the shell

test evaluate an expression as true or false

trap trap for a typed signal and execute commands

umask set a default file permission mask for new files

unset unset shell variables

wait wait for a specified process to terminate

while while conditional loop

Csh For the C shell the more commonly used built-in functions are:

alias assign a name to a function

bg put a job into the background

cd change the current working directory

echo write a string to stdout

eval evaluate the given arguments and feed the result back to the shell

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 8

exec execute the given command, replacing the current shell

exit exit the current shell

fg bring a job to the foreground

foreach for conditional loop

glob do filename expansion on the list, but no "\" escapes are honored

history print the command history of the shell

if if conditional loop

jobs list or control active jobs

kill kill the specified process

limit set limits on system resources

logout terminate the login shell

nice command lower the scheduling priority of the process, command

nohup command do not terminate command when the shell exits

popd pop the directory stack and return to that directory

pushd change to the new directory specified and add the current one to the

directory stack

rehash recreate the hash table of paths to executable files

repeat repeat a command the specified number of times

set set a shell variable

setenv set an environment variable for this and subsequent shells

source source (read and execute) commands from a file

stop stop the specified background job

switch switch conditional loop

umask set a default file permission mask for new files

unalias remove the specified alias name

unset unset shell variables

unsetenv unset shell environment variables

wait wait for all background processes to terminate

while while conditional loop

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 9

Environment Variables Environmental variables are used to provide information to the programs you use. You

can have both global environment and local shell variables. Global environment

variables are set by your login shell and new programs and shells inherit the

environment of their parent shell. Local shell variables are used only by that shell and

are not passed on to other processes. A child process cannot pass a variable back to its

parent process.

The current environment variables are displayed with the “env” or “printenv”

commands. Some common ones are:

• DISPLAY The graphical display to use

• EDITOR The path to your default editor, e.g. /usr/bin/vi

• GROUP Your login group, e.g. staff

• HOME Path to your home directory, e.g. /home/frank

• HOST The hostname of your system

• IFS Internal field separators, usually any white space (defaults to tab, space and

<newline>)

• LOGNAME The name you login with, e.g. frank

• PATH Paths to be searched for commands, e.g. /usr/bin:/usr/ucb:/usr/local/bin

• PS1 The primary prompt string, Bourne shell only (defaults to $)

• PS2 The secondary prompt string, Bourne shell only (defaults to >)

• SHELL The login shell you’re using, e.g. /usr/bin/bash

• TERM Your terminal type, e.g. xterm

• USER Your username, e.g. frank

Many environment variables will be set automatically when we login. You can modify

them or define others with entries in your startup files or at anytime within the shell.

Some variables you might want to change are PATH and DISPLAY. The PATH variable

specifies the directories to be automatically searched for the command you specify.

Examples of this are in the shell startup scripts below.

We set a global environment variable with a command similar to the following for the

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 10

C shell: % setenv NAME value

and for Bourne shell:

$ NAME=value; export NAME

You can list your global environmental variables with the env or printenv commands.

You unset them with the unsetenv (C shell) or unset (Bourne shell) commands.

To set a local shell variable use the set command with the syntax below for C shell.

Without options set displays all the local variables.

% set name=value

For the Bourne shell set the variable with the syntax:

$ name=value

The current value of the variable is accessed via the “$name”, or “${name}”, notation.

The Bourne Shell, sh Sh uses the startup file .profile in your home directory. There may also be a system-

wide startup file, e.g. /etc/profile. If so, the system-wide one will be sourced (executed)

before your local one.

A simple .profile could be the following:

PATH=/usr/bin:/usr/ucb:/usr/local/bin:. # set the PATH

export PATH # so that PATH is available to subshells

# Set a prompt

PS1=”{‘hostname‘ ‘whoami‘} “ # set the prompt, default is “$”

# functions

ls() { /bin/ls -sbF “$@”;}

ll() { ls -al “$@”;}

# Set the terminal type

stty erase ^H # set Control-H to be the erase key

eval ‘tset -Q -s -m ‘:?xterm’‘ # prompt for the terminal type, assume xterm

#umask 077

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 11

Whenever a # symbol is encountered the remainder of that line is treated as a

comment. In the PATH variable each directory is separated by a colon (:) and the dot (.) specifies that the current directory is in your path. If the latter is not set it’s a simple

matter to execute a program in the current directory by typing:

./program_name

It’s actually a good idea not to have dot (.) in your path, as you may inadvertently

execute a program you didn’t intend to when you cd to different directories.

A variable set in .profile is set only in the login shell unless you “export” it or source

.profile from another shell. In the above example PATH is exported to any subshells.

You can source a file with the built-in “.” command of sh, i.e.:

. ./.profile

You can make your own functions. In the above example the function ll results in an “ls -al” being done on the specified files or directories.

With stty the erase character is set to Control-H (^H), which is usually the Backspace

key.

The tset command prompts for the terminal type, and assumes “xterm” if we just hit

<CR>. This command is run with the shell built-in, eval, which takes the result from the

tset command and uses it as an argument for the shell. In this case the “-s” option to

tset sets the TERM and TERMCAP variables and exports them.

The last line in the example runs the umask command with the option such that any

files or directories you create will not have read/write/execute permission for group and

other. For further information about sh type “man sh” at the shell prompt.

Job Control With the C shell, csh, and many newer shells including some newer Bourne

shells, you can put jobs into the background at anytime by appending “&” to the

command, as with sh. After submitting a command you can also do this by typing

^Z (Control-Z) to suspend the job and then “bg” to put it into the background. To

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 12

bring it back to the foreground type “fg”. You can have many jobs running in the background. When they are in the background

they are no longer connected to the keyboard for input, but they may still display output

to the terminal, interspersing with whatever else is typed or displayed by your current

job. You may want to redirect I/O to or from files for the job you intend to background.

Your keyboard is connected only to the current, foreground, job.

The built-in jobs command allows you to list your background jobs. You can use the kill command to kill a background job. With the %n notation you can reference the nth

background job with either of these commands, replacing n with the job number from

the output of jobs. So kill the second background job with “kill %2” and bring the third

job to the foreground with “fg %3”.

History

The C shell, the Korn shell and some other more advanced shells, retain information

about the former commands you’ve executed in the shell. How history is done will

depend on the shell used. Here we’ll describe the C shell history features.

You can use the history and savehist variables to set the number of previously

executed commands to keep track of in this shell and how many to retain between

logins, respectively. You could put a line such as the following in .cshrc to save the last

100 commands in this shell and the last 50 through the next login.

set history=100 savehist=50

The shell keeps track of the history list and saves it in ~/.history between logins.

You can use the built-in history command to recall previous commands, e.g. to print the

last 10:

% history 10

52 cd workshop

53 ls

54 cd unix_intro

55 ls

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 13

56 pwd

57 date

58 w

59 alias

60 history

61 history 10

You can repeat the last command by typing !!: % !!

53 ls

54 cd unix_intro

55 ls

56 pwd

57 date

58 w

59 alias

60 history

61 history 10

62 history 10

You can repeat any numbered command by prefacing the number with a !, e.g.:

% !57

date

Tue Apr 9 09:55:31 EDT 1996

Or repeat a command starting with any string by prefacing the starting unique part of the

string with a !, e.g.:

% !da

date

Tue Apr 9 09:55:31 EDT 1996

When the shell evaluates the command line it first checks for history substitution before

it interprets anything else. Should you want to use one of these special characters in a

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 14

shell command you will need to escape, or quote it first, with a \ before the character,

i.e. \!. The history substitution characters are summarized in the following table.

C Shell History Substitution Command Substitution Function

!! repeat last command !n repeat command number n !-n repeat command n from last !str repeat command that started with string str !?str? repeat command with str anywhere on the line !?str?% select the first argument that had str in it !: repeat the last command, generally used with a modifier !:n select the nth argument from the last command (n=0 is the command name) !:n-m select the nth through mth arguments from the last command !^ select the first argument from the last command (same as !:1) !$ select the last argument from the last command !* select all arguments to the previous command !:n* select the nth through last arguments from the previous command !:n- select the nth through next to last arguments from the previous command ^str1^str2^ replace str1 with str2 in its first occurrence in the previous command

!n:s/str1/str2/ substitute str1 with str2 in its first occurrence in the nth command, ending with a g substitute globally

Additional editing modifiers are described in the man page.

Special Unix Features

One of the most important contributions Unix has made to Operating Systems is the

provision of many utilities for doing common tasks or obtaining desired information.

Another is the standard way in which data is stored and transmitted in Unix systems.

This allows data to be transmitted to a file, the terminal screen, or a program, or from a

file, the keyboard, or a program; always in a uniform manner. The standardized

handling of data supports two important features of Unix utilities: I/O redirection and

piping.

With output redirection, the output of a command is redirected to a file rather than to

the terminal screen. With input redirection, the input to a command is given via a file

rather than the keyboard. Other tricks are possible with input and output redirection as

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 15

well, as you will see. With piping, the output of a command can be used as input

(piped) to a subsequent command. In this chapter we discuss many of the features and

utilities available to Unix users

File Descriptors

There are 3 standard file descriptors:

• stdin 0 Standard input to the program

• stdout 1 Standard output from the program

• stderr 2 Standard error output from the program

Normally input is from the keyboard or a file. Output, both stdout and stderr, normally go

to the terminal, but you can redirect one or both of these to one or more files.

You can also specify additional file descriptors, designating them by a number 3 through

9, and redirect I/O through them

File Redirection

Output redirection takes the output of a command and places it into a named file. Input

redirection reads the file as input to the command. The following table summarizes the

redirection options.

File Redirection Symbol Redirection

> output redirect >! same as above, but overrides noclobber option of csh >> append output

>>! same as above, but overrides noclobber option on csh and creates the file if it doesn't already exist.

| pipe output to another command < Input redirection

<<String Read from standard input until "String" is encountered as the only thing on the line. Also known as a "here document" (see Chapter 8).

<<\String same as above, but don't allow shell substitutions An example of output redirection is:

cat file1 file2 > file3

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 16

The above command concatenates file1 then file2 and redirects (sends) the output to

file3. If file3 doesn't already exist it is created. If it does exist it will either be truncated to

zero length before the new contents are inserted, or the command will be rejected, if the

noclobber option of the csh is set. The original files, file1 and file2, remain intact as

separate entities.

Output is appended to a file in the form:

cat file1 >> file2

This command appends the contents of file1 to the end of what already exists in file2.

(Does not overwrite file2).

Input is redirected from a file in the form:

program < file

This command takes the input for program from file. To pipe output to another command use the form:

command | command

This command makes the output of the first command the input of the second command

Sh

2> file direct stderr to file

> file 2>&1 direct both stdout and stderr to file

>> file 2>&1 append both stdout and stderr to file

2>&1 | command pipe stdout and stderr to command

To redirect stdout and stderr to two separate files you can do:

$ command 1> out_file 2> err_file

or, since the redirection defaults to stdout:

$ command > out_file 2> err_file

With the Bourne shell you can specify other file descriptors (3 through 9) and redirect

output through them. This is done with the form:

n>&m redirect file descriptor n to file descriptor m

We used the above to send stderr (2) to the same place as stdout (1), 2>&1, when we

wanted to have error messages and normal messages to go to file instead of the

terminal. If we wanted only the error messages to go to the file we could do this by

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 17

using a place holder file descriptor, 3. We'll first redirect 3 to 2, then redirect 2 to 1, and

finally, we'll redirect 1 to 3:

$ (command 3>&2 2>&1 1>&3) > file This sends stderr to 3 then to 1, and stdout to 3, which is redirected to 2. So, in effect,

we've reversed file descriptors 1 and 2 from their normal meaning. We might use this in

the following example:

$ (cat file 3>&2 2>&1 1>&3) > errfile

So if file is read the information is discarded from the command output, but if file can't

be read the error message is put in errfile for your later use.

You can close file descriptors when you're done with them:

m<&- closes an input file descriptor

<&- closes stdin

m>&- closes an output file descriptor

>&- closes stdout

Other Special Command Symbols

In addition to file redirection symbols there are a number of other special symbols you

can use on a command line. These include:

; command separator

& run the command in the background

&& run the command following this only if the previous command completes

successfully, e.g.:

grep string file && cat file

|| run the command following only if the previous command did not complete

successfully, e.g.:

grep string file || echo "String not found."

( ) the commands within the parentheses are executed in a subshell. The output of the

subshell can be manipulated as above.

' ' literal quotation marks. Don't allow any special meaning to any characters within

these quotations.

\ escape the following character (take it literally)

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 18

" " regular quotation marks. Allow variable and command substitution with theses

quotations (does not disable $ and \ within the string).

'command' take the output of this command and substitute it as an argument(s) on the

command line

# everything following until <newline> is a comment

The \ character can also be used to escape the <newline> character so that you can

continue a long command on more than one physical line of text

Wild Cards

The shell and some text processing programs will allow meta-characters, or wild cards, and replace them with pattern matches. For filenames these meta-characters

and their uses are:

? match any single character at the indicated position

* match any string of zero or more characters

[abc...] match any of the enclosed characters

[a-e] match any characters in the range a,b,c,d,e

[!def] match any characters not one of the enclosed characters, sh only

{abc,bcd,cde} match any set of characters separated by comma (,) (no spaces), csh

only

~ home directory of the current user, csh only

~user home directory of the specified user, csh only.

Some Useful UNIX Utility Programs In addition to the various tools built into whichever shell you use, UNIX normally has a

variety of programs to help you get your work done. These programs are often called

"tools", since you may not be able to accomplish your entire task with one, but a

collection of them will often help you achieve your goal.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 19

Here, we'll introduce some of the more commonly used tools. Remember, these tools

are often best used when combined together using the pipe mechanism descried

earlier.

For more detail, consult the man page for these commands.

tee

Tee forms a "T" fitting in pipes ( | ). It will take whatever is fed to it, copy it to a file, and

also feed the same data to its standard input. Thus you can keep a record of whatever

is flowing through some section of your pipes.

Use it as: "tee filename"

For example: ls | sort | tee sorted.list | less

script

Script is used to make a log file of your session. When you issue the command: script filename A new session will be started for you, and every character that's displayed on your

terminal (including your typing that's echoed to the screen) will go into the file. Type

"exit" or [Control-D] to end the log file.

grep

Grep is one of the classic UNIX tools. It will search through its input, and write to its

standard output any lines which contain text which matches a string you give it. This

allows you to quickly search a file, or a group of files for something.

The key to using grep are the regular expressions, which are similar to the wildcards

described above. A regular expression is a "formula" which describes what a text string

must contain in order for a "match" to occur. Here are some of the operators which

make up such a "formula":

- just match a single character

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 20

string - match an occurrence of string

. - match (almost) ANY character (once)

[string] - match any character in string (once)

[char1-char2] - match any character in ASCII collating sequence

between character char1 and character char2

* - match anything which has zero or more occurrences of

^ - make the "formula" match only if it's

at the start of a line

$ - make the "formula" match only if it's at the end of a line

\ - use a \ if you want to use special characters, such as "[" or "*"

To use grep, type : grep expression filename Where expression is a regular expression as described above, and filename is a

filename, or a shell wildcarded filename.

For example: grep #include *.c : list all the include lines in *.c. ntp.c:#include

ntp.c:#include

ntp.c:#include

test.c:/*#include "ntp.h"*/

grep ^#include *.c : more precise way to do above.

ntp.c:#include

ntp.c:#include

ntp.c:#include

grep ^#include *.c | less : pipe to a pager ps -axu | grep "r.*t" | less : find anything with an "r" followed eventually by a "t"

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 21

root 112 0.0 0.0 28 0 ? I Nov 1 0:00 (nfsd)

root 53 0.0 0.0 68 0 ? IW Nov 1 0:04 portmap

root 2 0.0 0.0 0 0 ? D Nov 1 10:47 pagedaemon

dela 2213 0.0 0.0 40 0 co IW Nov 1 0:00 /usr/openwin/bin/xinit -

dela 2321 0.0 0.0 36 0 co IW Nov 1 0:00 rsh augustus.me.rocheste

Finally, grep -v will list every line except those that match the regular expression. The

following example is just like the one above, but skips entries which include "root".

ps -axu | grep "r.*t" | grep -v root | less

dela 2213 0.0 0.0 40 0 co IW Nov 1 0:00 /usr/openwin/bin/xinit -

dela 2321 0.0 0.0 36 0 co IW Nov 1 0:00 rsh augustus.me.rocheste

diff

Diff will list the lines that are different between two files. Typically you'll use this to look

at two versions of the same file to see how it's changed. The output is somewhat

cryptic; it shows you the "ed" commands to change the first file into the second file,

followed by the affected lines from the two files.

Invoke diff via:

diff file1 file2

For example:

diff ntp_proto.c.original ntp_proto.c

176c176

< pkt->status = sys.leap | NTPVERSION_1 | peer->hmode;

---

> pkt->status = sys.leap.year | NTPVERSION_1 | peer->hmode;

sort

Sort will sort the contents of a file. By default it will use the ASCII collating sequence

(which is wrong for numbers, 101 will sort before 12). The two options of interest are "-

n", which will sort numerically, and "+#" where # is the number of the word in each line

(start at zero) to sort on. For example

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 22

ls -l

produces:

-rw-r--r-- 1 dela 5692 Nov 4 16:34 #slides.txt#

-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt

-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt

-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1

ls -l | sort -n +3 | less

produces:

-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt

-rw-r--r-- 1 dela 4872 Nov 4 16:05 slides.txt

-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1

-rw-r--r-- 1 dela 5692 Nov 4 16:34 #slides.txt#

wc

Wc will count the words in a file. It also reports how many lines and characters there are

in a file.

wc slides.txt

204 987 6244 slides.txt

head and tail

Head and tail are two programs that will show the beginning and the end of their input

respectively. They are often used in pipes as well. Both commands will take a numeric

argument that determines how many lines to show (the default is 10).

ls -l | sort -n +3 | head -2

-rw-r--r-- 1 dela 849 Oct 30 17:15 outline.txt

-rw-r--r-- 1 dela 4872 Nov 4 16:12 slides.txt1

less

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 23

Less is a pager which you use just like more, but it's better. Like more, less will page

through your document when you hit [space], but unlike more you can page backwards

by hitting the "b" key. Also "g"

will move you to the start of the file, "G" will move you to the end of the file. Hit the "h"

key when running less for help.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 24

Chapter-2 EDITORS

Introduction

The editor is the basic tool used to create or modify text files under any computer

system. The UNIX system provides several standard editors. One of the most popular of

these is vi.

Vi is useful because it is a screen editor. The file being edited is displayed on screen.

The user can move the cursor (a pointer) around the text. Any changes made to the text

are displayed immediately on the screen.

Most general editing commands are available under vi. The cursor is used to navigate

around a file. Text may be inserted, deleted, or changed, and a range of powerful

pattern matching commands allow large global edits to be performed.

The best way to learn vi is to use it. It isn't necessary to learn all the commands at once.

Indeed this handout describes just a subset of commands available. Users will find that

they develop their own `working set' of commands. These will suit their own type of

work. Occasional glances back at this handout will prove useful, as you may discover

new useful commands and add these to your working set.

Starting vi

The vi editor can be invoked with one of the following command lines, as well as a few

others that are only needed for more advanced users:

Open file under vi:

vi file

Open file at line n:

vi +n file

Open file at first occurrence of pattern:

vi +/pattern file

NOTE: If you start vi with a non-existent filename, vi will create an empty file for you.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 25

Modes

The vi editor has the following three modes of operation:

• Command mode, in which keystrokes are interpreted as vi editor subcommands

to be carried out immediately without being displayed.

In command mode numbers act as command modifiers. Entering a number n

before a command means that the command is to be acted upon n times. For

example, the command x deletes 1 character, 5x deletes 5 characters.

• Text input mode, in which a keystroke is interpreted as text to be displayed as it

is added to the file.

• Last line mode, in which all keystrokes, until the enter or return key is pressed,

form a subcommand that appears at the bottom of the screen as you type.

Navigating in vi

Some of the following commands will vary depending on the terminal emulator you are

using. The following commands and control sequences will let you move around in the

editor:

Left, Down, Up, Right (respectively):

h, j, k, l

You can also use the arrow keys.

Move forward one character:

spacebar

Scroll forward or backward one screen (respectively):

^f (Ctrl-f), ^b (Ctrl-b)

You can also use Page-up and Page-down.

Move to the beginning or the end of the file:

1G, G

Move to line number n:

nG

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 26

Move to the beginning of the current line:

0

Move to the first non-blank character:

^

Move to the end of the current line:

$

Redraw screen:

l̂ (ctrl-l)

Inserting and Editing Text

Insert mode can be started through one of the following commands: Append after cursor:

a

Append at end of line:

A

Insert before cursor:

i

Insert at beginning of line:

I

Open a line below current line:

o (this is a lower-case ooh)

Open a line above current line:

O (this is an upper-case ooh)

Terminate insert mode:

ESC

Searching The vi editor can also be used simply for finding data in file. The use of searches can

simplify this process. The more common search commands are as follows:

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 27

Search forward for text:

/text

Repeat previous search:

n

Repeat previous search in opposite direction:

N

Repeat forward search:

/

Deleting and moving text

Eventually, you will need to delete and/or copy and move text. The next few commands

will help you in doing so:

Delete line:

dd

Delete n lines:

ndd

Copy line to a buffer:

yy

Paste current buffer contents back into the text:

p

Marking and copying a section:

1. Move the cursor to the beginning of the section you want to copy.

2. type ma (where a is any letter from a to z)

3. move the cursor to the end of the section you want to copy

4. type y'a (where a is the letter you previously used as a marker)

5. and, then use p to past it at your target destination.

Marking and deleting a section

1. move cursor to the beginning of the section you want to delete

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 28

2. type ma (where a is any letter from a to z)

3. move the cursor to the end of the section you want to delete

4. type d'a (where a is the letter you previously used as a marker)

Saving and Exiting the vi editor Now that you have edited and finished whatever file you are working with, you can quit

and save or abort any work you may have done on the file. The next list of commands

will accomplish this:

Quit vi, saving all changes:

ZZ or :x or :wq

Write to file (save):

:w

Write to file (save as):

:w file

Quit file:

:q

Quit file and abort changes since last save:

:q!

Edit file2 without leaving vi:

:e file2

Entering ex Commands in Last Line Mode

The vi editor allows you to enter ex editor commands. You use ex commands from

command mode, by entering ":" followed by the ex command.

Examples: Global replace of <search_string> with <replacement_string>:

:<line_range>s/<search_string>/<replacement_string>/g

Global deletion of all lines containing <search_string>:

:<line_range>g/<search_string>

<line_range>

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 29

The range specifies an area of the file on which to perform a specific command.

Following are a few of the common line_range formats:

3,5 (from line 3 to 5)

^,45 (from the beginning to line 45)

45,$ (from line 45 to the end)

% (entire file)

// for example: /abc/ (the one line containing the next

occurrence of the pattern)

Here are just a few commands that the ex editor provides:

s - substitutes one string for another

g - globally performs a command

d - deletes a line

<search_string>

one or a series of characters that form a pattern, called a regular expression.

Letters (e.g. A-Z or a-z) and numbers, in the search string have to be matched by

the same letter or number in the text. Some special symbols that have special

meaning are:

^ - beginning of the line

$ - end of the line

* - any number of the preceding characters

. - any single character

Regular expressions can be used in both a pattern in line_range and as a

search_string.

Example 1: :%s/rate/value/g

Changes all occurrences of rate to value in the file being edited.

% says to apply the command to all lines in the file.

s is the substitute command.

g says to apply the substitution globally to all

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 30

occurrences on the same line.

Example 2: :%g/^.*0.*/d

Deletes all lines which contain a 0.

% says to apply the command to the entire file.

g says its a global change.

^ is the beginning of the line symbol.

. is the any single character symbol.

* is the any number of character symbol.

^.*0.* matches all lines in the file that contain any number of

characters starting at the beginning of the line, a zero,

and then any number of additional characters.

d says to delete the lines.

Command Summary

starting vi vi filename edit a file named "filename"

vi newfile create a new file named "newfile"

entering text i insert text left of cursor

a append text right of cursor

I insert text at the beginning of the line

A insert text at the end of the line

moving the cursor h, (left arrow) left one space

j, (down arrow), + down one line

k, (up arrow), - up one line

l, (right arrow) right one space

0 (zero) to beginning of line

^ to first non-blank character

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 31

$ to end of line

H to top line of screen

M to middle line of screen

L to last line of screen

w forward word by word

b backward word by word

$ to end of line

basic editing x delete character

nx delete n characters

X delete character before cursor

dw delete word

ndw delete n words

dd delete line

ndd delete n lines

D delete characters from cursor to end of line

r replace the character the cursor is under

R replace characters until ESC is pressed

cw replace a word

ncw replace n words

C change text from cursor to end of line

cc Change the current line

o insert blank line below the line the cursor is on

(ready for insertion)

O insert blank line above the line the cursor is on

(ready for insertion)

J join succeeding line to current cursor line

nJ join n succeeding lines to current cursor line

. repeat the last command

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 32

u undo the last command

U undo the last change on the line

ns replace n characters

moving around in the file

^f, Pagedown scroll forward one screen

^b, Pageup scroll backward one screen

z+ scroll forward one screen

z^ scroll backward one screen ^d scroll down one-half screen

^u scroll up one-half screen

/string forward search for string

?string backward search for string

n repeat last search in same direction

N repeat last search in opposite direction

G to last line of file

1G to first line of file

nG to the nth line

start entering an ex command

:

closing and saving a file

ZZ save file and then quit

:w save file

:wq save file and then quit

:q! discard changes and quit file

:q quit

For more information about the vi editor, see the vi man page

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 33

Chapter-3 UNIX file system

A file system is a logical method for organising and storing large amounts of information

in a way that makes it easy manage. The file is the smallest unit in which information is

stored. The UNIX file system has several important features.

• Different types of file

• Structure of the file system

• Your home directory

• Your current directory

• Pathnames

• Access permissions

Different types of file

To the user, it appears as though there is only one type of file in UNIX - the file which is

used to hold your information. In fact, the UNIX filesystem contains several types of file.

• Ordinary files

• Directories

• Special files

• Pipes

Ordinary files

This type of file is used to store your information, such as some text you have written or

an image you have drawn. This is the type of file that you usually work with.

Files which you create belong to you - you are said to "own" them - and you can set

access permissions to control which other users can have access to them. Any file is

always contained within a directory.

Directories

A directory is a file that holds other files and other directories. You can create directories

in your home directory to hold files and other sub-directories.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 34

Having your own directory structure gives you a definable place to work from and allows

you to structure your information in a way that makes best sense to you.

Directories which you create belong to you - you are said to "own" them - and you can

set access permissions to control which other users can have access to the information

they contain.

Special files

This type of file is used to represent a real physical device such as a printer, tape drive

or terminal.

It may seem unusual to think of a physical device as a file, but it allows you to send the

output of a command to a device in the same way that you send it to a file. For example:

cat scream.au > /dev/audio

This sends the contents of the sound file scream.au to the file /dev/audio which

represents the audio device attached to the system.

The directory /dev contains the special files which are used to represent devices on a

UNIX system.

Pipes

UNIX allows you to link commands together using a pipe.

The pipe acts a temporary file which only exists to hold data from one command until it

is read by another.

Structure of the file system

The UNIX file system is organised as a hierarchy of directories starting from a single

directory called root which is represented by a / (slash). Imagine it as being similar to

the root system of a plant or as an inverted tree structure.

Immediately below the root directory are several system directories that contain

information required by the operating system. The file holding the UNIX kernel is also

here.

• UNIX system directories

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 35

• Home directory

• Pathnames

UNIX system directories The standard system directories are shown below. Each one contains specific types of

file. The details may vary between different UNIX systems, but these directories should

be common to all. /(root) | -------------------------------------------------------------- | | | | | | | | /bin /dev /etc /home /lib /tmp /usr kernel file /bin: This directory contains the commands and utilities that you use day to day. These

are executable binary files - hence the directory name bin.

Often in modern UNIX systems this directory is simply a link to /usr/bin.

/dev: This directory contains special files used to represent real physical devices such

as printers and terminals. One of these files represents a null (non-existent) device.

/etc: This directory contains various commands and files which are used for system

administration. One of these files - motd - contains a 'message of the day' which is

displayed whenever you login to the system.

/home: This directory contains a home directory for each user of the system.

/lib: This directory contains libraries that are used by various programs and languages.

Often in modern UNIX systems this directory is simply a link to /usr/lib.

/tmp: This directory acts as a "scratch" area in which any user can store files on a

temporary basis

/usr: This directory contains system files and directories that you share with other users.

Application programs, on-line manual pages, and language dictionaries typically reside

here.

/Kernel file: As its name implies, the kernel is at the core of each UNIX system and is

loaded in whenever the system is started up - referred to as a boot of the system.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 36

It manages the entire resources of the system, presenting them to you and every other

user as a coherent system. You do not need to know anything about the kernel in order

to use a UNIX system. This information is provided for your information only.

Amongst the functions performed by the kernel are:

• managing the machine's memory and allocating it to each process.

• scheduling the work done by the CPU so that the work of each user is carried out

as efficiently as is possible.

• organising the transfer of data from one part of the machine to another.

• accepting instructions from the shell and carrying them out.

• enforcing the access permissions that are in force on the file system.

Home Directory Any UNIX system can have many users on it at any one time. As a user you are given a

home directory in which you are placed whenever you log on to the system.

User's home directories are usually grouped together under a system directory such as

/home. A large UNIX system may have several hundred users, with their home

directories grouped in subdirectories according to some schema such as their

organisational department.

Pathnames Every file and directory in the file system can be identified by a complete list of the

names of the directories that are on the route from the root directory to that file or

directory – Absolute Pathname .

Each directory name on the route is separated by a / (forward slash). For example:

/usr/local/bin/ue

This gives the full pathname starting at the root directory and going down through the

directories usr, local and bin to the file ue

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 37

Relative pathnames You can define a file or directory by its location in relation to your current directory. The

pathname is given as / (slash) separated list of the directories on the route to the file (or

directory) from your current directory.

A .. (dot dot) is used to represent the directory immediately above the current directory.

In all shells except the Bourne shell, the ~ (tilde) character can be used as shorthand for

the full pathname to your home directory.

Unix acess perms

UNIX files have protection mode given by short integer (2 bytes).

Symbolically labeled bits are set as follows

Bit Set Description

b block special file (d and c bits set)

c character special file

d directory

r read permission granted

w write permission granted

x execute permission granted

s set user id on execution

S set group id on execution

File System Model

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 38

Installations of UNIX have several physical file systems on

• different discs

• different partitions of same disc

Physical file systems never span disc partitions.

File systems are sequence of fixed sized file blocks of bytes either

512 1024 2048 ... Pointers link blocks together into ordered chains.

Block size is trade off between performance and storage efficiency

Block Size Advantage larger higher transfer rate between disc and RAM smaller higher effective storage capacity

Components of file system

The word inode is short for index node.

Files may have holes in them, which are

• created by moving pointer past file end and writing data

• interpreted as zero valued bytes

boot block start of file system, typically the first sector initialisation code to boot UNIX, possibly empty super block state of file system - size, file capacity, free space inodes kernel indexes into inode list (includes root inode) data block file and administrative data (no shared blocks)

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 39

Boot Block The Boot block is the beginning of a file system typically the first sector and may

contain the bootstrap code that is read into the machine to boot, or initialize the

Operating system. Although only one boot block is needed to boot the system,

every file system has a possibly empty boot block.

Super Block The super block describes the state of a file system:

The super block consists of the following fields:

1.Size of the file system

2. No. of free blocks in the file system

3. A list of free blocks available on the file system

4. Index of the next free block in the free block list

5. Size of the I- node list

6. No. of free I-nodes in the file system.

7. Index of the next free I-node in the file system

Inode

Each I-node consists of the following information

1. File ownership 2. File type 3. File Access permissions 4. Creation time 5. Modification Time 6. Time of last access 7. Number of links to a file representing the number of names the file has 8. File size 9. Array of 13 pointers to file

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 40

Process file descriptors and disc blocks are linked as above.

File has one inode but may have several names or links that are either

• Hard • Soft

Soft or symbolic link to file is

• implemented as file containing absolute or relative pathname

• interpreted at access time and need not succeed in referring

• interpreted relative to link name's directory if relative name

open() system call applied to link follows the link to its target.

stat() system call applied to link reports link file's status.

Symbolic links unlike hard links can refer to

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 41

• files on other file systems • directories • other symbolic links in a loop

File Updates Updates of files are converted to updates of disc sectors.

Modification of data in logical block is done by

• allocating system buffer • determining location of physical block on disc • reading physical block into system buffer • altering part of buffer contents with user buffer contents • writing block back to disc

Data blocks: Pure data is stored in the data blocks, which commences from the point the I-node. An

allocated data block can belong to one and only one file in the file system.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 42

Block addressing Scheme There are 13 entries in the I-node table containing the addresses of upto 13 disk blocks.

The first 10 addresses are simple. They contain the disk addresses of the first 10 blocks

of the file. However, reserving space for 10 addresses in the I-node table doesn’t mean

that 10 disk blocks are automatically allocated. If a file is only 3 blocks long, the first 3

entries in the table contain the disk block numbers & the remaining entries are flushed

out with zeros. As the file grows beyond 10 blocks, an eleventh block is allocated to

specify a disk block, which contains the addresses of the next 341 data blocks

(Assuming the block size is 1024 & each data is a 3 byte address: 1024/3). This block is

called the Single indirect block. With these eleven pointers, the size of the file

becomes 10k + 341k. When the file grows beyond this size, the twelfth block, known as

the Double indirect block is used. This block contains the address of another block,

which contains the addresses of 341 indirect blocks. This enables us to reference 10k +

341k + (341 * 341k) data blocks. Finally, if the file size exceeds this, which is very

unlikely, the thirteenth pointer, known as the Triple indirect block, enhances the

maximum possible file size to 10k + 341k +(341 * 341k) + (341 * 341 * 341k). The

organization of the data blocks used by the file is depicted below.

Direct 0

Direct 1

Direct 2

Direct 3

Direct 4

Direct 5

Direct 6

Direct 7

Direct 8

Direct 9

single indirect

double indirect

triple indirect

Data BlocksInode

“UNIX”

©

File Related System Calls

open

A file is opened by

The pathname is the name of the file to be opened. The symbolic specification

of the oflag argument which can be combined together by or’ ing together is as

below:

O_RDONLY Open for reading only

O_WRONLY Open for writing only

O_RDWR Open for reading and writing

O_NDELAY Do not block on open or read or write

O_APPEND Append to the end of file on each write

O_CREAT Creat a file if it doesn’t exist

O_TRUNC If the file exists, truncate its length to zero

O_EXCL Error if O_CREAT and the file already exists

The third argument is used only if a new file is being created.

creat

A new file can be created by

#include<fcntl.h>

int creat ( char * pathname, int mode );

Returns a file descriptor if successful, -1 on error

#include<fcntl.h>

int open ( char * pathname, int oflag, [int mode ]);

Returns a file descriptor if successful, -1 on error.

CRANES VARSITY ALL RIGHTS RESERVED 43

“UNIX”

©

This call is equivalent to opening a file with O_CREAT|O_WRONLY|O_TRUNC

mode using the open system call.

close

An open file is closed by

W

r

D

T

a

W

r

W

T

i

#include<fcntl.h>

int close ( int filedes);

Returns a 0 if successful, -1 on error

CRANES VARSITY ALL RIGHTS RESERVED 44

hen a process terminates, the kernel closes all the files automatically.

ead

ata is read from an open file using

here are several cases in which the no. of bytes actually read is less than the

mount requested.

hen reading from a regular file, if the no. of bytes are less than what was

equested for.

hen reading from a terminal device, normally upto one line is read at a time.

he read operation starts at the file’s offset. Before a successful return, the offset

s incremented by the number of bytes actually read.

#include<fcntl.h>

int read ( int filedes, char * buf, int nbytes);

Returns the no. of bytes read if successful – this can be less than the nbytes that was requested, 0 if there are no bytes to be read or -1 on error

“UNIX”

©

write

Data is written into an open file using

lE

a

f

t

p

f

T

#include<fcntl.h>

int write ( int filedes, char * buf, int nbytes);

Returns the no. of bytes written if successful – this can be less than the nbytes that was requested, 0 if there is no space to write or -1 on error

CRANES VARSITY ALL RIGHTS RESERVED 45

seek very open file has a current byte position associated with it. This is measured

s the number of bytes from the start of the file. The creat system call sets the

ile’s position to the beginning of the file, as does the open system call, unless

he O_APPEND is set. The read and write system calls update the file’s

osition by the number of bytes read or written. Before read or write, an open

ile can be positioned using

he offset and whence arguments are interpreted as follows:

If the whence is 0, the file’s position is set to offset bytes from the beginning of the file.

If the whence is 1, the file’s position is set to its current position plus the offset. The offset can be positive or negative.

If whence is 2, the file’s position is set to the size of the file plus offset. The offset can be positive or negative.

#include<fcntl.h>

long lseek ( int filedes, long offset, int whence);

Returns the new long integer byte offset of the file or -1 on error

“UNIX”

©

link link system call adds a new link to a directory. Every time a new file is created,

you are putting a pointer to a directory. This pointer associates a filename with a

place on the disk. The link utility creats an additional pointer to an exixting file. It

does not make another copy of the file. Because there is only one file, the file

status information is the same.

The first parameter, old path must be an existing link. The second parameter,

new path indicates the name of the new link.

unlink

The unlink system call removes a specified link from the directory, reducing the

link count in the

I-node by one. If the resulting link count is zero, the file system will discard the

file. All disk space that is used will be made available for reuse. The I-node will

become available for reuse too.

#include<fcntl.h>

int link ( char * old path, char* new path);

Returns 0 on success or -1 on error

#include<fcntl.h>

int unlink ( char * path );

Returns 0 on success or -1 on error

CRANES VARSITY ALL RIGHTS RESERVED 46

“UNIX”

©

Chmod

This system call allows us to change the access permissions for an existing file.

C

Ta

f

To

T

F

F

#include<fcntl.h>

int chmod ( char * path, int mode );

Returns 0 on success or -1 on error.

hown

his system call allows us to change the ownership for an existing file, i.e, it llows us to change the User Id and Group Id of the file.

cntl

he fcntl system call is used to change the properties of a file that is already pen.

#include<fcntl.h>

int chown ( char * path, int owner, int group);

Returns 0 on success or -1 on error.

#include<fcntl.h>

int fcntl ( int filedes, int cmd, int arg );

Returns 0 on success or -1 on error.

CRANES VARSITY ALL RIGHTS RESERVED 47

he cmd argument must be one of the following:

_DUPFD Duplicate the file descriptor filedes. It allows us to specify the lowest number that the new filedescriptor is to assume indicated by the value of arg. The return value is the new file descriptor.

_SETFD Set the close-on-exec flag for the file to the low-order bit of arg. If the low order bit of arg is set, the file is closed on exec system call. Otherwise the file remains open across an exec.

“UNIX”

©

F_GETFD Return the close-on-exec flag for the file as the value of the system call.

F_SETFL Set the status flags for this file to the value of the arg. The only flags that can be changed are O_APPEND and O_NDELAY.

F_GETFL Return the status flags for this file as the value of the system call.

Stat and fstat

The stat and fstat system calls return the attributes of a specified file to the caller.

Sp

f

s

T

S

#include<sys/types.h> #include<sys/stat.h>

int stat ( char * pathname, struct stat * buf);

int fstat ( int filedes, struct stat * buf );

Returns 0 on success or -1 on error.

CRANES VARSITY ALL RIGHTS RESERVED 48

tat and fstat are used to get status information from an I-node. Stat takes a

ath and finds the I-node by following it. fstat takes an open file descriptor and

inds the I-node from the active I-node table inside the user supplied stat

tructures which is defined in /usr/include/sys/stat.h.

he stat structure used in these system calls is given below:

truct stat { ushort st_mode; /* file type ans acess permissions*/ ino_t st_ino; /* I-node number*/

dev_t st_dev; /*Id of the device containing a directory entry for this file*/ short st_nlinks; /*number of links*/ ushort st_uid; /*User Id*/ ushort st_gid; /*Group Id*/

dev_t st_rdev; /*Id for the device, for char special or block special files.*/ off_t st_size; /* file size in bytes*/ time_t st_atime; /* time of last file access*/ time_t st_mtime; /* time of last file modification*/ time_t st_ctime; /* time of last file status change*/ };

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 49

Chapter-3 Process management

UNIX Process

A process is an instance of running a program. If, for example, three people are

running the same program simultaneously, there are three processes there, not just

one. In fact, we might have more than one process running even with only person

executing the program, because (you will see later) the program can ``split into two,''

making two processes out of one.

UNIX Process State Transition Diagram

“UNIX”

© CRAN

Process states & transitions

The lifetime of a process can be conceptually divided into a set of states.

1.The process is executing in user mode.

2.The process is executing in kernel mode.

3. The process is not executing , but is ready to run as soon as the kernel schedules it.

4.The process is sleeping and resides in main memory.

5. The process is ready to run, but the swapper process swaps it out of memory.

6. The process is sleeping, and swapper has swapped out the process.

7.The process is returning from the kernel to user mode, but the kernel preempts it.

8.The process id newly created and is in a transition state. i.e. process is neither in

sleep state nor is ready to run. The start state of a process.

9.The process executed the exit system call and is in the zombie state.

Process Data structures

Proc

u-area

ES VARSITY ALL RIGHTS

ess table

per process region table

RESERVED

iProcess table

memoryMain memory

50

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 51

Process table entry /u-area

Layout of system memory •UNIX system contains three logical sections: text, data and stack.

•The text section m/c executable instruction set of a process; addresses in the text

section include text, data and stack addresses.

•Compiler generates addresses for a virtual address space with a given address range

and the m/c’s memory management unit translates the virtual addresses generated by

the compiler into address locations in physical memory.

•The subsystems of the kernel and the hardware that cooperate to translate virtual to

physical addresses comprise the memory management subsystem.

pointer to

u area

state of

process

Size of

process

UIDs

PIDs

Event

descriptor

scheduling params

enum of

signals

times used

usr/sys

alarm

process table entry

pointer to PT

entry

Real UID

effect-ive UID

React to

signals

array

login terminal

errors of sys call

Return val of sys call

I/O

parameters

times used usr/sys

current directo

ry

current root

user file

descriptor

file size limit

process size

limit

U-area

Pointer to

dynamic stack

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 52

Regions •The kernel divides the virtual address space of a process into logical regions.

•The concept of the regions is independent of the memory management policies

implemented by the operating system.

•A region is a contiguous area of the virtual address space of a process that can be

treated as a distinct object to be shared or protected.

•The region table contains the information to determine where its contents are located in

physical memory.

•The per process region table. Each pregion entry has 3 fields.

1. points to a region table entry.

2. contains the start virtual address of the region, and

3. a permission field that indicates the type of access allowed to a process.

Processes & regions

Per Process region table (virtual address)

text

data

stack

text

data

stack 32k

8k

16k

4k

8k

32k

b

a

c

d

e

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 53

Memory triplets

Memory is organised in pages of 1K bytes, accessed via page tables. The system

contains a set of mmu reg triples as shown.

addr of page table First vir addr control info such as no. of pages,

in physical mem mapped page access perms etc.,

Context of a process

The context of a process

–consists of contents of its(user) address space, contexts of hardware registers and

Kernel data structures

–is the union of its user-level context, register context and system-level context

–System-level context consists of static and dynamic portion

Dynamic Portion of Context

Static Portion of Context

User level Context

Process text Data Stack

Shared Data

Process table entry U Area

Per Process Region Table

Static Part of System Level Context

User Level

Kernel Context Layer 0

Kernel Stack for Layer 1 Saved Register Context For Layer 0

Layer 1

Kernel Stack for Layer 2 Saved Register Context For Layer 1

Layer 2

Kernel Stack for Layer 3 Saved Register Context For Layer 2

Layer 3

logical pointer to current context layer

Components of the Context of a Process

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 54

How the Kernel Manages Processes in Unix

Address Space: For each new process created, the kernel sets up an address space

in memory. This address space consists of the following logical segments:

• text - contains the program's instructions.

• data - contains initialized program variables.

• bss - contains uninitialized program variables.

• stack - a dynamically growable segment, it contains variables allocated locally

and parameters passed to functions in the program.

Each process has two stacks: a user stack and a kernel stack. These stacks are used

when the process executes in the user or kernel mode (described below).

Mode Switching: At least two different modes of operation are used by the Unix kernel

- a more privileged kernel mode, and a less privileged user mode. This is done to

protect some parts of the address space from user mode access.

User Mode: Processes, created directly by the users, whose instructions are currently

executing in the CPU are considered to be operating in the user-mode. Processes

running in the user mode do not have access to code and data for other users or to

other areas of address space protected by the kernel from user mode access.

Kernel Mode: Processes carrying out kernel instructions are said to be running in the

kernel-mode. A user process can be in the kernel-mode while making a system call,

while generating an exception/fault, or in case on an interrupt. Essentially, a mode

switch occurs and control is transferred to the kernel when a user program makes a

system call. The kernel then executes the instructions on the user's behalf.

While in the kernel-mode, a process has full privileges and may access the code and

data of any process (in other words, the kernel can see the entire address space of any

process).

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 55

The Context of a Process and Context Switching: The context of a process is

essentially a snapshot of its current runtime environment, including its address space,

stack space, etc. At any given time, a process can be in user-mode, kernel-mode,

sleeping, waiting on I/O, and so on. The process scheduling subsystem within the

kernel uses a time slice of typically 20ms to rotate among currently running processes.

Each process is given its share of the CPU for 20ms, then left to sleep until its turn

again at the CPU. This process of moving processes in and out of the CPU is called

context switching. The kernel makes the operating system appear to be multi-tasking

(i.e. running processes concurrently) via the use of efficient context-switching.

At each context switch, the context of the process to be swapped out of the CPU is

saved to RAM. It is restored when the process is scheduled its share of the CPU again.

All this happens very fast, in microseconds.

To be more precise, context switching may occur for a user process when

• a system call is made, thus causing a switch to the kernel-mode,

• a hardware interrupt, bus error, segmentation fault, floating point exception, etc.

occurs,

• a process voluntarily goes to sleep waiting for a resource or for some other

reason, and

• the kernel preempts the currently running process (i.e. a normal process

scheduler event).

Context switching for a user process may occur also between threads of the same

process.

Extensive context switching is an indication of a CPU bottleneck.

Context switch can occur under the following situations:

• When the Process puts itself to sleep

• When a process exits

• When it returns from a sys call to user mode, but is not eligible to run

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 56

• When it returns to user mode after the kernel completes handling an interrupt, but is

not eligible to run.

Talking to Running Processes

Unix provides a way for a user to communicate with a running process. This is

accomplished via signals, a facility which enables a running process to be notified

about the occurrence of a) an error event generated by the executing process, or b) an

asynchronous event generated by a process outside the executing process.

Signals are sent to the process ultimately by the kernel. The receiving process has to be

programmed such that it can catch a signal and take a certain action depending on

which signal was sent.

Here is a list of common signals and their numerical values:

SIGHUP 1 Hangup

SIGINT 2 Interrupt

SIGKILL 9 Kill (cannot be caught or ignore)

SIGTERM 15 Terminate (termination signal from SIGKILL)

(Many more signals exist; these are the most commonly used ones.)

You send a running process a signal using the Unix kill command. The basic usage is

kill -<VALUE> <PID_OF_THE_PROCESS>

Processes vs. Jobs

During the Unix shell discussion, we spoke of job control in csh and other, newer shells.

Job control is basically an explicit exercise in using signals. When you fire up a long

running command or program from your shell prompt, you are starting a process. If you

hit CTRL-Z, assuming it's bound to your tty's suspend function (i.e. "stty susp ^Z"), you

are sending the process a terminal stop signal (SIGTSTP) and asking it to be stopped. It

may be brought back to a running state with a shell command like fg, which sends a

continue signal (SIGCONT) to the sleeping (i.e. waiting) process.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 57

Process Related system Calls fork The only way a new process is created by the kernel is when an existing process calls the fork system call.

The new process created by fork is called the child process. This system call is

called once but returns twice. The only difference in the returns is the return

value in the child is 0. While the return value in the parent is the process ID of the

new child. The reason the child’s process ID is returned to the parent is because

the process can have more than one child. So there is no function that allows

process to obtain the process Ids of its children. The reason fork returns zero to

the child is because a process has only a single parent, so the child can always

call getppid to obtain the process ID of its parent.

Both the child and parent continue executing with the instruction that follows the

call to fork. The child is the copy of the parent. For eg, the child gets a copy of

parenmt’s data space, heap and stack. Note that this is the copy for the child, the

parent and the child do not share these portion of memory.

Many current implementations do not perform a complete copy of the parent’s

data, heap and stack, since a fork is often followed by an exec. Inbstead they

use copy-on-write. These regions are shared by the parent and the child and

have their protection changed by the kernel to read only. If either process tries to

modify this region, the kernel makes a copy of that piece of memory only typically

a “page” in a virtual memory system.

An important feature of fork operation is that the child process shares the files

that were open in the parent process before the fork. This feature provides an

int fork ( );

Returns 0 to child and process ID of child to partent or -1 on error.

“UNIX”

©

easy way for the parent process to open specific files or devices and pass those

open files to the child process. After the fork the parent closes the files that it

opened for the child, So that the processes are not sharing the same file.

The values of the following in child process are copied from the parent process

• The real user ID

• Real group ID

• Effective user ID

• Effective group ID

• Process group ID

• Terminal group ID

• Root directory

• Current working directory

• Signal handling settings

• File mode creation mask

The child process differs from the parent process in the following ways: • The child process has a new, unique process ID

• The child process has a different Parent Process ID

• The return value from fork

• It has its own copies of the parents file descriptors

• The time left until an alarm clock signal is set to zero in the file.

• File locks set by the parent are not inherited by the child

Wait and waitpid

The process can wait for one of the child processes to finish by executing the wait system call.

int wait ( int * status );

int waitpid ( int pid, int * status, int options );

Returns process ID of the child that terminated if successful or -1 on error.

CRANES VARSITY ALL RIGHTS RESERVED 58

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 59

If the process that calls the wait does not have any child processes, wait returns a value

of –1 immediately. If the process that calls wait has one or more child processes that

has not yet terminated, then the calling

process is suspended by the kernel until one of its child processes terminates. When a

child process terminates and waits returns, if the status argument is not NULL, the

value passed to exit by terminating child process is stored in the status variable. Some

additional information is also returned by wait.

There are three conditions for which wait returns a PID as its return value.

1. A child process called exit

2. A child process was terminated by a signal

3. A child process was being traced and the process stopped. This occurs when

process tracing the execution of another process, such as when a debugger is

being used to step through a process.

What happens to the parent process ID of a child process when the parent process

terminates before the child process?

There are the following possible scenarios to consider

1. The child process terminates before the parent process:

This is the “normal” condition when we are entering commands to an

interactive shell

a. If the parent process has already executed a wait, then the wait returns to

the parent process with the process ID of the child that terminated.

b. If the parent process has not executed a wait, then the child process

becomes a “Zombie” process. (If the parent process of the existing

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 60

process is not executing a wait, the terminating process is marked as

“Zombie” Process.)

The parent process terminates before the child process; the child process

becomes an “Orphan” process. For this child process that are about to be

orphaned, UNIX sets their parent ID to 1, the PID of the init process.

The difference between these tow system calls are:

Wait blocks the caller until a child process terminates, while waitpid has

an option that prevents it from blocking.

Waitpid does not wait for the first child to terminate. It has a number of

options that control which process it waits for. For both the system calls, the second parameter status is a pointer to an integer.

If this argument is not a NULL, the termination or the exit status of the terminated

process is stored in the location pointed to by the argument.

The interpretation of the pid argument for waitpid depends on its value:

pid == -1 waits for any child process (equivalent to wait )

pid > 0 waits for the child whose process ID equals pid

pid == 0 waits for any child whose process group ID equals that of the calling process

pid < -1 waits for any child whose process group ID equals the absolute value of pid

waitpid returns the process ID of the child that terminated, and its termination

status is returned through status. With wait a -1 is returned if the calling

process has no children. With waitpid, however, it is possible to get an error if th

specified process or the process group does not exist ot it is not a child of the

calling process.

“UNIX”

©

The option argument lets us further control the operation of waitpid. This

argument is either 0 or is constructed from the bitwise OR of the following

constants:

WNOHANG waitpid will not block if a child specified by the pid is not

immediately available. In this case the return value is 0.

WUNTRACED if the implementation supports job control, the status of any

child specified by the pid that has stopped, and the whose status

has not been reported since it has stopped, is returned.

Hence waitpid allows us to wait for a particular process, provides nonblocking

version of wait and supports jobcontrol (with WUNTRACED option)

exec

The only way to execute a program in UNIX is for an existing process to issue the exec system call.

T

ed

c

p

int execlp( char * filename, char* arg0, char* arg1, …, char* argn, (char*) 0); int execl( char * pathname, char* arg0, char* arg1, …, char* argn, (char*) 0); int execle( char* pathname, char* arg0,char* arg1, …,char* argn, (char*)0, char**envp); int execvp( char * filename, char** argv); int execv( char * pathname, char** argv);

int execve( char * pathname, char** argv, char** envp);.

Returns to the caller only if an error occurs. Otherwise the control is passed to the start of the new program.

CRANES VARSITY ALL RIGHTS RESERVED 61

he exec system call replaces the current process with the new program. The

xec system call reinitializes a process from a designated program. The PID

oes not change. We refer to a process that issues an exec system call as the

alling process and the program that is execed as the new program. The

rocess ID does not change across an exec. The relationship between these six function are shown:

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 62

The program invoked by the exec system call inherits the following attributes from the process that calls exec.

• PID

• PPID

• GPID

• TPID

• Time left until an alarm clock signal

• Root directory

• Current working directory

• File mode creation mask

• Real user ID

• Real Group ID

• File locks

The two attributes that can change a new program is execed are:

• Effective user ID, Effective Group ID

getpid, getppid, getuid, geteuid, getgid, getegid

exit

Sys call

Add envp

Convert file to path

execlp (file, arg, ---,0) execl (path, arg, ---,0) execle (path, arg, ---,0, envp)

execve (path, argv, envp) execv (path, argv) execvp (file, argv)

int getpid ( void ); Returns process ID of calling process

int getppid ( void ); Returns parent process ID of calling process

int getuid ( void ); Returns real user ID of calling process

int geteuid ( void ); Returns effective user ID of calling process

int getgid ( void ); Returns real group ID of calling process

int getegid ( void ); Returns effective group ID of calling process

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 63

exit A process terminates by calling the exit system call. This system call never returns to

the caller. When exit is called, and int exit status is passed by the process to the kernel.

This exit status is then available to the parent process of the exiting process through the

wait system call. The low order 8 bits of the exit status only should be used, allowing a

process to terminate with an exit status in the range 0 through 255. By

convention, a process that terminates normally returns an exit status of zero, while the

nonzero values are used to indicate an error condition.

void exit (int status);

No Return value.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 64

Chapter-4 Memory Management

Memory Management under Unix

One of the numerous tasks the Unix kernel performs while the machine is up is to

manage memory. In this section, we explore relevant terms (such as physical vs. virtual

memory) as well as some of the basic concepts behind memory management.

• Physical vs. virtual memory

• What is a page of memory?

• Cache memory

• How the kernel organizes memory:

Dividing the RAM

System and user areas

• Paging vs. swapping

Physical vs. Virtual Memory

Unix, like other advanced operating systems, allows you to use all of the physical

memory installed in your system as well as area(s) of the disk (called swap space)

which have been designated for use by the kernel in case the physical memory is

insufficient for the tasks at hand. Virtual memory is simply the sum of the physical

memory (RAM) and the total swap space assigned by the system administrator at the

system installation time. Mathematically,

Virtual Memory (VM) = Physical RAM + Swap space

Dividing Memory into Pages

The Unix kernel divides the memory into manageable chunks called pages. A

single page of memory is usually 4096 or 8192 bytes (4 or 8KB). Memory pages

are laid down contiguously across the physical and virtual memory.

Cache Memory

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 65

With increasing clock speeds for modern CPUs, the disparity between the CPU speed

and the access speed for RAM has grown substantially. Consider the following:

Typical CPU speed today: 250-500MHz (which translates into 4-2ns clock tick)

Typical memory access speed (for regular DRAM): 60ns

Typical disk access speed: 13ms

In other words, to get a piece of information from RAM, the CPU has to wait for 15-30

clock cycles, a considerable waste of time.

Fortunately, cache RAM has come to the rescue. The RAM cache is simply a small

amount of very fast (and thus expensive) memory that is placed between the CPU and

the (slower) RAM. When the kernel loads a page from RAM for use by the CPU, it also

prefetches a number of adjacent pages and stores them in the cache. Since programs

typically use sequential memory access, the next page needed by the CPU can now be

supplied very rapidly from the cache. Updates of the cache are performed using an

efficient algorithm, which can enable cache hit rates of nearly 100% (with a 100% hit

ratio being the ideal case).

CPUs today typically have hierarchical caches. The on-chip cache (usually called the L1

cache) is small but fast (being on-chip). The secondary cache (usually called the L2

cache) is often not on-chip (thus a bit slower) and can be quite large, sometimes as big

as 16MB for high-end CPUs (obviously, you have to pay a hefty premium for a cache

that size)

How the Kernel Organizes Memory

Dividing the RAM

When the kernel is first loaded into memory (at boot time), it sets aside a certain amount

of RAM for itself as well as for all system and user processes: Main categories in which

RAM is divided are:

• Text: to hold the text segments of running processes.

• Data: to hold the data segments of running processes.

• Stack: to hold the stack segments of running processes.

• Shared Memory: This is an area of memory which is available to running

programs if they need it. Consider a common use of shared memory: Let assume

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 66

you have a program which has been compiled using a shared library (libraries that

look like libxxx.so; the C-library is a good example - all programs need it). Assume

that five of these programs are running simultaneously. At run-time, the code they

seek is made resident in the shared memory area. This way, a single copy of the

library needs to be in memory, resulting in increased efficiency and major cost

savings.

• Buffer Cache: All reads and writes to the filesystem are cached here first. You

may have experienced situations where a program that is writing to a file doesn't

seem to work (nothing is

written to the file). You wait a while, then a sync occurs, and the buffer cache is

dumped to disk and you see the file size increase.

The System and User Areas

When the kernel loads, it uses RAM to keep itself memory resident. Consequently, it

has to ensure that user programs do not overwrite/corrupt the kernel data structures (or

overwrite/corrupt other users' data structures). It does so by designating part of RAM as

kernel or system pages (which hold kernel text and data segments) and user pages

(which hold user stacks, data, and text segments). Strong memory protection is

implemented in the kernel memory management code to keep the users from corrupting

the system area. For example, only the kernel is allowed to switch from the user to the

system area. During the normal execution of a Unix process, both system and user

areas are used.

A common system call when memory protection is violated is SIGSEGV (you see a

"Segmentation violation" message on the screen when this happens. The culprit

process is killed and its in-memory portions dumped to a disk file called "core").

Paging vs. Swapping

Paging: When a process starts in Unix, not all its memory pages are read in from the

disk at once. Instead, the kernel loads into RAM only a few pages at a time. After the

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 67

CPU digests these, the next page is requested. If it is not found in RAM, a page fault occurs, signaling the kernel to load the next few pages from disk into RAM. This is

called demand paging and is a perfectly normal system activity in Unix. (Just so you

know, it is possible for you, as a programmer, to read in entire processes if there is

enough memory available to do so.)

The Unix SVR4 daemon which performs the paging out operation is called pageout. It is

a long running daemon and is created at boot time. The pageout process cannot be

killed. There are three kernel variables which control the paging operation (Unix SVR4):

• minfree - the absolute minimum of free RAM needed. If free memory falls below

this limit, the memory management system does its best to get back above it. It

does so by page stealing from other, running processes, if practical.

• desfree - the amount of RAM the kernel wants to have free at all times. If free

memory is less than desfree, the pageout syscall is called every clock cycle.

• lotsfree - the amount of memory necessary before the kernel stops calling

pageout. Between desfree and lotsfree, pageout is called 4 times a second.

Swapping: Let's say you start ten heavyweight processes (for example, five xterms, a

couple netscapes, a sendmail, and a couple pines) on an old 486 box running Linux

with 16MB of RAM. Basically, you *do not have* enough physical RAM to accomodate

the text, data, and stack segments of all these processes at once. Since the kernel

cannot find enough RAM to fit things in, it makes use of the available virtual memory by

a process known as swapping. It selects the least busy process and moves it in its

entirety (meaning the program's in-RAM text, stack, and data segments) to disk. As

more RAM becomes available, it swaps the process back in from disk into RAM. While

this use of the virtual memory system makes it possible for you to continue to use the

machine, it comes at a very heavy price. Remember, disks are relatively slower (by the

factor of a million) than CPUs and you can feel this disparity rather severely when the

machine is swapping. Swapping is not considered a normal system activity. It is

basically a sign that you need to buy more RAM.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 68

In Unix SVR4, the process handling swapping is called sched (in other Unix variants, it

is sometimes called swapper). It always runs as process 0. When the free memory falls

so far below minfree that pageout is not able to recover memory by page stealing,

sched invokes the syscall sched(). Syscall swapout is then called to free all the memory

pages associated with the process chosen for being swapping out. On a later invocation

of sched(), the process may be swapped back in from disk if there is enough memory.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 69

Chapter-5 Locking Techniques in UNIX

There are situations where multiple processes want to share some

resource.Locking is a facility provided so that only one process at a time can

access the resources. Locking is of two types. They are

• Advisory Locking

• Mandatory Locking

Advisory Locks Vs Mandatory Locks Advisory locking means that the operating system maintains a correct knowledge

of which files have been locked by which process, but it does not prevent some

process from writing to a file that is locked by another process. A process can

ignore an sdvisory lock and write to a file that is locked, if the process has

adequate permissions. Advisory locks are fine for what is known as cooperating

processes (the programs that accesses a shared resource ) .

The other type of file locking-mandatory locking, is provided by some

systems.Mandatory locks mean that the operating system checks every read and

write request to verify that the operation does not interfere with a lock held by a

process.

File locking Vs Record Locking File locking locks an entire file, while record locking allows a process to lock a

specified portion of a

file. The definition of a record for UNIX record locking is given by specifying a

starting byte offset in the file and the number of bytes from that position.

Lockf file Locking

#include<unistd.h>

int lockf(int fd, int function, long size);

Returns 0 on success, or -1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 70

The fd is the file descriptor of the file to be locked. The function has one of the following values: F_ULOCK unlock a previously locked region

F_LOCK lock a region(blocking)

F_TLOCK Test and lock a region(nonblocking)

F_TEST Test a region to see if it is locked The size is the size of bytes to be locked. The lockf function uses the current file offset offset (which the process can set

using the lseek system call) and the size argument to define the “record”. The

record starts at the current offset and extends forward for a positive size, or

extends backwards for a negative size. If the size is 0, the record affected

extends from the current offset through the largest file offset(the end of file).

Doing an lseek to the beginning of the file followed by a lockf with a size of zero

locks the entire file.

The lockf function provides both the ability to set a lock and test if a lock is set.

When the function is F_TLOCK and the region is already locked ny another

process, the calling process is put to sleep until the region is available. This is

termed blocking. The F_TLOCK operation, however is termed a nonblocking call

– if the region is not available, lockf returns immediately with a value of –1. Also

the F_TEST operation allows a process to test if a lock is set, without setting a

lock.

fcntl record locking

For record locking cmd is F_GETLK F_SETLK F_SETLKW

#include<fcntl.h>

int fcntl(int fd, int cmd, struct flock * arg);

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 71

The third argument is a pointer to a flock structure Struct flock { short l_type; /* F_RDLCK, F_WRLCK or F_UNLCK */ off_t l_start; /* offset in bytes, relative to l_whence */ short l_whence; /* SEEK_SET, SEEK_CUR or SEEK_END */ off_t l_len; /* length, in bytes; 0 means lock till EOF */ pid_t l_pid; /* returned with F_GETLK */ } This structure describes

• The type of lock desired: F_RDLCK (a shared read lock), F_WRLCK (an

exclusive write lock), or F_UNLCK ( unlocking a region)

• The starting byte offset of the region being locked or unlocked (l_start and

l_whence)

• The size of the region(l_len)

There are numerous rules about the specification of the region to be locked or unlocked.

• The two elements that satisfy the starting offset of the region are similar to

the last two arguments of the lseek function.Indeed, the l_whence

member is specified as SEEK_SET, SEEK_CUR or SEEK_END.

• Locks can start and extend beyond the current end of file, but cannot start

or extend before the beginning of the file.

• If the l_len is 0, it means that the lock extends to the largest possible

offset of the file (till the end of file). This allows us to lock a region starting

anywhere in the file, up through and including any data that is appended

to the file.

• To lock the entire file, we set l_start and l_whence to point to the

beginning of the file, and specify a length (l_len) of 0.

The basic rule is that any number of processes can have a shared read lock on a

byte, but only one process can have an exclusive write lock on a given byte.

Furthermore, if there are one or more read locks on a byte, there cannot be any

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 72

write lock on that byte, and if there is an exclusive write lock on a byte, there

cannot be any read locks on that byte.

To obtain a read lock, the descriptor must be open for reading, and to obtain a

write lock the descriptor must be open for writing.

The three different cmd for the fcntl function. F_GETLK Determine if the lock described by the structure flockptr is blocked by

some other lock. If a lock exists, that would prevent ours from being

created, the information on that existing lock overwrites the information

pointed to by the flockptr. If no lock exists, that would prevent ours from

being created, the structure pointed to by the flockptr is left unchanged

except for the l_type member, which is set to F_UNLCK.

F_SETLK Set the lock described by flockptr. If we are trying to obtain a read lock (

l_type or F_RDLCK) or a write lock (l_type of F_WRLCK) and the

compatibility rule prevents the system from giving us the lock, fcntl returns immediately with error.This command is also used to clear the

lock described by flockptr(l_type or F_UNLCK).

F_SETLKW This command is a blocking version of F_SETLK. If the requested read

lock or write lock cannot be guaranteed because another process

currently has some part of the requested region locked, the calling

process is put to sleep. This sleep is interrupted if a signal is caught.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 73

Chapter-6 Inter Process Commnication

Communication in UNIX plays a very important role. Process in computer

memory are said to be communicating when a process passes data to another or

vice-versa. The only requirement is that the communicating processes must

mutually agree with the means of communication.

The following are some methods of Inter Process Communication under UNIX. • Signals

• Pipes

• FIFO

• Message Queues

• Semaphores

• Shared Memory

• Sockets

Interrupts and Signals:

In this section will look at ways in which two processes can communicate using signals.

When a process terminates abnormally it usually tries to send a signal indicating what

went wrong. User specified communication could take place in this way.

Signals are software generated interrupts that are sent to a process when an event

occurs. Signals can be synchronously generated by an error in an application, such as

SIGFPE and SIGSEGV, but most signals are asynchronous. Signals can be posted to a

process when the system detects a software event, such as a user entering an interrupt

or stop or a kill request from another process. Signals can also be come directly from

the OS kernel when a hardware event such as a bus error or an illegal instruction is

encountered. The system defines a set of signals that can be posted to a process.

Signal delivery is analogous to hardware interrupts in that a signal can be blocked from

being delivered in the future. Most signals cause termination of the receiving process if

no action is taken by the process in response to the signal. Some signals stop the

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 74

receiving process and other signals can be ignored. Each signal has a default action

which is one of the following:

• The signal is discarded after being received

• The process is terminated after the signal is received

• A core file is written, then the process is terminated

• Stop the process after the signal is received

Each signal defined by the system falls into one of five classes: • Hardware conditions

• Software conditions

• Input/output notification

• Process control

• Resource control

Macros are defined in <signal.h> header file for common signals. These include: SIGHUP 1 /* hangup */ SIGINT 2 /* interrupt */

SIGQUIT 3 /* quit */ SIGILL 4 /* illegal instruction */

SIGABRT 6 /* used by abort */ SIGKILL 9 /* hard kill */

SIGALRM 14 /* alarm clock */

SIGCONT 19 /* continue a stopped process */

SIGCHLD 20 /* to parent on child stop or exit */ Signals can be numbered from 0 to 31.

Sending Signals -- kill

The common function used to send signals The first parameter is the process ID of the process to which the signal is sent.

int kill(int pid, int signal);

Returns 0 on success, or -1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 75

The pid can can have the following values: pid > 0 the signal is sent to the process whose process ID is equal to pid. pid == 0 the signal is sent to all processes in the sender’s process group. pid == -1 The kernel sends the signal to all processes whose Real user ID =

Effective user ID of the sender. If the sender process has Effective user ID

of Super user, the kernel sends the signal to all the process except

process 0 and 1.

pid < -1 the kernel sends the signal to all process in the process group equal to the

absolute value of the pid. The second parameter is the signal number. There is also a UNIX command called kill that can be used to send signals from the

command line - see man pages for further details.

NOTE: that unless caught or ignored, the kill signal terminates the process. Therefore

protection is built into the system.

Only processes with certain access privileges can be killed off.

Basic rule: only processes that have the same user can send/receive messages.

The SIGKILL signal cannot be caught or ignored and will always terminate a process.

For example kill (getpid (), SIGINT); would send the interrupt signal to the id of the

calling process.

This would have a similar effect to exit() command. Also ctrl-c typed from the command

sends a SIGINT to the process currently being.

Signal Handling -- signal()

An application program can specify a function called a signal handler to be invoked

when a specific signal is received. When a signal handler is invoked on receipt of a

signal, it is said to catch the signal. A process can deal with a signal in one of the

following ways:

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 76

• The process can let the default action happen

• The process can block the signal (some signals cannot be ignored)

• The process can catch the signal with a handler.

Signal handlers usually execute on the current stack of the process. This lets the signal

handler return to the point that execution was interrupted in the process. This can be

changed on a per-signal basis so that a signal handler executes on a special stack. If a

process must resume in a different context than the interrupted one, it must restore the

previous context itself

Receiving signals is straighforward with the function:

int ( *signal (int sig, void (*func)( )))( ) -- that is to say the function signal( ) will call the

func functions if the process receives a signal sig. Signal returns a pointer to function

func if successful or it returns an error to errno and -1 otherwise.

func( ) can have three values:

SIG_DFL -- a pointer to a system default function SID_DFL( ), which will terminate the process upon

receipt of sig.

SIG_IGN -- a pointer to system ignore function SIG_IGN( ) which will disregard the sig action (UNLESS it

is SIGKILL).

A function address -- a user specified function.

SIG_DFL and SIG_IGN are defined in signal.h (standard library) header file.

Thus to ignore a ctrl-c command from the command line. We could do:

signal(SIGINT, SIG_IGN);

TO reset system so that SIGINT causes a termination at any place in our program, we

would do:

signal(SIGINT, SIG_DFL);

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 77

So lets write a program to trap a ctrl-c but not quit on this signal. We have a function

sigproc ( ) that is executed when we trap a ctrl-c. We will also set another function to

quit the program if it traps the SIGQUIT signal so we can terminate our program:

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 78

Chapter-7 The Pipe

Every user of UNIX has almost certainly used the pipe at some stage in interacting with

the operating system. "ls | more" pipes the output of the ls command to the more

command, so producing a paged listing of a directory. In effect the pipe acts as a

temporary file holding the output from the first command until it is read by the second.

Remember here that each of these two commands is run as a separate process on the

system. Assuming that the ls command runs for long enough then the ps command can

be used to examine all the processes on the system. Both the ls and the more

processes will be seen.

This method of establishing a pipe requires the use of the shell and could be used as a

means of communicating between processes if a program writes and then executes a

shell script. This is not perhaps the most efficient method of working, as it requires the

creation of a number of processes with the associated overheads involved. Execution of

"ls | more" from within a program would require generation of a shell process which

would subsequently create ls and more processes.

The pipe( ) system call returns an array of two file descriptors, the first one open for

reading and the second for writing. Data can therefore be passed down the write only

file descriptor and will appear out of the read only file descriptor. Note here that these

are UNIX file descriptors which are integer numbers representing open files, not C FILE

pointers. Reading and writing to and from these descriptors must therefore use the

UNIX read ( ) and write( ) system calls.

int pipe ( pfd );

int pfd[2];

Returns 0 on success, or -1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 79

For using pipe with in a single process: #include <signal.h> main( ) { int pfd[2]; charbuf[10]; int pid; pipe (pfd); pid = fork ( ); switch (pid) {

case –1: printf(“Fork failed\n”); exit(2);

default: if(write (pfd[1], “hello”, 6)= = -1) printf(“Write failed\n”); break; case 0: sleep(1); if(read(pfd[0], buf, sizeof(buf)= = -1) printf(“ Read failed\n”); break; }

}

Pipe does a destructive read, which means that the data once read from the pipe

cannot be retrieved. A pipe has a finite size, always atleast 4K.

The main disadvantage of pipes is:

They can be used only between the inter-related processes, like parent and child.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 80

Chapter-8 Fifos

A named pipe works much like a regular pipe, but does have some noticeable

differences.

• Named pipes exist as a device special file in the file system.

• Processes of different ancestry can share data through a named pipe.

• When all I/O is done by sharing processes, the named pipe remains in the file

system for later use.

Creating a FIFO

There are several ways of creating a named pipe. The first two can be done directly

from the shell.

mknod MYFIFO p

mkfifo a=rw MYFIFO

The above two commands perform identical operations, with one exception. The mkfifo

command provides a hook for altering the permissions on the FIFO file directly after

creation. With mknod, a quick call to the chmod command will be necessary.

FIFO files can be quickly identified in a physical file system by the ``p'' indicator seen

here in a long directory listing:

$ ls -l MYFIFO

prw-r--r-- 1 root root 0 Dec 14 22:15 MYFIFO|

Also notice the vertical bar (``pipe sign'') located directly after the file name. Another

great reason to run Linux!

mknod pathname is the name of the FIFO to be created.

mode is S_IFIFO|Permissions

#include <sys/types.h>

int mknod( char *pathname, int mode, int dev );

Returns 0 on success, or -1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 81

dev is 0 ( ignored )

I/O operations on a FIFO are essentially the same as for normal pipes, with one major

exception. An ``open'' system call or library function should be used to physically open

up a channel to the pipe. With half-duplex pipes, this is unnecessary, since the pipe

resides in the kernel and not on a physical filesystem.

Blocking Actions on a FIFO

Normally, blocking occurs on a FIFO. In other words, if the FIFO is opened for reading,

the process will "block" until some other process opens it for writing. This action works

vice-versa as well. If this behavior is undesirable, the O_NONBLOCK flag can be used

in an open( ) call to disable the default blocking action.

The alternative would be to jump to another virtual console and run the client end,

switching back and forth to see the resulting action.

The Infamous SIGPIPE Signal

On a last note, pipes must have a reader and a writer. If a process tries to write to a

pipe that has no reader, it will be sent the SIGPIPE signal from the kernel. This is

imperative when more than two processes are involved in a pipeline.

Eg. For usage of FIFOs: Process 1 #include <sys/types.h> #include <stdio.h> main( ) { int writefd; char*msg1= “Process #1”;

if(mknod(“FIFO1”, S_FIFO|0600, 0)= = -1) printf(“Could not create a FIFO\n”); if(writefd = open(“FIFO1”, O_RDWR) == -1) printf(“Open failed\n); if(write(writefd,msg1,10) == -1) printf(“write failed\n”);}

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 82

Process 2 #include <sys/types.h> #include <stdio.h> main( ) { int readfd; char buf[15]; if(readfd= open(“FIFO1”, O_RDONLY) == -1) printf(“Open failed\n); if(read(readfd, buf, sizeof(buf)= = -1) printf(“Read failed\n”); printf(“The contents of buf is:%s”, buf); }

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 83

Chapter – 8 Message Queues

Two (or more) processes can exchange information via access to a common system

message queue. The sending process places via some (OS) message-passing module

a message onto a queue which can be read by another process. Each message is

given an identification or type so that processes can select the appropriate message.

Process must share a common key in order to gain access to the queue in the first

place (subject to other permissions -- see below).

Basic Message Passing: IPC messaging lets processes send and receive messages,

and queue messages for processing in an arbitrary order. Unlike the file byte-stream

data flow of pipes, each IPC message has an explicit length. Messages can be

assigned a specific type. Because of this, a server process can direct message traffic

between clients on its queue by using the client process PID as the message type. For

single-message transactions, multiple server processes can work in parallel on

transactions sent to a shared message queue.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 84

Before a process can send or receive a message, the queue must be initialized

.Operations to send and receive messages are performed by the msgsnd() and

msgrcv() functions, respectively.

When a message is sent, its text is copied to the message queue. The msgsnd() and

msgrcv() functions can be performed as either blocking or non-blocking operations.

Non-blocking operations allow for asynchronous message transfer -- the process is not

suspended as a result of sending or receiving a message. In blocking or synchronous

message passing the sending process cannot continue until the message has been

transferred or has even been acknowledged by a receiver. IPC signal and other

mechanisms can be employed to implement such transfer. A blocked message

operation remains suspended until one of the following three conditions occurs:

• The call succeeds.

• The process receives a signal.

• The queue is removed.

For every message in the system, the kernel maintains the following structure of information, defined in sys/msg.h>

struct msqid_ds { struct ipc_perms;

struct msg* msg_first; /* pointer to first message on queue */ struct msg* msg_last; /* pointer to last message on queue */ ushort msg_cbytes; /* current number of bytes in queue*/

ushort msg_qbytes /* no. of bytes on queue; */ ushort q_num; /* current number messages on queue */

ushort msg_lspid; /* pid of last message send */ ushort msg_lrpid; /* pid of last message receive*/

time_t stime; /* time of last message send */ time_t rtime; /* time of last message receive */ time_t ctime; /* time of last message control */

}

The ipc_perms structure is common to Message Queues, Semaphores and Shared

Memory. The members of this structure is:

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 85

struct ipc_perms{ ushort uid; /* Owner’s user id */ ushort gid; /* Owner’s group id */ ushort cuid; /* Creator’s User id */ ushort cgid; /* Creator’s group id */ ushort mode; /* Access modes */ key_t key; /* key */ }

Internally, the kernel maintains the message structures in the form of linked lists.

msg_first msg_last . . .

Initialising the Message Queue

The msgget( ) function initializes a new message queue:

The value passed as the msgflg argument must be an octal integer with settings for the

queue's permissions and control flags.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/msg.h>

int msgget(key_t key, int msgflg);

Returns the message queue ID (msqid) of the queue corresponding to the key argument On success or –1 on error.

msg_perm structure

Link type = 100 length = 1 Data

Link type = 200 length = 2 Data

Link type = 300 length = 3 Data

msqid

struct msqid_ds

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 86

IPC Functions, Key Arguments, and Creation Flags:

Processes requesting access to an IPC facility must be able to identify it. To do this,

functions that initialize or provide access to an IPC facility use a key_t key argument.

(key_t is essentially an int type defined in <sys/types.h>

The key is an arbitrary value or one that can be derived from a common seed at run

time. One way is with ftok( ) , which converts a filename to a key value that is unique

within the system. Functions that initialize or get access to messages (also semaphores

or shared memory) return an ID number of type int. IPC

functions that perform read, write, and control operations use this ID. If the key

argument is specified as IPC_PRIVATE, the call initializes a new instance of an IPC

facility that is private to the creating process. When the IPC_CREAT flag is supplied in

the flags argument appropriate to the call, the function tries to create the facility if it does

not exist already. When called with both the IPC_CREAT and IPC_EXCL flags, the

function fails if the facility already exists. This can be useful when more than one

process might attempt to initialize the facility. One such case might involve several

server processes having access to the same facility. If they all attempt to create the

facility with IPC_EXCL in effect, only the first attempt succeeds. If neither of these flags

is given and the facility already exists, the functions to get access simply return the ID of

the facility. If IPC_CREAT is omitted and the facility is not already initialized, the calls

fail. These control flags are combined, using logical (bitwise) OR, with the octal

permission modes to form the flags argument. For example, the statement below

initializes a new message queue if the queue does not exist.

msqid = msgget(ftok("/tmp",key), (IPC_CREAT | IPC_EXCL | 0400)); The first argument evaluates to a key based on the string ("/tmp"). The second

argument evaluates to the combined permissions and control flags.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 87

Sending and Receiving Messages

The msgsnd( ) and msgrcv( ) functions send and receive messages, respectively: The msqid argument must be the ID of an existing message queue. The msgp

argument is a pointer to a structure that contains the type of the message and its text.

The structure below is an example of what this user-defined buffer might look like:

struct msgbuf { long mtype; /* message type */ char mtext[MSGSZ]; /* message text of length MSGSZ */ } The msgsz argument specifies the length of the message in bytes.

The structure member msgtype is the received message's type as specified by the

sending process.

The argument msgflg specifies the action to be taken if one or more of the following are

true:

• The number of bytes already on the queue is equal to msg_qbytes (maximum bytes

in the queue)

• The total number of messages on all queues system-wide is equal to the system-

imposed limit.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/msg.h> int msgsnd(int msqid, struct msgbuf *msgp, int msgsz, int msgflg); int msgrcv(int msqid, struct msgbuf *msgp, int msgsz, long msgtype,int msgflg); Returns 0 On success or –1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 88

These actions are as follows:

• If (msgflg & IPC_NOWAIT) is non-zero, the message will not be sent and the calling

process will return immediately.

• If (msgflg & IPC_NOWAIT) is 0, the calling process will suspend execution until one

of the following occurs:

o The condition responsible for the suspension no longer exists, in which case the

message is sent.

o The message queue identifier msqid is removed from the system; when this occurs,

errno is set equal to EIDRM and -1 is returned.

o The calling process receives a signal that is to be caught; in this case the message

is not sent and the calling process resumes execution.

Upon successful completion, the following actions are taken with respect to the data

structure associated with msqid:

o msg_qnum is incremented by 1.

o msg_lspid is set equal to the process ID of the calling process.

o msg_stime is set equal to the current time.

Controlling message queues The msgctl( ) function alters the permissions and other characteristics of a message

queue. The owner or creator of a queue can change its ownership or permissions using

msgctl() Also, any process with permission to do so can use msgctl() for control

operations.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/msg.h> int msgctl(int msqid, int cmd, struct msqid_ds *buf ) Returns 0 On success or –1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 89

The msqid argument must be the ID of an existing message queue. The cmd argument is one of: IPC_STAT -- Place information about the status of the queue in the data structure pointed to by buf. The

process must have read permission for this call to succeed.

IPC_SET -- Set the owner's user and group ID, the permissions, and the size (in number of bytes) of the

message queue. A process must have the effective user ID of the owner, creator, or superuser

for this call to succeed.

IPC_RMID -- Remove the message queue specified by the msqid argument.

The following code illustrates the msgctl() function with all its various flags: #include<sys/types.h> #include <sys/ipc.h> #include <sys/msg.h> ... if (msgctl(msqid, IPC_STAT, &buf) == -1) { perror("msgctl: msgctl failed"); exit(1); } ... if (msgctl(msqid, IPC_SET, &buf) == -1) { perror("msgctl: msgctl failed"); exit(1); } ... ...

Disadvantages of Message Queues:

• System calls overheads.

• Speed is less.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 90

Chapter-9 Semaphores

Semaphores are a programming construct designed by E. W. Dijkstra in the late 1960s.

Dijkstra's model was the operation of railroads: consider a stretch of railroad in which

there is a single track over which only one train at a time is allowed. Guarding this track

is a semaphore. A train must wait before entering the single track until the semaphore is

in a state that permits travel. When the train enters the track, the semaphore changes

state to prevent other trains from entering the track. A train that is leaving this section of

track must again change the state of the semaphore to allow another train to enter. In

the computer version, a semaphore appears to be a simple integer. A process (or a

thread) waits for permission to proceed by waiting for the integer to become 0. The

signal if it proceeds signals that this by performing incrementing the integer by 1. When

it is finished, the process changes the semaphore's value by subtracting one from it.

Semaphores let processes query or alter status information. They are often used to

monitor and control the availability of system resources such as shared memory

segments.

Semaphores can be operated on as individual units or as elements in a set. Because

System V IPC semaphores can be in a large array, they are extremely heavy weight.

Much lighter weight semaphores are available in the threads library and POSIX

semaphores (see below for brief). Threads library semaphores must be used with

mapped memory . A semaphore set consists of a control structure and an array of

individual semaphores. A set of semaphores may contain up to 25 elements.

For every semaphore in the system, the kernel maintains the following structure of information. struct semid_ds { struct ipc_perm; /* permission structure */ struct sem *sem_base; /* pointer to first semaphore in the set */ ushort sem_nsems; /* number of semaphores in set */ time_t sem_otime; /* time of last semop */ time_t sem_ctime; /* time of last change */ }

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED

The sem structure is the internal datastructure used by the kernel to maintain the set of values for a given semaphore. struct sem { ushort semval; /* semaphore value, non-negative */ short sempid; /* pid of the last operation */ ushort semncnt; /* number of processes awaiting semval > cval */ ushort semzcnt; /* number of processes awaiting semval = 0 */ } We can picture a particular semaphore in the kernel as being a semid_ds structure that points to an array of sem structures.If the semaphore has two members in its set, we would have the picture as shown.

In a similar fashion to message queues, the s

semget( ); the semaphore creator can chang

semctl(); and semaphore operations are perf

now discussed below:

sem_perm structure

sem_base

sem_nsems

sem_otime

sem_ctime

[0]

struct sem_id_ds

semid

kernel

Semval

Sempid

Semncnt

Semzcnt

[1]

Semval

Sempid

Semncnt

Semzcnt

91

emaphore set must be initialized using

e its ownership or permissions using

ormed via the semop() function. These are

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 92

Initializing a Semaphore Set The function semget() initializes or gains access to a semaphore. The key argument is a access value associated with the semaphore ID.

The nsems argument specifies the number of elements in a semaphore array. The call

fails when nsems is greater than the number of elements in an existing array; when the

correct count is not known, supplying 0 for this argument ensures that it will succeed.

The semflg argument specifies the initial access permissions and creation control flags.

Semaphore Operations

semop( ) performs operations on a semaphore set. The semid argument is the semaphore ID returned by a previous semget() call. The

sops argument is a pointer to an array of structures, each containing the following

information about a semaphore operation:

• The semaphore number

• The operation to be performed

• Control flags, if any.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/sem.h> int semget(key_t key, int nsems, int semflg); Returns semaphore ID(semid) On success or –1 on error.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/sem.h> int semop(int semid, struct sembuf *sops, int nsops); Returns semaphore ID(semid) On success or –1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 93

The sembuf structure specifies a semaphore operation, as defined in <sys/sem.h>. struct sembuf { ushort_t sem_num; /* semaphore number */ short sem_op; /* semaphore operation */ short sem_flg; /* operation flags */ }; The nsops argument specifies the length of the array, the maximum size of which is

determined by the SEMOPM configuration option; this is the maximum number of

operations allowed by a single semop( ) call. The operation to be performed is

determined as follows:

• A positive integer increments the semaphore value by that amount.

• A negative integer decrements the semaphore value by that amount. An attempt

to set a semaphore to a value less than zero fails or blocks, depending on whether

IPC_NOWAIT is in effect.

• A value of zero means to wait for the semaphore value to reach zero.

There are two control flags that can be used with semop():

IPC_NOWAIT -- Can be set for any operations in the array. Makes the function return without changing

any semaphore value if any operation for which IPC_NOWAIT is set cannot be

performed. The function fails if it tries to decrement a semaphore more than its current

value, or tests a nonzero semaphore to be equal to zero.

SEM_UNDO -- Allows individual operations in the array to be undone when the process exits.

This function takes a pointer, sops, to an array of semaphore operation structures. Each

structure in the array contains data about an operation to perform on a semaphore. Any

process with read permission can test whether a semaphore has a zero value. To

increment or decrement a semaphore requires write permission. When an operation

fails, none of the semaphores is altered.

The process blocks (unless the IPC_NOWAIT flag is set), and remains blocked until:

• the semaphore operations can all finish, so the call succeeds,

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 94

• the process receives a signal, or

• the semaphore set is removed.

Only one process at a time can update a semaphore. Simultaneous requests by

different processes are performed in an arbitrary order. When an array of operations is

given by a semop() call, no updates are done until all operations on the array can finish

successfully.

If a process with exclusive use of a semaphore terminates abnormally and fails to undo

the operation or free the semaphore, the semaphore stays locked in memory in the

state the process left it. To prevent this, the SEM_UNDO control flag makes semop()

allocate an undo structure for each semaphore operation, which contains the operation

that returns the semaphore to its previous state. If the process dies, the system applies

the operations in the undo structures. This prevents an aborted process from leaving a

semaphore set in an inconsistent state. If processes share access to a resource

controlled by a semaphore, operations on the semaphore should not be made with

SEM_UNDO in effect. If the process that currently has control of the resource

terminates abnormally, the resource is presumed to be inconsistent. Another process

must be able to recognize this to restore the resource to a consistent state. When

performing a semaphore operation with SEM_UNDO in effect, you must also have it in

effect for the call that will perform the reversing operation. When the process runs

normally, the reversing operation updates the undo structure with a complementary

value. This ensures that, unless the process is aborted, the values applied to the undo

structure are cancel to zero. When the undo structure reaches zero, it is removed.

NOTE:Using SEM_UNDO inconsistently can lead to excessive resource consumption

because allocated undo structures might not be freed until the system is rebooted.

“UNIX”

Controlling Semaphores

#include <sys/types.h> #include<sys/ipc.h> #include<sys/sem.h> int semctl(int semid, int semnum, int cmd, union semun arg);

© CRANES VARSITY ALL RIGHTS RESERVED 95

semctl() changes permissions and other characteristics of a semaphore set.

It must be called with a valid semaphore ID, semid. The semnum value selects a

semaphore within an array by its index. The cmd argument is one of the following

control flags:

GETVAL -- Return the value of a single semaphore. SETVAL -- Set the value of a single semaphore. In this case, arg is taken as arg.val, an int.

GETPID -- Return the PID of the process that performed the last operation on the semaphore or array. GETNCNT -- Return the number of processes waiting for the value of a semaphore to increase.

GETZCNT -- Return the number of processes waiting for the value of a particular semaphore to reach zero.

GETALL -- Return the values for all semaphores in a set. In this case, arg is taken as arg.array, a pointer

to an array of unsigned shorts (see below).

SETALL -- Set values for all semaphores in a set. In this case, arg is taken as arg.array, a pointer to an

array of unsigned shorts.

IPC_STAT -- Return the status information from the control structure for the semaphore set and place it in

the data structure pointed to by arg.buf, a pointer to a buffer of type semid_ds.

IPC_SET

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 96

-- Set the effective user and group identification and permissions. In this case, arg is taken as

arg.buf.

IPC_RMID -- Remove the specified semaphore set.

A process must have an effective user identification of owner, creator, or superuser to

perform an IPC_SET or IPC_RMID command. Read and write permission is required as

for the other control commands.

The fourth argument union semun arg is optional, depending upon the operation

requested. If required it is of type union semun, which must be explicitly declared by the

application program as:

union semun { int val; struct semid_ds *buf; ushort *array; } arg;

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 97

Chapter-10 Shared Memory

Shared Memory is an efficeint means of passing data between programs. One program

will create a memory portion which other processes (if permitted) can access.

When write access is allowed for more than one process, an outside protocol or

mechanism such as a semaphore can be used to prevent inconsistencies and

collisions.

A process creates a shared memory segment using shmget( ). The original owner of a

shared memory segment can assign ownership to another user with shmctl(). It can also

revoke this assignment. Other processes with proper permission can perform various

control functions on the shared memory segment using shmctl(). Once created, a

shared segment can be attached to a process address space using shmat(). It can be

detached using shmdt(). The attaching process must have the appropriate permissions

for shmat(). Once attached, the process can read or write to the segment, as allowed by

the permission requested in the attach operation. A shared segment can be attached

multiple times by the same process. A shared memory segment is described by a

control structure with a unique ID that points to an area of physical memory. The

identifier of the segment is called the shmid.

The structure definition for the shared memory segment control structures and

prototypes can be found in <sys/shm.h>.

For each shared memory, the kernel maintains the following structure:

struct shmid_ds { struct ipc_perm; shm_segsize; /* segment size */ ushort shm_lpid; /* pid of last operation */ ushort shm_cpid; /* creator pid */ ushort shm_nattch; /* current number attached */ time_t shm_atime; /* last attach time */ time_t shm_dtime; /* last detach time */ time_t shm_ctime; /* last change time */

}

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 98

Initialising shared memory shmget() is used to obtain access to a shared memory segment.

Accessing a Shared Memory Segment

The key argument is a access value associated with the semaphore ID. The size

argument is the size in bytes of the requested shared memory. The shmflg argument

specifies the initial access permissions and creation control flags.

When the call succeeds, it returns the shared memory segment ID. This call is also

used to get the ID of an existing shared segment (from a process requesting sharing of

some existing memory portion).

Attaching a Shared Memory Segment

shmid is the identifier returned by the shmget( ).

The valid values for shmaddr are given below:

• If the shmaddr argument is zero, the stystem selects the address for the caller.

• If the shmaddr is a non-zero, the address returned depends on whether the

caller specifies the SHM_RND value for the shmflag argument.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> int shmget(key_t key, int size, int shmflg); Returns shared memory ID(shmid) On success or –1 on error.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> char *shmat (int shmid, char *shmaddr, int shmflg); Returns the starting address of the shared memory on success or –1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 99

o If the SHM_RND value is not specified, the shared memory segment is

attached at the address specified by the shmaddr argument.

o If SHM_RND is specified, the shared memory segment is attached at the

address specified by the shmaddr argument, rounded down by the

constant SHMLBA. LBA stands for “lower boundary address”.

By default, the shared memory segment is attached for both reading and writing by the

calling process, if the calling process has read-write permissions for the segment. The

SHM_RDONLY value can be specified in the flag argument, specifying read-only

access.

Detatching a Shared Memory Segment shmdt( ) detaches the shared memory segment located at the address indicated by shmaddr. This call does not delete the shared memory segment.

Controlling a Shared Memory Segment

shmctl( ) is used to alter the permissions and other characteristics of a shared memory segment. The process must have an effective shmid of owner, creator or superuser to perform

this command. The cmd argument is one of following control commands:

#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> char *shmdt (char *shmaddr ); Returns 0 on success or –1 on error.

#include <sys/types.h> #include<sys/ipc.h> #include<sys/shm.h> int shmctl(int shmid, int cmd, struct shmid_ds *buf); Returns 0 on success or –1 on error.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 100

SHM_LOCK

-- Lock the specified shared memory segment in memory. The process must have the effective

ID of superuser to perform this command.

SHM_UNLOCK -- Unlock the shared memory segment. The process must have the effective ID of superuser to

perform this command.

IPC_STAT -- Return the status information contained in the control structure and place it in the buffer

pointed to by buf. The process must have read permission on the segment to perform this

command.

IPC_SET -- Set the effective user and group identification and access permissions. The process must

have an effective ID of owner, creator or superuser to perform this command.

IPC_RMID -- Remove the shared memory segment.

The buf is a sructure of type struct shmid_ds which is defined in <sys/shm.h>

Address Spaces and Mapping

Since backing store files (the process address space) exist only in swap storage, they

are not included in the UNIX named file space. (This makes backing store files

inaccessible to other processes.) However, it is a simple extension to allow the logical

insertion of all, or part, of one, or more, named files in the backing store and to treat the

result as a single address space. This is called mapping. With mapping, any part of any

readable or writable file can be logically included in a process's address space. Like any

other portion of the process's address space, no page of the file is not actually loaded

into memory until a page fault forces this action. Pages of memory are written to the file

only if their contents have been modified. So, reading from and writing to files is

completely automatic and very efficient. More than one process can map a single

named file. This provides very efficient memory sharing between processes. All or part

of other files can also be shared between processes.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 101

Not all named file system objects can be mapped. Devices that cannot be treated as

storage, such as terminal and network device files, are examples of objects that cannot

be mapped. A process address space is defined by all of the files (or portions of files)

mapped into the address space. Each mapping is sized and aligned to the page

boundaries of the system on which the process is executing. There is no memory

associated with processes themselves.

A process page maps to only one object at a time, although an object address may be

the subject of many process mappings. The notion of a "page" is not a property of the

mapped object. Mapping an object only provides the potential for a process to read or

write the object's contents. Mapping makes the object's contents directly addressable by

a process. Applications can access the storage resources they use directly rather than

indirectly through read and write. Potential advantages include efficiency (elimination of

unnecessary data copying) and reduced complexity (single-step updates rather than the

read, modify buffer, write cycle). The ability to access an object and have it retain its

identity over the course of the access is unique to this access method, and facilitates

the sharing of common code and data.

Because the file system name space includes any directory trees that are connected

from other systems via NFS, any networked file can also be mapped into a process's

address space.

“UNIX”

©

Chapter-11 Sockets

Sockets are a generalized networking capability first introduced in 4.1cBSD and

subsequently refined into their current form with 4.2BSD. The sockets feature is

available with most current UNIX system releases. (Transport Layer Interface (TLI) is

the System V alternative). Sockets allow communication between two different

processes on the same or different machines.

Sockets are anapplication program interface(they provide the user an interface with the

system). They can be used for local interprocess communication and also across

TCP/IP networks. They provide both connection oriented and connectionless modes of

communication.

Socket

T

t

T

T

s

b

.

#include <sys/types.h> #include <sys/socket.h> int socket(int family, int type, int protocol); Returns socket ID on success or –1 on error

CRANES VARSITY ALL RIGHTS RESERVED 102

he family or the domain the socket is supposed to function in. This determines

he address format to be used.

he two mainly used socket domains are: • The UNIX domain ( or AF_UNIX, for Address Format UNIX ):- In this

domain, a socket is given a pathname within the system name space.

• The internet domain ( or AF_INET ):- Addresses in the internet domain

consist of a machine network address and an identifying number, called

the port. Internet domain names allow communication between machines.

he second parameter type is the type of the socket. Communication follows

ome particular “style”. Currently, communication is either through a stream or

y a datagram.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 103

The two mainly used socket types are:

• SOCK_STREAM ( for streams sockets )

• SOCK_DGRAM ( for datagram sockets )

stream communication means:

• Communication takes place across a connection between two sockets

• The communication is bi-directional, reliable, error free and no message

boundaries are kept.

• Reading from a stream may result in, reading the data sent from one or

several calls to write( ). The whole data may be read at one go, or only a

part of the data if there is not enough room for the entire message, or if

not all the dat from a large message has been transferred across.

• The protocol implementing such a style will retransmit messages received

with errors.

• The protocol will also return error messages if one tries to send a

message after the connection has been broken.

Datagram communication means:

• It is connectionless, and is bi-directional

• Each ,message is addressed individually

• If the address is correct, it will generally be received, although this is not

guaranteed.

• Often datagrams are used for requests that requires a response from the

recipient. If no response arrives in a reasonable amount of time (timeout),

the request is repeated.

• The individual datagrams will be kept separatewhen they are read, that is

message boundaries are preserved.

A protocol is a set of rules, data formats and conventions that regulates the

transfer of data between participants while communicating. There is one protocol

for each socket type (stream, datagram etc.,) within each domain.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 104

• The program that implements a protocol keeps track of the names that are

bound to sockets, sets up connections and transfers data between

sockets.

• It is possible for several protocols, differing only in low level details, to

implement the same style of communication within a particular domain.

Although it is possible to select which protocol should be used, for nearly

all uses it is sufficient to request the default protocol.

• Usually the protocol argument to socket( ) is kept 0, which invokes default

protocol.

Socket address structure This structure contains information about the family (UNIX, internet etc.,),

network and host address, and port (which service to ask for)

• The network, host address and the port are stored in a 14 byte long

string, which is set protocol – specific.

• The structure as defined in sys/socket.h is:

Struct sockaddr { u_short1 sa_family; /* address family, AF_UNIX for UNIX */ /* AF_INET for the Internet */ char sa_data[14]; /* Upto 14 bytes of protocol-specific */ }

For the INTERNET domain the following information is provided • Family name ( AF_INET )

• The IP address:

The network and host address: a 4 dotted decimal number

Client must identify the server it wants to communicate with, by

using this 4 dotted decimal number.

• The port number : The service the client wants to use on the server.

Port numbers below 5000 are reserved for specific services.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 105

The server registers the service offered by the socket to a port via

the bind( ) call.

The port number for the clients socket is assigned automatically.

The structures for the internrt domain are defined in the netinet/in.h

header file. The structures are:

Struct in_addr

{

u_long s_addr; /* 32 bit net_id/host_id, network

byte ordered*/

}

struct sockaddr_in {

short sin_family; /* AF_INET */

u_short sin_port; /* 16 bit port number */

struct in_addr sin_addr; /*32bit netid/hostid

network byte ordered /

char sin_zero; /*unused */

}

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 106

Connection oriented socket calls

Server (connection oriented protocol)

socket( )

bind( )

listen( )

Blocks Until connection from client

accept( )

read( )

Process request

write( )

Connection establishment

Data (request)

socket( )

connect( )

write( )

read( )

client

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 107

Connectionless socket calls bind A socket is created without a name. Until a name is bound to a socket, other

processes have no way to reference the socket. This means no messages can

be exchanged.

A server must bind its local addresses to receive the client’s requests. A

coonection oriented client does not have to explicitly bind its local addresses, it

will be bound during the connection.

Struct sockaddr

{ u_short1 sa_family; /* address family */ char sa_data[14]; /*upto 14 bytes of protocol specific address */ }

Server (connectionless protocol)

socket( )

bind( )

recvfrom()

Blocks Until connection from client

Sendto( )

Data (request)

Data (reply)

Process request

socket( )

bind( )

sendto( )

recvfrom()

client

“UNIX”

©

bind

The bind system call assignes a name to the unnamed socket. Binding an

address allows a process to register its address with the system.This makes it

possible for other process to find it. Binding uses domain specific address

formats.

The second argument struct sockaddr * address is a pointer to a protocol

specific address .

The third argument is the length of the address structure.

Server processes register their well-known addresses with the system.

This tells the system that any messages received at the address pointed

to by the address should be forwarded to the server process.Both

connection oriented and connectionless servers have to register their

sockets before accepting any client requests.

A client can register a specific address for itself.

A connectionless client should make sure that the system assigns it a

unique address, so that the server has a valid address to send the data to.

Connect

A

e

l

#include <sys/types.h> #include <sys/socket.h> int connect( int sock, struct sockaddr * servaddr, int addrlen ); Returns 0 on success or –1 on error.

#include <sys/types.h> #include <sys/socket.h> int bind( int sock, struct sockaddr * address, int addresslength ); Returns 0 on success or –1 on error.

CRANES VARSITY ALL RIGHTS RESERVED 108

client process connects a socket descriptor following the socket system call to

stablish a connection with the server.

isten

“UNIX”

©

This system call is used by a connection oriented server to indicate that it is

willing to receive connections.

T

a

T

q

c

I

b

a

o

r

a

A

w

a

A

.

#include <sys/types.h> #include <sys/socket.h> int listen( int sock, int backlog); Returns 0 on success or –1 on error

his call is executed after the socket and bind system calls have been executed,

nd before the accept system call.

he backlog argument specifies how many connections requests may be

ueued by the system, while it waits for the server to execute the accept system

all.

n the time it takes the server to handle the request of an accept (the time taken

y the server to for a child process, and then have the parent process execute

nother accept( )), it is possible for additional connection requests to arrive from

ther clients.what the backlog argument refers to is this queue of pending

equests for connections.

ccept

#include <sys/types.h> #include <sys/socket.h> int accept( int sock, struct sockaddr * clientname, int *addrlen); Returns the address of the client in clientname or –1 on error.

CRANES VARSITY ALL RIGHTS RESERVED 109

fter a connection oriented server executes the listen system call, the server

aits for a connection from some client process by having the server execute the

ccept call.

ccept performs the following functions:

• Blocks until a connection request is in the queue.

• Creats a new socket with the same properties as a sock

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 110

The clientname and addresslen arguments are used to return the address of

the connected peer process(the client).addrlen is called a ‘value result

argument’. The caller sets its value before the system call, and the system call

stores a result in the ariable. Usually these value result arguments are integers

that the caller sets to the size of the buffer, with the system call changing this

value on the return, to the actual amouint of data stored in it.

For this system call the caller sets the addrlen to the size of the sockaddr

structure whose address is passed as the clientname argument. On return, the addresslen contains the actual number of bytes that the stystem call stores in

the clientname argument.

The system call returns upto 3 values:

An integer return code that is either an error indicator or a new socket

descriptor.

The address of the client process(clientname), and

The size of this address.

Send, sendto,recv,recvfrom

When a connection is established, the server and client can exchange data.This can be performed using

The standarad read( ) and write( ) system calls.

In the connection oriented mode

send( ) and recv( ) system calls.

In the connectionless mode

Sendto( ) and recvfrom( ) system calls.

“UNIX”

© CR

Ta

T

#include <sys/types.h> #include <sys/socket.h> int send( int sock, char *buf, int bytes, int flags); int sendto( int sock, char *buf, int bytes, int flags, struct sockaddr *to, int addr_to_len); int recv( int sock, char *buf, int bytes, int flags); int recvfrom( int sock, char *buf, int bytes, int flags,struct sockaddr *from, int addr_from_len); Returns the length of the data that was transferred or –1 on error.

ANES VARSITY ALL RIGHTS RESERVED 111

he first three arguments are similar to the first three arguments of the read nd write system calls.

he flags argument is 0 or formed by OR’ing one of the following constants.

MSG_OOB Send/receive out-of-band data.

MSG_PEEK Peek at the incoming message (recv or recvfrom )

MSG_DONTROUTE Send without using the routing tables( send or sendto ).

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 112

An A-Z Index of the Linux BASH command line

alias Create an alias

awk Find and Replace text within file(s)

break Exit from a loop

builtin Run a shell builtin

cal Display a calendar

case Conditionally perform a command

cat Display the contents of a file

cd Change Directory

chgrp Change group ownership

chmod Change access permissions

chown Change file owner and group

chroot Run a command with a different root directory

cksum Print CRC checksum and byte counts

clear Clear terminal screen

cmp Compare two files

comm Compare two sorted files line by line

command Run a command - ignoring shell functions

continue Resume the next iteration of a loop

cp Copy one or more files to another location

cron Daemon to execute scheduled commands

crontab Schedule a command to run at a later time

csplit Split a file into context-determined pieces

cut Divide a file into several parts

date Display or change the date & time

dc Desk Calculator

dd Data Dump - Convert and copy a file

declare Declare variables and give them attributes

df Display free disk space

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 113

diff Display the differences between two files

diff3 Show differences among three files

dir Briefly list directory contents

dircolors Colour setup for `ls'

dirname Convert a full pathname to just a path

dirs Display list of remembered directories

du Estimate file space usage

echo Display message on screen

ed A line-oriented text editor (edlin)

egrep Search file(s) for lines that match an extended expression

eject Eject CD-ROM

enable Enable and disable builtin shell commands

env Display, set, or remove environment variables

eval Evaluate several commands/arguments

exec Execute a command

exit Exit the shell

expand Convert tabs to spaces

export Set an environment variable

expr Evaluate expressions

factor Print prime factors

false Do nothing, unsuccessfully

fdformat Low-level format a floppy disk

fdisk Partition table manipulator for Linux

fgrep Search file(s) for lines that match a fixed string

find Search for files that meet a desired criteria

fmt Reformat paragraph text

fold Wrap text to fit a specified width.

for Expand words, and execute commands

format Format disks or tapes

free Display memory usage

fsck Filesystem consistency check and repair.

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 114

gawk Find and Replace text within file(s)

getopts Parse positional parameters

grep Search file(s) for lines that match a given pattern

groups Print group names a user is in

gzip Compress or decompress named file(s)

hash Remember the full pathname of a name argument

head Output the first part of file(s)

history Command History

hostname Print or set system name

id Print user and group id's

if Conditionally perform a command

import Capture an X server screen and save the image to file

info Help info

install Copy files and set attributes

join Join lines on a common field

kill Stop a process from running

less Display output one screen at a time

let Perform arithmetic on shell variables

ln Make links between files

local Create variables

locate Find files

logname Print current login name

logout Exit a login shell

lpc Line printer control program

lpr Off line print

lprint Print a file

lprintd Abort a print job

lprintq List the print queue

lprm Remove jobs from the print queue

ls List information about file(s)

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 115

m4 Macro processor

man Help manual

mkdir Create new folder(s)

mkfifo Make FIFOs (named pipes)

mknod Make block or character special files

more Display output one screen at a time

mount Mount a file system

mtools Manipulate MS-DOS files

mv Move or rename files or directories

nice Set the priority of a command or job

nl Number lines and write files

nohup Run a command immune to hangups

passwd Modify a user password

paste Merge lines of files

pathchk Check file name portability

popd Restore the previous value of the current directory

pr Convert text files for printing

printcap Printer capability database

printenv Print environment variables

printf Format and print data

ps Process status

pushd Save and then change the current directory

pwd Print Working Directory

quota Display disk usage and limits

quotacheck Scan a file system for disk usage

quotactl Set disk quotas

ram ram disk device

rcp Copy files between two machines.

read read a line from standard input

readonly Mark variables/functions as readonly

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 116

remsync Synchronize remote files via email

return Exit a shell function

rm Remove files

rmdir Remove folder(s)

rpm Remote Package Manager

rsync Remote file copy (Synchronize file trees)

screen Terminal window manager

sdiff Merge two files interactively

sed Stream Editor

select Accept keyboard input

seq Print numeric sequences

set Manipulate shell variables and functions

shift Shift positional parameters

shopt Shell Options

shutdown Shutdown or restart linux

sleep Delay for a specified time

sort Sort text files

source Run commands from a file `.'

split Split a file into fixed-size pieces

su Substitute user identity

sum Print a checksum for a file

symlink Make a new name for a file

sync Synchronize data on disk with memory

tac Concatenate and write files in reverse

tail Output the last part of files

tar Tape ARchiver

tee Redirect output to multiple files

test Evaluate a conditional expression

time Measure Program Resource Use

times User and system times

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 117

touch Change file timestamps

top List processes running on the system

traceroute Trace Route to Host

trap Run a command when a signal is set(bourne)

tr Translate, squeeze, and/or delete characters

true Do nothing, successfully

tsort Topological sort

tty Print filename of terminal on stdin

type Describe a command

ulimit Limit user resources

umask Users file creation mask

umount Unmount a device

unalias Remove an alias

uname Print system information

unexpand Convert spaces to tabs

uniq Uniquify files

units Convert units from one scale to another

unset Remove variable or function names

unshar Unpack shell archive scripts

until Execute commands (until error)

useradd Create new user account

usermod Modify user account

users List users currently logged in

uuencode Encode a binary file

uudecode Decode a file created by uuencode

v Verbosely list directory contents (`ls -l -b')

vdir Verbosely list directory contents (`ls -l -b')

watch Execute/display a program periodically

wc Print byte, word, and line counts

whereis Report all known instances of a command

“UNIX”

© CRANES VARSITY ALL RIGHTS RESERVED 118

which Locate a program file in the user's path.

while Execute commands

who Print all usernames currently logged in

whoami Print the current user id and name (`id -un')

xargs Execute utility, passing constructed argument list(s)

yes Print a string until interrupted