unix quick learn

103
UNIX Overview The UNIX operating system was designed to let a number of programmers access the computer at the same time and share its resources. The operating system coordinates the use of the computer's resources, allowing one person, for example, to run a spell check program while another creates a document, lets another edit a document while another creates graphics, and lets another user format a document -- all at the same time, with each user oblivious to the activities of the others. The operating system controls all of the commands from all of the keyboards and all of the data being generated, and permits each user to believe he or she is the only person working on the computer. This real-time sharing of resources make UNIX one of the most powerful operating systems ever. Although UNIX was developed by programmers for programmers, it provides an environment so powerful and flexible that it is found in businesses, sciences, academia, and industry. Many telecommunications switches and transmission systems also are controlled by administration and maintenance systems based on UNIX. While initially designed for medium-sized minicomputers, the operating system was soon moved to larger, more powerful mainframe computers. As personal computers grew in popularity, versions of UNIX found their way into these boxes, and a number of companies produce UNIX-based machines for the scientific and programming communities. The uniqueness of UNIX The features that made UNIX a hit from the start are: · Multitasking capability · Multiuser capability · Portability · UNIX programs · Library of application software Multitasking Many computers do just one thing at a time, as anyone who uses a PC or laptop can attest. Try logging onto your company's network while opening your browser while opening a

Upload: nisanth

Post on 11-May-2015

3.109 views

Category:

Technology


5 download

DESCRIPTION

Learn Unix Commands in a day

TRANSCRIPT

Page 1: Unix Quick Learn

UNIX OverviewThe UNIX operating system was designed to let a number of programmers access the computer at the same time and share its resources.The operating system coordinates the use of the computer's resources, allowing one person, for example, to run a spell check program while another creates a document, lets another edit a document while another creates graphics, and lets another user format a document -- all at the same time, with each user oblivious to the activities of the others.The operating system controls all of the commands from all of the keyboards and all of the data being generated, and permits each user to believe he or she is the only person working on the computer.This real-time sharing of resources make UNIX one of the most powerful operating systems ever.Although UNIX was developed by programmers for programmers, it provides an environment so powerful and flexible that it is found in businesses, sciences, academia, and industry. Many telecommunications switches and transmission systems also are controlled by administration and maintenance systems based on UNIX.While initially designed for medium-sized minicomputers, the operating system was soon moved to larger, more powerful mainframe computers. As personal computers grew in popularity, versions of UNIX found their way into these boxes, and a number of companies produce UNIX-based machines for the scientific and programming communities.The uniqueness of UNIXThe features that made UNIX a hit from the start are:· Multitasking capability · Multiuser capability · Portability · UNIX programs · Library of application software MultitaskingMany computers do just one thing at a time, as anyone who uses a PC or laptop can attest. Try logging onto your company's network while opening your browser while opening a word processing program. Chances are the processor will freeze for a few seconds while it sorts out the multiple instructions.UNIX, on the other hand, lets a computer do several things at once, such as printing out one file while the user edits another file. This is a major feature for users, since users don't have to wait for one application to end before starting another one.MultiusersThe same design that permits multitasking permits multiple users to use the computer. The computer can take the commands of a number of users -- determined by the design of the computer -- to run programs, access files, and print documents at the same time.The computer can't tell the printer to print all the requests at once, but it does prioritize the requests to keep everything orderly. It also lets several users access the same document by compartmentalizing the document so that the changes of one user don't override the changes of another user.

Page 2: Unix Quick Learn

System portabilityA major contribution of the UNIX system was its portability, permitting it to move from one brand of computer to another with a minimum of code changes. At a time when different computer lines of the same vendor didn't talk to each other -- yet alone machines of multiple vendors -- that meant a great savings in both hardware and software upgrades.It also meant that the operating system could be upgraded without having all the customer's data inputted again. And new versions of UNIX were backward compatible with older versions, making it easier for companies to upgrade in an orderly manner.UNIX toolsUNIX comes with hundreds of programs that can divided into two classes:· Integral utilities that are absolutely necessary for the operation of the computer, such as the command interpreter, and · Tools that aren't necessary for the operation of UNIX but provide the user with additional capabilities, such as typesetting capabilities and e-mail.

Tools can be added or removed from a UNIX system, depending upon the applications required.UNIX CommunicationsE-mail is commonplace today, but it has only come into its own in the business community within the last 10 years. Not so with UNIX users, who have been enjoying e-mail for several decades.UNIX e-mail at first permitted users on the same computer to communicate with each other via their terminals. Then users on different machines, even made by different vendors, were connected to support e-mail. And finally, UNIX systems around the world were linked into a world wide web decades before the development of today's World Wide Web.Applications librariesUNIX as it is known today didn't just develop overnight. Nor were just a few people responsible for it's growth. As soon as it moved from Bell Labs into the universities, every computer programmer worth his or her own salt started developing programs for UNIX.Today there are hundreds of UNIX applications that can be purchased from third-party vendors, in addition to the applications that come with UNIX.How UNIX is organizedThe UNIX system is functionally organized at three levels:· The kernel, which schedules tasks and manages storage; · The shell, which connects and interprets users' commands, calls programs from memory, and executes them; and · The tools and applications that offer additional functionality to the operating system

The three levels of the UNIX system: kernel, shell, and tools and applications.The kernelThe heart of the operating system, the kernel controls the hardware and turns part of the system on and off at the programer's command. If you ask the computer to list (ls) all the

Page 3: Unix Quick Learn

files in a directory, the kernel tells the computer to read all the files in that directory from the disk and display them on your screen.The shellThere are several types of shell, most notably the command driven Bourne Shell and the C Shell (no pun intended), and menu-driven shells that make it easier for beginners to use. Whatever shell is used, its purpose remains the same -- to act as an interpreter between the user and the computer.The shell also provides the functionality of "pipes," whereby a number of commands can be linked together by a user, permitting the output of one program to become the input to another program.Tools and applicationsThere are hundreds of tools available to UNIX users, although some have been written by third party vendors for specific applications. Typically, tools are grouped into categories for certain functions, such as word processing, business applications, or programming.

Logging In & Out

When you have established contact with the Unix system, the login prompt will be displayed. You must give your username followed by your password:

login: lnp3jbPassword: secret1

The username can be up to 8 characters in length. Unix usernames contain only lowercase characters, and it is important that you type your username in lower case (if you don't you will be permitted to log in, and then the shell will not recognise case differences.) The password must normally contain between 6 and 8 characters. On some unix systems the password must contain at least 1 non-alphabetic character.

System messagesWhen you log in a number of system messages may be displayed. The more filter will be used to control the output if the file contains more than a screenful of information. Just press the space bar to see the next screenful if it says 'more' at the bottom of the screen.

The message: You have new mailindicates that electronic mail has been sent to your mailbox.

The promptWhen your login procedure is completed you should see the system prompt. This indicates that the shell is running and is awaiting instructions from the user. The prompt can take many forms, and you can change it later on if you want to. Often the prompt will contain the % character, and a number in brackets. This number will represent the number of a command, and can be used to recall commands already issued. It may also display the name of machine or system that you are logged onto. Some users prefer to

Page 4: Unix Quick Learn

have the name of the current working directory displayed in their prompt. For convenience, in this document, the % character will be used to represent the prompt.

Changing your password

Use the passwd command to change your password: % passwd -where '%' is the prompt

Changing password for lnp5mwOld password: -type in your old passwordNew password: -type in your new passwordRetype new password: -and again, to make sure%

Logging outWhen you have finished your unix session you must log out from the system. To do this give the command:

% logout

You should always wait for the message confirming that you have logged out.

On some unix systems you may receive the message:

logout: command not known

If this happens you should type:

exit

You may occasionally get the message:

There are stopped jobs

If this happens simply give the logout command again.

------------------------------------------------------------------------------------------------------------PRACTICE Log in to the unix system using your username and password.Change your password using the passwd command. You may find that the system will not change your password immediately. In this case you may have to use your old password next time that you log on.------------------------------------------------------------------------------------------------------------

Page 5: Unix Quick Learn

THE UNIX FILESTORE------------------

File hierarchyUnix has a hierarchical tree-like filestore. The filestore contains files and directories.

The top-level directory is known as the root. Beneath the root are several system directories. The root is designated by the / character.

The directories below the root are designated by the pathnames:

/bin /etc /usr

Confusingly, the / character is also used as a separator in pathnames. Historically, user directories were often kept in the directory /usr. However, it is often desirable to organise user directories in a different manner.

Users have their own directory in which they can create and delete files, and create their own sub-directories. For example:

/user/ei/eib035

belongs to someone whoe has the username eib035.

Some typical system directories below the root directory:

/bin contains many of the programs which will be executed by users /etc files used by system administrators /dev hardware peripheral devices /lib system libraries /usr normally contains applications software /home home directories for different systems

The current directoryThis refers to your actual location in the filestore hierarchy. When you log in the current directory is set to the home directory. You can then change current directory, effectively moving around the filestore tree structure. The current directory is also called the "current working directory" and the "working directory". The current directory can be referred to in pathnames by the . character (a full stop).

Changing current directoryThe command cd is used to change your current directory. For example:

% cd bin

Page 6: Unix Quick Learn

will move you from your current directory, down one "branch" to the directory bin, if such a directory exists. Typing cd with no arguments takes you to your home directory.

Display current directoryThe command pwd is used to display your current directory. For example:

% pwd/home/sunserv1_b/lnp5jb/bin

PathnamesFiles and directories may be referred to by their absolute pathname. For example:

/home/sunserv1_b/lnp5jb/bin/hello

Files and directories may also be referred to by a relative pathname. For example, if your current directory is /home/sunserv1_b/lnp5jb, the above file can be referred to as:

bin/hello

The home directoryEach user has a home directory. They will be attached to this directory when they log in. Jenny Brown's home directory is:

/home/sunserv1_b/lnp5jb

The symbol ~ can be used to refer to the home directory. If Jenny Brown wishes to refer to her file she can give:

~/bin/hello

rather than typing the long form:

/home/sunserv1_b/lnp5jb/bin/hello

The symbol ~ can also refer to other the home directory of other users. For example Jenny can refer to a file in John Smith's home directory using:

~lnp5js/test.dat

The parent directoryThe parent directory is the directory above the current directory. The parent directory can be referred to by the .. characters (two full stops). For example to refer to the file test.dat in the parent directory:

../test.dat

Page 7: Unix Quick Learn

Linking filesThe ln command can be used to link files and directories across the filestore system. The symbolic link function (ln -s) is the most useful. This enables a file or directory to appear to be in a particular directory when it is in fact stored somewhere else. This can save the user from having to type out long pathnames for frequently used files or directories. For example, if you want to use the files in /usr/games regularly, you can set up a symbolic link to this directory. If Jenny Brown is in her home directory and types:

% ln -s /usr/games fun

this will create what appears to be a new directory below her home directory, entitled fun. When she does cd fun she will move to /usr/games. If she now does pwd, the current directory will appear as /home/sunserv2_a/lnp5jb/fun. Some things may be a little surprising however: the parent directory, for example, will be that of the original file or directory.

--------------------------------------------------------------------------------

Exercises

Check which directory you are currently in. If necessary, move to your home directory. (Remember: cd will do this from anywhere). Move to the root directory. ("Move to..." means "change your current working directory to...". It is useful to picture the process as movement around the tree structure.) Work your way down one directory at a time to your home directory. Experiment with using relative and absolute pathnames; show how the two can produce the same results. Explore your systems filestore. Try to get into the home directory of someone else you know! (You may not be able to view their files.)

--------------------------------------------------------------------------------

UNIX COMMANDS--------------------------------------------------------------------------------

Unix commands have the general format:

command [options] [item]

Items in brackets are optional, and words in italics are generic identifiers (i.e. options must be replaced by a particular option, e.g. -a).

Note that:

Page 8: Unix Quick Learn

Commands are case sensitive. The command ls is different from LS. In fact LS is not recognised as a valid command.

Command options consist of a single character. The command to list all the files in a directory is ls -a and could not be ls -all (the latter would have to mean a combination of options.)

Command options can usually be combined or listed separately. For example:

ls -al or ls -a -l

The command item is given last. This is very often a file name. For example:

ls -a file1.f not ls file1.f -a

The echo commandThe echo command 'echoes' its argument to the standard output. This means that in its simplest form it prints something out on screen. For example:

% echo Hello - you typeHello - response from the shell%

Who is logged on?The command who gives a list of logged on users:

% whoroot console Jan 4 10:34men6matw ttyp1 Jan 6 09:45 (ecusun1)cbl6nd ttyp2 Jan 6 10:10 (cblslcd)cbl6ar ttyp3 Jan 6 16:03 (cblsuna)csc6ea ttyp4 Jan 6 14:15 (csuna1)root ttyp5 Jan 6 10:40 (sun032)ecl6rsh ttyp6 Jan 6 15:39csc6ea ttyp8 Jan 6 14:15 (csuna1)lnp5mw ttyUf Jan 6 16:16lnp5jb ttyp3 Jan 6 15:20 (sun051)

Also try the command finger. This command gives the full name of logged in users.

--------------------------------------------------------------------------------PRACTICE Type finger to get information on yourself and other users.

Page 9: Unix Quick Learn

--------------------------------------------------------------------------------

Creating a directoryThe mkdir command is used to create directories. The format of this command is: % mkdir directory_name

Jenny Brown stores her unix scripts in a directory called scripts beneath her home directory. In order to create this directory she uses the command:

% mkdir scripts

Deleting a directoryThe rmdir command is used to delete directories. The format of this command is: % rmdir directory_name

Jenny Brown stores files for project work in a directory called proj. When the project has been completed she deletes the directory using the command:

% rmdir proj

Note that the directory must be empty before it can be deleted.

Listing contents of a directoryThe command ls is used to list the contents of a directory. For example: % lsfile1 scripts test.f test

Notice that directories are listed as well as files. To list all files, including hidden files, give the command:

% ls -a.cshrc file1 bin test.f test

Hidden files begin with . (a full stop). Hidden files are normally system files, and will normally include the following:

% ls -a.cshrc .forward .history .login .logout

.cshrc contains commands that are executed every time you start off a C-shell, including when you log in .forward enables you to redirect your mail to another computer .history contains a record of previously executed commands .login contains commands that are executed at login time .logout contains commands that are executed at logout time

Page 10: Unix Quick Learn

The purpose of some hidden files.

To identify directories in a listing give the command:

% ls -Ffile1 bin/ test.f test

Notice how the directory is identified by the slash (/) character.

Deleting filesFiles can be deleted using the rm command. For example:

% rm test.f

Displaying filesThe command cat is used to display the contents of a file on the screen.

For example:

% cat file1

Creating filesThe command cat can also be used to create a file. For example:

% cat > test.fWhen typing in a new filethe input must be terminated by^D

NOTE ^D means press the <ctrl> and the d keys simultaneously. Be careful not to type ^D when you have the shell prompt, because this might log you out. Normally you would use an editor for creating files. This example is given since it illustrates how to create a small file without needing to learn the use of an editor.

Copying filesThe command cp is used to copy a file. It takes the format:

% cp old_file new_file

For example:

% cp file1 file2

Renaming filesThe command mv is used to rename a file.

Page 11: Unix Quick Learn

For example:

% mv file2 temp

changes the name of file2 to temp.

Moving filesThe command mv is also used to move a file to a new location in the filestore hierarchy. For example:

% mv file2 bin

moves the file file2 into the subdirectory bin.

Overwriting filesCommands such as rm and cp can be dangerous if not used with care. The command:

% cp file1 file2

will delete file2 if a file of that name already exists. If you have spelled the name of the new file incorrectly you may accidentally overwrite the contents of a file. Using the wildcard symbol * with the command rm can also be very dangerous. The command:

% rm test*

will delete all files starting with test. However if you inadvertently type an extra space (do not try this!):

% rm test * -do not try this!

the file test will be deleted if it exists. Then all other files in the directory will be deleted! Often no warning will be given.

To prevent accidental deletion of files you can use the -i option with commands such as rm. The format of the command is:

% rm -i file

You will be asked to confirm that files are to be deleted. You may find that this is set as the default on your system.

WildcardsWildcard characters can be used to identify directory and file names. The wildcard character * is used to refer to any combination of characters. For example:

% ls * - refers to all files

Page 12: Unix Quick Learn

% cat test* - refers to all files starting with 'test', e.g. 'test', 'testing', 'test.c', etc.

The wildcard character ? is used to refer to a single character. For example:

% ls test? - refers to files starting with 'test' followed by a singlecharacter e.g. 'test1', 'test2', 'testz', etc.% cat test.? - refers to all files

starting with 'test' with a single character after the full stop, e.g. 'test.c, test.f'

--------------------------------------------------------------------------------

ExercisesDisplay your current working directory using the pwd command. Make a directory called exercises. Change your directory to the directory exercises. Display the current working directory. Return to your home directory. List the contents of your directory. Use the -l, -a and -F options and compare the output. Change your directory to the directory exercises. Create a file called example1 using the cat command containing the following text: water, water everywhereand all the boards did shrink;water, water everywhere,Nor drop to drink

List the contents of your directory. Use the -l option to obtain a long listing.

Viewing files with the more command

The command more is used to display the contents of a file on the screen. The command is particularly useful for viewing long files since the display stops at the bottom of the screen. The following is a listing of a program in the Icon programming language:

% more lookup.icn# program to look up words (given at the terminal) in the# computer usable version of the OALD# last change 18.12.91# set global parametersglobal k# main bodyprocedure main()# input word to be searched for write("Give me a word: \n") word:=read()# this the important line - call the 'lookup' procedure

Page 13: Unix Quick Learn

if not write(lookup(word)) then write("Not found in the dictionary.")endprocedure lookup(voc)# connect to the dictionary(dict:=open("/home/sunserv1_a/ecl6rsh/oald.mitton/cuv2")) | stop("can't open the dictionary")# lookup algorithm every k:=1 to *voc do {--More-- (75%)

The message at the bottom of the screen means that 75% of the file has been viewed so far. (The amount shown on screen will depend on the type of terminal you are using.) You can now do the following:

To continue viewing press the space bar

To view the next line press <RETURN>

To quit press the <q> key

To jump to the next occurrence of a string of characters type /string

For a list of valid commands press the <h> key.

Viewing files with the pg commandThe pg command is also available on some systems. This is an alternative to more

% pg lookup.icn# program to look up words (given at the terminal) in the# computer usable version of the OALD# last change 18.12.91

# set global parametersglobal k

# main bodyprocedure main()# input word to be searched for write("Give me a word: \n") word:=read()# this the important line - call the 'lookup' procedure if not write(lookup(word)) then write("Not found in the dictionary.")end

procedure lookup(voc)# connect to the dictionary

Page 14: Unix Quick Learn

(dict:=open("/home/sunserv1_a/ecl6rsh/oald.mitton/cuv2")) | stop("can't open the dictionary")# lookup algorithm every k:=1 to *voc do { bit:=bite(voc)

Commands can be typed to the ':' prompt at the bottom of the screen: Type <RETURN> to view the next screen. Type <h> for a list of valid commands.

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

If you have a file longer than 20 lines use pg to view it. Compare the use of pg with more. Use them both on the file /etc/passwd, and find the listing for your own username.

Searching for strings in filesThe command grep is used to search a file for a string of characters. For example, to search the file lookup.icn for the character '#' (which designates comments in the program), use the command:

% grep # lookup.icn# program to look up words (given at the terminal) in the# computer usable version of the OALD# last change 18.12.91# set global parameters# main body# input word to be searched for# this the important line - call the 'lookup' procedure# connect to the dictionary# lookup algorithm

A lot of pattern matching operations can be carried out with grep. The following example shows the use of a regular expression. In this example, the search is restricted to lines beginning with the 'p' character.

% grep " p" lookup.icnprocedure main() -output starts hereprocedure lookup(voc)procedure bite(voc2)

You will learn more about pattern matching expressions later.

Page 15: Unix Quick Learn

Control charactersThe actual key sequences for the following operations can vary from between different systems and different terminals. The most commonly used key sequences are described below. If it is different on your system, remember the correct sequence and use it whenever the key sequences below are referred to later in the text. Where possible the operation itself is named (e.g. end-of-file), and not just the key sequence.

Deleting the last character typedIf you make a typing mistake you can delete the last character typed by using your delete key, which is usually the one marked <DEL> or <DELETE>.

Deleting the entire lineIf you make many typing mistakes you can delete the entire line by typing ^U.

NOTE Remember ^U means "press <CTRL> and <u> keys simultaneously".

Sending an interruptIf you wish to terminate the execution of a command type ^C.

Sending an end-of-file characterIn many Unix commands you need to finish your input with an end-of-file character. The default end-of-file character is ^D.

Printing on paperThis is usually called 'obtaining hard copy output', as distinct from output to the screen or a file. The command lpr sends a file to the line printer:

% lpr file1

Note that the command lp is used on some Unix systems. The command:

% lpr -Pprinter file

is used to submit the file to a specific printer.

--------------------------------------------------------------------------------

The locally developed command printers can be used to obtain a list of printers.

--------------------------------------------------------------------------------

Getting helpThe command man is used to display help on the syntax of Unix commands.

Page 16: Unix Quick Learn

The format of this command is:

% man [option] [file]

For example to obtain help information on the who command, type:

% man who

The keyword option -k keyword is used to display a list of help files associated with the keyword. For example to display a list of all man files associated with password type the command:

% man -k passwordgetpass(3) read a passwordpasswd(1) change login passwordpasswd(5) password file

The command man automatically invokes the more program for viewing files. You can use the normal more commands to continue viewing.

--------------------------------------------------------------------------------

If you have any problems that can't be solved by referring to the manual, please consult your supervisor or the Advisory Service. The Help Desk can be contacted in person in the User Access Area, on the telephone on extension 5366, or by email to helpdesk. Also the LUCS Unix system operators can be contacted on telephone extension 5380. With non-urgent problems, an email message to your supervisor is usually the most efficient way of getting help. (See next chapter on how to use email.)

--------------------------------------------------------------------------------

Exercises1. Display a list of logged on users.

2. Obtain further information for a particular user using the finger command.

3. Use the man command to obtain further information on the finger command.

4. Use the man -k command to find what manual entries there are related to passwords.

5. Use the grep command to search the file example1 for occurrences of the string 'water'.

Page 17: Unix Quick Learn

6. Use man and the keyword option to find out more information on communications and e-mail in Unix.

7. Print out a file on paper.

COMMUNICATIONS

--------------------------------------------------------------------------------

MailThe mail command enables the user to send and receive electronic mail messages to and from users on both the Unix system and remote users.

This is the basic mail command. Enhanced versions, such as programs that run under a windows program (e.g. mailtool), or screen-based versions of mail (e.g. elm) may be available, and you will probably find them preferable to mail. If so, much of the following can safely be ignored. Remember however that some version of mail will definitely be available on any unix system that you use.

Sending mailTo send a message to a user on your system, type:

% mail username

The cursor will move to the next line, and you will get a Subject: prompt. You can now type in the subject of your message, and then press <RETURN>. The cursor will go to the start of the next line and there will be no prompt. You now type in the text of your message. Terminate each line with <RETURN>. When you have finished the text of the message, type an end-of-file character (usually ^D), or a full-stop character. You should now return to your normal shell prompt. If the message is dispatched successfully, you will hear no more about it. The following is example of the mail command in action:

% mail lnp6ttldSubject: UNIX courseI don't think I'll ever be able to get the studentsin the UNIX course to understand how to use e-mail.^D%

Entering the text of the message by this method is a rather crude process. Errors on the line being typed can be erased with your delete key, but once you have pressed <RETURN>, a line cannot be edited. A message may be aborted by pressing ^C twice.

--------------------------------------------------------------------------------

Page 18: Unix Quick Learn

PRACTICE

--------------------------------------------------------------------------------

Send yourself a message. (You will find out where it has gone in the next section.)

Subcommands while entering mailThere are several commands you can type while entering mail:

<CTRL/Z> will cancel the message, and leave the text in a file named dead.letter.

^e invoke a text editor to edit your message.

~v invoke a screen editor to edit your message.

~f reads the contents of the message you have just read, into your message text.

~r file reads contents of file into your message text.

While this method is quick and easy to use, and quite adequate for short and simple messages, many users prefer to first create a file containing the text of the message, and then mail this file to the intended recipient. This enables you to use any system editor and formatter to create the message, and you do not need to send it immediately.

The following sequence shows how to send a file note containing the text of a message to another user.

% mail lnp6ttld < note

To understand fully how this works see the section on 'Re-direction of standard output' in Chapter 8 below.

In this example the message will not contain a subject heading, unless one has already been included as the first line of the file note. There is a -s option with the mail command, that can be used to include a subject header, as follows:

% mail -s UNIX lnp6ttld < note

The string following the -s is the subject; in this case, the subject is "UNIX".

Receiving mailIf new mail is waiting for you when you login, you will see the message:

You have new mail

Page 19: Unix Quick Learn

To start the mail program type the command:

% mail

Each message is summarised on a numbered list. The current message is marked with a ">" character. The mail prompt character is "&". Type the number of the message you want to read, or just press <RETURN> to read through the list. The list of mail headers will look something like this:

% mailMail version SMI 4.0 Thu Oct 11 12:59:09 PDT 1990 Type ? for help."/usr/spool/mail/lnp5jb": 2 messages 2 new>N 1 lnp5mw Thu Jan 9 15:10 11/262 hello N 2 lnp5js Thu Jan 9 15:11 10/287 party&

This tells Jenny Brown that she has two messages, one from user lnp5mw, and one from lnp5js. The date and time at which the messages were received is also listed, and so is the subject header (the last item on each line - here 'hello' and 'party'). The following commands can be entered to the mail prompt:

d Mark the current message for deletion

d n Mark message number n for deletion

u n undelete message number n.

w file save the current message in file with the mail header and mark for deletion

s file Save the current message in file without the mail header and mark for deletion

r Reply to the current message

q Quit mail, removing deleted messages from your system mailbox. Undeleted messages that have been read are normally stored in your personal mailbox (see below)

x Exit mail, leaving your mailbox untouched, i.e. messages deleted in this session are restored

h Show list of message headers

? List the useful mail commands

! command Execute specified shell command

- Re-read previous message.

Page 20: Unix Quick Learn

m recipient Send mail to named recipient

Files used by mail~/mbox Your personal mailbox, located in your home directory. This is where messages that you have saved are stored, unless you specified another location when you saved them. You can access this file by issuing the command:

% mail -f mbox

~/.mailrc A file that can hold commands for mail to obey when it starts up.

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

See if you have received any mail. If you have, save a message to your mailbox file. Send yourself another message, and this time discard it. Send a message to another user.

Sending mail to remote usersThe following also applies to the elm mail program.

Sending mail to users on other computer systems is simple using mail. Simply type the full address of the remote user where the system username is used above. For example:

% mail [email protected]% mail -s Hello [email protected] < note

These two examples show two ways of sending mail shown above.

It is also possible to use mail to look at folders of mail that you have already received. To do this type:

% mail -f folder_name

and it will treat the messages in the folder as incoming mail.

Sending on-line messagesAs you have seen, messages sent using mail are received in a special buffer, and it is up to the recipient when to look at them and what to do with them. It is also possible to send a message that will simply appear on the screen of the recipient, if they are logged on. This is less useful than mail for the following reasons:

Page 21: Unix Quick Learn

mail can be used irrespective of whether the recipient is logged on or not.

mail messages can be stored by the recipient. This means that files can be transferred by mail, and a record of transactions can be kept.

On-line messages can be confused with whatever the recipient has on screen and can easily disrupt what the are doing. They can be very annoying!

On the other hand, on-line messages do have the advantage of obtaining the immediate attention of another user, and it is possible to have an interactive conversation. Bearing these facts in mind, use the following command with caution!

writeThe write command is used to send on-line messages to another user on the same machine.

The format of the write command is as follows:

% write usernametext of message^D

After typing the command, you enter your message, starting on the next line, terminating with the end-of-file character. The recipient will then hear a bleep, then receive your message on screen, with a short header attached. The following is a typical exchange. User lnp5jb types:

% write lnp8zzHi there - want to go to lunch?^D%

User lnp8zz will hear a beep and the following will appear on his/her screen:

Message from lnp5jb on sun050 at 12:42Hi there - want to go to lunch?EOF

If lnp8zz wasn't logged on, the sender would see the following:

% write lnp8zzlnp8zz not logged in.

Page 22: Unix Quick Learn

SunOS has the talk command. This has several advantages over write. Firstly, talk can call other machines on a network. Secondly, talk provides a clearer interface for the exchange of messages, dividing the screen into two windows for the interlocutors. Type

talk username@machine

to start a conversation.

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

Try to have an extended on-line conversation with another user.

You can stop messages being flashed up on your screen if you wish. To turn off direct communications type:

% mesg n

It will remain off for the remainder of your session, unless you type:

% mesg y

to turn the facility back on. Typing just mesg lets you know whether it is on or off.

Remote loginsIt is possible to log on to another machine on a Unix network, provided that you have permission to do so. To do this use the rlogin command. Type:

rlogin machine

and you will be asked for your password. It may be necessary for you to do this to make on-line communications with another user easier.

--------------------------------------------------------------------------------

Exercises1. Send a message to another user on your Unix system, and get them to reply.

2. Create a small text file and send it to another user.

Page 23: Unix Quick Learn

3. When you receive a message, save it to a file other than your mailbox. (Remember you can always send yourself a message if you don't have one.)

4. Send a message to a user on a different computer system.

5. Send a note to your course tutor telling him that you can use mail now.

FILE PERMISSIONS

--------------------------------------------------------------------------------

What are file permissions?The Unix file security system can prevent unauthorised users from reading or altering files.

Every file and directory has specific permissions associated with it, giving different categories of user certain permissions to look at or change a file, and to run executable files.

NOTE Executable files are files containing commands than can themselves be executed as if the file itself were a command.

The file permissions can be displayed using the command:

% ls -l [filename]

For example, to display the permissions on the file lookup.icn, type the command:

% ls -l lookup.icn-rw-r--r-- 1 lnp5jb 777 Dec 18 lookup.icn

The first set of characters in the output from the command (-rw-r--r--) gives the permissions. The username in the middle of the line (lnp5jb) is the owner of the file. This is user who created the file. The following fields tell you the number of characters in the file, the date it was created and the name of the file.

Note that the first character specifies the file type. This is normally one of the following:

- indicates a file

d indicates a directory

The following nine characters represent permissions for different classes of users. Users on a Unix system are assigned to a group or groups, which might correspond to a

Page 24: Unix Quick Learn

particular department, or research group in the real world. Members of a particular group can be allowed access to files belonging to other members of the group.

The second, third and fourth characters in the permissions string represent permissions that apply to the owner of the file. The next three characters apply to members of the owner's group. The last three apply to all other users. The file in this example therefore has rw- for the owner, r-- for the group and r-- for others.

The three characters corresponding to each class of user each represent a different type of permission. The first character represents 'read' permission. This means that a user has permission to open a file and view the contents. If there is an r in this position then that class of users has read permission. In this example all users have read permission. In this, and in every case, a horizontal bar character (-) means that permission is denied.

The second position represents 'write' permission (the right to make changes to a file). In the example, only the owner has write permission. Normally, you will not want others to be allowed to make changes to your files, so write permission is only allowed to the owner.

The third position represents 'execute permission'. This means permission to 'execute', or run, a file that works like a command. In this example no-one has execute permission for the file lookup.icn (it is an Icon program, and it would have to be compiled before it could be executed, so execute permission would be useless). To summarise the above, this is how the permissions string is divided up:

- rw- r-- r--type of file owner group others

Here is another example, this time an executable file:

-rwxr-x--x 1 lnp5jb 562 Jan 10 hello

This tells us that hello is a file; the owner is lnp5jb, the owner has read, write and execute permission; the group has read and execute permission; others just have execute permission.

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

What are the default permissions for your files and directories? Are they all the same?

Page 25: Unix Quick Learn

When you copy a file what file permissions does the new file have?

Changing file permissionsThe command chmod is used to change the permissions on a file. The format of this command is:

% chmod mode filename

For example, to add read permission for the group to the file file1, give the command:

% chmod g+r file1

chmod modesIn the command:

% chmod mode filename

the mode consists of three elements:

who

operator

permissions

The following options are possible:

who:u user (owner)

g group

o other

a all

operators:- remove permission

+ add permission

= assign permission

permissions:r read

Page 26: Unix Quick Learn

w write

x execute

For example:

chmod o-rw file1.f

removes read and write permissions from others.

chmod u+x test

adds execute permission to the owner.

Permissions for directoriesRead, write and execute permissions are set for directories as well as files. Read permission means that the user may see the contents of a directory (e.g. use ls for this directory.) Write permission means that a user may create files in the directory. Execute permission means that the user may enter the directory (i.e. make it his current directory.)

--------------------------------------------------------------------------------

Exercises1. Try to move to the home directory of someone else in your group. There are several ways to do this, and you may find that you are not permitted to enter certain directories. See what files they have, and what the file permissions are. (Remember that you can protect your own files from prying eyes, or from interference.)

2. Try to copy a file from another user's directory to your own.

3. Set permissions on all of your files and directories to those that you want. You may want to give read permission on some of your files and directories to members of your group.

STANDARD INPUT AND OUTPUT

--------------------------------------------------------------------------------

Standard inputInput to Unix commands is normally given from the keyboard. For example you can use the cat command interactively:

% cat

Page 27: Unix Quick Learn

Hello - you typeHello - responsethere - you typethere - response^D - you type%

Note that input from the keyboard is terminated with the end-of-file character, usually ^D. For another example consider the spell command, which is the unix spelling checker:

% spell - you typeInput to the spell ulitity - you typeis typed at the keyboard - you type D - you typeulitity - response

The spell command outputs words that are incorrectly spelled in the input.

Standard outputOutput from Unix commands is normally displayed on the screen. For example:

% spellInput to the spell ulitityis typed at the keyboard^Dulitity - output

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

Try out the spell checker. See how it copes with British spellings (remember it's an American system), proper nouns, hyphens and recently coined vocabulary.

Re-direction of standard inputIt is possible to redirect standard input so that the input is taken from a file. Imagine you wish to check for spelling errors in a report. A text can be put into the file report, which can be fed into the spell command:

% cat > reportInput to the spell ulititycan come from a file^D% spell < reportulitity

The < character is used to re-direct the input from the file report to the command spell. The general format for re-direction of user input is:

Page 28: Unix Quick Learn

command < filename

Another common use of re-direction of standard input is to mail a file to another user. The command:

% mail lnp8zz < report

will mail the file report to local user lnp8zz.

Re-direction of standard outputYou do not always want the output from a Unix command to be displayed on the screen. It has already been shown how it is possible to direct the output from the cat command to a file. Imagine you want a list of your files and directories kept in a file. You would use the command:

% ls > filelist

The > character is used to re-direct the output from the command to the file called filelist. The general format for re-direction of user output is:

% command > filename

Note that output directed to the file /dev/null is effectively discarded. This is the system 'wastebasket'.

Another example involves directing the output of echo to a file:

echo "Hello there" > greeting

This would normally overwrite any existing contents of the file greeting. Study the following sequence:

% echo "Hello there" > greeting% cat greetingHello there% echo "This instead" > greeting% cat greetingThis instead

It is possible to append output to a file, rather than overwriting it, by using the >> operator. For example:

% echo "Hello there" > greeting% cat greetingHello there

Page 29: Unix Quick Learn

% echo "and goodbye" >> greeting% cat greetingHello thereand goodbye

Look carefully at the difference between these two examples.

Re-direction of input and outputIt is possible to re-direct both standard input and output. If you have a report containing many spelling mistakes you may wish to keep a list of the mistakes in a file. You can do this using the following command:

% spell < report > errors

PipingOutput from one command can be sent ('piped') to the input of another command using the | character:

command1 | command2

A common use for pipes is to control the output of large files to the screen. It is possible to send output to the more command so that only one screenful at a time is output. If the command

% ls -l

is used to give a long listing of all files and directories there may be too many lines to see them all at once on the screen. (If you don't have many files, move to /etc where there should be plenty.) Output from ls -l can be piped to more as follows:

% ls -l /etc | more

You can then use the usual more commands to control the output.

In the output from ls -l, directories are identified by the d character at the start of each line. A list of just the directories can be obtained by piping the output of this command to the grep command, giving grep an option which will list only lines containing the d character at the start of the line. The command is:

% ls -l | grep "^d"

The commands sort and grep are often used when piping. For example:

% cat phonenos | sort | lpr

Page 30: Unix Quick Learn

will send an alphabetically sorted list of the phone numbers contained in the file phonenos to the line printer. The command:

% cat phonenos | grep leeds | sort | lpr

will send a sorted list of phone numbers containing the string 'leeds' to the line printer.

--------------------------------------------------------------------------------

Exercises1. Put a listing of the files in your directory into a file called filelist. (Then delete it!)

2. Create a text file containing a short story, then use the spell program to check the spelling of the words in the file.

3. Redirect the output of the spell program to a file called errors.

4. Type the command ls -l and examine the format of the output. Pipe the output of the command ls -l to the word count program wc to obtain a count of the number of files in your directory.

AN INTRODUCTION TO THE EX LINE EDITOR

--------------------------------------------------------------------------------

What's ex for?Editors available on Unix include:

ed basic line editor

ex line editor

vi screen editor

emacs screen editor

Ex is an enhanced and more friendly version of ed. Vi is a screen-based version of ex. Most users have no practical use for a line editor nowadays, and they are really a relic of an earlier age in computing. However, you may occasionally have to use ex, if for some reason you can't run a screen editor on your terminal. It is covered here mainly to teach something else, namely, the way that Unix handles texts. This is perhaps most transparent when you are using ex. Ex forces the user to use complicated pattern matching operations to do things that are comparatively easy with a screen editor, such as making correcting small typing errors in the text. While taking this approach may at times seem

Page 31: Unix Quick Learn

unnecessarily difficult, it should be remembered that what follows here is just a stepping stone to other Unix utilities, such as vi (which you are far more likely to want to use as an editor than ex), and commands that use regular expressions, such as grep, tr and awk. Learning to use ex involves skills necessary for getting the most out of these utilities.

Using exStarting exThe command ex is used to invoke the editor. The format of this command is:

% ex [filename]

A filename can be supplied if you wish to edit an existing file.

% ex oldfile"oldfile" 10 lines 465 characters:

Alternatively the filename may be used as the name of a new file:

% ex newfile"newfile" [Newfile]:

notice that the prompt for ex commands is the ':' character.

Adding TextTo enter text simply type the command a (short for append), and then type in the text, as follows:

:aThis is the text

Input is terminated by typing a full stop ('.') on a new line:

:aThis is just one line of text.:

The command i is used to insert text before the current line.

Saving Your DataThe command w (short for 'write') is used to save your data. The format of this command is:

:w [filename]

Page 32: Unix Quick Learn

If no filename is specified, the filename given when ex was invoked will be used. E.g.:

:w test.f test.f 50 lines 576 characters :

The number of lines and characters in the file will be displayed.

Quitting the EditorThe command q (short for 'quit') is used to quit the editor. Note that if changes have been made to the file and have not been saved the editor will respond with a warning message:

No write since last change (:quit! overrides)

The command quit! (or just q!) must be given if you wish to quit without saving your changes:

Displaying Lines in the FileThe p command (for 'print') used to display lines in the file. The format of this command is:

:[line_range] p

If no range is supplied the current line is displayed.

Pressing <RETURN> is equivalent to moving on to and displaying the next line. With small files it is possible to display the entire file by pressing <RETURN> until the end of the file is reached.

Line RangesRanges of lines that can be given to edit commands include:

Absolute line number

6 refers to line 6

1,6 refers to lines 1 to 6

Relative line numbers

-2 refers to 2 lines before the current line

+3 refers to 3 lines after the current line

-2,+3 refers to a range from 2 lines before the current line to 3 lines after the current line

Page 33: Unix Quick Learn

Special symbols

$ refers to the last line in the file e.g. $p to display last line, 1,$p to display entire file

. refers to the current line e.g. .,$p to display from the current line to the end

Examples:

6d - deletes lines the sixth line1,6d - deletes the first six lines1,$d - deletes all lines3a - append text after line three.,+10w new - saves the next ten lines to a file called new

The = operator gives the line number, with the last line the default, so typing = gives you the number of lines in a text. The number of the current line is obtained by typing .=.

Deleting LinesThe d command is used to delete lines. The format of this command is:

:[line_range] d

If no line number is given the current line will be deleted. It is possible to supply a range of lines. For example:

:1,$d

will delete the entire file.

SearchingSearches are carried out by including the search string in slashes ('/'):

/string/

The search will start at the current line.

:/Jane/ This is Jane's file

The special characters '^' and '$' can be used to assist the search. For example:

/^This/ will find a line beginning with 'This'/file$/ will find a line ending in 'file'

The last string searched for is the default string. This means that you can repeat a search just by typing //.

Page 34: Unix Quick Learn

Reverse SearchesReverse searches are carried out by including the search string in question marks ('?'):

:?string?

The search will start at the current line and search backwards through the file.

Making SubstitutionsThe s command is used to make substitutions. The format of this command is:

:[line_range]s/old_string/new_string/

If no line number is given substitutions will be made only on the current line. For example:

:s/old/new/

will substitute the first occurrence of the string 'old' with 'new' on the current line. The command:

:.,$s/old/new/

will substitute the first occurrence of the string 'old' with 'new' in every line from the current line to the end of the file.

Global SubstitutionsThe g command (for 'global') is used to make multiple substitutions on a line. For example:

:s/old/new/g

will substitute all occurrences of the string 'old' with 'new' on the current line. The command:

:1,$s/old/new/g

will substitute all occurrences of the string 'old' with 'new' in the file.

Search strings can also be used in conjuction with the s command in order to carry out more sophisticated global changes. The line range preceding a substitution string may include a search for the string to changed. For example:

:g/old/s//new/g

Page 35: Unix Quick Learn

This means 'search globally for 'old', then replace every occurrence with 'new'. Remember the null string (in s//) stands for the last RE, in this case the RE 'old'. This is the same as:

:1,$s/old/new/g

Additional ex facilitiesAdditional commands available using the ex editor include:

c replaces lines

t transfers lines

m moves lines

j joins lines

l shows invisible characters

f gives the name of the file being edited

r inserts named file

e edits named file

u undo last change

The commands m and t above work in a similar way, in that they require two line addresses, one before and one after the command. The address in front refers to the source and the address after the destination. If either is omitted, the current line is assumed. Line addresses may be ranges, allowing blocks of text to be moved. Here are a few examples of commands:

:.m2

This moves the current line to a position after line 2.

:1,.m$

This moves a block (line 1 to the current line) to the end of the text.

:1,.t$

This copies the block at the end of the text, leaving the original block untouched.

Page 36: Unix Quick Learn

--------------------------------------------------------------------------------

Exercises1. Create a file using ex. Put the text of a message in the file and then mail it to someone (see chapter on mail).

--------------------------------------------------------------------------------

2. Use ex to explore the file /etc/passwd. Search for your own listing, and those of others in your group. (You won't be able to save changes to the file).

3. Find a text file to which you have access and copy it to your home directory. Try making some changes to it.

REGULAR EXPRESSIONS

--------------------------------------------------------------------------------

What are regular expressions?A regular expression (RE) is a string of characters that can be used to match a set of character strings. For example, to globally search for all occurrences of the word "and" would require a search for "and", "And", "AnD", "AND", etc. Without regular expressions finding all possible occurrences of "and" would require eight separate searches. Using an RE the search could be done with one command.

Regular expressions are used by many Unix utilities, including:

ed

ex

vi

grep

sed

awk (The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs easily with just a few lines of code. Awk is not covered in this course, but the GAWK Manual is a good guide to its use.)

Regular expressions are used in searches and substitutions.

Page 37: Unix Quick Learn

Character stringsA character string is the simplest regular expression which simply matches the string itself. For example:

/hello/ - matches 'hello's/hello/goodbye/ - matches 'hello' and makes a substitution

Matching single charactersThe '.' character is used to match a single character. For example:

/p.t/ - matches 'p' and 't' separated by a single character, e.g. 'pit', 'put', 'pot', etc.

Sets of charactersThe expression /RE/ is used to match a set of characters in a single character position. For example:

/x[ab2X]y/ - matches any of the following:xayxbyx2yxXy

In the expression /[RE]/ a range of characters can be specified. For example:

[a-z] - matches any single lower case character[0-9] - matches any single digit

Note however:

[0-57] - matches any one of the following:0 1 2 3 4 5 7

i.e. 0-5 and 7. Sets of characters can be combined:

[a-d5-8X-Z] - matches any one of the following:a b c d 5 6 7 8 X Y Z

It is possible to specify a set of characters which are not to be matched in the RE. For example:

[^0-9] - matches any single character which is not a digit

AnchorsAn anchor is used to match a RE found at a particular position. For example:

/^RE/ - matches RE at the start of a line/RE$/ - matches RE at the end of a line/^RE$/ - matches RE as the whole line

Page 38: Unix Quick Learn

Note that there are two separate uses of the '^' operator. One is as the sart of line anchor, and the other as the 'logical not' operator. The latter function only applies inside square brackets.

RepetitionsMultiple occurrences of REs can be specified. For example:

a* - matches 0 or more occurrences of 'a'aa* - matches 1 or more occurrences of 'a'.* - matches any string of characters

Remembered regular expressionsA null RE stands for the last RE. For example:

:/[Tt]he.*car/pThe blue car exploded with a roar.:s//(The blue car)/p(The blue car) exploded with a roar.

The '&' character in a replacement string stands for the most recently matched string. For example:

:/[Tt]he.*car/p The blue car exploded with a roar. :s//(&)/p (The blue car) exploded with a roar.

Sub-expressionsA sub-expression in a RE can be referred to.

\(string\) - defines an RE sub-expression\n - refers to the nth RE sub-expression

NOTE The backslash is the escape character for REs. This means it neutralises the special meanings of special characters. For example:

:pA line of text:s/\(line\).*\(text\)/\2\1/pA text line:*

RepetitionIt is possible to specify multiple occurrences of REs. For example:

c\{4\} matches exactly 4 c'sc\{4,\} matches 4 or more c'sc\{2,4\}matches between 2 and 4 c's

Page 39: Unix Quick Learn

For example, to find a line containing 5 digits:

/[0-9]\{5\}/

A summary of special charactersSpecial characters in the search stringstart of line anchor (or NOT operator inside [] )

$ end of line anchor

. any character

* character repeated any number of times

\ escape character

[ ] contains range of characters

Special characters in the replacement string& string matched in search string

\ escape character

Note that any regular expression can be used with grep. (It gets its name from the editor command g/RE/p which means 'globally search for RE and print it'). This opens up many new possibilities for the use of grep. Unix commands that use regular expressions often makes the use of an editor redundant.

--------------------------------------------------------------------------------PRACTICE Obtain a listing of the members of your group from the password file using grep.

--------------------------------------------------------------------------------

Introduction to sedsed is a non-interactive stream editor which is used for text. The command to invoke sed is:

sed [-n] [-e command] [-f edfile] [input_file]

For example:

Page 40: Unix Quick Learn

sed "s/UNIX/Unix/g" thesis > thesis.new

This will process the file thesis line by line, outputting each line to the file thesis.new and replacing each occurrence of the string "UNIX" with "Unix".

In the above example every line of thesis will be output to thesis.new, irrespective of whether it has been changed or not. This is because the default output for sed is every line of the input. Using the -n option supresses the default output, and only specified lines are output. In the above example this would mean that no lines would be output in the following example:

sed -n "s/UNIX/Unix/g" thesis > thesis.new

since a change but no output has been specified. If a print command is added, as follows:

sed -n "s/UNIX/Unix/gp" thesis > thesis.new

then only those lines in which "UNIX" had been changed to "Unix" would be output.

As you also see in the example, the -e option is not not necessary when there is only one editor command. It is possible to specify more than one command, and in this case each must be preceded by -e. For example:

% sed -e "s/a/A/" -e "s/b/B/" file1 > file2

This command will carry out the two substitutions on each line of file1.

The -f option enables the user to use a file containing editor commands, instead of typing out a series of commands with the -e option.

sed examplesThe sed command to list only files (exclude directories) is:

% ls -l | sed -n "/ -/p"-rw------- 1 lnp5jb 1765 mbox-rw------- 1 lnp5jb 320 example1

The sed command to extract a list of usernames from the password file is:

% sed "s/:.*//" /etc/passwd | more

What this does is to delete everything that comes after ':' in the password file.

--------------------------------------------------------------------------------

Page 41: Unix Quick Learn

Exercises1. Reproduce the effects of the above sed examples using grep instead. Note that grep is generally better for searches, such as this, while sed can be used to make changes to files.

2. Find the system's games directory and type quiz function ed-command to do the ed commands quiz. Don't worry if there are a couple of things that you haven't come across. Try it again and see if you improve your score.

PROCESSING LARGE TEXT CORPORA

--------------------------------------------------------------------------------

This section will focus on exploiting large files containing linguistic material with the use of the commands already covered plus many more.

Compressed filesOften large files are compressed to save disk space. If this is the case then the user must make the file revert to it's original format in order to be able to do anything with it. A popular compressing command is called, simply, compress. The command:

% compress filename

will cause the file to be replaced by a compressed file with a .Z suffix. The command uncompress will cause it to revert to its original format. It is often not necessary to uncompress a file to use it. In fact, the file will often be owned by someone else, and you would have to copy it and then uncompress it, using up a great deal of disk space and processor time. It is often better to use the zcat which sends the uncompressed contents of a compressed file to the standard output, while leaving the compressed version of the file in the filestore.

--------------------------------------------------------------------------------PRACTICE Try compressing and uncompressing some of your own files.

Find a large compressed file on your system and search it for some appropriate string using grep without uncompressing the file.

--------------------------------------------------------------------------------

Some useful commands for processing text files

Page 42: Unix Quick Learn

The following is a summary of some useful commands for processing text files, some of which you have met already, some of which are new to you. Both have been included so that this section can easily be used for reference purposes. Not all of these commands are standard Unix, so they may not all work in the way you expect (or at all) on your system. For the same reasons, their syntax is somewhat incongruous and some use different input and output conventions. Not all are included in the command summary in the appendix below. See the relevant manual pages for more details.

sort sort into alphabetical order

sort -n sort into numerical order

sort -m merge sorted files into one sorted file

sort -r sort into reverse order (highest first)

sort -c check a file is already sorted

uniq remove duplicate lines (or partly-duplicate lines)

uniq -d output only duplicate lines

uniq -c count identical lines (or lines with identical fields)

grep find lines containing given string or pattern

grep -v find lines not containing given string or pattern

grep -c count lines containing given string or pattern

grep -n give line numbers of lines containing...

fgrep same as grep except that it does not recognise regular expressions

egrep same as grep except that it recognises all REs grep only recognises certain special characters

wc -c count characters

wc -w count words

wc -l count lines

NOTEwc -l file will output the number of lines in the file, and the file name.

Page 43: Unix Quick Learn

wc -l < file just gives the bare line count.

head -17 output first 17 lines

tail -17 output last 17 lines

tail +30 output from line 30

cut -f3 delete all but third field of each line

cut -f3,5 delete all but third and fifth fields of each line

cut -f3-5,7 delete all but 3rd, 4th, 5th, 7th fields of each line

cut -c-4,6-8 delete all but 2nd 3rd 4th, 6th 7th 8th characters

cut -f2 -d":" deletes all but the second field where ":" is the field delimiter (tab is the default)

paste combines files horizontally; corresponding lines are appended

paste -d">" pastes with delimiter defined as ">" (tab is default). The special characters "\n" (newline) and "\0" (null string) may be used.

cat concatenates file vertically (appends files to one another)

cat -n precedes each line with a line number in the output

cat -b as above, but does not number blank lines

cat -s reduces any number of successive blank lines to one blank line

tr "abc-e" "kmx-z" translates a, b, c, d, e to k, m, x, y, z respectively.

tr -d "xy" deletes all occurrences of x and y

tr -s "a" "b" translates all a to b and reduces any string of consecutive b to just one b.

To go down to the character, rather than field, level, sed is simplest for line by line processing. sed looks for patterns, so is not very good with column or field positions.

uniq needs an already-sorted file. A common idiom is

sort | uniq

Page 44: Unix Quick Learn

to produce a sorted list of all the different lines in a file. uniq has a peculiar way of spacing its output, so it is difficult to use in a pipeline with another command such as cut. tr is useful for converting blanks to newlines (hence converting a text to a vertical list of words, which can then be sorted, counted etc.). The command:

% tr " " "\012" < filename

will do this. 012 is the octal code for the linefeed character. This is also useful for converting strings of blanks or tabs to single characters. 011 is the octal code for the tab character.

--------------------------------------------------------------------------------PRACTICE Try out the following pipeline on a text file:

--------------------------------------------------------------------------------

tr " " "\012" < input_file | sort | uniq > output_file

--------------------------------------------------------------------------------

Using language corporaA corpus (plural corpora) is a collection of language data. The corpora with which we will be concerned here are electronic, that is they are stored in a computer. Corpora may contain data about written or spoken language. They usually contain texts from one language, but they may also be multilingual. Corpora are usually designed and collated for a specific purpose. Many of the major corpora in use today aim to be representative of different domains of language use, and can facilitate comparative studies. For example, the average length of words in academic texts and newspaper reports could be compared by measuring words in texts from these two domains. Computers obviously make this type of number-crunching (or word-crunching) activity much easier than it would be if you had to count words and letters in a printed text. Corpora are particularly useful for checking the intuitions that we have and the generalisations that are made about language use.

Unix commands can be used to extract information from language corpora. The commands learned in this course can be used for issuing commands and writing simple scripts that can be used to extract information from language corpora.

Types of CorporaThere are many types of corpora, defined by the types of language that they represent and the formats in which that information is stored. Unix commands for handling strings are

Page 45: Unix Quick Learn

sufficiently flexible to handle many different formats. Users however need to be sensitive to the arcane minutiae of the format and markup of the different corpora that they use. The 'l' command in the vi editor can be used to view hidden characters (such as spaces and tabs) in a file.

The LOB and Brown corporaBrown and LOB are parallel corpora, with very similar formats and tagging. Brown, which was constructed first, represents different types of written American English. LOB represents the same categories of British English. All words are lemmatised and given a word class tag. Here is a sample from the so-called 'vertical tagged' version of Brown:

^N01002001 ----- ----- -----N01002010 - NP AlastairN01002020 - BEDZ wasN01002030 - AT aN01002040 - NN bachelorN01002041 - . .^N01002042 ----- ----- -----N01002050 - ABN allN01002060 - PP$ hisN01002070 - NN lifeN01002080 - PP3A heN01002090 - HVD hadN01002100 - BEN beenN01002110 - VBN inclinedN01002120 - TO toN01003010 - VB regardN01003020 - NNS womenN01003030 - IN asN01003040 - PN somethingN01003050 - WDTRwhichN01003060 - MD mustN01003070 - RB necessarilyN01003080 - BE beN01003090 - VBN subordinatedN01003100 - IN toN01004010 - PP$ his

And the 'untagged' version of the same passage, plus the following lines:

N01 0010 DAN MORGAN TOLD HIMSELF HE WOULD FORGET Ann Turner. HeN01 0020 was well rid of her. He certainly didn't want a wife who was fickleN01 0030 as Ann. If he had married her, he'd have been asking for trouble.N01 0010 DAN MORGAN TOLD HIMSELF HE WOULD FORGET Ann Turner. HeN01 0020 was well rid of her. He certainly didn't want a wife who was fickleN01 0030 as Ann. If he had married her, he'd have been asking for trouble.

Page 46: Unix Quick Learn

N01 0040 But all of this was rationalization. Sometimes he woke up inN01 0050 the middle of the night thinking of Ann, and then could not get backN01 0060 to sleep. His plans and dreams had revolved around her so much and forN01 0070 so long that now he felt as if he had nothing. The easiest thing wouldN01 0080 be to sell out to Al Budd and leave the country, but there wasN01 0090 a stubborn streak in him that wouldn't allow it. The best antidoteN01 0100 for the bitterness and disappointment that poisoned him was hardN01 0110 work. He found that if he was tired enough at night, he went to sleep

Users can choose the version (from those available to them) which includes the information that they need. If you are only interested in word frequencies, then the grammatical information encoded in the tagged version is redundant, and the untagged version can be used. If however you are looking for the word 'set' used as a noun, then it would be necessary to use a tagged version, so that this word can be differentiated from 'set' used as a verb or adjective.

Processing LOB and BrownThe Susanne corpusThis corpus uses a section of the Brown corpus and marks it up with syntactic information.

N01:0010a - YB <minbrk> - [Oh.Oh]N01:0010b - NP1m DAN Dan [O[S[Nns:s.N01:0010c - NP1s MORGAN Morgan .Nns:s]N01:0010d - VVDv TOLD tell [Vd.Vd]N01:0010e - PPX1m HIMSELF himself [Nos:i.Nos:i]N01:0010f - PPHS1m HE he [Fn:o[Nas:s.Nas:s]N01:0010g - VMd WOULD will [Vdc.N01:0010h - VV0v FORGET forget .Vdc]N01:0010i - NP1f Ann Ann [Nns:o.N01:0010j - NP1s Turner Turner .Nns:o]Fn:o]S]N01:0010k - YF +. - .N01:0010m - PPHS1m He he [S[Nas:s.Nas:s]N01:0020a - VBDZ was be [Vsb.Vsb]N01:0020b - RR well well [Tn:e[R:h.R:h]N01:0020c - VVNt rid rid [Vn.Vn]N01:0020d - IO of of [Po:u.N01:0020e - PPHO1f her she .Po:u]Tn:e]S]N01:0020f - YF +. - .N01:0020g - PPHS1m He he [S[Nas:s.Nas:s]N01:0020h - RR certainly certainly [R:m.R:m]N01:0020i - VDD did do [Vde.N01:0020j - XX +n<apos>t not .N01:0020k - VV0v want want .Vde]N01:0020m - AT1 a a [Ns:o101.N01:0020n - NN1c wife wife .

Page 47: Unix Quick Learn

N01:0020p - PNQSrwho who [Fr[Nq:s101.Nq:s101]

The London-Lund corpusThis corpus differs from the others that we have looked at because it is a transcription of spoken English. Intonation is marked.

1 1 1 10 1 1 B 11 ((of ^Spanish)) . graph\ology#/

1 1 1 20 1 1 A 11 ^w=ell# ./

1 1 1 30 1 1 A 11 ((if)) did ^y/ou _set _that# - /

1 1 1 40 1 1 B 11 ^well !J\oe and _I#/

1 1 1 50 1 1 B 11 ^set it betw\een _us#/

1 1 1 60 1 1 B 11 ^actually !Joe 'set the :p\aper#/

1 1 1 70 1 1 B 20 and *((3 to 4 sylls))*/

1 1 1 80 1 1 A 11 *^w=ell# ./

1 1 1 90 1 1 A 11 "^m/\ay* I _ask#/

1 1 1 100 1 1 A 11 ^what goes !\into that paper n/ow#/

1 1 1 110 1 1 A 11 be^cause I !have to adv=ise# ./

1 1 1 120 1 1 A 21 ((a)) ^couple of people who are !d\oing [dhi: @]/

1 1 1 130 1 1 B 11 well ^what you :d\/o#/

1 1 1 140 1 2 B 12 ^is to - - ^this is sort of be:tween the :tw\/o of /

1 1 1 140 1 1 B 12 _us# /

1 1 1 150 1 1 B 11 ^what *you* :d\/o#/

1 1 1 160 2 1 B 23 is to ^make sure that your 'own . !c\andidate/

1 1 1 170 1 1 A 11 *^[\m]#*/

1 1 1 160 1 2(B 13 is . *.* ^that your . there`s ^something that your /

1 1 1 160 1 1(B 13 :own candidate can :h\/andle# - -/

Page 48: Unix Quick Learn

CUVOALDThis acronym stands for the Computer Usable Version of the Oxford Advanced Learners Dictionary. There are in fact two versions. The most useful is usually in a file called cuv2.dat contains 68742 words including inflected forms and proper nouns. It is most often of use as a wordlist, but the file also contains a phonemic transcription and a part-of-speech tag for every word. Here is a sample of cuv2.dat:

verbs v3bz Kjverdancy 'v3dnsIL@verdant 'v3dnt OAverdict 'v3dIkt K6verdicts 'v3dIkts Kjverdigris 'v3dIgrIs L@verdure 'v3dj@R L@verge v3dZ I2,K6 3Averged v3dZd Ic,Id 3Averger 'v3dZ@R K6vergers'v3dZ@z Kjverges 'v3dZIz Ia,Kj 3Averging 'v3dZIN Ib 3Averifiable 'verIfaI@bl OAverification ,verIfI'keISn M6verifications ,verIfI'keISnz Mjverified 'verIfaId Hc,Hd 6Averifies 'verIfaIz Ha 6Averify 'verIfaIH3 6Averifying 'verIfaIIN Hb 6Averily 'ver@lIPuverisimilitude ,verIsI'mIlItjud M6verisimilitudes,verIsI'mIlItjudz Mjveritable 'verIt@bl OAverities'verItIz Mjverity 'verItI M8vermicelli ,v3mI'selI L@vermiform 'v3mIfOm OAvermilion v@'mIlI@n M6,OA

The coding conventions for the phonemic and syntactic tags are explained in a file that comes with dictionary. Some examples of applications that use the dictionary can be found in the appendix of this course.

Other textsCorpus building is currently a growth area, and there are many, many more corpora as well as the above examples. Currently available or under construction are a number of very large corpora, comprehensive corpora aiming to cover all registers of English,

Page 49: Unix Quick Learn

international English corpora, corpora of different languages and specialised corpora covering a single well-defined domain of language.

--------------------------------------------------------------------------------

Exercises1. Find a large text file with a fixed field format (e.g. the Brown or LOB corpora) and inspect the format. Use zcat to view it if necessary.

3. Use cut to strip away the reference material and leave just the text field.

4. Use tr to strip away any tags that are actually in the text (e.g. attached to the words), so that you are left with just the words.

5. Make a sorted wordlist from the file.

6. Combine the above commands in a shell script so that you have a small program for extracting a wordlist.

INTRODUCTION TO THE VI SCREEN EDITOR

--------------------------------------------------------------------------------

What is viVi is a screen editor. This means that you can see part of the file in a window on the screen, and editing operations can be controlled by moving a cursor around the text on screen.

Vi works in a different way from the editing functions of modern word processors. It's effective use requires a considerable amount of expertise on the part of the user. The user must have the ability to remember and manipulate opaquely named one-letter commands that can be combined in an arbitrary variety of different ways.

Vi is a screen-based version of ex. It's lack of user-friendliness is largely a result of this. In many ways it still works like a line editor, with complicated commands typed in by the user.

The main enhancements on ex are the window, which enables you to constantly view part or all of the file, the visible cursor and the commands that can be issued without moving to the command line. Once you have learned to start vi, you will probably not need to use ex again. Everything that you have learned with ex, you can do with vi. What is more, with vi you have a window and the possibility to use interactive commands. The only

Page 50: Unix Quick Learn

time that you might want to use ex now is if you have trouble running a screen-based utility on your terminal.

Using viThe next section lists the commands needed to start and use vi. In this section, the key concepts underpinning the use of vi are explained so that you can understand what is happening when you use it.

The first thing to understand is that there are three modes:

command mode:

insert mode

last line mode (or command line mode)

You start in command mode. The commands listed below for moving the cursor and changing the file are entered in command mode. To enter a command simply type it at the keyboard. What you type will not appear anywhere on screen. To abandon a command you have started, you can type <ESC>. If you are not sure which mode you are in at any time you can type <ESC> and return to command mode. When you leave the other modes you return to command mode. Insert mode is used to enter text. Insert mode is entered by issuing one of a variety of commands that involve entering text. Insert mode must be exited in order to issue more commands. A common mistake made is to attempt to enter a command while in insert mode, which results in the command appearing on screen as part of the text.

Last line mode is entered from command mode, and enables the user to type a command on the last line of the screen. Any ex command can be used in this way, simply by typing ':' followed by the command. The current line will be that where the cursor is positioned.

When you start vi you will see a screen similar to the one below. If you are starting a new file, or the file you are editing is less than 18 lines long, then the empty lines in the window will be marked by the '~' (tilde) character.

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

This is a small file called 'vi.prac'.This is the second and last line.^^^^

Page 51: Unix Quick Learn

^^^^^^^^^^^^"vi.prac" 2 lines 103 charactersA typical vi screen

Note that is necessary to press return at the end of each line of text that you enter. Otherwise, vi will interpret all of your text as a single line!

--------------------------------------------------------------------------------PRACTICE Create a new file, enter several lines of text and save it.

Edit an existing file that you have, making several changes.

--------------------------------------------------------------------------------

vi referencevi modescommand Normal and initial state. <ESC> cancels partial command insert entered by the following commands: a, A, i, I, o, O, c, C, s, S, R. Terminates with <ESC> (or ^C). last line entered by :, /, ? or !. Input is read and echoed at the bottom of the screen. Commands executed by <RETURN> or <ESC>, terminated by ^C. Entering and leaving vi% vi file edit file % vi +n file edit starting at line n % vi + file :edit starting at end % vi +/RE/ file edit starting at RE % view file read only mode ZZ exit from vi, saving changes (same as :wq) ^Z stop vi process, for later resumption Some simple commands

Page 52: Unix Quick Learn

The following are examples of some compound commands, using the operators listed later.

dw delete word de delete word leaving punctuation dd delete line 4dd delete 4 lines xp transpose characters cwtext<ESC> change word to text File manipulationThe following are all last line mode commands, so must be preceded by a colon.

w save changes wq save and quit q quit q! quit, discarding changes e file edit file e! re-edit current file, discarding changes w file write to file w! file overwrite file ! command execute shell command, then return f show current file and line Positioning within the file^F forward one screenful ^B back one screenful ^D scroll down half screen ^U scroll up half screen nG go to line n (last line default) /RE/ go to next occurrence of RE % find matching bracket Marking`` return to previous cursor position mx mark position with x `x go to mark x Line positioningH top line of window (home) M middle line of window L last line of window + next line, at first non-white character - previous line, at first non-white character <RETURN> same as + j next line, same column (same as down arrow) k previous line, same column (same as up arrow) Character positioning0 beginning of line ^ first non-white in line

Page 53: Unix Quick Learn

$ end of line <SPACE> forward (same as right arrow) fx find x forwards in current line Fx find x backwards in current line ; repeat last find command forwards : repeat last find command backwards n| go to column n Words, sentences, paragraphsw forward to start of next word (delimited by non-alphanumeric character) b back to start of last word e forward to end of next word W as w, with word delimited by blank only B as b, with word delimited by blank only E as e, with word delimited by blank only ) forward to start of next sentence ( Back to start of next sentence } Forward to start of next sentence { Back to start of last sentence Corrections during insertH erase last character (or your usual delete key) W erase last word \ escape character <ESC> ends insert; back to command mode C ends insert Insert and replace commandsa append after cursor i insert before cursor A append at end of line I insert before first non-blank o open line below current line O open line above current line rx replace single character with x R replace characters OperatorsThe following can be doubled to apply to a line and also preceded by a number to indicate a number of lines. They can be combined with positional commands (e.g.d$ to delete to end of line.)

d delete c change y yank Miscellaneous operationsx delete character X delete character to left of cursor C change rest of line (same as c$). D delete rest of line (same as d$)

Page 54: Unix Quick Learn

J join lines Y yank (paste) lines Yank and putp put back after cursor P put back before cursor "xp put from buffer x "xy yank to buffer x "xd delete to buffer x Undo, redo and retrieveu undo last change U restore current line . repeat last command "np retrieve nth last delete

TEXT FORMATTING

--------------------------------------------------------------------------------

There are text formatting facilities available with all Unix implementations. They will not be investigated in any detail here. Many users will prefer to use a PC-based word processing package for document production. Those that want to format text on Unix will have vastly differing needs, and it would be impossible to go into all of the possibilities here. A flavour of the simpler programs is given here, and users can look elsewhere for more extensive documentation.

prThis is a filter that will format a text, giving a choice of columns, page width, length etc.. It is not capable of sophisticated formatting for document production.

nroffThe simplest of the proper formatters is nroff. You can format a plain text file with nroff, by simply typing:

% nroff text_file

Formatting commands can be inserted into text files. Some simple commands:

.ce centre text .ll line length .pl page length .po page offset (left margin) .sp blank line

These commands may be followed by a numerical argument, which will make the command apply to the specified number of lines, e.g. .sp 3 to leave three blank lines. Formatting commands must be placed at the beginning of a line to be recognised as such. Normally they appear as the only text on a line. Commands are normally composed of lower-case characters. Here is an example of a text containing some nroff instructions:

Page 55: Unix Quick Learn

.ceThis is the title.sp 2And this is the text, whichwill be formatted and justified when I run nroff. You will seethat the linebreaks will change, and the text will look tidier. That is whatformatting is all about..spThat was a blank line.

The following is what the output from this file would look like:

This is the titleAnd this is the text, which will be formatted and justified when I run nroff. You will see that the line breaks will change, and the text will look tidier. That is what formatting is all about.That was a blank line.

nroff macrosMacros are a special type of nroff command, identified by being in upper-case characters. Standard macro libraries can be invoked by using option flags with the nroff command, e.g.:

nroff -ms filename

for the standard macros. Other macro libraries can be invoked by the me, mn and mv options. Here are some standard macros:

.FS footnote starts .FE footnote ends .ND no date .TL title .PP

start paragraph

The .PP tag, for example, is the equivalent of the following sequence of ordinary nroff instructions:

.sp 5 .ce 1 .sp 5

It is possible write your macros.

More details on nroff can be found in the manual.

MORE ON THE SHELL

Page 56: Unix Quick Learn

--------------------------------------------------------------------------------

GeneralThe role of the shellA Unix shell is used to:

evaluate the command line. For example:

% car nofilecar: Command not found

Here the shell looks for a command called car. Since it cannot find this command it gives an error message.

perform variable substitution. For example:

% echo "In directory $HOME"In directory /home/sunserv1_b/lnp5jb

Here the shell variable $HOME is evaluated and displayed.

handle pipelines. For example:

% who | wc -l

Here the output from who is piped through to the wc command which displays a count of the number of lines in its input.

Types of shellsA number of shells are available for Unix systems, including:

Bourne shell

C shell

Korn shell

Graphical User Interface (GUI) shells

The Bourne shell, which was developed by Steve Bourne at Bell Laboratories, is one of the oldest shells and, as such, has gained a lot of popularity. It is widely used for shell programming because of its efficiency and because it is available on all Unix systems.

The C shell provides sophisticated interactive capabilities lacking in the Bourne shell. The C shell, which was developed at the University of California, Berkeley, has a syntax

Page 57: Unix Quick Learn

which resembles the C language. Features of the C shell include a command history buffer, command aliases and file name completion.

However the C shell does not allow efficient shell programs (also known as scripts) to be written. Due to the fact that C shell programs are written in a style similar to the C programming language, people who are unfamiliar with C may find the C shell difficult to program in.

The Korn shell combines the best features of the Bourne and C shells. Korn scripts are 95% upwardly compatible with Bourne scripts. The Korn shell interactive features include:

in-line editing

command editing

job control

Graphical User Interface (GUI) shells provide a iconic interface to Unix. GUI shells require the use of workstations (or powerful microcomputers) which perform part of the processing locally. The use of GUIs such as X-Windows is likely to become increasingly important in the near future. GUIs currently available include:

Sun View A Sun-specific GUI

Open Look GUI standard supported by Sun

Motif GUI standard supported by other suppliers

Vista eXceed Available on PCs; similar in style to Motif

There is a battle currently taking place in the market-place to establish the standard GUI.

Recommended shellsThe Bourne shell is the oldest shell, and is widely used. The C shell has more utilities however and is probably more widely used now.

--------------------------------------------------------------------------------

The default shell for interactive shells at Leeds is the C shell. The Bourne shell is the default for shell programs.

--------------------------------------------------------------------------------

Page 58: Unix Quick Learn

However the Bourne shell is recommended for shell programs. The Korn shell is not widely available and is not a standard part of Unix, but is perhaps the best option if available, unless you want to do a lot of C programming. You can change your default login shell using the command:

% chsh username /bin/sh Bourne shell% chsh username /bin/cshC shell% chsh username /bin/ksh Korn shell

Warning! You probably don't want to try these commands now.

C shell featuresThe history mechanismThe history mechanism enables previous typed Unix commands to be re-invoked and edited. There are two forms. One is the quick substitution, which acts only on the immediately preceding command, e.g:

% car messagecar: Command not found% ^r^tThis is the message file

This command replaces the first occurrence of 'r' with 't' in the last command.

A list of previously entered commands can be displayed using the history command:

% history1 cd texts2 vi lookup3 who4 history

Commands can be re-entered using the number. For example:

% !2

will re-execute the second command (vi lookup). It is possible to add extra options to commands re-executed. For example to redirect output from the who command to a file called list we could give the command (for the above list):

% !3 > list

You may also edit previous commands e.g:

% !2:s/vi/cat/cat lookup

Page 59: Unix Quick Learn

although it is usually easier to re-type the whole command. The last command may be referred to as !!, and you can count back using !-2, !-3 etc..

File name completionWithin the C shell when a file name is used in a command it is possible to specify only as many characters as will uniquely identify the file, and then press the <ESC> key to complete the filename:

% lsmbox message% cat me<ESC>This is the message file

When you type <ESC>, the file name will be extended to 'message' on screen.

Command aliasesCommand aliases provide a way of customising commands. For example:

% alias dir ls% dirmbox message

Note that command aliases are only valid during the execution of the current shell. It is normal practice to include alias definitions in your .cshrc file.

The following aliases could be useful to shorten long command names:

alias hh historyalias ll 'ls -al'alias q logout

The quotes around ls -al are necessary because of the space in the command. This tells the shell that it is all one command.

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

Put the above aliases in your .cshrc file. Think of some other aliases that you would use, such as shortened versions of commands or different names for commands that you will find easier to remember.

Page 60: Unix Quick Learn

C shell startup filesCertain files are executed automatically.

These are:

.cshrc file

Executed whenever a new C shell spawned

Useful for specifying command aliases

Since C shells may be spawned automatically be certain systems commands (such as the mail system of a compiler) this file should NOT contain commands which send output to your terminal.

Contains a list of directories that are searched for commands. A line in the .cshrc file will give a value to the PATH system variable. The user can add pathnames to this list. It is conventional to store any of your own commands or shell scripts that you will use frequently directory called bin, and to add ^/bin to your search path.

.login file

Executed when you login.

Use for setting system wide variables, such as your terminal type.

Can be used to display information, such as who is logged on, or news from the system managers.

Shell processesA process is an executing program. To display a list of processes use the ps command:

% psPID TTY TIME COMMAND23268 ttyp1 0:01 ps22520 ttyp1 0:00 csh

The PID specifies the Process Identifier. The 'time' field gives the amount of CPU used by the process.

Background processesNormally processes run interactively, but they may also be run interactively, to enable the user to do something else while a process is running (this is known as 'multitasking'). This is usually necessary when you are running a very long job. To run a command in the background use the & character at the end of the command line, as follows:

Page 61: Unix Quick Learn

% command &

Note that output from command will still be sent to standard output. If you fail to redirect standard output it will be sent to your terminal where it is likely to be confused with output from your interactive process.

For example, to sort logged on users using a background process give the command:

% who | sort > sortedwho &

Note that this would normally be a very short process and you would not in fact need to run it in the background.

Controlling processesYou may wish to terminate a background process. To do this first you must first find out its process id (PID) using ps:

% psPID TTY TIME COMMAND23397 ttyp1 0:01 who23268 ttyp1 0:02 ps22520 ttyp1 0:00 csh

Then use the kill command to terminate your process.

For example:

% kill 23397

If the process continues use the -9 argument:

% kill -9 23397

Another way of displaying your background processes is to use the jobs command:

% jobs[1] + Running who - sort > sortedwho

The background process (or 'job') has been assigned the number 1, and this can be used to refer to it instead of the process i.d.. The job number is usually identified by preceding it with the '%' (per cent) character, so as to differentiate it from a process i.d.. So, for example, the command:

% kill %1

Page 62: Unix Quick Learn

should kill this process. A job may also be stopped using ^Z if it is running interactively (you have already met this as a way of stopping vi). A stopped job can be resumed by simply typing it's job number (e.g. %1 to run it in the foreground, or %1 & to run it in the background).

NOTE There are also the C shell commands fg and bg which will bring a job to the foreground and push to the background respectively.

Controlling Processes After Logging OffIf you create a background process and log off the background process will continue to execute. If you log in again and use ps or jobs command to display your background processes, the original background process will not be displayed. The same will happen if you switch between windows when you are using a GUI. This is because these commands, by default, will only display processes that have been created (or 'spawned') by the original login process.

To display background processes spawned by a previous login session you will have to use the command:

% ps -u lnp5jbUID PID PPID C STIME TTY TIME COMMANDlnp5jb 7759 7757 0 10:37:21 ttyw7 0:00 who > sorted wholnp5jb 5058 5057 0 09:57:02 ttyw5 10:03 longjoblnp5jb 7760 7758 18 10:37:21 ttyv4 0:00 -csh [csh]lnp5jb 7798 7760 6 10:37:42 ttyv4 0.00 ps -fu lnp5jb

Special charactersCertain characters have a special meaning to the shell. The backslash (\) is known as the escape character. A character following an escape character has a special meaning. For example:

% echo "This is a very long message \which is longer than 1 line"

In this example because a command could not fit on one line, the \ character was given IMMEDIATELY before the <RETURN> key was pressed. This indicates that the <RETURN> has a special meaning - which is not it's usual meaning, to terminate the command. The double quotes character (") is used to group words together as a single expression. The single back quote (`) is used to identify a string which is to be executed rather than to be displayed. For example:

% echo "Users logged on are: `who`"

--------------------------------------------------------------------------------

Page 63: Unix Quick Learn

PRACTICE

--------------------------------------------------------------------------------

Try this with and without the backquotes around who. Try it with date.

Shell parametersParameters can be set interactively in the C shell by using the set command:

% set jenny=/home/sunserv1_b/lnp5jb

To then use a parameter:

% cd $jenny% pwd/home/sunserv1_b/lnp5jb

The variable name is preceded by the $ prefix, to indicate that it is a variable. Curly brackets can be used to delimit the variable name if other characters needs to come straight after it. For example:

$ cat ${jenny}/test.dat

Note that the syntax for the Bourne shell is slightly different. You are most likely to use parameters in shell scripts, which you may well be executed by a Bourne shell. The basic difference is that the set command is not used:

$ jenny=/home/sunserv1_b/lnp5jb$ cd $jenny

Special shell variables

--------------------------------------------------------------------------------

CDPATH informs the shell where to search for the relative pathnames

--------------------------------------------------------------------------------

HOME the name of your home directory

MAIL the pathname of the file where your mail is placed

PATH the list of directories searched for commands

Page 64: Unix Quick Learn

PS1 the primary prompt string

PS2 the secondary prompt string

! the process number of the last process run in the background

# the number of positional parameters

$ the process number of the current shell

? the exit status of the last command run (0 if it was completed successfully, non-zero otherwise).

You can see the values of these variables by typing set (with no arguments)

SHELL PROGRAMMING

--------------------------------------------------------------------------------

Shell commands can be stored in a file which can be executed when required. A file containing shell commands is known as a script. For example:

% cat > list -create the filepwdls^D% chmod u+x list -give execute permission% list -execute the script/home/sunserv2_a/lnp5jbmbox message list bin

Control structuresAs mentioned earlier you are strongly recommended to carry out all shell programming in the Bourne Shell. This does not mean that you have to be running a Bourne Shell when you start a script. By default, all shell scripts are normally executed by the Bourne shell, whatever your normal interactive shell. This is possible because when you run a script a new shell is started ('spawned' according to the jargon) to run the commands. You can add (as the first line):

#! /bin/sh

to ensure that it is a Bourne Shell script. A C shell script would begin:

#! /bin/csh

Command parameters

Page 65: Unix Quick Learn

The Bourne shell is capable of using parameters (see the section on parameters in the previous chapter.) These may be defined by the attribution operator =, by the read command and by the for command. The Bourne shell also interprets parameters which are given as arguments to the command that executes the shell script. Such parameters are 'positional parameters', which means that they are interpreted as a list structure. This can be seen in the simple example below:

% ex simple - first create the script"simple" [New file]:aecho $1echo $2echo $3.:wq"simple" [New file] 3 lines, 24 characters% chmod u+x simple - make it executable% simple one two three - execute itonetwothree%

The three arguments given to the script ('one', 'two' and 'three') are read in by the script as variables named 1, 2 and 3, and so are referred to in the script as $1, $2 and $3 respectively. The special parameter * refers to all of the parameters, and the special parameter # refers to the number of parameters.

% ex simple2 - create a new script"simple2" [New file]:aecho $*echo $#:wq"simple" [New file] 3 lines, 51 characters% chmod u+x simple2% simple2 one two threeone two three3%

readThe read command enables parameter values to be entered interactively by the user while the script is running. It is usual to provide a prompt for the user, as in the script listed below (called greeting):

echo "What's your name?"read nameecho "Hello, $name"

Page 66: Unix Quick Learn

This can give the following results:

% greeting - execute the scriptWhat's your name? - outputJenny - the shell waits for your inputHello, Jenny - output%

More than one parameter can be given to the read command, usually separated by one or more spaces, as in the following script (called count):

echo How far can you count?read first second thirdecho $first $second $third

which can run to give the following:

% countHow far can you count?1 2 3 - user input1 2 3 - script output

--------------------------------------------------------------------------------

PRACTICE

--------------------------------------------------------------------------------

See what happens with this script if you give it less than three parameters. Try it with more than three - is this what you expected? Can you explain this?

Try changing the script so that it echoes each parameter on a different line. This should show what is going on.

Control structuresSometimes it is useful to use control structures (like you find in programming languages), for example specifying that a command is only carried out under certain conditions, or that it does the same thing to a list of arguments. The shell provides control of flow with the following statements:

if structured control branching

case multiway branching

for looping over a list of commands

while conditional looping

Page 67: Unix Quick Learn

until conditional looping

if...then...else...fiThis structure allows conditional branching. It takes the following form:

if command_list_1 then command_list_2 [else command_list_3] -this clause is optionalfi

Note that it is usual practice to indent the subordinate clauses, to make the script easier to read, but it is not necessary. This structure depends on the exit status of command_list_1. Every time a command runs it returns a 0 (also known as a 'true result') if it completes it's run successfully or a 1 ('false') if fails to end normally. The command_list_2 if and only if the exit status of the last command in command_list_1 is 0 (or true). The command_list_3 is executed if and only if the exit status of command_list_1 is 1 (or false).

The test command is often used to generate an exit result. Equivalence operators may also be used such as = (equals) or != (not equal to). The following example shows the script trio in action:

% cat trioif test $ = 3 then echo "There are three parameters."fi% trio one two threeThere are three parameters.% trio one two% - No output

testThe test command can be used in its simplest form to test if a string exists (more exactly, if it is a 'null string' or not), as in the following script:

% cat test.1echo "Type something please:"read aif test $a then echo "Thank you" else echo "Thanks for nothing"fi% test.1Type something please:HelloThank you% test.1Type something please:

Page 68: Unix Quick Learn

Thanks for nothing%

There are also several options that can be used in a command of the form:

test [options] filename

The following options are available:

-d true if a file is a directory

-h true if a file is a symbolic link

-x true if file exists and is executable

-l tests the length of a string

-f true if the file exists

-r true if the file can be read

-s true if the file exists and is not empty

-w true if the file can be written to

= is equal to

!= is not equal to

There are also the following arithmetic operators which apply to integer values:

-eq is equal to

-ne is not equal to

-gt is greater than

-ge is greater than or equal to

-lt is less then

-le is less than or equal to

Note that the above operators are all for use with the test command, and cannot be used independently.

Page 69: Unix Quick Learn

caseWhen more than two directions for the control of flow are needed, if clauses may be nested, but the case structure provides a more elegant way of doing this. The case structure is of the form:

case string inpattern) command_list_1;;pattern) command_list_1;; -- --pattern) command_list_N;;esac

The shell attempts to match the string with each pattern in turn. When a pattern that matches string is found, the appropriate command list is executed, and the case command is then terminated.

The case command is often used to give the user a choice of options, as in the following:

% cat pickecho "Type one of the following:"echo " 1 - who am I?"echo " 2 - who is logged on?"echo " 3 - date"echo " 4 - calendar"read ncase $n in 1) whoami ;; 2) who ;; 3) date ;; 4) cal ;;esac

Study the following, rather more complex, example:

% cat test.2echo "Give me a letter:"read lettercase $letter in [aeiou]) echo "That's a vowel!";; [b-df-hj-np-tv-z]) echo "That's a consonant!";; [A-Z]) echo "I said lower case!";; [1-9]) echo "I said a letter, not a number!";; *) echo "What's that?"esacecho "Thank you and goodbye"

Page 70: Unix Quick Learn

% test.2Give me a letter:aThat's a vowel!Thank you and goodbye% test.2xThat's a consonant!Thank you and goodbye% test.2;What's that?Thank you and goodbye%

Note that the last pattern in this case clause will match anything if a match has not already been found.

forThe for command can be used to apply a list of commands to a series of variables. It has the general form:

for variable [in wordlist]docommand-listdone

The wordlist is a series of strings separated by spaces. The variable takes the value of each of this strings consecutively and then runs the command list. Here is an example:

for n in one two three four five six sevendo echo $ndone

This script will output the list of words ('one', 'two', etc.)

whileThe while command allows a sequence of commands to be executed repeatedly while certain conditions are met. It takes the form:

while command_list_1docommand_list_2done

If command_list_1 is exited successfully, then command_list_2 is executed. This process continues until command_list_1 fails. Here is an example:

Page 71: Unix Quick Learn

flag=ywhile test $flag = ydoecho Do it again?read flagdone

The loop will be repeated while the value of the variable flag remains 'y'.

untilThe until command tests for the opposite condition to the while command. Command-list-1 is executed until command_list_2 fails. The following will do the same as the example with while above:

flag=yuntil test $flag = ndoecho Do it again?read flagdone

--------------------------------------------------------------------------------

Exercises1. Write a script called hello which outputs the following:

--------------------------------------------------------------------------------

your username

the time and date

who is logged on

Also output a line of asterices (*********) after each section.

2. Put the command hello into your .login file so that the script is executed every time that you log on.

--------------------------------------------------------------------------------

3. Write a script that will count the number of files in each of your subdirectories.

Page 72: Unix Quick Learn

Command summaryalias assigns an alias for commands, files or devices. Only available in the C shell.

cat concatenates (joins) files and lists the result. Often used to direct the contents of a single file to the standard output.

Qualifiers:

-n numbers the lines in the file(s)

-s eliminates consecutive blank lines

Example: % cat file1 file2 > file3

cd [directory] changes current working directory. (Default is home directory.)

Example: % cd /usr/etc

chmod mode file changes permissions of files and directories. Mode consists of three elements: [ugoa] [+-=] [rwxs]

Example: % chmod g+r project (adds read permission to group)

cmp compares two files and prints the line number and character where they differ.

Example: % cmp file1 file2

comm compares two files for common lines.

-1 suppresses lines that only occur in file1

-2 suppresses lines that only occur in file2

-3 suppresses lines that only occur in one file

cp makes a copy of a file.

Qualifiers:

-i interactive mode (to protect destination file if it already exists)

Example: % cp -i file1 file2

date gives time and date

Page 73: Unix Quick Learn

diff lists differences in two files or directories.

Qualifiers

-b ignores trailing blanks

-e prints ed changes needed to make files identical

ed accesses the ed line editor

grep searches a file for a pattern (see chapter 15)

head -n Prints first n lines

jobs lists the background jobs.

Qualifier:

-1 displays process id

kill terminates background job

ln -s sets up a symbolic link to a file or directory.

Example: ln -s /usr/games fun

ls lists files in a directory (default current directory)

Qualifiers:

-a all files (including hidden files)

-c in order of creation time

-g give group identity

-l in long format

-s sorted by block size

-t sorted by modification time

-u sorted by last access time

mail receives and sends mail

Page 74: Unix Quick Learn

mkdir creates a directory

more lists the contents of a file a page at a time.

mv moves a file. Often used to simply rename a file.

Qualifiers:

-i interactive mode to protect destination file if it already exists

passwd change passwd

pg pager available on some systems

pr formats and outputs a file.

Qualifiers:

-ln where n is the page length (default 66)

-wn where n is the page width (default 72)

-n no. of columns

-hstring defines the header name as string

pwd displays name of current directory

rm deletes files

Qualifiers:

-i interactive prompt to protect files

rmdir delete directory (only works on empty directories).

sort sorts and merges files.

Qualifiers:

-b ignores blanks

-d dictionary order

-f fold upper to lower case

Page 75: Unix Quick Learn

-i ignores characters outside the printable ASCII set

-n sorts numbers by value

-o directs output to a file

-r sorts in reverse order

spell checks spelling in a file

tail n lists the last n lines of a file if n is negative, or starts listing on the nth line, if n is positive

time displays the execution time of a command

unalias removes a previously defined alias

vi accesses the vi screen editor

wc counts the number of lines, words and characters in a file

Qualifiers:

-c counts only characters

-w counts only words

-l counts only lines

who who is logged on

write direct communications to users on the same machine

http://www.comp.lancs.ac.uk/computing/users/eiamjw/unix/