lecture 5 regular expressions; grep; cse4251 the unix programming environment

26
Lecture 5 Regular Expressions; grep; CSE4251 The Unix Programming Environment

Upload: alma-dwyer

Post on 16-Dec-2015

230 views

Category:

Documents


2 download

TRANSCRIPT

Lecture 5

Regular Expressions;grep;

CSE4251 The Unix Programming Environment

Regular Expressions (R.E.)• What’s R.E.– a set of symbols/rules to describe a string pattern

e.g., strings a, aa, aaa, aaaa, .... can be described simply as a+– facilitate search/replace/delete ...

• Unix commands/utilities that support regular expressions:– grep(fgrep, egrep) - search a file for a string or regular

expression– sed - stream editor– awk (nawk) - pattern scanning and processing language

Regular Expression Symbols

• . any single character• * 0 or more instances of previous character

e.g., a* matches <>, a, aa, aaa, aaaa, ...

• ? 0 or 1 instance of previous character• + 1 or more instances of previous character• ^ beginning of a line

e.g., ^A matches A only if A is at the beginning of a line

• $ end of a linee.g., A$ matches A only if A is at the end of a line

• \ Turn off the special meaning

• […] matches any of the enclosed characters– [abc] matches a single a b or c– [a-z] matches any of abcdef…xyz

• [^…] matches any thing not included– [^A-Za-z] matches a single character as long as it is not a

letter.

Regular Expression Symbols

• \< beginning of worde.g., \<abc matches “abcd” but not “dabc”

• \> end of worde.g., abc\> matches “dabc” but not “abcd”

• \(…\) stores the pattern inside \( and \)e.g., \(abc\)def matches “abcdef” and storesabc in \1. So \(abc\)def\1 matches

“abcdefabc”. Can store up to 9 matches e.g., \(ab\)c\(de\)f\1\2 will match abcdefabde

Regular Expression Symbols

• X{n} The preceding item X is matched exactly n times.

• X{n,} The preceding item X is matched n or more times.

• X{n,m} The preceding item X is matched at least n times, but not more than m times.

Repeat

• can put a subpattern inside () , and apply +, *, and ? to the entire subpattern.

e.g.:a(bc)*d matches “ad”, "abcbcd”, “abcbcbcbcbcbcd”, ...

Subpattern

More examples

• x[abc]?x matches "xax" or "xx“

• [abc]* matches "aaaaa" or "acbca"

• 0*10 matches "010" or "0000010"or "10“

• [Dd][Aa][Vv][Ee]– Matches "Dave" or "dave" or "dAVE",– Does not match "ave" or "da"

Review: metacharacters for filename abbreviation (lecture 2)

• Metacharacters: * ? [ ] ~• * matches anything

$ ls *.doc # list all files ending with .doc• ? matches single character

$ ls ?.doc # list a.doc, b.doc, but not ab.doc• […] matches any of the enclosed charactors

$ ls [ab].doc # only list a.doc or b.doc$ ls *[cx].doc # list files ending with c.doc or

x.doc• ~ is a shortcut of your home directory

$ cd ~ # change to home directory9

Difference

• Interpreted by different programs– filename expansion is done by the shell.– regular expressions are used by commands (programs).

• Be careful when specifying R.E. on the command line– Good idea to always quote R.E. with special chars (‘’or “”)on

the command line– Example:

$ grep ‘[a-z]*’ somefile.txt

grep - search for a string

• grep [-bchilnsvw] PATTERN [filename...]– Read files or standard /redirected input – Search for specified pattern in each line– Send results to the standard output

• Examples:$ grep ‘^X11’ * - search all files for lines starting

with the string “X11”

$ grep -v text file - print lines that do not match “text”• Exit status– 0 – pattern found; 1 - not found

grep - options• Some useful options

-c count number of lines-i ignore case-l list only the files with matching lines-L list files that dose not match-v display lines that do not match-n print line numbers-r recursively search the sub-directories

Regular Expressions for grep

\X turn off any special meaning of character X^ beginning of line$ end of line. any single character[...] any of characters in range .…[^....] any single character not in range .…r* zero or more occurrences of r

grep with pipes

• we can use pipes when a file is expectede.g., $ ls –l | grep a*

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep NW gamefilenorthwest NW Charles Main 3.0 .98 3 34

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep '^n' gamefilenorthwest NW Charles Main 3.0 .98 3 34northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep '4$' gamefilenorthwest NW Charles Main 3.0 .98 3 34

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep TB Savage gamefilegrep: Savage: No such file or directorygamefile:eastern EA TB Savage 4.4 .84 5 20

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep -l 'SE' *gamefile

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep '5\..' gamefilewestern WE Sharon Gray 5.3 .97 5 23southern SO Suan Chin 5.1 .95 4 15northeast NE AM Main Jr. 5.1 .94 3 13central CT Ann Stephens 5.7 .94 5 13

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep '\<north' gamefilenorthwest NW Charles Main 3.0 .98 3 34northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep '\<north\>' gamefilenorth NO Margot Webber 4.5 .89 5 9

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep -v "Suan Chin" gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep -c 'west' gamefile3

gamefilenorthwest NW Charles Main 3.0 .98 3 34western WE Sharon Gray 5.3 .97 5 23southwest SW Lewis Dalsass 2.7 .8 2 18southern SO Suan Chin 5.1 .95 4 15southeast SE Patricia Heme 4.0 .7 4 17eastern EA TB Savage 4.4 .84 5 20northeast NE AM Main Jr. 5.1 .94 3 13north NO Margot Webber 4.5 .89 5 9central CT Ann Stephens 5.7 .94 5 13

$ grep -i "$LOGNAME" /etc/passwdzhengm:x:503:504::/home/zhengm:/bin/bash