perl = practical extraction and report language developed by

68
Perl Perl = Practical Extraction and Report Language Developed by Larry Wall (late 80’s) as a replace- ment for awk. Has grown to become a replacement for awk, sed, grep, other filters, shell scripts, C programs, ... (i.e. ”kitchen sink”). An extremely useful tool to know because it: runs on Unix variants (Linux/Android/OSX/IOS/.. Windows/DOS variants, Plan 9, OS2, OS390, VMS.. very widely used for complex scripting tasks has standard libraries for many applications (Web/CGI, DB, ...) Perl has been influential: PHP, Python, Ruby, ... even Java (interpreted) perl-1

Upload: others

Post on 11-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Perl = Practical Extraction and Report Language Developed by

Perl

Perl = Practical Extraction and Report Language

Developed by Larry Wall (late 80’s) as a replace-

ment for awk.

Has grown to become a replacement for awk, sed,

grep, other filters, shell scripts, C programs, ...

(i.e. ”kitchen sink”).

An extremely useful tool to know because it:

• runs on Unix variants (Linux/Android/OSX/IOS/..),

Windows/DOS variants, Plan 9, OS2, OS390,

VMS..

• very widely used for complex scripting tasks

• has standard libraries for many applications

(Web/CGI, DB, ...)

Perl has been influential: PHP, Python, Ruby, ...

even Java (interpreted)

perl-1

Page 2: Perl = Practical Extraction and Report Language Developed by

Perl (cont.)

Some of the language design principles for Perl:

• make it easy/concise to express common id-

ioms

• provide many different ways to express the

same thing

• use sensible defaults to minimise declarations

• don’t be afraid to use context as a syntactic

tool

• create a large language that users will learn

subsets of

Many of these conflict with design principles of languages

for teaching.

perl-2

Page 3: Perl = Practical Extraction and Report Language Developed by

Perl (cont.)

So what is the end product like?

• a language which makes it easy to build use-

ful systems

• readability can sometimes be a problem (lan-

guage is too rich?)

• interpreted

efficient? (although still remarkably efficient)

Summary: it’s easy to write concise, powerful,

obscure programs in Perl

perl-3

Page 4: Perl = Practical Extraction and Report Language Developed by

Reference Material

Getting started ...

• A Little Book on Perl

by Robert W. Sebesta

Prentice Hall, 2000, ISBN:0139279555

Getting serious ...

• Perl Cookbook

by Tom Christiansen, Nathan Torkington, Larry

Wall

O’Reilly@ Associates, 1998, ISBN:1565922433

• Mastering Algorithms With Perl

by Jon Orwant, Jarkko Hietaniemi, John Mac-

Donald, John Orwant

O’Reilly@ Associates, 1999, ISBN: 1565923987

• www.perl.com

by various authors, on-line repository of Perl

information

perl-4

Page 5: Perl = Practical Extraction and Report Language Developed by

Using Perl

Perl programs can be invoked in several ways ...

• giving the filename of the Perl program as acommand line argument

perl PerlCodeFile.pl

• giving the Perl program itself as a commandline argument

perl {\em{-e}} ’print "Hello, world\(\backslash\)n";’

• using the #! notation and making the pro-gram file executable

chmod 755 PerlCodeFile./PerlCodeFile

perl-5

Page 6: Perl = Practical Extraction and Report Language Developed by

Using Perl (cont.)

Advisable to always use -w option.

Causes Perl to print warnings about common er-rors.

perl -w PerlCodeFile.pl

perl -w -e ’PerlCode’

Can use options with #!

#!/usr/bin/perl -w

PerlCode

you can also get warnings via a pragma:

use warnings;

To catch other possible problems

use strict;

Some programmers always use strict, others find

it too annoying.

perl-6

Page 7: Perl = Practical Extraction and Report Language Developed by

Syntax Conventions

Perl uses non-alphabetic characters to introduce

various kinds of program entities (i.e. set a context

in which to interpret identifiers).

Char Kind Example Description# Comment # comment rest of line is a comment$ Scalar $count variable containing simple@ Array @counts list of values, indexed% Hash %marks set of values, indexed@ Subroutine @doIt a callable piece of Perl

perl-7

Page 8: Perl = Practical Extraction and Report Language Developed by

Syntax Conventions (cont.)

Any unadorned identifiers are either

• names of built in (or other) functions (e.g.

chomp, split, ...)

• control-structures (e.g. if, for, foreach,

...)

• literal strings (like the shell!)

The latter can be confusing to C/Java/PHP pro-grammers e.g.

$x = abc; is the same as $x = "abc";

perl-8

Page 9: Perl = Practical Extraction and Report Language Developed by

Variables

Perl supports three kinds of variable:

• scalars ... a single atomic value (number or

string)

• arrays ... a list of values, indexed by number

• hashes ... a group of values, indexed by

string

Variables do not need to be declared or initialised.

If used when not initialised, the value is 0 or

empty string or empty list.

Beware: spelling mistakes in variable names, e.g.

print "abc=$acb

n"; rather than print "abc=$abc

n";

prints no value, leaving you wondering what hap-

pened to $abc’s value.

perl-9

Page 10: Perl = Practical Extraction and Report Language Developed by

Variables (cont.)

Many scalar operations have the notion of a ”de-

fault source/target”.

If you don’t specify an argument, they assume

one called $ .

This makes it

• often very convenient to write brief programs

(minimal syntax)

• sometimes confusing to new users (”Where’s

the argument??”)

$ performs a similar role to ”it” in English text.

E.g. “The dog ran away. It ate a bone. It had lots of

fun.”

perl-10

Page 11: Perl = Practical Extraction and Report Language Developed by

Scalars

One scalar data type: string.

In expressions/assignments, values of scalar vari-

ables are automatically converted to a type ap-

propriate for the context where the context is

determined by the operators used

• Arithmetic operators

numeric context

• String operators (concatenation etc)

string context

This requires two sets of relational (comparison)

operators:

• Numeric: == != < <= >> >>= <=>>

• String: eq ne lt le gt ge cmp

perl-11

Page 12: Perl = Practical Extraction and Report Language Developed by

Scalars (cont.)

Examples:

$x = ’123’; # $x assigned string "123"$y = "123 "; # $y assigned string "123 "$z = 123; # $z assigned integer 123$i = $x + 1; # $x value converted to integer$j = $y + $z; # $y value converted to integer$a = $x == $y; # compare numeric versions of $x,$y (true)$b = $x eq $y; # compare string versions of $x,$y (false)$c = $x.$y; # concat string versions of $x,$y (explicit)$c = "$x$y"; # concat string versions of $x,$y (interpolation)

Note: $c = $x $y is invalid (Perl has no empty infix

operator)

(unlike predecessor languages such as awk, where $x $y

meant string concatenation)

perl-12

Page 13: Perl = Practical Extraction and Report Language Developed by

Scalars (cont.)

Scalars are converted to printable form when in-

terpolated into strings.

A couple of subtleties to note:

• for booleans ... true

1, false

"".

• for numbers ... Perl uses the default format

%.15g

Numeric formatting produces the most compactrepresentation it can with up to 15 significantdigits, e.g.

$x = 3/2; print "$x"; # displays 1.5$x = 4/2; print "$x"; # displays 2$x = 1/3; print "$x"; # displays 0.333333333333333

To get numbers formatted exactly how you want

... use printf.

perl-13

Page 14: Perl = Practical Extraction and Report Language Developed by

Scalars (cont.)

A very common pattern for modifying scalars is:

$var = $var op expression

Compound assignments for the most commonoperators allow you to write

$var op= expression

Examples:

$x += 1; # increment the value of $x$y *= 2; # double the value of $y$a .= "abc" # append "abc" to $a

perl-14

Page 15: Perl = Practical Extraction and Report Language Developed by

Logical Operators

Perl has two sets of logical operators, one like

C, the other like ”English”.

The second set has very low precedence, so can

be used between statements.

Operation Example MeaningAnd x @@ y false if x is false, otherwiseOr x || y true if x is true, otherwiseNot ! x true if x is not true, falseAnd x and y false if x is false, otherwiseOr x or y true if x is true, otherwiseNot not x true if x is not true, false

perl-15

Page 16: Perl = Practical Extraction and Report Language Developed by

Logical Operators (cont.)

Example of using statement-level logical opera-tions:

if (!open(FILE,"myFile")) {die "Can’t open myFile";

}# or

if (!open(FILE,"myFile")){ die "Can’t open myFile"; }

# can be replaced by Perl idiom

open(FILE,"myFile") or die "Can’t open myFile";

perl-16

Page 17: Perl = Practical Extraction and Report Language Developed by

Arrays (Lists)

An array is a sequence of scalars, indexed by

position (0,1,2,...)

The whole array is denoted by @array

Individual array elements are denoted by $array[index]

$#array gives the index of the last element.

Example:

$a[0] = "first string";$a[1] = "2nd string";$a[2] = 123;or, equivalently,@a = ("first string", "2nd string", 123);

print "Index of last element is $#a\(\backslash\)n";print "Number of elements is ", $#a+1, "\(\backslash\)n";

perl-17

Page 18: Perl = Practical Extraction and Report Language Developed by

Arrays (Lists) (cont.)

Alternative interpretations for @a ...

@a = ("abc", 123, ’x’);

\begin{slide}\Heading{ Control Structures}

{\em{All}} single Perl statements must be terminated by\makeatletter\begin{small}\begin{program}

$x = 1;print "Hello";

All statements with control structures must beenclosed in braces, e.g.

if ($x > 9999) {print "x is big\(\backslash\)n";

}

You don’t need a semicolon after a statement

group in {...}.

Statement blocks can also be used like anony-

mous functions.

perl-18

Page 19: Perl = Practical Extraction and Report Language Developed by

Function Calls

All Perl function calls ...

• are call by value (like C) (except scalars

aliased to @ )

• are expressions (although often ignore return

value)

Notation(s) for Perl function calls:

&func(arg1, arg2, ... argn)func(arg1, arg2, ... argn)func arg1, arg2, ... argn

perl-19

Page 20: Perl = Practical Extraction and Report Language Developed by

Control Structures

Selection is handled by if ... elsif ... else

if ( boolExpr1 ) statements1elsif ( boolExpr2 ) statements2...else statementsn

statement if ( expression );

perl-20

Page 21: Perl = Practical Extraction and Report Language Developed by

Control Structures (cont.)

Iteration is handled by while, until, for, foreach

while ( boolExpr ) {statements

}

until ( boolExpr ) {statements

}

for ( init ; boolExpr ; step ) {statements

}

foreach var (list) {statements

}

perl-21

Page 22: Perl = Practical Extraction and Report Language Developed by

Control Structures (cont.)

Example (compute pow = kn):

# Method 1 ... while$pow = $i = 1;while ($i <= $n) {

$pow = $pow * $k;$i++;

}# Method 2 ... for$pow = 1;for ($i = 1; $i <= $n; $i++) {

$pow *= $k;}# Method 3 ... foreach$pow = 1;foreach $i (1..$n) { $pow *= $k; }

# Method 4 ... foreach $_$pow = 1;foreach (1..$n) { $pow *= $k; }

# Method 5 ... builtin operator$pow = $k ** $n;

perl-22

Page 23: Perl = Practical Extraction and Report Language Developed by

Control Structures (cont.)

Example of fine-grained loop control:

OUTER:while (i < 1000){

INNER:for ($i = 0; $i < 99; $i++){

last OUTER if $i > 90; # terminates both loops$i += 3;next if $i > 80; # next iteration of INNERif ($i > 70) { next; } # next iteration of INNER$i += 4;redo if $i < 70; # next iteration of INNERnext OUTER if $i == 42; # next iteration of OUTER

}}

perl-23

Page 24: Perl = Practical Extraction and Report Language Developed by

Terminating

Normal termination: call exit 0 at the end of

the script.

• accepts a list of arguments

• concatenates them all into a single string

• appends file name and line number

• prints this string

• and then terminates the Perl interpreter

Example:

if (! -r "myFile") {die "Can’t read myFile\(\backslash\)n";

}# ordie "Can’t read myFile\(\backslash\)n" if (! -r "myFile");

Convention: omit parentheses on die argument list if just

one argument

perl-24

Page 25: Perl = Practical Extraction and Report Language Developed by

Defining Functions

Perl functions (or subroutines) are defined viasub, e.g.

sub sayHello {print "Hello!\(\backslash\)n";

}

And used by calling, with or without &, e.g.

&sayHello; # arg list optional with &sayHello(); # more common, show empty arg list explicitly

perl-25

Page 26: Perl = Practical Extraction and Report Language Developed by

Defining Functions (cont.)

Function arguments are passed via a list variable@ , e.g.

sub mySub {@args = @_;print "I got ",$#args+1," args\(\backslash\)n";print "They are (@args)\(\backslash\)n";

}

Note that @args is a global variable.

To make it local, precede by my, e.g.

my @args = @_;

perl-26

Page 27: Perl = Practical Extraction and Report Language Developed by

Defining Functions (cont.)

Can achieve similar effect to the C function

int f(int x, int y, int z) {int res;...return res;

}

by using array assignment in Perl

sub f {my ($x, $y, $z) = @_;my $res;...return $res;

}

perl-27

Page 28: Perl = Practical Extraction and Report Language Developed by

Defining Functions (cont.)

For functions with array or hash parameters

sub ff {my ($x, @y, %z) = @_;...

}

Prototypes indicate expected parameter struc-

ture. Use one $ for each scalar and @ for a list.

Examples:

sub f($$$);sub ff($@%);

\begin{slide}\Heading{ Input/Output}

Files are accessed via {\em{handles}} ~ {\small (cf. Unix

<@><<</@>Handle<@>>></@> for an input file means "read the

E.g. ~~ <@> $line = <<STDIN>>; </@>

... stores the next line from standard input in the variable

Output file handles are used as the first argument to the

E.g. ~~ <@> print REPORT "Report for $today\(\backslash\)n";</@>

perl-28

Page 29: Perl = Practical Extraction and Report Language Developed by

... writes a line to the file attached to the <@>REPORT</@>

{\small Note: no comma after the handle ID}\end{slide}

\begin{slide}\ContHeading{ Input/Output}

Example (a simple <@>cat</@>):\makeatletter\begin{small}\begin{program}

#!/usr/bin/perl# Copy stdin to stdout

while ($line = <<STDIN>>) {print $line;

}

However, this can be simplified to:

while (<<STDIN>>) { print; }

# or even

print <<STDIN>>;

Defaults:

• the default destination variable for input is $

• the default argument for print is also $

Page 30: Perl = Practical Extraction and Report Language Developed by

Input/Output (cont.)

Handles can be explicitly attached to files via theopen command:

open(DATA, "<< data"); # read from a file called "data"open(RES, ">> results"); # write to a file called "results"open(XTRA, ">>> stuff"); # append to a file called "stuff"

Handles can even be attched to pipelines to read/writeto Unix commands:

open(DATE, "/bin/date |"); # read from output of dateopen(FEED, "| more"); # send output through the more

Opening a handle may fail:

open(DATA, "<< data") or die "Can’t open data file";

Handles are closed using close(HandleName) (or

automatically closed on exit).

perl-29

Page 31: Perl = Practical Extraction and Report Language Developed by

Input/Output (cont.)

The special file handle <>>

• treats all command-line arguments as file names

• opens and reads from each of them in turn

If there are no command line arguments, then

<>> == <STDIN>>

Example:

perl -e ’print <<>>;’ a b c

Displays the contents of the files a, b, c on std-

out.

perl-30

Page 32: Perl = Practical Extraction and Report Language Developed by

Perl and External Commands

Perl is shell-like in the ease of invoking other

commands/programs.

Several ways of interacting with external com-

mands/programs:

‘cmd‘; capture entire output of cmd as single

system("cmd")execute cmd and capture its exit status

open(F,"cmd|")collect cmd output by reading from a

perl-31

Page 33: Perl = Practical Extraction and Report Language Developed by

Perl and External Commands (cont.)

External command Examples:

$files = ‘ls $d‘; # string output from command

$status = system("ls $d"); # numeric status of command

open(FILES, "ls $d |"); # display dir contents, onewhile (<~FILES~>) {

chomp;@fields = split; # split words in $_ to @_print "Next file is $fields[$#fields]\(\backslash\)n";

}

perl-32

Page 34: Perl = Practical Extraction and Report Language Developed by

File Test Operators

Perl provides an extensive set of operators to

query file information:

-r, -w, -x file is readable, writeable, executable

-e, -z, -s file exists, has zero size, has non-zero

-f, -d, -l file is a plain file, directory, sym link

Cf. the Unix test command.

Used in checking I/O operations, e.g.

-r "dataFile" && open DATA,"<dataFile";

perl-33

Page 35: Perl = Practical Extraction and Report Language Developed by

Special Variables

Perl defines numerous special variables to hold

information about its execution environment.

These variables typically have names consisting

of a single punctuation character e.g. $! $@

$# $$ $% ... (English names are also available)

The $ variable is particularly important:

• acts as the default location to assign result

values (e.g. <STDIN>>)

• acts as the default argument to many oper-

ations (e.g. print)

Careful use of $ can make programs concise,

uncluttered.

Careless use of $ can make programs cryptic.

perl-34

Page 36: Perl = Practical Extraction and Report Language Developed by

Special Variables (cont.)

$ default input and pattern match

@ARGV list (array) of command line arguments

$0name of file containing executing P

shell)$i matching string for ith regexp in pattern

$! last error from system call such as op

$. line number for input file stream

$/ line separator, none if undefined

$$ process number of executing Perl script

%ENV lookup table of environment variables

perl-35

Page 37: Perl = Practical Extraction and Report Language Developed by

Special Variables (cont.)

Example (echo in Perl):

for ($i = 0; $i < @ARGV; $i++) {print "$ARGV[$i] ";

}print "\(\backslash\)n";

or

foreach $arg (@ARGV) {print "$arg ";

}print "\(\backslash\)n";

or

print "@ARGV\(\backslash\)n";

perl-36

Page 38: Perl = Practical Extraction and Report Language Developed by

Perl Regular Expressions

Because Perl makes heavy use of strings, regular

expressions are a very important component of

the language.

They can be used:

• in conditional expressions to test whether a

string matches a pattern

e.g. checking the contents of a string

if ($name =~ /[0-9]/) { print "name contains digit\(\backslash\)n";

• in assignments to modify the value of a string

e.g. convert McDonald to MacDonald

$name =~ s/Mc/Mac/;

perl-37

Page 39: Perl = Practical Extraction and Report Language Developed by

Perl Regular Expressions (cont.)

Perl extends POSIX regular expressions with some

shorthand:

d matches any digit, i.e. [0-9]

D matches any non-digit, i.e. [^0-9]

w matches any ”word” char, i.e. [a-zA-Z

W matches any non ”word” char, i.e. [^a-zA-Z

s

matches any whitespace, i.e. [

t

n

r

f]

S

matches any non-whitespace, i.e. [^

t

n

r

f]

perl-38

Page 40: Perl = Practical Extraction and Report Language Developed by

Perl Regular Expressions (cont.)

Perl also adds some new anchors to regexps:

b matches at a word boundary

B matches except at a word boundary

And generalises the repetition operators:

patt* matches 0 or more occurences of patt

patt+ matches 1 or more occurences of patt

patt? matches 0 or 1 occurence of patt

patt{n,m} matches between n and m occurences

perl-39

Page 41: Perl = Practical Extraction and Report Language Developed by

Perl Regular Expressions (cont.)

The default semantics for pattern matching is

”first, then largest”.

E.g. /ab+/ matches abbbabbbb not abbbabbbb

or abbbabbbb

A pattern can also be qualified so that it looks

for the shortest match.

If the repetition operator is followed by ? the

”first, then shortest” string that matches the

pattern is chosen.

E.g. /ab+?/ would match abbbabbbb

perl-40

Page 42: Perl = Practical Extraction and Report Language Developed by

Perl Regular Expressions (cont.)

Regular expressions can be formed by interpolat-

ing strings in between /.../.

Example:

$pattern = "ab+";$replace = "Yod";$text = "abba";

$text =~ s/$pattern/$replace/;

# converts "abba" to "Yoda"

Note: Perl doesn’t confuse the use of $ in $var and abc$,

because the anchor occurs at the end.

perl-41

Page 43: Perl = Practical Extraction and Report Language Developed by

Pattern Matcher

A Perl script to accept a pattern and a stringand show the match (if any):

#!/usr/bin/perl

$pattern = $ARGV[0]; print "pattern=/$pattern/\(\backslash\)n";

$string = $ARGV[1]; print "string =\(\backslash\)"$string\(\backslash\)

$string =~ /$pattern/; print "match =\(\backslash\)"$&\(\backslash\)"\(\b

You might find this a useful tool to test out your under-

standing of regular expressions.

perl-42

Page 44: Perl = Practical Extraction and Report Language Developed by

Lists as Strings

Recall the marks example from earlier on; we

used "54,67,88" to effectively hold a list of marks.

Could we turn this into a real list if e.g. we

wanted to compute an average?

The split operation allows us to do this:

Syntax: split(/pattern/,string) returns a list

The join operation allows us to convert from list

to string:

Syntax: join(’char’,list) returns a string

(Don’t confuse this with the join filter in the shell. Perl’s

join acts more like paste.)

perl-43

Page 45: Perl = Practical Extraction and Report Language Developed by

Lists as Strings (cont.)

Examples:

$marks = "99,67,85,48,77,84";

@listOfMarks = split(/,/, $marks);# assigns (99,67,85,48,77,84) to @listOfMarks

$sum = 0;foreach $m (@listOfMarks) {

$sum += $m;}

$newMarks = join(’:’,@listOfMarks);# assigns "99:67:85:48:77:84" to $newMarks

Note: argument parentheses are optional: I use them for

more than one argument or where it looks ambiguous

(though not for printf).

perl-44

Page 46: Perl = Practical Extraction and Report Language Developed by

Lists as Strings (cont.)

Complex splits can be achieved by using a fullregular expression rather than a single delimitercharacter.If part of the regexp is parenthesised, the corre-sponding part of each delimiter is retained in theresulting list.

split(/[#@]+/, ’ab##@#c#d@@e’); # gives (ab, c, d, e)split(/([#@]+)/, ’ab##@#c#d@@e’); # gives (ab, ##@#, c,split(/([#@])+/, ’ab##@#c#d@@e’); # gives (ab, #, c, #,

And as a specially useful case, the empty reg-

exp is treated as if it matched between every

character, splitting the string into a list of single

characters:

split(//, ’hello’); # gives (h, e, l, l, o)

perl-45

Page 47: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes)

As well as arrays indexed by numbers, Perl sup-

ports arrays indexed by strings: hashes.

Conceptually, as hash is a set (not list) of (key, value)

pairs.

We can deal with an entire hash at a time via%hashName, e.g.

# Key Value

%days = ( "Sun" => "Sunday","Mon" => "Monday","Tue" => "Tuesday","Wed" => "Wednesday","Thu" => "Thursday","Fri" => "Friday","Sat" => "Saturday" );

perl-46

Page 48: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes) (cont.)

Individual components of a hash are accessed via

$hashName{keyString}

Examples:

$days{"Sun"} # returns "Sunday"$days{"Fri"} # returns "Friday"$days{"dog"} # is undefined (interpreted as "")$days{0} # is undefined (interpreted as "")

# inserts a new (key,value)$days{dog} = "Dog Day Afternoon"; # bareword OK as key

# replaces value for key "Sun"$days{"Sun"} = Soonday; # bareword OK as value

perl-47

Page 49: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes) (cont.)

Consider the following two assignments:

@f = ("John", "blue", "Anne", "red", "Tim", "green");%g = ("John" => "blue", "Anne" => "red", "Tim" => "green");

The first produces an array of strings that can

be accessed via position, such as $f[0]

The second produces a lookup table of names

and colours, e.g. $g{"Tim"}.

(In fact the symbols => and comma have identical

meaning in a list, so either right-hand side could

have been used. However, always use the arrow

form exclusively for hashes.)

perl-48

Page 50: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes) (cont.)

Consider iterating over each of these data struc-

tures:

foreach $x (@f) {

print "$x\(\backslash\)n";}

JohnblueAnneredTimgreen

foreach $x (keys %g) {print "$x => $g{$x}\(\backslash\)n";

}

Anne => redTim => greenJohn => blue

The data comes out of the hash in a fixed but

arbitrary order (due to the hash function).

perl-49

Page 51: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes) (cont.)

There are several ways to examine the (key, value)pairs in a hash:

foreach $key (keys %myHash) {print "($key, $myHash{$key})\(\backslash\)n";

}

or, if you just want the values without the keys

foreach $val (values %myHash) {print "(?, $val)\(\backslash\)n";

}

or, if you want them both together

while (($key,$val) = each %myHash) {print "($key, $val)\(\backslash\)n";

}

Note that each method produces the keys/values

in the same order. It’s illegal to change the hash

within these loops.

perl-50

Page 52: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes) (cont.)

Example (collecting marks for each student):

• a data file of (name,mark) pairs, space-separated,

one per line

• out should be (name,marksList), with comma-

separated marks

while (<>) {chomp; # remove newline($name, $mark) = split; # separate data fields$marks{$name} .= ",$mark"; # accumulate marks

}foreach $name (keys %marks) {

$marks{$name} =~ s/,//; # remove comma prefixprint "$name $marks{$name}\(\backslash\)n";

}

perl-51

Page 53: Perl = Practical Extraction and Report Language Developed by

Associative Arrays (Hashes) (cont.)

The delete function removes an entry (or en-

tries) from an associative array.

To remove a single pair:

delete $days{"Mon"}; # "I don’t like Mondays"

To remove multiple pairs:

delete @days{ ("Sat","Sun") }; # Omigosh! No weekend!

To clean out the entire hash:

foreach $d (keys %days) { delete $days{$d}; }# or, more simply%days = ();

perl-52

Page 54: Perl = Practical Extraction and Report Language Developed by

Perl Library Functions

Perl has literally hundreds of functions for all

kinds of purposes:

• file manipulation, database access, network

programming, etc. etc.

It has an especially rich collection of functions

for strings.

E.g. tr/abc/ABC/ converts a to A, b to B, c to C

throughout a string

Consult on-line Perl manuals, reference books,

example programs for further information.

perl-53

Page 55: Perl = Practical Extraction and Report Language Developed by

Defining Functions

Perl functions (or subroutines) are defined viasub, e.g.

sub sayHello {print "Hello!\(\backslash\)n";

}

And used by calling, with or without &, e.g.

&sayHello; # arg list optional with &sayHello(); # more common, show empty arg list explicitly

perl-54

Page 56: Perl = Practical Extraction and Report Language Developed by

Defining Functions (cont.)

Function arguments are passed via a list variable@ , e.g.

sub mySub {@args = @_;print "I got ",@#args+1," args\(\backslash\)n";print "They are (@args)\(\backslash\)n";

}

Note that @args is a global variable.

To make it local, precede by my, e.g.

my @args = @_;

perl-55

Page 57: Perl = Practical Extraction and Report Language Developed by

Defining Functions (cont.)

Can achieve similar effect to the C function

int f(int x, int y, int z) {int res;...return res;

}

by using array assignment in Perl

sub f {my ($x, $y, $z) = @_;my $res;...return $res;

}

perl-56

Page 58: Perl = Practical Extraction and Report Language Developed by

Defining Functions (cont.)

Lists (arrays and hashes) with any scalar argu-

ments to produce a single argument list.

This in effect means you can only pass a single

array or hash to a Perl function and it must be

the last argument.

sub good {my ($x, $y, @list) = @_;

This will not work (x and y will be undefined):

sub bad {my (@list, $x, $y) = @_;

And this will not work (list2 will be undefined):

sub bad {my (@list1, @list2) = @_;

perl-57

Page 59: Perl = Practical Extraction and Report Language Developed by

References

References

• are like C pointers (refer to some other objects)

• can be assigned to scalar variables

• are dereferenced by evaluating the variable

Example:

$aref = [1,2,3,4];print @$aref; # displays whole array... $$aref[0]; # access the first element... ${$aref}[1]; # access the second element... $aref->[2]; # access the third element

perl-58

Page 60: Perl = Practical Extraction and Report Language Developed by

Parameter Passing

Scalar variable are aliased to the correspond-ing element of @ . This means a function canchanged them. E.g. this code sets x to 42.

sub assign {$_[0] = $_[1];

}assign($x, 42);

Arrays & hashes are passed by value.

If a function needs to change an array/hash pass

a reference.

Also use references if you need to pass multiple

hashes or arrays.

%h = (jas=>100,arun=>95,eric=>50);@x = (1..10)

mySub(3, \(\backslash\)%h, \(\backslash\)@x);mysub(2, \(\backslash\)%h, [1,2,3,4,5]);mysub(5, {a=>1,b=>2}, [1,2,3]);

Notation:

• [1,2,3] gives a reference to (1,2,3)

• {a=>1,b=>2} gives a reference to (a=>1,b=>2)perl-59

Page 61: Perl = Practical Extraction and Report Language Developed by

Perl Prototypes

Perl prototypes declare the expected parameter

structure for a function.

Unlike other language (e.g. C) Perl prototype’s

main purpose is not type checking.

Perl prototypes’ main purpose is to allow more

convenient calling of functions.

You also get some error checking - sometimes

useful, sometimes less so.

Some programmers recommend against using pro-

totypes.

Use in COMP2041/9041 optional.

You can use prototypes to define functions that

can be called like builtins.

perl-60

Page 62: Perl = Practical Extraction and Report Language Developed by

If Perl has seen a prototype, you can omit paren-

thesises from around the parameters. For exam-

ple:

sub my_function($$@);#...my_function $a,2,@list;

Prototypes can also cause implicitly a reference

to be passed when an array is specified as a pa-

rameter.

If we defined our implementraion of push like

this:

sub mypush {my ($array_ref,@elements) = @_;if (@elements) {

@$array_ref = (@$array_ref, @elements);} else {

@$array_ref = (@$array_ref, $_);}

It has to be called like this:

mypush(\@array, $x);

But if we add this prototype:

Page 63: Perl = Practical Extraction and Report Language Developed by

sub mypush2(\(\backslash\)@@)

It can be called just like the builtin push:

mypush @array, $x;

Google perlsub more more info.

Page 64: Perl = Practical Extraction and Report Language Developed by

Recursive example

sub fac {my ($n) = @_;

return 1 if $n < 1;

return $n * fac($n-1);}

which behaves as

print fac(3); # displays 6print fac(4); # displays 24print fac(10); # displays 3628800print fac(20); # displays 2.43290200817664e+18

perl-61

Page 65: Perl = Practical Extraction and Report Language Developed by

Eval

The Perl builtin function eval evaluates (exe-

cutes) a supplied string as Perl.

For example, this Perl will print 43:

$perl = ’$answer = 6 * 7;’;eval $perl;print "$answer\(\backslash\)n";

and this Perl will print 55:

@numbers = 1..10;$perl = join("+", @numbers);print eval $perl, "\(\backslash\)n";

and this Perl also prints 55:

$perl = ’$sum=0; $i=1; while ($i <= 10) {$sum+=$i++}’;eval $perl;print "$sum\(\backslash\)n";

and this Perl could do anything:

$input_line = <STDIN>;eval $input_line;

perl-62

Page 66: Perl = Practical Extraction and Report Language Developed by

Pragmas and Modules

Perl provides a way of controlling some aspects

of the interpreter’s behaviour (through pragmas)

and is an extensible language through the use of

compiled modules, of which there is a large num-

ber. Both are introduced by the use keyword.

Pragmas

• use English; - allow names for built-in vars,

e.g., $NF = $. and $ARG = $ .

• use integer; - truncate all arithmetic opera-

tions to integer, effective to the end of the

enclosing block.

• use strict ’vars’; - insist on all variables

declared using my.

perl-63

Page 67: Perl = Practical Extraction and Report Language Developed by

Pragmas (cont.)

Standard modules These are heaps of these,

some grouped into packages. The package name

is prefixed with ::

Examples:

• use DB File; - functions for maintaining an

external hash

• use Getopt::Std; - functions for processing

command-line flags

• use File::Find; - find function like the shell’s

find

• use Math::BigInt; - unlimited-precision arith-

metic

• use CGI; - see next week’s lecture.

perl-64

Page 68: Perl = Practical Extraction and Report Language Developed by

Summary

• Structure your scripts so they have documen-

tation, constants, function prototypes, usage

handling, main logic and function definitions,

in that order.

• Make sure you understand all the examples.

• Consider that the built-in functions in the

examples are the minimum that you need to

know, and consult documentation for other

string and scalar functions.

• Perl is powered by regular expressions. Get

used to using regexps and their operators.

Quickly.

• Initially you may want to stick to predictable

ways of processing data, usually by splitting

into field arrays and processing each element

in turn.

• With experience, try to exploit some of the

more powerful attributes of the language,

such as list assignment, slices, map, grep, etc.

perl-65