programming for linguists an introduction to python 24/11/2011

47
Programming for Linguists An Introduction to Python 24/11/2011

Upload: warren-neal

Post on 11-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Programming for Linguists An Introduction to Python 24/11/2011

Programming for Linguists

An Introduction to Python24/11/2011

Page 2: Programming for Linguists An Introduction to Python 24/11/2011

From Last WeekEx 1)

def name( ): name = raw_input("what is your

first name? ") length = len(name) last_letter = name[-1] print name," contains ", lenght, "

letter(s) and ends with a(n) ”,last_letter

name( )

Page 3: Programming for Linguists An Introduction to Python 24/11/2011

Ex 2)def play( ): sentence = raw_input(“Sentence? ”) print sentence.upper( ) print sentence.lower( ) print sentence.title( ) print "The lowest index of the letter 'a' is”,sentence.index("a") if sentence.index("a")>3:

print sentence.replace ("a","e")

play( )

Page 4: Programming for Linguists An Introduction to Python 24/11/2011

Ex 3)def tkofschip( ):

verb=raw_input ("Please enter the root of a Dutch verb\n")

if verb.endswith ("t") or verb.endswith("k") or verb.endswith ("f") or verb.endswith("s") or verb.endswith("c") or verb.endswith("h") or verb.endswith("p"):

print verb+"te" else:

print verb+"de"

tkofschip( )

Page 5: Programming for Linguists An Introduction to Python 24/11/2011

Fruitful FunctionsFunctions which produce a result:

calling the function generates a value>>>len(“python”) 6

Often contain a return statement: return immediately from the function and use the following expression as a return value

Page 6: Programming for Linguists An Introduction to Python 24/11/2011

Try:

def word_length1(word):return len(word)

def word_length2(word):

print len(word)

a = word_length1(“hello”)

b = word_length2(“hello”)

type(a)

type(b)

Page 7: Programming for Linguists An Introduction to Python 24/11/2011

The return statement gives you a value which you can use in the rest of your script

The print statement does not give you a value

You can use multiple return statements, e.g.:def absolute_value(x):

if x >= 0:return x

else:return -x

Page 8: Programming for Linguists An Introduction to Python 24/11/2011

You can return a value, a variable, a function, a boolean expression

As soon as a return statement executes, the function terminates

Code that appears after a return statement = dead code

Page 9: Programming for Linguists An Introduction to Python 24/11/2011

Write a compare function that returns ‘1’ if x > y, ‘0’ if x == y, and ‘-1’ if x < y

Page 10: Programming for Linguists An Introduction to Python 24/11/2011

def compare(x, y):

if x == y:return 0

elif x > y:return 1

else:return -1

Page 11: Programming for Linguists An Introduction to Python 24/11/2011

As we saw: one function can call another

A function can also call itself

A function that calls itself = recursive

The process = recursion

Recursion

Page 12: Programming for Linguists An Introduction to Python 24/11/2011

Try this:

def countdown(n): if n<=0:

print ‘Happy Newyear!’ else:

print nn = n - 1 countdown(n)

countdown(10)

Page 13: Programming for Linguists An Introduction to Python 24/11/2011

If a recursion never reaches a base case, it goes on making recursive calls forever the program never terminates

Generally not a good idea

Python reports an error message when the maximum recursion depth is reached

Infinite Recursion

Page 14: Programming for Linguists An Introduction to Python 24/11/2011

e.g.def recurse( ):

recurse( )

recurse( )

Page 15: Programming for Linguists An Introduction to Python 24/11/2011

The while statement: used to perform identical or similar tasks

def countdown(n):while n > 0:

print nn = n – 1

print “Happy Newyear!”

Page 16: Programming for Linguists An Introduction to Python 24/11/2011

3 steps:

evaluate the condition, yielding True or False

if the condition is True, execute the statements inside the body and return to step 1 ( = loop)

if the condition is False, exit the while statement and continue with the execution of the next statement

Mind the difference in indentation between the statements inside and outside the while statement !

Page 17: Programming for Linguists An Introduction to Python 24/11/2011

The statements inside the body should change the value of one or more variables so that the condition becomes False at a certain point

If a loop goes on forever = infinite loop

You can use the break statement to jump out of the loop

Page 18: Programming for Linguists An Introduction to Python 24/11/2011

This program will echo the keyboard input until the user types “done”

while True:

line = raw_input (“> ”)if line == “done”:

break

print line

Page 19: Programming for Linguists An Introduction to Python 24/11/2011

Write a function that takes a string as an argument and prints the letters one by one using a while statement

while index < len(fruit):letter = fruit[index]

print letterindex = index + 1

Page 20: Programming for Linguists An Introduction to Python 24/11/2011

def print_letters(word):

index = 0

while index < len(word):letter = word[index]print letterindex = index + 1

Page 21: Programming for Linguists An Introduction to Python 24/11/2011

Write a similar function that takes a string as an argument and prints the letters backward using a while statement

Page 22: Programming for Linguists An Introduction to Python 24/11/2011

def print_backward(word):index = len(word) – 1

while index >= 0:letter = word[index]print letterindex = index - 1

Page 23: Programming for Linguists An Introduction to Python 24/11/2011

Lists

A list is a sequence of values

The values can be of any type

The values = elements/items

A list is always in between [ ]

To create a new list:list = [10, “cheese”, 5.6, “this is a sentence”]

Page 24: Programming for Linguists An Introduction to Python 24/11/2011

A list can contain another list (nested list):[‘hello’, 15, [‘my name is’, ‘Franky’]]

to access them: index methodlist[:20]

you can change existing listsnumbers = [17, 21]numbers[1] = 10print numbers[17, 10]

Page 25: Programming for Linguists An Introduction to Python 24/11/2011

A list index works the same way as a string index: any integer expression can be

used as an indexif you try to read or write an

element that does not exist, you get an IndexError

if an index has a negative value, it counts backward from the end of the list

the in operator also works on lists

Page 26: Programming for Linguists An Introduction to Python 24/11/2011

Traversing a ListFor loop

words = [‘work’, ‘run’, ‘play’, ‘jump’]for word in words:

print word

Page 27: Programming for Linguists An Introduction to Python 24/11/2011

if you need to update all elements: range function

numbers = [1, 3, 5, 10]

for elem in range(len(numbers)): numbers[elem] =

numbers[elem] * 2

print numbers

This loop traverses the list and updates each element

Page 28: Programming for Linguists An Introduction to Python 24/11/2011

List OperationsThe + operator concatenates lists

the * operator repeats a list a given number of times

The slice operator [n:m] gives you a slice of the list

Page 29: Programming for Linguists An Introduction to Python 24/11/2011

Try this:

a = [1, 2, 3]

b = [4, 5]

print a + b

print a*2

print a[1:2]

Page 30: Programming for Linguists An Introduction to Python 24/11/2011

List MethodsPython provides methods that

operate on lists

append method:

a = [‘a’, ‘b’, ‘c’]a.append(‘d’)print a[‘a’, ‘b’, ‘c’, ‘d’]

Page 31: Programming for Linguists An Introduction to Python 24/11/2011

deleting elements:using the index in the list:

pop method modifies the list and returns the element that was removed

del method modifies the list without returning the removed element

remove method if you do not know the index of the element

Page 32: Programming for Linguists An Introduction to Python 24/11/2011

t = ['a', 'b', 'c’, ‘c’, ‘d’]

x = t.pop(1)

print t

print x

del t[0]

print t

t.remove(‘c’)

print t

Page 33: Programming for Linguists An Introduction to Python 24/11/2011

s = [2,1,4,3]

s.count( ) 4

s.sort( ) [1, 2, 3, 4]

s.extend([5,6,7]) [1,2,3,4,5,6,7]

s.insert(0,8) [8,1,2,3,4,5,6,7]

s.reverse( ) [7, 6, 5, 4, 3, 2, 1, 8]

Page 34: Programming for Linguists An Introduction to Python 24/11/2011

From String to ListFrom a word to a list of letters:

list( ) s = “spam”print list(s)[‘s’, ‘p’, ‘a’, ‘m’]

From a sentence to a list of words: .split( )s = “This is a sentence”print s.split( )[‘This’, ‘is’, ‘a’, ‘sentence’]

Page 35: Programming for Linguists An Introduction to Python 24/11/2011

The split( ) function can also be used to split a string on other characters besides spaces

s = “spam-spam-spam”print s.split(“-”)[‘spam’, ‘spam’, ‘spam’]

“-” is called a delimiter in this case

Page 36: Programming for Linguists An Introduction to Python 24/11/2011

From List to StringJoin( ) is the inverse of split( )

l = [‘this’, ‘is’, ‘a’, ‘sentence’]delimiter = “ ”delimiter.join(l)“this is a sentence”

Page 37: Programming for Linguists An Introduction to Python 24/11/2011

List ArgumentsYou can also pass a list into a

function as argument

def del_first(list1):del list1[0]return list1

del_first([1,2,3])

Page 38: Programming for Linguists An Introduction to Python 24/11/2011

Many linguistic processing tasks involve pattern matching, e.g..startswith( ).endswith( )

To use regular expressions in Python we need to import the re libraryimport re

Regular Expressions for Detecting Word Patterns

Page 39: Programming for Linguists An Introduction to Python 24/11/2011

. Wildcard, matches any character

^abc Matches some pattern abc at thestart of a string

abc$ Matches some pattern abc at theend of a string

[abc] Matches one of a set of characters

[A-Z0-9] Matches one of a range of characters

Some Basic Regular Expression Meta-characters

Page 40: Programming for Linguists An Introduction to Python 24/11/2011

a|b|c Matches one of the specifiedstrings (disjunction)

* Zero or more of the previous item(s)

+ One or more of the previous item(s)

? Zero or one of the previousitem(s) (i.e. optional)

{n} Exactly n repeats where n is anon-negative integer

{n,} At least n repeats

Page 41: Programming for Linguists An Introduction to Python 24/11/2011

{,n} No more than n repeats

{m,n} At least m and no morethan n repeats

a(b|c)+ Parentheses that indicatethe scope of the operatorse.g. w(i|e|ai|oo)t matches wit, wet, wait and woot

<.*> Matches any token

In general, when using regular expressions it is best to use r'...' before the regular expressions

Page 42: Programming for Linguists An Introduction to Python 24/11/2011

Counting all vowels in a given word:

word='supercalifragilisticexpialidocious'

vowels = re.findall(r'[aeiou]', word)nr_vowels = len(vowels)

The re.findall( ) function finds all (non-overlapping) matches of the given regular expression

Page 43: Programming for Linguists An Introduction to Python 24/11/2011

You can find a list of all regular expressions operations in Python on:

http://docs.python.org/library/re.html

Page 44: Programming for Linguists An Introduction to Python 24/11/2011

For Next WeekEx 1) Write a script that reads 5 words

that are typed in by a user and tells the user which word is shortest and longest

Ex 2) Write a function that takes a sentence as an argument and calculates the average word length of the words in that sentence

Page 45: Programming for Linguists An Introduction to Python 24/11/2011

Ex 3) Take a short text of about 5 sentences. Write a script that will split up the text into sentences (tip: use the punctuation as boundaries) and calculates the average sentence length, the average word length and the standard deviation for both values

How to calculate the standard deviation: http://en.wikipedia.org/wiki/Standard_deviation

Page 46: Programming for Linguists An Introduction to Python 24/11/2011

No lecture next week

Some extra exercises will be posted on Blackboard instead.

Page 47: Programming for Linguists An Introduction to Python 24/11/2011

Thank you!