programming for linguists an introduction to python

68
Programming for Linguists An Introduction to Python

Upload: sabina-bates

Post on 25-Dec-2015

252 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Programming for Linguists An Introduction to Python

Programming for Linguists

An Introduction to Python

Page 2: Programming for Linguists An Introduction to Python

ContactClaudia Peersman

[email protected]

Lange Winkelstraat 40, room L202 (2nd floor)

Page 3: Programming for Linguists An Introduction to Python

Literature“Think Python. How to Think Like a Computer

Scientist?” by Allen B. Downey freely available at:http://greenteapress.com/thinkpython/thinkpython.html

“Natural Language Processing with Python. Analyzing Text with the Natural Language Toolkit” by Steven Bird, Ewan Klein, and Edward Loperfreely available at:http://www.nltk.org/book

Page 4: Programming for Linguists An Introduction to Python

The Python programming language

Part 1

Formal vs. natural languages

The way of the program

Programming for linguists?

What is a program?

Debugging

Your first program

Page 5: Programming for Linguists An Introduction to Python

Formal vs. natural languages

Natural Languages: spoken languages, e.g. English, Dutch, French…not designed by peopleevolved naturally

Formal Languages: designed by people for specific applications, e.g.:in mathematics: notation which denotes

relationships among numbers and symbolsin chemistry: represent the chemical

structure of molecules

Page 6: Programming for Linguists An Introduction to Python

Many features in common:tokens, structure, syntax and

semantics

A lot of differences:Natural Languages Formal Languages

Ambiguity Nearly unambiguous

Redundancy Compact

Idioms and metaphors

Literal: they mean exactly what they

say

Page 7: Programming for Linguists An Introduction to Python

Some Examples5 + 5 = 10

H2O

5 + 5 = 1$0 ???

Zz ???

Illegal tokens $ and Zz

5 +: 5 = 10 ??? Legal tokens, but illegal structure +:

Page 8: Programming for Linguists An Introduction to Python

The way of the programProgramming = the art of problem

solving:formulate problemsthink creatively about possible

solutionsexpress a solution clearly and

accuratelytrial and error

Page 9: Programming for Linguists An Introduction to Python

Low-level vs. high-level languagesLow-level languages = “machine

languages”: only language a computer can execute

High-level languages like Python, Perl, Java, C++ need to be processed to a low-level language to be executed by:

compilersinterpreters

Page 10: Programming for Linguists An Introduction to Python

An interpreter:processes the program a little at a timealternates between reading lines and

performing computations

A compiler:translates the high-level language

completely firstonce a program is compiled, it can be

executed repeatedly without further translation

Page 11: Programming for Linguists An Introduction to Python

Programming for linguists?aim: handle large linguistic corpora

automatic frequency countsdistribution of linguistic features across

different categories, corporalook up context

existing tools are limited, cost money

Page 12: Programming for Linguists An Introduction to Python

About Python…high-level language

open source

executed by an interpreter in two ways:interactive modescript mode

Page 13: Programming for Linguists An Introduction to Python

interactive mode:

open the interpreter>>> prompt = ready to begintype a commandinterpreter prints the results

>>> 1 + 12

Page 14: Programming for Linguists An Introduction to Python

script mode:open a new window in the interpretertype a number of commandssave the program as a python script:

e.g. test.pythe program is executed whenever you

tell the interpreter to run itthe results are printed when the script is

run

Page 15: Programming for Linguists An Introduction to Python

Which mode to use?interactive mode:

good for testing small parts of the program before you go on

does not save the program!

script mode:put together all small parts of code in a

sequence of instructions for the computer to execute

save your programuse it again in the future

Page 16: Programming for Linguists An Introduction to Python

What is a program?a sequence of instructions that

specifies how to perform a computation

for linguists: the computation can also be e.g. looking up the context of words in a text, calculating average word lengths, sentence lengths, …

Page 17: Programming for Linguists An Introduction to Python

Some basic instructionsinput: data you type, a text you load

output: display data on the screen, send data to a file

math: perform basic mathematical operations like +, -, X, :

Page 18: Programming for Linguists An Introduction to Python

conditional execution: check for certain conditions and execute the appropriate instructions

repetition: perform some action repeatedly, usually with some variation

Programming = breaking a large, complex task into smaller and smaller subtasks until the subtasks are simple enough to be performed with one of these basic instructions

Page 19: Programming for Linguists An Introduction to Python

Debugging

Page 20: Programming for Linguists An Introduction to Python

Bugs = programming errors

Debugging = process of tracking down programming errors

Three kinds of bugs:syntax errors runtime errorssemantic errors

Page 21: Programming for Linguists An Introduction to Python

Syntax errorsrefer to the structure of the program

and the rules about that structure

if there is even a single syntax error in your code:

Python will display an error message

the execution of your program will quit immediately

Page 22: Programming for Linguists An Introduction to Python

An exampleparentheses:

(1 + 2) : correct syntax

2) : syntax error

Syntax errors are very common in the beginning. The more you practice and gain experience, the fewer mistakes you will make and the faster you will find them.

Page 23: Programming for Linguists An Introduction to Python

Runtime errorsalso called exceptions

do not appear until after the program has started to run

Python will display an error message

For example: you give the instruction to open a file, but you have typed in the wrong file name or wrong directory

Page 24: Programming for Linguists An Introduction to Python

Semantic errorsThe program will run perfectly, but it

will not produce the results you wanted: the meaning of the program (semantics) is wrong

Tricky errors, because:Python will not display an error

message !!you need to work backward looking

at the output of the program and try to figure out what it is doing exactly

Page 25: Programming for Linguists An Introduction to Python

An examplePython function read( ) vs. readline(

) vs. readlines( )

Page 26: Programming for Linguists An Introduction to Python

Debugging is equally important to programming itself:

not only learn how to write a programlearn to write a program that workslearn to write a program that does

what you want it to do

Always try out small pieces of code before you go on with writing your program

Try out your code on short pieces of text, so that you can verify your results manually

Page 27: Programming for Linguists An Introduction to Python

Your first programopen IDLE (desktop)

The first program is usually called “Hello, world!”

In Python:>>> print “Hello, world!” or

>>> print ‘Hello, world!’

Mind the quotation marks!

Page 28: Programming for Linguists An Introduction to Python

This is the print statement

The quotation marks mark the beginning and the end of the text to be displayed

The quotation marks do not appear in the result

Page 29: Programming for Linguists An Introduction to Python

Why we teach Python:

e.g. in Java:

public class Hello

{

public static void main( String[] args )

{

System.out.println( "Hello, World!" );

}}

Page 30: Programming for Linguists An Introduction to Python

Make some mistakesWhat happens if you:

leave out one of the quotation marksreplace “ by ‘ or vice versa in one

casespell “print” wrongdouble the quotation marksdouble the quotation marks, but

change the order

Page 31: Programming for Linguists An Introduction to Python

By making mistakes on purpose you will:learn which details are important

in writing program codelearn to debug more efficiently,

because you get to know what the error messages mean

Page 32: Programming for Linguists An Introduction to Python

Try it yourselvesWe will make time to try out new

things as we proceed

Programming is a new way of thinking for linguists

If there is a problem or you have a question, do not hesitate to mention it immediately

Page 33: Programming for Linguists An Introduction to Python

Values and Typesvalues = basic elements of a

programe.g. print “Hello, world!”

each value has a type:integerstringfloat

Page 34: Programming for Linguists An Introduction to Python

Integer: all non-decimal numbers e.g. 105

String: a string of letterse.g. “Hello, World!”

Float: numbers with a decimal pointe.g. 10.5

The interpreter can tell you the type of a value:

>>> type(105)<type ‘int’>

Page 35: Programming for Linguists An Introduction to Python

Try to find out what the type is of the following values:

“Hello!”3.1415Dag Jan“123”“123.456”

Page 36: Programming for Linguists An Introduction to Python

Try this:>>> print 123,456

Float types always have a dot, never a comma

To which kind of error could this lead?runtime errorsyntax errorsemantic error

Page 37: Programming for Linguists An Introduction to Python

VariablesA name that refers to a value

An assignment statement creates new variables and assigns values to them

You can choose the name yourselfe.g.

>>> text = “Everything except ‘Hello, world!’”>>> age = 26>>> pi = 3.1415

Page 38: Programming for Linguists An Introduction to Python

The variables now carry the values we assigned to them:>>> print text>>> print age>>> print pi

The interpreter can again tell you the type:>>> type(text)

Page 39: Programming for Linguists An Introduction to Python

Variable names:can be arbitrarily longcan contain both letters and

numbershave to begin with a lettercan contain uppercase lettersare case sensitive !

If you use an illegal character in your name, you will get a syntax error message:e.g. my name, live@

Page 40: Programming for Linguists An Introduction to Python

You cannot choose a name that is a keyword in Python:

and del from as elif global assert else if break except import class exec in continue finally is def for lambda not while or with pass yield print raise return try

Tip: try to choose names which describe what the variable is used for

Page 41: Programming for Linguists An Introduction to Python

StatementsUnits of code that the Python

interpreter can execute

So far we have seen the print statement and the assignment statement

A program usually contains a series of statements that are executed in an order predetermined by the programmer

Page 42: Programming for Linguists An Introduction to Python

e.g.>>> age1 = 20>>> age2 = 40>>> print age240>>> average_age = (age1 + age2)/2>>> print average_age30

Page 43: Programming for Linguists An Introduction to Python

You always have to assign a value to a variable before you can work with it

Variables have to be spelled in the same way throughout the program

If you assign a new value to an existing variable, the old value is deleted

e.g. >>>age = 20

>>>age = age + 20

>>>print age

Page 44: Programming for Linguists An Introduction to Python

Operators and OperandsOperators = special symbols that

represent computationse.g. +, -, *, /, **

Operands = the values the operator is applied toe.g. 2 + 2

Try 2/3

Page 45: Programming for Linguists An Introduction to Python

When both operands are integers, the result is again an integer

If you want a floating-point result, you have to make one of the operands a floating-point number:>>> 2/3.00.66666666666666663

you can also give a command at the beginning of your script:from __future__ import division

Page 46: Programming for Linguists An Introduction to Python

ExpressionsA combination of values, variables,

and operatorsTry:

>>>x = 5>>>x + 1

Now make a script of it (File New window) and run it (Run Run module)

Page 47: Programming for Linguists An Introduction to Python

In a script an expression all by itself does not print a result !!!

How can you modify the script so that it does produce a result ?

Page 48: Programming for Linguists An Introduction to Python

Order of OperationsThe order of evaluation depends on

the rules of precedence

For mathematical operators, Python follows mathematical conventions:ParenthesesExponentiationMultiplication and divisionAddition and subtraction

Page 49: Programming for Linguists An Introduction to Python

String OperationsIn general: no mathematical

operations on stringse.g. “hello”/ “hi” TypeError: unsupported operand type(s) for /: 'str' and 'str’

Except: the + and * operators

Page 50: Programming for Linguists An Introduction to Python

Try:“hello” + “hi”“hello”*2

String + string = concatenation

string * int = repetition

Page 51: Programming for Linguists An Introduction to Python

An expression that is either True or Falsee.g. the operator ==>>>5 == 5True>>>5 == 6False

True & False: <type ‘bool’> not string

Boolean Expressions

Page 52: Programming for Linguists An Introduction to Python

x == y x is equal to y

x != y x is not equal to y

x > y x is greater than y

x < y x is smaller than y

x >= y x is greater than or equal to y

x <= y x is smaller than or equal to y

Relational Operators

Page 53: Programming for Linguists An Introduction to Python

Remember that: = is an assignment operator used to assign a value to a variable== is a relational operator used to express equalityorder is again important

=< , =>, =! do not work!

Page 54: Programming for Linguists An Introduction to Python

and

or

not

Return a boolean expression:True or False

Logical Operators

Page 55: Programming for Linguists An Introduction to Python

Which would return True?

x > 0 and x < 5x == 3 or x == 4not(x > 5)

Page 56: Programming for Linguists An Introduction to Python

Conditional statements check conditions and change the behaviour of the program accordingly

if statement:e.g. >>>if x > 0 :

print “x is positive” #body

Conditional Execution

Page 57: Programming for Linguists An Introduction to Python

Only if the condition is True, the print statement will be executed

There is no limit on the number of statements that can appear in the body

There has to be at least one statement in the body

You can use pass as a temporary substitute for code you have not written yet:

if x > 0:pass

Page 58: Programming for Linguists An Introduction to Python

There are more than 2 possibilities

if, elif (else if) & elsee.g.if x > y :

print “x is greater than y”elif x < y :

print “x is smaller than y”else :

print “x is equal to y”

Chained Conditionals

Page 59: Programming for Linguists An Introduction to Python

There is no limit on the number of elif statements

Every elif statement has to contain at least one statement

The else statement has to come at the end, but is not necessary

Each condition is checked in order

Page 60: Programming for Linguists An Introduction to Python

What would the result be if x = 8?

if x == 0:print “x is 0”

elif x > 0:print “x is greater than 0”

elif x > 0 and x <10:print “x is between 0 and 10”

else:print “x is smaller than 0”

Page 61: Programming for Linguists An Introduction to Python

If one of the conditions is True, the corresponding branch executes and the statement ends

Even if more than one condition is True, only the first True branch executes !

Page 62: Programming for Linguists An Introduction to Python

Some CommentsAs programs grow and become more

complicated, they get more difficult to read

You can add notes which explain (for yourself and for others who read your code) what the program is doing:start a piece of code with “#” and add

your commenteverything from the # to the end of the

line is ignored by the program

Page 63: Programming for Linguists An Introduction to Python

Put 2 numbers in different variables

Print the results for the operands +, -, *, /, ** when they are applied to these 2 variables (with floating-point numbers as a result for division)

Exercises

Page 64: Programming for Linguists An Introduction to Python

x = 2 y = 3 print "x =", x print "y =", y print "x + y =", x + y print "x - y =", x - y print "x * y =", x * y print "x / y =", x / float(y) print "x**y =", x**y

Page 65: Programming for Linguists An Introduction to Python

For Next Week…Write a script called yourname_ex1.py

that calculates the average weight of 5 variables:36.5 kg47.8 kg33 kg68.3 kg72 kg

Page 66: Programming for Linguists An Introduction to Python

Write a script called yourname_ex2.py that assigns an integer value to a variable “age” and prints “you are a minor” if the value is under 18, that prints “you are over 18” if the value is 18 or more and prints “you are kidding” if the value is less than 0.

Page 68: Programming for Linguists An Introduction to Python

Please mail by Tuesday next week:the scripts from the exercisesthe subject of your dissertation

[email protected]

Thank you