functions and modules in python

29
Functions and modules Karin Lagesen [email protected]

Upload: karinlag

Post on 28-Nov-2014

1.738 views

Category:

Technology


2 download

DESCRIPTION

Day 4 of an introductory python course for biologists. Theme: functions and modules.

TRANSCRIPT

Page 1: Functions and modules in python

Functions and modules

Karin Lagesen

[email protected]

Page 2: Functions and modules in python

Homework: TranslateProtein.py

● Input files are in /projects/temporary/cees-python-course/Karin

● translationtable.txt - tab separated● dna31.fsa

● Script should:

● Open the translationtable.txt file and read it into a dictionary

● Open the dna31.fsa file and read the contents.● Translates the DNA into protein using the dictionary● Prints the translation in a fasta format to the file

TranslateProtein.fsa. Each protein line should be 60 characters long.

Page 3: Functions and modules in python

Modularization

● Programs can get big● Risk of doing the same thing many times● Functions and modules encourage

● re-usability● readability● helps with maintenance

Page 4: Functions and modules in python

Functions

● Most common way to modularize a program

● Takes values as parameters, executes code on them, returns results

● Functions also found builtin to Python:● open(filename, mode)● sum([list of numbers]

● These do something on their parameters, and returns the results

Page 5: Functions and modules in python

Functions – how to define

def FunctionName(param1, param2, ...):

""" Optional Function desc (Docstring) """

FUNCTION CODE ...

return DATA

● keyword: def – says this is a function

● functions need names

● parameters are optional, but common

● docstring useful, but not mandatory

● FUNCTION CODE does something

● keyword return results: return

Page 6: Functions and modules in python

Function example

>>> def hello(name):... results = "Hello World to " + name + "!"... return results... >>> hello()Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: hello() takes exactly 1 argument (0 given)>>> hello("Lex")'Hello World to Lex!'>>>

● Task: make script from this – take name from command line

● Print results to screen

Page 7: Functions and modules in python

Function examplescript

import sys

def hello(name): results = "Hello World to " + name + "!" return results

name = sys.argv[1]functionresult = hello(name)print functionresult

[karinlag@freebee]% python hello.py Traceback (most recent call last): File "hello.py", line 8, in ? name = sys.argv[1]IndexError: list index out of range[karinlag@freebee]% python hello.py LexHello World to Lex![karinlag@freebee]%

Page 8: Functions and modules in python

Returning values

● Returning is not mandatory, if no return, None is returned by default

● Can return more than one value - results will be shown as a tuple

>>> def test(x, y):... a = x*y... return x, a... >>> test(1,2)(1, 2)>>>

Page 9: Functions and modules in python

Function scope

● Variables defined inside a function can only be seen there!

● Access the value of variables defined inside of function: return variable

Page 10: Functions and modules in python

>>> def test(x):... z = 10... print "the value of z is " + str(z)... return x*2... >>> z = 50>>> test(3)the value of z is 106>>> z50>>> xTraceback (most recent call last): File "<stdin>", line 1, in <module>NameError: name 'x' is not defined>>>

Scope example

Page 11: Functions and modules in python

Parameters

● Functions can take parameters – not mandatory

● Parameters follow the order in which they are given

>>> def test(x, y):... print x*2... print y + str(x)... >>> test(2, "y")4y2>>> test("y", 2)yyTraceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in testTypeError: unsupported operand type(s) for +: 'int' and 'str'>>>

Page 12: Functions and modules in python

Named parameters

● Can use named parameters

>>> def test(x, y):... print x*2... print y + str(x)... >>> test(2, "y")4y2>>> test("y", 2)yyTraceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in testTypeError: unsupported operand type(s) for +: 'int' and 'str'>>> test(y="y", x=2)4y2>>>

Page 13: Functions and modules in python

Default parameters

● Parameters can be given a default value● With default, parameter does not have to

be specified, default will be used● Can still name parameter in parameter list

>>> def hello(name = "Everybody"):... results = "Hello World to " + name + "!"... return results... >>> hello("Anna")'Hello World to Anna!'>>> hello()'Hello World to Everybody!'>>> hello(name = "Annette")'Hello World to Annette!'>>>

Page 14: Functions and modules in python

Exercise TranslateProteinFunctions.py● Use script from homework● Create the following functions:

● get_translation_table(filename)– return dict with codons and protein codes

● read_dna_string(filename)– return tuple with (descr, DNA_string)

● translate_protein(dictionary, DNA_string)– return the protein version of the DNA string

● pretty_print(descr, protein_string, outname)– write result to outname in fasta format

Page 15: Functions and modules in python

TranslateProteinFunctions.py

import sys

YOUR CODE GOES HERE!!!!

translationtable = sys.argv[1]fastafile = sys.argv[2]outfile = sys.argv[3]

translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)

Page 16: Functions and modules in python

get_translation_table

def get_translation_table(translationtable): fh = open('translationtable.txt' , 'r') trans_dict = {} for line in fh: codon = line.split()[0]

aa = line.split()[1] trans_dict[codon] = aa fh.close() return trans_dict

Page 17: Functions and modules in python

read_dna_string

def read_dna_string(fastafile): fh = open(fastafile, "r") line = fh.readline() header_line = line[1:-1]

seq = "" for line in fh: seq += line[:-1] fh.close() return (header_line, seq)

Page 18: Functions and modules in python

translate_protein

def translate_protein(translation_dict, DNA_string): aa_seq = ""

for i in range(0, len(DNA_string)-3, 3): codon = DNA_string[i:i+3] one_letter = translation_dict[codon] aa_seq += one_letter

return aa_seq

Page 19: Functions and modules in python

pretty_print

def pretty_print(description, protein_string, outfile): fh = open(outfile, "w") fh.write(">" + description + "\n")

for i in range(0, len(protein_string), 60): fh.write(protein_string[i:i+60] + "\n") fh.close()

Page 20: Functions and modules in python

Modules

● A module is a file with functions, constants and other code in it

● Module name = filename without .py● Can be used inside another program● Needs to be import-ed into program● Lots of builtin modules: sys, os, os.path....● Can also create your own

Page 21: Functions and modules in python

Using module

● One of two import statements:

1: import modulename

2: from module import function/constant

● If method 1:● modulename.function(arguments)

● If method 2:● function(arguments) – module name not

needed● beware of function name collision

Page 22: Functions and modules in python

Operating system modules – os and os.path

● Modules dealing with files and operating system interaction

● Commonly used methods:● os.getcwd() - get working directory● os.chdir(path) – change working directory● os.listdir([dir = .]) - get a list of all files in this

directory● os.mkdir(path) – create directory● os.path.join(dirname, dirname/filename...)

Page 23: Functions and modules in python

Your own modules

● Three steps:

1. Create file with functions in it. Module name is same as filename without .py

2. In other script, do import modulename

3. In other script, use function like this: modulename.functionname(args)

Page 24: Functions and modules in python

Separating module use and main use

● Files containing python code can be:● script file● module file

● Module functions can be used in scripts● But: modules can also be scripts● Question is – how do you know if the code

is being executed in the module script or an external script?

Page 25: Functions and modules in python

Module use / main use

● When a script is being run, within that script a variable called __name__ will be set to the string “__main__”

● Can test on this string to see if this script is being run

● Benefit: can define functions in script that can be used in module mode later

Page 26: Functions and modules in python

Module mode / main mode

import sys

<code as before>

translationtable = sys.argv[1]fastafile = sys.argv[2]outfile = sys.argv[3]

translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)

When this script is being used,this will always run, no matter what!

Page 27: Functions and modules in python

Module use / main use

# this is a scriptimport sysimport TranslateProteinFunctions

description, DNA_string = read_dna_string(sys.argv[1])print description

[karinlag@freebee]% python modtest.py dna31.fsa Traceback (most recent call last): File "modtest.py", line 2, in ? import TranslateProteinFunctions File "TranslateProteinFunctions.py", line 44, in ? fastafile = sys.argv[2]IndexError: list index out of range[karinlag@freebee]Karin%

Page 28: Functions and modules in python

TranslateProteinFuctions.py with main

import sys

<code as before>

if __name__ == “__main__”:translationtable = sys.argv[1]fastafile = sys.argv[2]outfile = sys.argv[3]

translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)

Page 29: Functions and modules in python

ConcatFasta.py

● Create a script that has the following:● function get_fastafiles(dirname)

– gets all the files in the directory, checks if they are fasta files (end in .fsa), returns list of fasta files

– hint: you need os.path to create full relative file names

● function concat_fastafiles(filelist, outfile)– takes a list of fasta files, opens and reads each of

them, writes them to outfile

● if __name__ == “__main__”:– do what needs to be done to run script

● Remember imports!