an introduction to python and its use in bioinformatics dr. nancy warter-perez april 19, 2005
Post on 21-Dec-2015
220 views
TRANSCRIPT
An Introduction to Python and Its Use in Bioinformatics
Dr. Nancy Warter-PerezApril 19, 2005
4/19/05 Introduction to Python 2
Overview What is Bioinformatics? Overview of program/script development (PP
Ch3) Python Basics (PP Ch1) Python Types and Operators
Numbers and Arithmetic operators (PP Ch2) Strings (PP Ch4) Lists and Dictionaries (PP Ch5) Input & Output (PP Ch2)
Programming Workshop #1
4/19/05 Introduction to Python 3
What is Bioinformatics? Fredj Tekaia at the Institut Pasteur
offers this definition of bioinformatics:"The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information."
4/19/05 Introduction to Python 4
Classical Bioinformatics According to Damian Counsell from
bioinformatics.org“use computers to store, retrieve, analyze or predict the composition or the structure of biomolecules. As computers become more powerful you could probably add simulate to this list of bioinformatics verbs. "Biomolecules" include your genetic material---nucleic acids---and the products of your genes: proteins. These are the concerns of "classical" bioinformatics, dealing primarily with sequence analysis.”
4/19/05 Introduction to Python 5
“New” Bioinformatics comparative genomics - look for
differences and similarities between all the genes of multiple species
functional genomics - identifying gene functions and associations
proteomics - catalogue the activities and characterize interactions between all gene products (in humans)
structural genomics - crystallize and or predict the structures of all proteins (in humans)
4/19/05 Introduction to Python 6
Program DevelopmentProblem specification
Algorithm design
Test by hand
Code in target language
Test code / debug
Program/Script
Problem solving
Implementation
4/19/05 Introduction to Python 7
What is Python? A portable, interpretive, object-
oriented programming language Elegant syntax Powerful high-level built-in data
types Numbers, strings, lists, dictionaries
Full set of string operations
4/19/05 Introduction to Python 8
Why Python? Previously used C++ Scripting languages useful for
bioinformatics Perl is “bioinformatics standard” Python is more “robust” for larger
software projects
4/19/05 Introduction to Python 9
Useful Tutorials DNA from the Beginning
http://www.dnaftb.org/dnaftb/ Python Tutorial
http://www.python.org/doc/current/tut/tut.html
4/19/05 Introduction to Python 10
Python Development Open-Source Software
Python interpreter - will run on windows, you need to download it in two parts:1. The actual interpreter and core of python http://www.python.org/2.3.3/ (get the Python-2.3.3.exe file. There is a newer release (2.4.1) that you can download if you’d prefer.)
2. An integrated development environment for python called pythonwin, by Mark Hammond http://starship.python.net/crew/mhammond/win32/Downloads.html
4/19/05 Introduction to Python 11
Python Basics - Comments Python comments
# line comment Header comments
#Description of program#Written by:#Date created:#Last Modified:
4/19/05 Introduction to Python 12
Python Basics - Variables Python variables are not “declared”.
To assign a variable, just type: identifier=literal Identifiers
Have the following restrictions: Must start with a letter or underscore (_) Case sensitive Must consist of only letters, numbers or underscore Must not be a reserved word
Have the following conventions: All uppercase letters are used for constants Variable names are meaningful – thus, often multi-word (but not too
long) Convention 1: alignment_sequence (align_seq) Convention 2: AlignmentSequence (AlignSeq)
Python specific conventions (Avoid _X, __X__, __X, _)
4/19/05 Introduction to Python 13
Numbers Numbers
Normal Integers –represent whole numbers Ex: 3, -7, 123, 76
Long Integers – unlimited sizeEx: 9999999999999999999999L
Floating-point – represent numbers with decimal places
Ex: 1.2, 3.14159,3.14e-10 Octal and hexadecimal numbers
Ex: O177, 0x9ff, Oxff Complex numbers
Ex: 3+4j, 3.0+4.0j, 3J
4/19/05 Introduction to Python 14
Python Basics – arithmetic operations
+ add- subract* multiply/ divide% modulus/remainder
y=5; z=3x = y + z x = y – z x = y * z x = y / z x = y % z
x = 8x = 2x = 15x = 1x = 2
OperatorsExample
4/19/05 Introduction to Python 15
Python Basics – arithmetic operations
<< shift left
>> shift right** raise to power
y=5; z=3x = y << 1 x = y >> 2 x = y ** z
x = 10x = 1x = 125
OperatorsExample
4/19/05 Introduction to Python 16
Python Basics – Relational and Logical Operators
Relational operators== equal!=, <> not equal>greater than>= greater
than or equal
<less than<= less than or
equal
Logical operatorsand andor ornot not
4/19/05 Introduction to Python 17
Python Basics – Relational Operators Assume x = 1, y = 4, z = 14
Expression Value Interpretation
x < y + z 1 True
y == 2 * x + 3
0 False
z <= x + y 0 False
z > x 1 True
x != y 1 True
4/19/05 Introduction to Python 18
Python Basics – Logical Operators Assume x = 1, y = 4, z = 14
Expression Value Interpretation
x<=1 and y==3 0 False
x<= 1 or y==3 1 True
not (x > 1) 1 True
not x > 1 0 False
not (x<=1 or y==3)
0 False
4/19/05 Introduction to Python 19
Enclosed in single or double quotesEx: ‘Hello!’ , “Hello!”, “3.5”, “a”, ‘a’
Sequence of characters:mystring=“hello world!”
mystring[0] -> “h” mystring[1] -> “e”
mystring[2] -> “l” mystring[-1] -> “!”
Strings
-1 is last,
-2 next to last, etc…
4/19/05 Introduction to Python 20
String operations
mystring = “Hello World!”
Expression Value Purposelen(mystring) 12 number of characters in
mystring
“hello”+“world” “helloworld” Concatenate strings
“%s world”%“hello” “hello world” Format strings (like sprintf)
“world” == “hello”
“world” == “world”
0 or False
1 or True
Test for equality
“a” < “b”
“b” < “a”
1 or True
0 or False
Alphabetical ordering
4/19/05 Introduction to Python 21
Strings (2) slicing:mystring = “spoon!”
mystring[2:] -> “oon!”mystring[:3] -> “spo” #note last element is never included!
mystring[1:3]-> “po” Many useful built-in functions
mystring.upper() -> “SPOON!” mystring.replace(‘o’, ‘O’) -> “spOOn!”
4/19/05 Introduction to Python 22
Strings (3) “%” operator:
sort of “fill in the blanks” operation:mystring=“%s has %d marbles” % (“John”,35)
mystring -> “John has 35 marbles”
%s replace with string %d,%i replace with integer %f replace with float
Values to put in blanks
“blanks”
4/19/05 Introduction to Python 23
Lists
mylist=[“a”,”b”,3.58,”d”,4,0]mylist[0]mylist[2]
a3.58
Indexing
mylist[-1]mylist[-2]
04
Negative indexing (counts from end)
mylist[1:4] [“b”,3.58,”d”] Slicing (like strings)
“b” in mylist“e” not in mylist
1 or True1 or True
mylist.append(8) [“a”,”b”,3.58,”d”,4,0,8]
Add to end of list
4/19/05 Introduction to Python 24
Tuples Tuples – sequence of values
like lists, but cannot be changed after it is createdmytuple=(1,”a”,”bc”,3,87.2)mytuple[2] -> “bc”
mytuple[1]=“3” Used when you want to pass several
variables around at once
Error!
4/19/05 Introduction to Python 25
Dictionaries Dictionaries – map ‘keys’ to ‘values’
like lists, but indices can be of any type Also, keys are in no particular order Eg:mydict={‘b’:3, ’a’:4, 75:2.85}mydict[‘b’] -> 3mydict[75] -> 2.85mydict[‘a’] -> 4
4/19/05 Introduction to Python 26
Dictionaries
mydict={“r”:1,”g”:2,”y”:3.5,8.5:8,9:”nine”}mydict.keys() ['y', 8.5, 'r', 'g', 9] List of the keys
mydict.values() [3.5, 8, 1, 2, 'nine'] List of the values
mydict[“y”] 3.5 Value lookup
mydict.has_key(“r”) True or 1 Check for keys
mydict.update({“a”:75})
{8.5: 8, 'a': 75, 'r': 1, 'g': 2, 'y': 3.5, 9: 'nine'}
Add pairs to dictionary
4/19/05 Introduction to Python 27
Dictionaries – other considerations Slicing not allowed Referencing invalid key is an error:>>> mydict={8.5: 8, 'a': 75, 'r': 1, 'g': 2, 'y':
3.5, 9: 'nine'}>>> mydict["red"]Traceback (most recent call last):
File "<interactive input>", line 1, in ?KeyError: 'red‘
Use mydict.get(“red”) instead, it returns None if key is not found
4/19/05 Introduction to Python 28
Input/Output Function raw_input() designed to read a line of
input from the user 1 optional argument: string to prompt user If int or float desired, simply convert string:
int(mystring)->convert to int (if possible)
float(mystring)->convert to float (if possible)
>>> mystr=raw_input("Enter a string:")Enter a string:Hello World!>>> mystr'Hello World!'
4/19/05 Introduction to Python 29
Output Function print
Prints each argument, followed by space
After all arguments, prints newline
Put comma after last arg to prevent newline
“add” strings to avoid spaces
print “a”,”b”,”c”a b c
print “a”,”b”,”c”,a b c
print “a”+”b”+”c”abc
Newline!
No Newline!
No spaces!
4/19/05 Introduction to Python 30
Output Example>>> print "hello","world";print "hello","again"
hello world
hello again
>>> print "hello","world",;print "hello","again"
hello world hello again
>>> print "hello %s world" % "cold and cruel"
hello cold and cruel world
>>> print "hello","cold"+ " " + "and","cruel","world"
hello cold and cruel world
4/19/05 Introduction to Python 31
Creating a Python Program Enter your program in the editor
Notice that the editor has a color coding Comments Key words Etc…
Also notice that it automatically indents Don’t override!! – this is how python tells when
block statements end! If doesn’t indent to proper location – indicates bug
4/19/05 Introduction to Python 32
Running your Program To build your program
Under File->Run… Select No Debugging in the drop-down
window Fix any errors, then run again
4/19/05 Introduction to Python 33
Programming Workshop #1
Write a Python program to compute the hydrophobicity of an amino acid
Amino Acid Hydrop. VALUEA 1.8C 2.5D -3.5E -3.5F 2.8G -0.4H -3.2I 4.5K -3.9L 3.8M 1.9N -3.5P -1.6Q -3.5R -4.5S -0.8T -0.7V 4.2W -0.9Y -1.3
Program will prompt the user for an amino acid and will display the hydrophobicity