# stat115 stat225 bist512 bio298 - intro to computational biology python tutorial ii monty python,...

Post on 29-Dec-2015

216 views

Embed Size (px)

TRANSCRIPT

Pythoun Tutoral

Python Tutorial II Monty Python, Game of Life and Sequence AlignmentFeb 1, 2011Daniel Fernandez and Alejandro Quiroz dfernan@gmail.comaquiroz@hsph.harvard.edu

1STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology1

1st ACT (1 hour)Random ModuleMonty HallGame of LifeSequence Alignment

INTERMISSIONChillout sessions (10 min)

2nd ACT (1 hour 50 min)Homework help Q5, Q6, Q7 and Q8.2STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology2Important Module: randomMethodResultrandint(x,y)Integer Random numbers between integer x and yrand()Random()distname(a,b)Uniform, Triangular, Gaussian, Lognormal, Negative, Exponential, Gamma, Beta, Pareto, Weibullchoice(list)Choose an element from a list at randomsample(list, k) Choose k elements from a list at random without replacement!shuffle(list)Shuffles the element in listseed()Change the seed to generate random numbersSTAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology3Example. Simulate Flip of a Coin.import randomcoin = [heads, tails]num_heads = 0num_tails = 0for i in range(0,1000):flip = random.choice(coin)if flip == heads:num_heads += 1else:num_tails += 1print number of heads: , num_headsprint number of heads: , num_tails

STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology4Monty Hall ProblemSuppose youre on a game show, and youre given a choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say number 3, and the host, who knows whats behind the doors, opens another door, say number 2, which has a goat. He says to you, Do you want to pick door number 1? Is it to your advantage to switch your choice of doors?5STAT115 STAT225 BIST512 BIO298 - Intro to Computational BiologyMonty Hall ProblemRun montyhall.py to see the results.Read montyhall.py and try to understand what did the program do?

Visual Simulation.Python source.6

Solution: montyhall.pyUsage: python montyhall.pySTAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology6Exercise 1. Read a fasta file.Write a python module for reading fasta files add it to your utils.py module if feeling lazy read q7 code.

Solution: ex1_fasta.pyUsage: from ex1_fasta import *7

STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology7Exercise 2. Complimentary DNA sequence and palindromic sequenceWrite a program that takes as an input a DNA sequence 5 to 3 and returns the same sequence 3 to 5 end (i.e., its reverse complement).Also make the program to output if the sequence is a palindromic sequence or not.HINT: http://en.wikipedia.org/wiki/Complementarity_(molecular_biology)

Solution: ex2_complimentarydna.pyUsage: python ex2_complimentarydna.py8STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology89Life is a "game" or cellular automaton - an evolving computational state system - developed by a Cambridge mathematician named John Conway.

The idea is simple: start with a board of dimensions (x,y). Populate the board with an initial pattern of occupied and empty cells. In every turn, the rules are:

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.

You can get really strange, unpredictable behavior out of very simple initial patterns, and many mathematicians have spent a lot of time thinking about how this works.GAME OF LIFE

STAT115 STAT225 BIST512 BIO298 - Intro to Computational BiologyGame of Life10Life is a "game" or cellular automaton developed by Conway.

Instructions:

Start with a board of dimensions (x,y). Populate the board with an initial pattern of occupied and empty cells. In every turn, the rules are:

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.Life is a "game" or cellular automaton - an evolving computational state system - developed by a Cambridge mathematician named John Conway.

The idea is simple: start with a board of dimensions (x,y). Populate the board with an initial pattern of occupied and empty cells. In every turn, the rules are:

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.

You can get really strange, unpredictable behavior out of very simple initial patterns, and many mathematicians have spent a lot of time thinking about how this works.STAT115 STAT225 BIST512 BIO298 - Intro to Computational BiologyGame of LifeRun the game of life (in the terminal)First Install Jython Standard package into Then add to your .bash_profile# For Jythonexport JYTHON_HOME=/Users/dfernan/bin/jython2.5.2/export PATH=$JYTHON_HOME:$PATHexport CLASSPATH=$JYTHON_HOME/jython.jar:$CLASSPATHjython LifeGame.py11

Solution: LifeGame.py (GridMutator.py)Usage: jython LifeGame.pySTAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology11HH Question 5. Melting Temp12

STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology12HH Question 5. Melting Temp13Usage: python q5.py q5_input.txt q5.output 20 55

STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology13HH Question 6. Longest SequenceAny ideas for retrieving the longest exact matching sequence between two sequences?

How to read a fasta file? Write a function that takes a file name as an input and outputs a list containing each sequence in the fasta file.If lazy, just look at homework q7.

Solution: fasta.py, Q8_input.fastaUsage: Use it as a python module containing the fasta class and the read_fasta function14STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology14

Sequence Alignment15Life is a "game" or cellular automaton - an evolving computational state system - developed by a Cambridge mathematician named John Conway.

The idea is simple: start with a board of dimensions (x,y). Populate the board with an initial pattern of occupied and empty cells. In every turn, the rules are:

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.

You can get really strange, unpredictable behavior out of very simple initial patterns, and many mathematicians have spent a lot of time thinking about how this works.

How many operations? _____

STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology

Sequence Alignment16

HOMOLOGOUSParalogsOrthologous

STAT115 STAT225 BIST512 BIO298 - Intro to Computational BiologySequence Alignment17Life is a "game" or cellular automaton - an evolving computational state system - developed by a Cambridge mathematician named John Conway.

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.

You can get really strange, unpredictable behavior out of very simple initial patterns, and many mathematicians have spent a lot of time thinking about how this works.

STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology18Life is a "game" or cellular automaton - an evolving computational state system - developed by a Cambridge mathematician named John Conway.

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.

You can get really strange, unpredictable behavior out of very simple initial patterns, and many mathematicians have spent a lot of time thinking about how this works.

Align the following sequences and explain it. Bellow are the sequences and the match/mismatch (sub)BLOSUM matrix (HH1 and HH7)Sequence AlignmentSTAT115 STAT225 BIST512 BIO298 - Intro to Computational BiologySequence Alignment19Dynamic Programming:The art of dividing a problem into simpler (sub)problems and then apply the sub-solutions recursively in order to obtain the final solutionLife is a "game" or cellular automaton - an evolving computational state system - developed by a Cambridge mathematician named John Conway.

(i) if an empty cell has three neighbors, fill it next turn;

(ii) if an occupied cell has zero or one neighbor, it dies of loneliness; and

(iii) if an occupied cell has four or more neighbors, it dies of overcrowding.

You can get really strange, unpredictable behavior out of very simple initial patterns, and many mathematicians have spent a lot of time thinking about how this works.ijNew best alignment = Best previous alignment + align (i,j)How many operations? _____ Memory cost? _______STAT115 STAT225 BIST512 BIO298 - Intro to Computational BiologySequence Alignment20Life is a "game" or cellular automaton - a