mat 4830 mathematical modeling 4.1 background on dna
TRANSCRIPT
PNW MAA Meeting
The 7th annual Northwest Undergraduate Mathematics Symposium (NUMS) will be held in conjunction with the 2015 Spring meeting of the PNW MAA Section at the University of Washington Tacoma on April 10-11.
http://www.tacoma.uw.edu/maa-nums
Remarks
No handouts Need to read the textbook for more info All individual HW for this chapter (4.1-
4.6) Techniques learned can be apply to
other applications
Disclaimer
This is not a biology class! I do not know too much biology. We will ignore all possible theological
questions and implications.
Our Learning Philosophy
Acquire minimum background to start the analysis/ modeling.
Ignore the complexity of the biochemical process.
Our Learning Philosophy
Concentrate on certain mathematical problems.
Very interesting problems once we get through the terminologies.
Bases
4 types of smaller molecules:
Adenine (A), Guanine (G) Purine
Cytosine (C), Thymine (T) Pyrimidine
Base Substitution
A common form of mutation. A base is replaced by another base.
A A T C G C
G A T G G C
Base Substitution
Transition: Pur by Pur, Pyr By Pyr Transversion: Pur By Pyr, Pyr By Pur
A A T C G C
G A T G G C
TransitionTransversion
Example
S0 : Ancestral sequenceS1 : Descendant of S0S2 : Descendant of S1
S0 : ATGTCGCCTGATAATGCC
S1 : ATGCCGCTTGACAATGCC
S2 : ATGCCGCGTGATAATGCC
Example
S0 : Ancestral sequenceS1 : Descendant of S0S2 : Descendant of S1
S0 : ATGTCGCCTGATAATGCC
S1 : ATGCCGCTTGACAATGCC
S2 : ATGCCGCGTGATAATGCC
Observed mutations: 2
Example
S0 : Ancestral sequenceS1 : Descendant of S0S2 : Descendant of S1
S0 : ATGTCGCCTGATAATGCC
S1 : ATGCCGCTTGACAATGCC
S2 : ATGCCGCGTGATAATGCC
Actual mutations: 5
Example
S0 : Ancestral sequenceS1 : Descendant of S0S2 : Descendant of S1
S0 : ATGTCGCCTGATAATGCC
S1 : ATGCCGCTTGACAATGCC
S2 : ATGCCGCGTGATAATGCC
Actual mutations: 5, (some are hidden mutations)
What Do We Want?
Compare the initial and final DNA sequences
Develop mathematical models to reconstruct the number of mutations likely to have occurred.
Reality…
Seldom do we actually have an ancestral DNA sequence, much less several from different times along a line of descent.
Instead, we have sequences from several currently living descendants, but no direct information about any of their ancestors.
Reality…
When we compare two sequences, and imagine the mutation process that produced them, the sequence of their most recent common ancestor, from which they both evolved, is unknown.
Orthologous Sequences
Given a DNA sequence from some organism, there are good search algorithms to find similar sequences for other organisms in DNA databases.
If a gene has been identified for one organism, we can quickly locate likely candidate sequences for similar genes in related organisms.
Orthologous Sequences
If the genes has similar function, we can reasonably assume the sequences are descended from a common ancestral sequence (orthologous)
Definition
Given two events and , the conditional probability of given is defined by
( )( | )
( )
P F EP F E
P E
Example
Suppose a 40-base ancestral and descendent DNA sequences are
0
1
S : ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT
S : ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC
Example
Count the frequency of base substitutions.
0
1
S : ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT
S : ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC
1 0\
7 0 1 1 1 9 2 0
0 2 7 2
1 0 1 6
S S A G C T
A
G
C
T
Example
We can estimate
0
1
S : ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT
S : ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC
1 0\
7 0 1 1 1 9 2 0
0 2 7 2
1 0 1 6
S S A G C T
A
G
C
T
1 0( | )P S i S j
Example
We can estimate
0
1
S : ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT
S : ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC
1 0\
7 0 1 1 1 9 2 0
0 2 7 2
1 0 1 6
S S A G C T
A
G
C
T
1 0( | )P S i S j
1 0
1 0
1 0
1 0
7( | )
91
( | )9
( | ) 0
1( | )
9
P S A S A
P S G S A
P S C S A
P S T S A
Example
Q1: What is the sum of the 16 numbers in the table? Why?
0
1
S : ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT
S : ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC
1 0\
7 0 1 1 1 9 2 0
0 2 7 2
1 0 1 6
S S A G C T
A
G
C
T
Example
Q2: What is the meaning of a row sum in the table?
0
1
S : ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT
S : ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC
1 0\
7 0 1 1 1 9 2 0
0 2 7 2
1 0 1 6
S S A G C T
A
G
C
T
Example
We can form a table of conditional probabilities
1 0\
7 1 10
9 11 91 9 2
0 9 11 112 7 2
011 11 9
1 1 60
9 11 9
S S A G C T
A
G
C
T
1 0( | )P S i S j
Example
Q3: What is the sum of the entries in any column of this new table? Why?
1 0\
7 1 10
9 11 91 9 2
0 9 11 112 7 2
011 11 9
1 1 60
9 11 9
S S A G C T
A
G
C
T