![Page 1: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/1.jpg)
Bioinformatics I Sequence Analysis
![Page 2: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/2.jpg)
Prerequisites•Comp Sci 1 and 2. Can you program?
•Molecular biology. Do you know the central dogma?
•Molecular biochemistry. What molecular functions can you name?
![Page 3: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/3.jpg)
You need this:
Z&B
![Page 4: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/4.jpg)
...and this.
4
www.geneious.com
![Page 5: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/5.jpg)
Chris, don’t forget to...
• Review the syllabus• Set office hours
5
![Page 6: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/6.jpg)
Consider how humans and mouse have diverged...
•Mouse and human had a common ancestor about 80 MYA.
•Evolution occurs by point mutations, insertions, deletions and rearrangements.
•Individual mouse genes and human genes are 80 to 95% identical.
•However, gene locations are scrambled! (Maybe gene location matters more than its sequence???)
actual content starts here
![Page 7: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/7.jpg)
Inter-chromosomal rearrangement
![Page 8: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/8.jpg)
Mouse chromosomes re-constructed from human
8
![Page 9: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/9.jpg)
Largescale evolutionary changes in chromosomes
• Duplication
9
“...the most important evolutionary force since the emergence of the universal common ancestor.” -- Susumu Ohno
![Page 10: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/10.jpg)
Inversions•A,B,C,D are genes or syntenic groups.
A BC
D
CB
DA
5'
5'3'
3'
Messages in C,B region are now read from the opposite (-) strand.
![Page 11: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/11.jpg)
Transposition
A syntenic group appears on the opposite strand, different location.
![Page 12: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/12.jpg)
Syntenic group in bacteria: Trp operon
![Page 13: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/13.jpg)
One way inversion can happen in Eukaryotes
Remember Meiosis?
![Page 14: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/14.jpg)
Normal synapsis does not change the order of genes, just swaps alleles.
3'A B
CD
b c
ad
5'
5' 3'
5'b
cDA
Normal prophase 1
3'
5'B
Cda
3'
non-sister cromatids synapse. Site of homologous recombination
![Page 15: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/15.jpg)
Illegitimate synapsis changes the order and direction of genes, and swaps alleles.
3'A B
CD
cb
da5'
5'3'
5'c
bDA
Abnormal prophase 1.
3'
5'C
Bda
3'
illegitimate synapse
INVERSION
![Page 16: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/16.jpg)
Edit distance as a series of mutations
ACGGCGTGCTTTAGAACATAG
AAGGCGTGCTTTAGAACATAG
AAGGCGTGCGTTAGAACATAG
ACGGCGTGCGTAAGGACAATAG
• • •
time
dista
nce
=4
homoplasy
![Page 17: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/17.jpg)
Edit distance (genomic) as a series of reversals
123456789
432156789
436512789
435612789
435698721
123498765
123789465
123789645
873219645
ancestral mammal
humanmouse
![Page 18: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/18.jpg)
rug rat mouse rat
![Page 19: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/19.jpg)
The Pancake Flipping ProblemA sloppy cook at a pancake diner makes pancakes of all different sizes and stacks them haphazardly.
The waiter likes the pancakes to be stacked with the largest on the bottom and the smallest on top. On the way to the table, using only one hand with a spatula, he flips the pancakes until they are arranged by size, largest on bottom, smallest on top.
•What is the algorithm for flipping?
•What is the algorithm for finding the fewest flips?
•Same problem, pancakes burned on one side.
![Page 20: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/20.jpg)
In class exercise:
Given the arrangement below, flip the pancakes until they are in order. How many flips? (You can order the numbers instead of the pancakes.)
642315 123456
![Page 21: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/21.jpg)
In class exercise: work in pairs
•Write detailed instructions on how to stack six pancakes in order by flipping. The instructions should not depend on the starting order.
•Give your instructions to your partner. Follow your partner's written instructions to stack the following six "pancakes" in order: 125436 ---> ... ---> ... ---> 123456
•On the board: Convert these instructions to pseudocode.
![Page 22: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/22.jpg)
Minimum inversions algorithm using maximal cycle decomposition
22Vineet Bafna and Pavel A. Pevzner. Molecular Biology and Evolution 12 (2): 239. (1995)
01234567 ==> 03152647
![Page 23: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/23.jpg)
Reversals put genes on the opposite strand.123456789
-4-3-2-156789
-4-3-6-512789
-4-35612789
-4-356-9-8-7-2-1
1234-9-8-7-6-5
123789-4-6-5
12378964-5
-8-7-3-2-1964-5
ancestral mammal
humanmouse
“-” indicates that the gene is on the reverse complement strand.
“pancakes burned on one side” problem
![Page 24: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/24.jpg)
Plotting rearranged genomes
24
1 2 3 4 5 6 7 8 9-4 -3 5 6 -9 -8 -7 -2-1
1 2 3 4 5 6 7 8 9 -4 -3 5 6 -9 -8 -7 -2-1
![Page 25: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/25.jpg)
Alignment matrix
25
A C T G A A C C T1
11
11
11
11
0 0 0 0 0 0 0 00 0 0 0 0 0 0
0 0 0 0 0 00
000 0 0 0 000 0
0 0 0 0 000 00 0 0 0 000 00 0 0 0 000 00 0 0 0 000 00 0 0 0 000 0
A C
T G A
A C
C T
1=aligned (associated)0=not aligned
•Boolean matrix.•Only one “1” per row.•Only one “1” per column.•If A(i,j)==1, then A(m,n)=0 for all (m<i && n>j)•If A(i,j)==1, then A(m,n)=0 for all (m>i && n<j)
![Page 26: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/26.jpg)
Alignment matrix
26
A C T G A A C C T
A C
T G A
A C
C T
easier to draw this way
![Page 27: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/27.jpg)
Alignment matrix
27
A C T G A A C C T
A C
T G A
A C
C T
easier still
![Page 28: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/28.jpg)
Alignment matrix
28
reverse alignment
A C T G A A C C T
TC
CA
AG
TC
A(NOT biologically relevant!)
![Page 29: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/29.jpg)
Alignment matrix
29
reverse complement alignment
(yes, biologically relevant, for nucleotides!)
A C T G A A C C T
AG
GT
TC
AG
T
![Page 30: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/30.jpg)
Plotting rearranged genomes
30
Z&B p. 158
1 2 3 4 5 6 7 8 9-4 -3 5 6 -9 -8 -7 -2-1
1 2 3 4 5 6 7 8 9 -4 -3 5 6 -9 -8 -7 -2-1
Each number represents a long sequence
Each negative number represents its reverse complement
![Page 31: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/31.jpg)
Dot PlotEach position in the matrix D[i,j] is either
dot, if A[i] == B[j]
blank, otherwise.
AAGACGTTTA GACGTACT
![Page 32: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/32.jpg)
A more advanced dot plotWith thousands of bases, it is impossible to plot all dots in the matrix. Instead we look for stretches of sequence with few mismatches. If the number of mismatches is less than the cutoff, plot a dot or line.
AAGACGTTTA GACGTACT
Show all diagonals with at least 4 out
of 5 matches.
"window size" is the length of a diagonal, "stringency" is minimum number of matches in the window.
![Page 33: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/33.jpg)
Dot matrix with reverse complement
AAGACGTTTA GACGTACT
Base matches its complementReverse diagonal means inverse alignment
![Page 34: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/34.jpg)
Install Geneious• www.geneious.com• Download “Free Trial”• Install (follow instructions of installer program)• Start Geneious Pro
– From top menu: Pro->Enter a license key...– Licensee’s name= your name– paste license key
34
![Page 35: Bioinformatics I Sequence Analysis - Purdue University · Consider how humans and mouse have diverged... •Mouse and human had a common ancestor about 80 MYA. •Evolution occurs](https://reader033.vdocuments.mx/reader033/viewer/2022041907/5e64d1e7d44c7d3a745d9a37/html5/thumbnails/35.jpg)
Explore Geneious• Sample Documents/Nucleotide Documents/DCN
gene• DotPlot (self), high sensitivity. Find a
WindowSize=50 with at least Threshold=80• Select. Got to Sequence View.• Extract.• Select extracted sequences. Align. (use defaults)• What is the Pairwise % Identity?
35