biology and bioinformatics gabor t. marth department of biology, boston college [email protected] bi820...
Post on 19-Dec-2015
219 views
TRANSCRIPT
![Page 1: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/1.jpg)
Biology and
Bioinformatics
Gabor T. Marth
Department of Biology, Boston [email protected]
BI820 – Seminar in Quantitative and Computational Problems in Genomics
![Page 2: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/2.jpg)
The animal cell
![Page 3: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/3.jpg)
DNA – the carrier of the genetic code
![Page 4: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/4.jpg)
DNA organization – chromosomes
![Page 5: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/5.jpg)
Translation of genetic information
![Page 6: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/6.jpg)
DNA sequencing informatics
DNA sequencing informatics
![Page 7: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/7.jpg)
DNA organization
![Page 8: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/8.jpg)
Genome annotation
![Page 9: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/9.jpg)
De novo gene prediction
![Page 10: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/10.jpg)
Similarity-based gene prediction
![Page 11: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/11.jpg)
Gene localization
![Page 12: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/12.jpg)
Genetic mapping
![Page 13: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/13.jpg)
Gene function
![Page 14: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/14.jpg)
Expression analysis
![Page 15: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/15.jpg)
Protein structure
![Page 16: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/16.jpg)
RNA structure
![Page 17: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/17.jpg)
Protein structure prediction
![Page 18: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/18.jpg)
RNA structure prediction
![Page 19: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/19.jpg)
DNA evolution
![Page 20: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/20.jpg)
Evolution of chromosome organization
![Page 21: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/21.jpg)
Evolution of gene structure
![Page 22: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/22.jpg)
Evolution of DNA sequence
![Page 23: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/23.jpg)
Comparative genomics
![Page 24: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/24.jpg)
Phylogenetics
![Page 25: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/25.jpg)
Mechanisms of molecular evolution
![Page 26: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/26.jpg)
Sequence variations
• Human Genome Project produced a reference genome sequence that is 99.9% common to each human being
• sequence variations make our genetic makeup unique
SNP
• Single-nucleotide polymorphisms (SNPs) are most abundant, but other types of variations exist and are important
![Page 27: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/27.jpg)
Why do we care about variations?
phenotypic differences
demographic history
inherited diseases
![Page 28: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/28.jpg)
How do we find polymorphisms?
• look at multiple sequences from the same genome region
• diverse sequence resources can be used EST
WGS
BAC
• diversion: sequencing informatics
![Page 29: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/29.jpg)
SNP discovery -- Methods
Sequence clustering
Cluster refinement
Multiple alignment
SNP detection
![Page 30: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/30.jpg)
SNP discovery – Computer tools
![Page 31: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/31.jpg)
>CloneXACGTTGCAACGTGTCAATGCTGCA
>CloneYACGTTGCAACGTGTCAATGCTGCA
ACCTAGGAGACTGAACTTACTGACCTAGGAGACCGAACTTACTG
~ 30,000 clones
25,901 clones (7,122 finished, 18,779 draftwith basequality values)
21,020 clone overlaps(124,356 fragment overlaps)
507,152 high-quality candidate SNPs(validation rate 83-96%)
Marth et al., Nature Genetics 2001
SNP discovery – Mining Projects
![Page 32: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/32.jpg)
SNP databases and characteristics
• access to variation data• SNP properties• reliability of information
• characterizing known polymorphic sites in sample collections – genotyping
![Page 33: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/33.jpg)
Where do variations come from?
• sequence variations are the result of mutation events TAAAAAT
TAACAAT
TAAAAAT TAAAAAT TAACAAT TAACAAT TAACAAT
TAAAAAT TAACAAT
TAAAAAT
MRCA• mutations are propagated down through generations
![Page 34: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/34.jpg)
Mutation rate
accgttatgtaga accgctatgtaga
MRCA
actgttatgtaga accgctatataga
MRCA
• higher mutation rate (µ) gives rise to more SNPS
![Page 35: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/35.jpg)
Recombination
accgttatgtaga accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
accgttatgtaga
![Page 36: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/36.jpg)
Demographic history
small (effective) population size N
large (effective)
population size N
• different world populations have varying long-term effective population sizes (e.g. African N is larger than European)
![Page 37: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/37.jpg)
Modeling
past
present
stationary expansioncollapse
MD(simulation)
AFS(direct form)
histo
ry
0
0.05
0.1
1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
1 2 3 4 5 6 7 8 9 100
0.05
0.1
1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
1 2 3 4 5 6 7 8 9 10
bottleneck
0
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 10
![Page 38: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/38.jpg)
Ancestral inference
0
0.05
0.1
0.15
1 2 3 4 5 6 7 8 9 10
minor allele count
bottleneckmodest but
uninterrupted expansion
![Page 39: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/39.jpg)
The signatures of selection
• selective mutations influence the genealogy itself; in the case of neutral mutations the processes of mutation and genealogy are decoupled
![Page 40: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/40.jpg)
Association and haplotype structure
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.81E-6
1E-5
1E-4
1E-3
0.01
0.1
1
10
100
1000
Reco
mbin
atio
n F
ract
ion
r2
European Asian
African American
Dista
nce
(kb)
“linkage disequilibrium”
“haplotype blocks”
![Page 41: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/41.jpg)
Computer simulations: the Coalescent
![Page 42: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/42.jpg)
Medical utility?
clinical phenotypemolecular markers
?
functional understanding
![Page 43: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/43.jpg)
Mapping disease-causing loci
genetic linkage
association between allele and phenotype
![Page 44: Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d3f5503460f94a18efa/html5/thumbnails/44.jpg)
Forensic applications