algorithms for biological sequence analysis
DESCRIPTION
Algorithms for Biological Sequence Analysis. Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan Date: September 14, 2009 WWW: http://www.csie.ntu.edu.tw/~kmchao. About this course. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/1.jpg)
Algorithms for
Biological Sequence Analysis
Kun-Mao Chao (趙坤茂 )
Department of Computer Science and Information Engineering
National Taiwan University, TaiwanDate: September 14, 2009
WWW: http://www.csie.ntu.edu.tw/~kmchao
![Page 2: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/2.jpg)
2
About this course• Course: Algorithms for biological sequence analysis• Some basic knowledge on algorithm development and program
design is required. • We will be focused on the sequence-related algorithmic
problems. Genomic sequences are our main target.– The oldest language– The largest program
• Fall semester, 2009• 13:20 – 16:20 Monday, 107 CSIE Building.• 3 credits• Web site: http://www.csie.ntu.edu.tw/~kmchao/seq09fall
![Page 3: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/3.jpg)
3
Coursework:
• Homework assignments and Class participation (15%)
• Two midterm exams (60%; 30% each):– October 26, 2009 (tentatively)– December 7, 2009 (tentatively)
• Oral presentation of selected papers (25%)
![Page 4: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/4.jpg)
4
OutlinesPart I: Sequence Homology
– Introduction to basic algorithmic strategies– Pairwise sequence alignment– Multiple sequence alignment– Chaining algorithms for genomic sequence analysis– Suboptimal alignment– Comparative genomics– Compressed / constrained sequence comparison– Hidden Markov models (the Viterbi algorithm et al.)
Part II: Sequence Composition– Maximum-sum and maximum-density segments– SNP and haplotype data analysis– Approximate gapped palindrome– Genome annotation– Other advanced topics
![Page 5: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/5.jpg)
5
A Brief History of Genetics
• 1859 Charles Darwin published “The Origin of Species.”
• 1865 Genes are particular factors. [Gregor Mendel]
• 1869 Discovery of nucleic acid [Friedrich Miescher]
• 1903 Chromosomes are hereditary units. [Walter Sutton]
• 1910 Genes lie on chromosomes. [Thomas Hunt Morgan]
• 1913 Chromosomes are linear arrays of genes. [Alfred Sturtevant]
![Page 6: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/6.jpg)
6
A Brief History of Genetics (cont’d)
• 1931 Recombination occurs by crossing over. [Harriet Creighton and Barbara McClintock]
• 1944 DNA is the genetic material. [Oswald Avery, Colin McLeod and Maclyn McCarty]
• 1953 DNA is a double helix. [James Watson and Francis Crick]
• 1961-1967 Genetic code is triplet. [Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner & Francis Crick]
• 1977 DNA was sequenced for the first time. [Fred Sanger, Walter Gilbert, and Allan Maxam]
• 21th Century: Many genomes completely sequenced
![Page 7: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/7.jpg)
7
Milestones of Bioinformatics
• 1962 Pauling's theory of molecular evolution• 1965 Margaret Dayhoff's Atlas of Protein Sequences• 1970 Needleman-Wunsch algorithm• 1977 DNA sequencing and software to analyze it (
Staden)• 1981 Smith-Waterman algorithm developed• 1981 The concept of a sequence motif (Doolittle)• 1982 GenBank Release 3 made public• 1982 Phage lambda genome sequenced
![Page 8: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/8.jpg)
8
Milestones of Bioinformatics (cont’d)
• 1983 Sequence database searching algorithm (Wilbur-Lipman)
• 1985 FASTP/FASTN: fast sequence similarity searching• 1988 National Center for Biotechnology Information (NC
BI) created at NIH/NLM• 1988 EMBnet network for database distribution• 1990 BLAST: fast sequence similarity searching• 1991 EST: expressed sequence tag sequencing• 1993 Sanger Centre, Hinxton, UK• 1994 EMBL European Bioinformatics Institute, Hinxton,
UK
![Page 9: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/9.jpg)
9
Milestones of Bioinformatics (cont’d)
• 1995 First bacterial genomes completely sequenced
• 1996 Yeast genome completely sequenced
• 1997 PSI-BLAST
• 1998 Worm (multicellular) genome completely sequenced
• 1999 Fly genome completely sequenced
![Page 10: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/10.jpg)
10
Milestones of Bioinformatics (cont’d)
• Human Genome Project (1990-2003)
• Mouse 2002
• Rat 2004
• Chimpanzee 2005
• Completed Genomes
![Page 11: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/11.jpg)
11
Chimpanzee Genome
![Page 12: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/12.jpg)
12
The Primate Family Tree
Source: Nature
![Page 13: Algorithms for Biological Sequence Analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042703/56813170550346895d97eb7a/html5/thumbnails/13.jpg)
13
A New BookPublished by Springer in 2009
(ISBN 978-1848003194)
Sequence Comparison: Theory and Methodsby Kun-Mao Chao and Louxin Zhang