computing the tree of life the university of texas at austin department of computer sciences tandy...
TRANSCRIPT
![Page 1: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/1.jpg)
Computing the Tree of Life
The University of Texas at Austin
Department of Computer Sciences
Tandy Warnow
![Page 2: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/2.jpg)
PhylogenyFrom the Tree of the Life Website,
University of Arizona
Orangutan Gorilla Chimpanzee Human
![Page 3: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/3.jpg)
DNA Sequence Evolution
AAGACTT
TGGACTTAAGGCCT
-3 mil yrs
-2 mil yrs
-1 mil yrs
today
AGGGCAT TAGCCCT AGCACTT
AAGGCCT TGGACTT
TAGCCCA TAGACTT AGCGCTTAGCACAAAGGGCAT
AGGGCAT TAGCCCT AGCACTT
AAGACTT
TGGACTTAAGGCCT
AGGGCAT TAGCCCT AGCACTT
AAGGCCT TGGACTT
TAGCCCA TAGACTT AGCGCTTAGCACAAAGGGCAT
AGGGCAT TAGCCCT AGCACTT
![Page 4: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/4.jpg)
Molecular Phylogenetics
TAGCCCA TAGACTT TGCACAA TGCGCTTAGGGCAT
U V W X Y
U
V W
X
Y
(Tree is unrooted)
![Page 5: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/5.jpg)
Evolutionary trees and the pharmaceutical industry
• Big genome sequencing projects just produce data -- so what? Evolutionary history relates all organisms and genes, and evolutionary trees are used to make important biological discoveries.
• The pharmaceutical industry uses phylogenies for many applications, such as the development of influenza vaccine!
• Inaccuracies in the phylogenies lead to inaccurate predictions (e.g., vaccines that don’t work, drugs that don’t have the required properties). Current software isn’t accurate enough, or fast enough!
• This means $$$!
![Page 6: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/6.jpg)
We are world leaders in research in Computational Phylogenetics
• “DCM-boosting” for phylogeny reconstruction - improves accuracy and speeds up heuristics for NP-hard problems (Warnow, UT-Austin)
• GRAPPA -- software for whole genome phylogeny (Moret, UNM)
• Visualization of large trees, and sets of trees (Amenta, UC Davis)
• Phylogenetic databases (Miranker)
![Page 7: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/7.jpg)
DCM-boosting phylogenetic reconstruction methods[Nakhleh et al. ISMB 2001]
• DCM-boosting makes fast methods more accurate
• DCM-boosting speeds-up heuristics for hard optimization problems
NJ
DCM-NJ
0 400 800 16001200No. Taxa
0
0.2
0.4
0.6
0.8
Err
or R
ate
![Page 8: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/8.jpg)
Whole-Genome Phylogenetics
A
B
C
D
E
F
X
Y
ZW
A
B
C
D
E
F
![Page 9: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/9.jpg)
Benchmark gene order dataset: Campanulaceae
• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion
phylogenies
1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)
![Page 10: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/10.jpg)
Benchmark gene order dataset: Campanulaceae
• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion
phylogenies
1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)2000: Using GRAPPA v1.1 on the 512-processor Los Lobos
Supercluster machine: 2 minutes (200,000-fold speedup per processor)
![Page 11: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/11.jpg)
Benchmark gene order dataset: Campanulaceae
• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion
phylogenies
1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)2000: Using GRAPPA v1.1 on the 512-processor Los Lobos
Supercluster machine: 2 minutes (200,000-fold speedup per processor)
2003: Using latest version of GRAPPA: 2 minutes on a single processor (1-billion-fold speedup per processor)
![Page 12: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/12.jpg)
GRAPPA (Genome Rearrangement Analysis under Parsimony and other
Phylogenetic Algorithms)http://www.cs.unm.edu/~moret/GRAPPA/
• Heuristics for NP-hard optimization problems
• Fast polynomial time distance-based methods
• Contributors: U. New Mexico,U. Texas at Austin, Universitá di Bologna, Italy
• Fastest and most accurate software for whole genome phylogeny worldwide
![Page 13: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/13.jpg)
Opportunities
• New phylogenetic reconstruction software can improve pharmaceutical R&D (making more accurate solutions achievable in hours or days, rather than months or years)
• Software for researchers is available as free (open source), but users need the latest tools now, with proper interfaces -- business opportunity.
![Page 14: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/14.jpg)
Participants and Funding
• University of Texas Computer Scientists: Warnow, Dhillon, Hunt, and Miranker
• University of Texas biologists: Jansen, Linder, and Hillis
• Other institutions: UNM, UC Davis, Central Washington, CUNY, JGI
• Funding: Three NSF ITR grants, NSF Biocomplexity, David and Lucile Packard Foundation
![Page 15: Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dcf5503460f94ac3ae2/html5/thumbnails/15.jpg)
Phylolab, U. TexasPlease visit us athttp://www.cs.utexas.edu/users/phylo/