![Page 1: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/1.jpg)
A Fully Resolved Consensus Between Fully Resolved
Phylogenetic Trees
José Augusto Amgarten QuitzauJoão Meidanis
Scylla Bioinformatics, BrazilUniversity of Campinas, Brazil
![Page 2: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/2.jpg)
Phylogeny reconstruction methods
Phylogeny reconstruction methods aim at inferring the phylogenetic tree that best describes the evolutionary history for a set of taxa.
![Page 3: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/3.jpg)
Which tree to choose?
“The field of systematics has been in considerable turmoil as various investigators developed different methods of classification and argued their merits. I guarantee you that no one method or view has all the good points.”
Walter M. Fitch – 1984
![Page 4: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/4.jpg)
Consensus as tree constructor
Consensus trees have been used traditionally in tree comparison and calculation of bootstrap values
We propose the use of consensus as a tree constructor
It can be efficiently implemented as long as we keep trees fully resolved
![Page 5: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/5.jpg)
Every edge in a phylogenetic tree divides the leaves in two subgroupssubgroups.
Each of these pairs of subgroups are splitssplits of the tree.
EF
G
H
AB
CD
Splits
![Page 6: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/6.jpg)
Tree weight
Our method relies on weighingweighing trees and taking the one with maximum weight
Let the frequencyfrequency of a split in a collection of trees be the number of trees which contain the split divided by the total number of trees in the collection
Let the weightweight of an unrooted phylogenetic tree be the product of its splits frequencies
![Page 7: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/7.jpg)
Most probable tree
A most probable treemost probable tree for a collection of fully resolved phylogenetic trees is a tree that maximizes the weight:
![Page 8: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/8.jpg)
Example
![Page 9: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/9.jpg)
Solution
w = 0.0703125
![Page 10: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/10.jpg)
Running time
The tree weight formula can be written as a product of the frequencies of the small subgroups
We designed an algorithm that finds all most probable trees for a given set of fully resolved phylogenetic trees
The complexity of the algorithm is O(l3t2log(lt)),where l is the number of leaves and t is the number of trees
![Page 11: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/11.jpg)
Experiments
Data setsData sets used to test the new method:
Synthetic data: from Gascuel’s LIRMM site
K2P – Kimura 2 Parameter, no MC
K2Pm – Kimura 2 Parameter, with MC
COV – Covarion model, no MC
COVm – Covarion model, with MC
Real data: Ribosomal RNA
![Page 12: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/12.jpg)
Experiments
ProgramsPrograms used to test the new method (19):Software Method Model
fastMe Minimum evolution JC, K2P
Mega Minimum evolution JC, K2P, TN
Mega Maximum parsimony
Mega Neighbor joining JC, K2P, TN
dnacomp DNA compatibility
dnaml Maximum likelihood
dnapars Maximum parsimony
neighbor Neighbor joining JC, K2P
neighbor UPGMA JC, K2P
weighbor Weighted neighbor joining JC, K2P
![Page 13: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/13.jpg)
Most probable = Median
![Page 14: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/14.jpg)
Reflects general tendency
![Page 15: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/15.jpg)
Results: average split distance
Data set Minimum Distance
K2P 43.44
K2Pm 77.78
COV 52.67
COVm 69.11
Ribosomal 60.71
Consensus consistently yields minimum average split distance
![Page 16: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/16.jpg)
May result in better tree
![Page 17: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/17.jpg)
Results: distance to “real” tree
Data set Consensus Not Worse Than ...
K2P 72 %
K2Pm 39 %
COV 78 %
COVm 72 %
Ribosomal 100 %
Consensus consistently not worse off than majority of input trees
… of input trees
![Page 18: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/18.jpg)
Theoretical foundations
AB
CD
EF
G
H
![Page 19: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/19.jpg)
All splits of a tree
AB
CD
EF
G
H AA | BCDEFGH| BCDEFGHBB | ACDEFGH| ACDEFGH
ABAB | CDEFGH| CDEFGH
CC | ABDEFGH| ABDEFGHDD | ABCEFGH| ABCEFGH
HH | ABCDEFG| ABCDEFG
GG | ABCDEFH| ABCDEFH
FF | ABCDEGH| ABCDEGHEE | ABCDFGH| ABCDFGH
CDCD | ABEFGH| ABEFGH
EFEF | ABCDGH| ABCDGH
EFGEFG | ABCDH| ABCDH
ABCDABCD | EFGH| EFGH
![Page 20: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/20.jpg)
Small subgroup of each split
AB
CD
EF
G
H AA | BCDEFGH
BB | ACDEFGH
ABAB | CDEFGH
CC | ABDEFGH
DD | ABCEFGH
HH | ABCDEFG
GG | ABCDEFH
FF | ABCDEGH
EE | ABCDFGH
CDCD | ABEFGH
EFEF | ABCDGH
EFGEFG | ABCDH
ABCDABCD | EFGH
![Page 21: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/21.jpg)
Small subgroups
AABB
ABAB
CCDD
HH
GG
FFEE
CDCD
EFEF
EFGEFG
ABCDABCD
![Page 22: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/22.jpg)
Maximal clusters (n-trees)
AABB
ABAB
CCDD
HH
GG
FFEE
CDCD
EFEF
EFGEFG
ABCDABCD
![Page 23: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/23.jpg)
Fundamental theoretical result
AA BBABAB
CC DDHH
GGFFEE
CDCD
EFEFEFGEFG
ABCDABCD
● The small subgroup set of a phylogenetic tree is always a finite set of n-treesn-trees
● There are exactly three n-trees in this set, and all n-trees are maximal if and only if the phylogenetic tree is fully resolved
![Page 24: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/24.jpg)
Implementation details
DD EE FF GG EFEF GHGH ABCABC
![Page 25: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/25.jpg)
Dynamic programming
DD EE FF GG EFEF GHGH ABCABC
![Page 26: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/26.jpg)
Dynamic programming
DD EE FF GG EFEF GHGH ABCABC
![Page 27: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/27.jpg)
Dynamic programming
DD EE FF GG EFEF GHGH ABCABC
![Page 28: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/28.jpg)
Implementation details
DD EE FF GG EFEF GHGH
FGHFGHDEFDEFABCABCDD EE DEDE
L \
ABCABC
![Page 29: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/29.jpg)
Implementation details
![Page 30: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/30.jpg)
To Do List
Rooted trees
Polytomies
Non uniform weights for input trees
![Page 31: A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University](https://reader036.vdocuments.mx/reader036/viewer/2022062421/56649d5f5503460f94a3f109/html5/thumbnails/31.jpg)
Acknowledgments
Scylla Bioinformatics and Institute of Computing, Unicamp, for machine time, infrastructure, and support
Brazilian Research Financing Agency CNPq, grant 470420/2004-9