lecture 13 – performance of methods folks often use the term “reliability” without a very...

Lecture 13 – Performance of Methods

Folks often use the term “reliability” without a very clear definition of what it is.

Methods of assessing performance - Simulation.

Enormous advantage that a large number of replicates can be examined, and this allows us to account for stochasticity

prospective simulations, in which a set of conditions is specified a priori, and this defines the conditions under which data are simulated.

retrospective simulations, in which simulation conditions are defined by analysis of a particular data set that’s relevant to some question.


Huelsenbeck & Hillis (1993. Syst. Biol., 42:247) led to a host of prospective simulation studies that have tremendously advanced our understanding of the conditions across which

phylogenetic estimation methods perform well.

Prospective simulations have been very important.

Tateno et al. (1994. Mol. Biol. Evol., 11:261-277) simulated data under an F84+G model of sequence evolution

They then compared how well NJ on G-corrected distances did at estimating the tree with ML under an equal-rates model.

Of course, NJ on G-corrected distances performed better than ML with an equal-rates model, but this is not an appropriate comparison.

Another weakness is that we must use relatively simple models to simulate the data, and this may compromise the generality of our results.


Danger – Easy to stack the deck.

Methods of assessing performance - Congruence.

Use of well-corroborated phylogenies.

“Trees of natural taxa, well supported by many independent lines of evidence, should be used in the same way as the known phylogenies of simulations and of certain laboratory

and domesticated groups, i.e., as standards for evaluating the accuracy of different phylogenetic methods.” (Miyamoto & Fitch. 1995. Syst. Biol. 44:64)

The advantage is that the data have been produced by the actual complex evolutionary process that has led to the diversity of the group being used, circumventing the

weakness of simulation.

There are several weaknesses, though.

The history of the group can’t be manipulated to explore different combinations of branch lengths and properties of the data.

Replication is non-existent.

Assumes gene tree equals species tree (coalescent stochasticity is ignored as is HGT/hybridization).

Methods of assessing performance

Experimental phylogenies.

Sequences evolve via natural and the tree topology can be anything the investigator chooses.

We can store ancestors and access the ancestral character states directly.

A

BC

D

E

FG

H

A

B C

DE

FGH

Subject A & D to similar selection

Bull et al. (1997. Genetics. 147:1497)

Criteria - Consistency

A statistically consistent estimator is one that converges to the true value of the parameter being estimated as the amount of data increases.

Sequence Length

Prob

. Cor

rect

So under conditions simulated here (FZ tree and GTR+I+G), 3 of the 9 methods are inconsistent.

Average gene length

Criteria - Efficiency

Sequence Length

Prob

. Cor

rect

In the figure above, estimation with GTR+G is more efficient than JC+G.

All are consistent, but MP and ML with ER models are most efficient.

How many data are required to get the right answer?

True model

Criteria - Robustness

Sequence Length

Prob

. Cor

rect

How sensitive is method to violation of assumptions?

Sequences simulated with GTR+I+G, but any model that incorporates ASRV somehow (I, G, I+G) is consistent.

ML is robust to violations, as long as something is done to accommodate ASRV.

Interaction of Topology and Performance

FZ tree

Inverse FZTree

Equal B.L.Tree

Efficiency of Parsimony in the Inverse-Felsenstein Zone

Swofford et al. (2001. Syst. Biol., 50:525-539) examined the situation in detail.

The probability that a state shared by the long branch taxa actually evolved on the internal branch and changed nowhere else represents the probability that

any site pattern of the form xxyy is the result of a true synapomorphy.

The probability of the site pattern xxyy being seen in the data under any scenario is 0.1172

Thus ca. 97% of the sites that have the pattern xxyy will have experienced multiple hits.

lecture 13 – performance of methods folks often use the term “reliability” without a very...

Documents

performance simulation

simulation conditions

performance of methods

simulated data

data increases

weakness of simulation

gene tree

set of conditions