Transcript
Page 1: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

Cladistics (1995) 11:91-104

FORUM PARSIMONY AND WEIGHTING:

A REPLY TO TURNER AND ZANDEE

Pablo A. Goloboff

Cmqo Nacional a2 Investigaciones Cientzficas y Ticnicas, Fundacih e Instituto “Miguel LiUo, ” Miguel LiUo 205, 4000 San Miguel de Tucuman, Argentina

Reckved for publication 10 March 1995; accepted 18 August 1995

Introduction Turner and Zandee (1995) criticize my (Goloboff, 1993a) method for character

weighting. I had proposed that data should always be analyzed taking into account character weights, regardless of the results under equal weights. With that in mind, I proposed a method, derived from Farris’ (1969) ideas on weighting, based on the notion that comparing the fit of different trees using concave, decreasing functions of the homoplasy automatically weights the characters according to the homoplasy they have on the trees being compared. For strongly concave functions, characters with homoplasy have very little influence in tree comparisons, while under less concave functions (i.e. functions approaching linearity) characters with homoplasy are allowed almost as much influence as those without (i.e. trees are compared only according to prior weights). In Pee-Wee, a computer program that implements the method (Goloboff, 1993b), I measured the fit of character i with J=k/(k+es), where k is a constant (equal to or greater than unity) that changes the concavity of the fitting function (to allow homoplastic characters to have less or more influence), and es is the number of extra steps. Pee-Wee searches for trees which maximize the total fit, F=ZJ, with searching algorithms analogous to those of other parsimony programs. As concave decreasing functions of the homoplasy were already used to estimate character weights (Farris, 1969; Farris, 1989), I proposed that trees of maximum fit can also be seen as the trees which imply the characters to be maximally reliable.

Turner and Zandee now present some supposedly new findings regarding my method, and they conclude from those that it has some serious faults. I shall here show that the “new” findings are not such-they are well-known facts-and that they constitute in themselves no evidence against my method of weighting, or any other. Most of the present arguments had already been presented in my 1993a paper, or by other authors, but Turner and Zandee have either missed or ignored them. To avoid simply referring the reader to other papers, I shall review the arguments here, and it will then become clear that there is no basis for Turner and Zandee’s criticism.

P l i O l i t y

Turner and Zandee claim having discovered that trees of maximum fit (like

07483007/95/010091+14/$12.00/0 0 1995 The Willi Hennig Society

Page 2: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

92 P. A. GOLOBOFF

those produced under successive weighting) may be of non-minimum length under equal weights. After suggesting that the behaviour of the fitting function for an individual character is easily predicted (always decreasing with extra steps), they state *at

'The behaviour of Fis less easily predicted. Because F=ZJ, and f; for each character depends on the topology of the tree, there is no straightfomd relation between Fand, e.g. total number of extra steps. . . . as h+m, F becomes equal to the number of characters in the data matrix, and thus the lengths of trees under implied weights become equal to their lengths under unweighted parsimony analysis. Thus, for b m [sic!] selecting trees with the highest fit becomes equal to selecting the most parsimonious trees (MPTs). However, for low values of k this is not necessarily the case."

It is rather obvious throughout their paper that they consider the potential to lead to trees of non-minimum length under equal weights a serious defect that invalidates my method. They further conjecture that

'The behaviour described here seems to indicate that there is a minimal value of h above which all fittest trees are MPTs. For lower values of k less parsimonious (under equal weighting) trees may have a better fitness. We offer no proof for this conjecture but in our experiments we have not met a single counterexample".

Because of their discovery that very high values of k lead to preference for trees shortest under equal prior weights, they recommend that only high values of k be used-or better, some other fitting function that never produces "longer" trees and is not affected by variables like k.

Turner and Zandee, however, are neither the first ones to realize that maximum fit trees may be non-shortest, nor the first ones to realize that higher values of k produce results more similar to those under equal weights. The possibility of non- shortest trees is documented not only in the paper of mine that Turner and Zandee criticize (Goloboff, 1993a: 88), but also in an empirical analysis cited there. In that analysis, the fittest trees were actually

"one step longer than the shortest trees for equally weighted characters; a comparison of those trees may clarify the way in which the weighting method used here works. The fittest trees save one step in characters 15 (numerous labial cuspules), 30 (type of trichobothrial bases), and 36 (separation of PMS and PLS), while requiring two additional extra steps in character 9 (teeth of the ITC), and one in characters 6 (maxillary cuspules) and 44 (uichobothrial pattern). AU of the characters in which steps could be saved [by preferring the shortest tree] (as well as character 30) have three or more extra steps in both the fittest and the shortest trees. In contrast, character 15 has only one extra step on the fittest tree, and character 36 only two. The fittest tree, although slightly longer, saves steps for those characters which have less homoplasy, and is therefore to be preferred." (Goloboff, 1993c: 22).

Other published analyses applying the method which resulted in trees of non- minimum length are those of Goloboff (1995), Szumik (1994) and Szumik (1995). But perhaps the best proof that parsimony analysis under implied weights was known to produce non-shortest trees is the existence of a program to apply the method, Peewee. I would never have bothered to program Pee-Wee if trees of best fit were always shortest trees-if one could "easily predict the behavior of F as a function of the total, unweighted number of steps. It would have been much easier to produce a program that simply took results from Hennig86 or PAUP-both effective at finding shortest trees-and selected among them. The

Page 3: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

FORUM: PARSIMONY AND WEIGHTING.. . . . 93

method was always presented with the understanding, in fact, with the aim, that the trees it produces need not be shortest under equal weights’. As for the influence of k, the variable had been introduced because in that way “the function can be modified to be less steep . . . For kO [l in the present notation] the knction downweights as drastically as [the consistency index], but for higher values of k the function weights less drastically against characters with homoplasy” (Goloboff, 1993a: 89).

it then follows directly that the highest values of k will weight least drastically against characters with homoplasy-i.e. will consider all characters equally influential (see also Goloboff, 1995: 7). Turner and Zandee have ”discovered” that k influences weighting exactly as it was intended.

Parsimony

Realizing that my method may lead to trees of non-minimum length, Turner and Zandee criticize it from two angles. The first has more to do with terminology than with substance: my method is in conflict with the parsimony criterion. The second seems at first to have more substance: even if one was to abandon parsimony in favor of character weighting, one would not know how to assign weights to the characters-and therefore the choice of trees would ultimately be based on arbitrary decisions. I will discuss in this section whether equal weighting is required for parsimony, leaving for subsequent sections the discussion of weight assignments.

Turner and Zandee call trees shortest under equal weights “MPTs” or “most parsimonious trees“, and the advantages of parsimony in phylogeny and systematics are indeed wellestablished (e.g. Farris, 1977, 1978, 1979a, b, 1980a, b, 1982, 1983). But Turner and Zandee do not show that what they call “parsimony” is the same principle widely accepted in systematics. In fact, that principle is nothing of the sort of what Turner and Zandee call “parsimony”-i.e. minimum length under equal weights. Therefore, their attempt to disqualify differential weighting by just calling it something other than parsimony is no more than a manipulation of definitions.

I presented my method as a more refined way to apply parsimony, not as an alternative, and rejected explicitly (citing Farris, 1983) the idea that parsimony required equal weights:

“Farris (1983) discussed the relationship between parsimony and weighting, and showed that the most parsimonious cladogram is the hypothesis with greatest explanatory power, given the weights that the charackm deserue . . . but some authors still advocate weighting as only a means to select among trees shortest under equal weights. Farris’ argument, however, applies even when the shortest trees for the weighted data are not shortest under equal weights.. . Then, the tree(s) obtained under equal weights could be defended only with a claim that all the characters provide equally strong evidence. But that claim of equality is rejected by almost every published

’It may be worth to note that my method was originally presented at the 1992 Meeting of the Society. R. Zandee attended that meeting; he gave a paper in the same session where I gave mine. That my method could lead to trees of non-minimum length under equal weights was perfectly obvious to at least part-of my audience. In the discussion that followed my presentation, for example, Mary Mickevich considered that a defect.

Page 4: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

94 P. A. GOLOBOFF

cladistic analysis . . . In conclusion, if the data are properly weighted, those results always are to be preferred, regardless of the results under equal weights. Although few authors have held this position explicitly (e.g. Kluge and Farris, 1969; Farris, 1969; Platnick et al., 1991), it is not that parsimony does not preclude weighting, but rather that it requires weighting.” (Goloboff, 1993a: 83; italics in the original).

But despite Farris’ (1983) work cited above, parsimony is still seen in two different ways. Some see parsimony as a rule dictating that one character versus one is always a tie, and that two characters must always win over one-essentially precluding weighting or allowing it only limited influence. Thus, parsimony would be only a means to resolve character conflict, and somehow unnecessary for perfectly congruent data. Turner and Zandee’s position on this matter is that

“homoplasy is in the first place the result of incorrect assumptions of homology and should be treated as such by re-assessing these assumptions in the light of the initial analyses . . . In itself, the re-assessment of homology assumptions constitutes reweighting of the characters, but based on biological grounds (observations made on the specimens), until all characters are e p l l y reliable as markers of phylogeny. The only remaining basis for weighting characters then bewmes a parsimony argument, namely in orah to select those M p T s in which the muximum number of characters are congruent.” (italics added).

Their description of parsimony as leading “to select trees in which the maximum number of characters are congruent” suggests clique methods more than parsimony. Parsimony under equal weights seeks trees in which the characters are maximally congruent, trees which may well have fewer perfectly congruent characters. Leaving that imprecision aside, it seems clear that they view parsimony as requiring that “all characters are equally reliable as markers of phylogeny”. But an alternative view is that parsimony indicates that a tree better explains a character to the extent that it requires fewer instances of homoplasy for that character, saying nothing of how differences in homoplasy among different characters should be compared. Parsimony allows to resolve character conflict only by indicating that an alternative tree may increase the explanatory power for some characters and decrease it for others. If the degree to which the alternative tree better explains some characters exceeds the degree to which it worse explains others, the alternative is clearly to be preferred, and vice versa. But deciding whether the improvement in some characters is equivalent to the worsening in others, requires that the weights of the characters involved, equal or otherwise, be taken into account.

Names are only a matter of convention. Although agreement on the meaning of “parsimony” is obviously necessary for facilitating communication, what has to be logically justified are the methods and procedures themselves, not their names. Therefore, what actually matters here is whether trees with fewer raw steps are always preferable to trees that minimize weighted amounts of homoplasy, regardless of which of those two procedures would fit Turner and Zandee’s own definition of “parsimony”. In my paper, I briefly cited Farris (1983), as that author had explicitly defended the minimization of weighted homoplasies, regardless of length under equal weights. To make Farris’ (1983) point of view clearer, I now expand the citation:

“In portraying weighting as an alternative to parsimony, Watrous and Wheeler apparently intended to equate the parsimony criterion with simple counting of equally weighted homoplasies. That usage reflects both a lack of familiarity with the way in which parsimony has long been used by other phylogeneticists and a misunderstanding of the nature of character

Page 5: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

FORUM: PARSIMONY AND WEIGHTING.. . . . 95

weighting. . . . Genealogies are selected to avoid requirements for ad hoc hypotheses of homoplasy because characters imposing such requirements are regarded as evidence favoring alternative genealogies over the one selected. In the absence of any convincing reason for doing otherwise, the characters of a study are often treated in practice as if they all provided equally cogent evidence on phylogenetic relationship. No one supposes, however, that characters in general all deserve the same weight-that they all yield equally strong evidence. Drawing conclusions despite conflicting evidence requires that some evidence be dismissed as homoplasy. It is surely preferable to dismiss weaker evidence in deference to stronger. A decision reached by weighting characters, at any rate, can hardly rest on a basis different from parsimony. The effect of giving a character a weight X is just to proceed as if the data included X independent characters all showing the same distribution of states. Now suppose that many independent characters support one placement of a taxon, while just one supports an alternative placement. Possible reinterpretations aside, if the characters are weighted equally, weight of evidence favors the first placement. . . . rfcharacter weights w e not aU equal, either placement might be suppmted b~ the greaterweight ofevidace, dLgendingon thecharacterweights.” (Farris, 1983 10-11; italicsadded).

The passage, incidentally, helps settling the terminological question, as Farris (1983) used the name “parsimony” for the minimization of weighted homoplasies, not for the notion that all characters count equally. The passage also shows I am hardly alone in considering that “longer” trees may be acceptable alternatives. Explicit discussions of which I am aware suggest why trees with more raw steps may be preferable in some cases, pointing out that it may be better to save homoplasy in a few good characters at the expense of increasing it in several very poor ones. I will admit that those discussions may not be the last word. For example, it might be found, on further analysis, that there is a fault in my reasoning, or Farris’. But no fault in reasoning has ever been pointed out, and until it be shown why weaker evidence cannot be dismissed in deference to the stronger, the premise of my method is untouched, and then there is no reason to reject its possible consequence-trees with more raw steps.

It is then clear that the possibility of “producing trees of non-minimum length” is in itself an empty criticism, for any weighting method. If trees with fewest raw steps are not always most explanatory (as not all characters provide equally strong evidence of relationships), we obviously need a method that does what Turner and Zandee consider unfortunate-to choose trees not in regard to their raw number of steps. That is exactly the aim of my method. Successive weighting also evaluates trees not by reference to raw number of steps (at least after the initial round), and therefore is as immune to the criticism of producing trees with more raw steps as is my method.

Given that weighting is desirable, the results of a parsimony analysis under a particular weighting scheme can only be criticized on the grounds that the weights have not been assigned in the best way. And this leads to Turner and Zandee’s second objection to weighting based on homoplasy.

Reliability

Turner and Zandee’s attack on weighting based on homoplasy goes beyond calling it something other than parsimony. They claim that there is no reason homoplasy should decrease our confidence in a character, and one therefore has no way to identify with certainty the bad characters:

‘Character weighting presupposes that some characters are phylogenetically more informative

Page 6: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

96 P. A. GOLOBOFT

than others. Even taking this assumption for granted, it does not follow necessarily that more homoplasious characters are per se less reliable as indicators of phylogeny . . . If the possibilities for re-assessment of homology statements (i.e. hypotheses of synapomorphies) have been exhausted, the remaining homoplasy may be indicative of the unreliability of the affected characters as markers of phylogeny, but still not necessarily SO.” (italics in the original).

Their first claim that homoplasy says nothing about character reliability is rather surprising. A character with no homoplasy on a tree perfectly defines a group of that tree; if one instead created a group based exclusively on a character with much homoplasy, the resulting tree would be very different from the original one. Not that this is the first time actual arguments are advanced in favor of downweighting characters with more homoplasy. That same idea had been presented before, with a different wording, by Farris (1969), Carpenter (1988), and in the paper of mine that Turner and Zandee are attacking (Goloboff, 1993a: 84). Turner and Zandee do not discuss those arguments, much less refute them; they simply try to give the impression that nothing has ever been said in favor of using homoplasy as indicating unreliability.

Their second claim, that ”when the possibilities for re-assessment of homology have been exhausted,” homoplasy may OT may not indicate unreliability, is hard to interpret. Perhaps they are trying to say that homoplasy indicates unreliablity in general, but fear in some specific cases it could make us downweight “good” characters. That is true as far as it goes, but no empirical method could possibly be free of that suspicion-no empirical method could render conclusions necessarily true, given any observations.

It is also worth to comment on Turner and Zandee’s assertion that “character weighting presupposes that some characters are phylogenetically more informative than others.” That some characters are more reliable-less homoplastic-than others is not something that must be assumed before the analysis. It is instead something that becomes obvious when the analysis is done. In most real cases, when it is assumed that all characters are equally informative, the results contradict that very assumption . . . as some characters have significantly higher amounts of homoplasy. But not even then would Turner and Zandee “take for granted” the “presupposition that some characters are phylogenetically more informative than others”; they would possibly choose among “MPTs”

“by applying other measures based on the same line of reasoning [as the parsimony argument] (e.g. OCCI [Rodrigo, 19921 or average RI [Turner, 19951). We can see no foundation for preferring any particular kvalue that does not result in a subset of the set of MF‘Ts”.

In my 1993a paper I had not discussed the OCCI, but I did discuss the possibility of selecting trees with higher average retention index (n]. I concluded this was illogical, because it would always give lower weights to characters with more informative variation (i.e. characters which would require more extra steps on a bush), regardless of their homoplasy. Consider two characters, X and Y, and 14 taxa; Xhas state 0 in seven taxa, and state 1 in the other seven; Yhas state 1 in two taxa, one in which X has state 0, and other in which X has state 1. Both characters are in conflict; there are two equally parsimonious trees; one tree accounts perfectly for character X (the X tree), other for Y (the Y tree). The average ri for the X tree is (1.00+0.00)/2=0.500; the average ri for the Y tree is (0.83+1.00)/2= 0.915. Just because character Y has the apomorphic state in fewer taxa, it would be preferred. Consider now havingfive characters distributed like X. There is only one

Page 7: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

FORUM: PARSIMONY AND WEIGHTING.. . . . 97

shortest tree, the X tree, with seven steps; but its average n’ is 0.833, while the average n’ for the Y tree-11 steps long-is higher, 0.858. Again, the average n’ in tree Y is higher just because Y has apomorphic state in fewer terminals than the five X characters. Turner and Zandee “can’t see a foundation for preferring” a measure I proposed because it “does not result in a subset of the set of MPTs”-yet they propose a measure with exactly that property, and additional defects. Perhaps Turner and Zandee think that the average ri cannot be meaningfully used to compare trees with differat numbers of steps, but then why would it make any sense using it to compare trees with the same number of steps?

Not only do Turner and Zandee ignore those published arguments that may counter their own opinions; they also present false dichotomies. Their idea that ”weighting” based on “re-assessing assumptions of homology” is the only logical alternative to weighting is misleading, because they are not actually alternatives. The reexamination of specimens or “checking, correcting and rechecking” of Hennig (1966) is certainly advisable, and no proponent of successive or implied character weighting opposed it. On the contrary, Carpenter (1988: 293) explicitly recommended it. But the reexamination cannot resolve conflict, it can only remove the apparent conflict. One is still left with the problem of resolving those conflicts that survive reexamination, and it is in these cases where weighting is applicable. Calling the reexamination “character weighting” is also terminologically misleading, because the only things affected are some entries in the matrix-the steps between states of those taxa that needed no reexamination still cost the same as before. If the reexamination ”weights” something, it is not a “character”, but instead the observation expressed in a given cell of a matrix. Improving the way in which a matrix describes observations made on specimens is perfectly compatible with well-justified weighting schemes; both operate at different levels.

Successive Weighting and Self-Consistency

If I proposed a new method for weighting, it was because I considered that it represented an improvement over preexisting ones. Of preexisting methods, only successive weighting was logically justified and deserved explicit discussion. But Turner and Zandee misunderstand my objections to successive weighting:

“Recently, Goloboff (1993a) proposed a new scheme for weighting a set of characters. His main concern about previously proposed schemes (e.g. successive weighting: Farris 1969) was that there are no unambiguous criteria for the weighting procedure and that the resulting trees are not always self-consistent. Thus, the result of successive weighting is dependent on the initial weighting”.

In fact, I (Goloboff, 1993a: 85-86) had attributed to Farris (1969) the notion of self-consistency, suggested that successive weighting satisfies in essence this requirement, and considered that self-consistency, being a necessary but not a sufficient condition, is not the only concern. Of my three main objections to successive weighting, they mention only part of one: that different sets of initial weights may lead to different final solutions. But if there was a way to choose among possible initial sets of weights, this would not be a problem. This is a problem only because there is as yet no way to decide between different starting points (using equal weights as starting point would not solve the problem because

Page 8: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

98 P. k GOLOBOFF

the final solution itself rejects the idea of equal weights). Perhaps the difficulty of choosing among starting points is what they mean by “no unambiguous criteria for the weighting procedure”, but that phrase, which appears nowhere in my paper, could be taken to mean almost anything. I had two other objections to successive weighting. In successive weighting, trees can be compared during searches according to the implications of reliability of other trees, not their own. Successive weighting may produce trees which, according to the weight function, do not imply that the characters are maximally reliable (this could also be phrased as “lack of an optimality criterion”). These other objections are not even mentioned by Turner and Zandee.

Farris’ (1969) successive weighting is an iterative method, but mine is not. Turner and Zandee, however, fail to apprehend the reasons why my method does not iterate:

“Because the weight of a character depends solely on the number of extra steps. . . it is possible to evaluate each tree without reference to other trees, just as the total number of steps on a tree is independent of the number of steps on other teres. Goloboffs weighting scheme therefm allows character weighting to proceed simultaneously with tree reconstruction. This has the advantage of requiring only a single pass through the data in order to come up with the ‘best’ trees. Whether these trees are actually self consistent is a point not addressed by him”. (italics added).

That my weighting method allows one “to come up with the ‘best’ trees” in a “single pass” hardly follows from the fact that the fitting function takes into account only the number of extra steps. Indeed, my paper introduced the method (Goloboff, 1993a: 87) using as an example the unit consistency index, which does not depend only on the number of extra steps. Simply, my method does not iterate because it maximizes some quantity; whether that quantity is based on extra steps or something else, has nothing to do with iterations. But Turner and Zandee consider most important one of the least important aspects of my paper, proposing a fit measure based on the number of extra steps. Even though they would not use my fit measure to select among trees, they think that

=. . . the concept of counting number of extra steps remains valuable in that it is independent of the number of states per character, unlike CI and RC. In addition, the number of extra steps a character takes on a tree depends solely on the topology of the tree in question, and can be calculated for any optimization scheme. These are valuable properties which make ES, a good basis for a quality measure of trees, because it allows trees to be selected or discarded independently of other trees”.

But selecting trees based on CES, is identical to selecting trees based on total steps, S=Csi, or any of the other statistics they mention. As extra steps differs (over all possible trees) from total steps by a constant, and C and RC decrease monotonically with total steps, any tree which maximizes C, or RC, will also minimize both ZE& and S. Among different trees, the number of steps and the number of extra steps for a character have exactly the same topology dependence, and both can “be calculated for any optimization scheme”. The very idea that one can “calculate the number of steps for an optimization scheme” is actually a strange one, as “optimization” is a procedure used to calculate the minimum number of steps required by a tree. Furthermore, it is hard to imagine how a tree could be selected “independently” of other trees. “Selecting a tree” may only mean it is considered better than the others, according to any measure employed. Any

Page 9: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

FORUM: PARSIMONY AND WEIGHTING.. . . . 99

difference that could result from evaluating trees using total extra steps instead of total steps exists only in Turner and Zandee’s imagination.

The accusation quoted above, that I had not discussed whether trees maximizing concave functions of the homoplasy are self-consistent is equally unfounded. My discussion actually used as basis the notion of self-consistency-i.e. whether conflict was resolved in favor of characters with less homoplasy. I concluded that only by maximizing concave functions of the homoplasy was self-consistency automatically achieved; maximizing a linear function (i.e. finding shortest trees under prior weights) could resolve character conflict inconsistently.

s-ngch In my paper I concluded that only maximizing concave functions of the

homoplasy was self-consistency automatically achieved. I pointed out that “if fit is measured as a linear function of the homoplasy, the reliability of the characters is fixed beforehand, and character conflict is resolved according to those prior weights” (Goloboff, 1993a: 86-87), obviously implying that shortest trees may be inconsistent. However, Turner and Zandee propose that

‘Goloboff s concern that MPTs may not be self-consistent is unfounded if self-consistency can be equated with maximum fitness according to his formula for F. At sufficiently high values of h at least some MPTs are always in the set of fittest trees”.

which ignores the fact that self-consistency is defined by reference to resolution of character conflict relative to (implied) character reliability, not by reference to the maximization of some absolute magnitude. This is simply another intent of manipulating definitions, more futile in this case because they had earlier accused me of not addressing whether maximum fit trees are self-consistent. That earlier accusation shows that they know that self-consistency cannot be simply redefined as “maximum fit”, since, if it could, maximum fit trees would be self-consistent by &finition. Their attempt to defend shortest trees by plays on words is, in this case, merely hypocritical.

For very high values of k, the fitting function approaches linearity and, as already pointed out, this is bound to ignore the relative amounts of homoplasy in different characters during tree comparisons. In other words, it is equivalent to weighting only according to prior weights (which will normally be all equal).

Note that Turner and Zandee’s attempt, to make trees with fewest raw steps “self-consistent”, could be accomplished more sensibly not by using a milder weighting function, as they propose, but instead by using a much stronger one. For weights assigned with an extremely strong weighting function, as in compatibility analysis, almost any tree (including of course those shortest under equal weights) will be shortest under its implied weights. But what is required is not just self- consistency, it is self-consistency under a reasonable weighting@nction. An extremely strong weighting function is difficult to justify. Farris’ (1983: 31-33) criticism of those functions is especially lucid: that the eyes were acquired independently in octopi and vertebrates hardly means that the eyes were necessarily acquired independently in each species of vertebrate and in each species of octopus.

Farris’ (1983) argument on cliques shows that weighting functions should not be too strong. They should not be too weak, either, because “characters which have

Page 10: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

100 P. A. GOLOBOFF

failed repeatedly to adjust to the exception of hierarchic correlation are more likely to fail again in the future, and so they are less likely to predict accurately the distribution of as yet unobserved characters" (Goloboff, 1993a: 84). Giving all characters equal weight when the trees being examined imply that some are much more reliable than others does not seem the best course of action. It is then clear that neither extreme is desirable-neither "all or none" nor "no-weighting" functions. Where exactly should we draw the line? This is what deserves further investigation, as I had noted in my paper. For example, if it was shown for real data that the trees for a given concavity often predict better the distribution of subsequently found characters, that would be a good reason to prefer that concavity. That would require a carefully designed study. I also suggested the possibility that different concavities might have to be used for different numbers of m a ; this could perhaps be the case because the potential for detecting unreliability-homoplasy-is less for fewer m a . Until those questions are properly answered, it has to be acknowledged that the characters will have to be weighted approximately, with some imprecision. But imprecision is not the same thing as arbitrariness. As noted by Farris (1995), imprecision seems generally preferable to the specious precision obtained from ludicrous premises.

Given that the estimation of weights on the basis of homoplasy is approximate, it is desirable to express this in the formula to be used. It is with this aim that the constant k was introduced. But Turner and Zandee object to the influence of k

"We have shown above that F is not well-behaved in that (1) different values of k may result in different sets of fittest trees, and (2) even within the set of MF'Ts, the fittest tree may depend on the value of k This behaviour is not completely unexpected because different weighting schemes (i.e. different values of k) are expected to give different results. Our (and Goloboffs) initial question as to the appropriate value of k could not be answered. There seems to be no foundation for any particular choice. Therefore, F seems inappropriate as a tool for weighting characters . . . An ideal function for implied weighting should also be independent of any buffering constant, unlike Fwhich is dependent on k".

Turner and Zandee present my same reasoning, but backwards. One of their stated goals is to propose an answer to my question of the exact concavity that should be used. I had asked that question because I knew that different concavities could lead to different results. Turner and Zandee pretend that because different concavities can lead to different results, any possible choice between concavities would be arbitrary, and then we have to prefer no concavity at all. But their claim that there is "no foundation for any particular choice" of weighting strength is false; as shown above, very mild or very strong functions are illogical, and it is clear they should not be chosen. Turner and Zandee would always prefer the same set of trees, because they have chosen one of those illogical extremes: that confidence in homoplastic characters should not decrease in the slightest2. They seem to base their preference, however, only on the mistaken idea that choosing not to weight

'Note that under the other extreme, with characters completely reliable or completely unreliable, choosing "best fit" trees is equivalent to choosing compatibility trees. Felsenstein (1981) had suggested that a continuum existed between parsimony and clique analysis, with "all or none" weighting functions representing cliques. That is strictly true only when the weights themselves are to be maximized. If "clique" weighting functions are used to simply re-weight the characters, it is not possible to choose trees on the basis of self-consistency, as in that case any tree is shortest under its implied weights.

Page 11: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

FORUM: PARSIMONY AND WEIGHTING.. . . . 101

at all avoids the problem of having to choose how strongly to weight. But the impression that deciding the degree of strength for a weighting function is avoided by preferring "no strength at all" is simply a delusion. Turner and Zandee have not actually avoided a decision, they simply reached one without any justification. It is then.clear that my question about determining strength could never have been answered by Turner and Zandee, as they did not actually confront the problem; they only evaded it.

Precision

That one should not want spurious precision is clear from the preceding section. That in itself suffices to make suspect the accusation of Turner and Zandee that

Y . . . probably due to shortcuts taken during fitness calculation, the results reported by Pee-Wee are inexact. As can be seen in Table 3, the two trees reported as fittest at kO actually differ in fitness by 0.008 (i.e. within the margin reported by Peewee). . . . We can thus not guarantee that trees obtained by Pee-Wee (and reported here) as fittest. . . are indeed the best fitting ones".

But their criticism has another defect as well. Although it is true that the results reported by Pee-Wee are not "exact"-i.e. a finite number of significant digits is used-Turner and Zandee are wrong in claiming that 0.008 is "within the margin reported by the program." The margin is actually 0.009, and not for the total fit. The program calculates the fit as truncated; what is truncated is the individual fit, so that what is always within 0.009 is the fit for an individual character, not the total fit. When individual fits are summed the errors might add up to more than 0.009 (which the program would subsequently multiply by 10). Then, an error of 0.008 is not within the margin of the program, and that difference is in itself not a reason to believe that any shortcut is causing an error. But there is another reason to discard that possibility: all the shortcuts in the program work exclusively by saving time in the calculation of numbers of steps and in collapsing trees during searchers'. Once the numbers of steps have been calculated, the fit for each character is obtained by simply dividing k by (kes). Errors in calculation of numbers of steps-possible but rarewould cause errors in total fit of more than 0.008. Besides, the error would quickly become apparent if the shortcuts for tree- searches are deactivated, or the fit for the trees resulting from a search is recalculated with a complete optimization. The "error" reported by Turner and Zandee is simply that which results of dividing an integer.

The documentation of Pee-Wee also points out that giving all characters a higher prior weight increases the precision with which the fit is measured. This is because, during fit calculation, k is first multiplied by the prior weight and subsequently divided by ( k e s ) ; precision would not be increased if one first divided k by ( k e s ) and subsequently multiplied by the prior weight. The maximum prior weight allowed for a character in the program is 10, however, as numbers

The method for implicit calculation of tree lengths described in Goloboff (1994) is not a concern here, because that method finds the exact length in every case. The only shortcut that could cause an error in fit calculation might be the one used to estimate quickly the state assignments when the tree is clipped in two (i.e. the 'qsearch" command of Pee-Wee). That shortcut-which can optionally be deactivated, slowing down the program-may produce errors, although very rarely.

Page 12: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

102 P. A. GOLOBOFF

during partial fit calculations could exceed 32 767 (16 bits) if much higher weights were used in conjunction with higher values of k. With all characters given a prior weight of 10, the rounding error for the fit of an individual character is within 0.0009, or less than 1 in 1000. Given the limitations necessary for any computer implementation, therefore, Pee-Wee provides a reasonable approximation. Once the necessarily discrete nature of the fit measure for an individual character is recognized, it can be seen that Pee-Wee’s trial-anderror methods are exactly equivalent to those of other parsimony programs.

A more conceptual problem with Turner and Zandee’s criticism regarding lack of precision in Pee-Wee is that it is very doubtful that more precision was desirable. This is because-as suggested in the preceding section-the formula for calculating fit as a function of the homoplasy is an estimation of the reliability, not an exact measurement of it. This points to a feature desirable in a program like Pee-Wee, but that still has not been implemented: to retain, during searches, trees that differ in fit by a small amount. This would recognize the fact that weighting is not of an exact nature. Future versions of Pee-Wee may include this option, but it has still not been implemented (in part, because only trees with a slightly lower fit but saving steps in at least one character should be retained; it is meaningless to retain a tree with a small decrease of fit due to only increasing steps in a bad character; this requires some extra computational work). Note that this option-desirable, although computationally difficultwould amount exactly to that objected to by Turner and Zandee, a decrease in the precision.

Character Independence

Although Turner and Zandee find many defects in my method, they nonetheless recognize that avoiding the iterations of successive weighting is a practical advantage, as it usually saves time and computational effort. They think that one could still search for trees of “maximum fit”, but to avoid the possibility that longer trees are selected, a different fitting formula should be used, one which

“must differentiate between trees of different length, preferring the most parsimonious trees. The weight function should have a combination of ES, and the total number of (extra) steps for all characters on the tree in the denominator, i.e. it should weight against longer trees”.

Obviously, their fitting function would have the effect of producing trees shortest under equal weights, which seems reasonable at first. But this would be done by maximizing the “fit” for all characters, measured with a function which (as shown below) cannot possibly be interpreted as measuring the fit for individual characters. Turner and Zandee’s proposal, more than providing a justification for preferring shortest trees, casts doubts on their reasons for that preference.

In their stubbornness to consider only trees shortest under equal weights, they are proposing to measure the fit for a character according not only to its own agreement with the tree, but also according to the agreement of other characters. That is bound to create problems. For example, consider having 10 taxa A-J plus an all-zero root, where character X has state 0 for A-E and 1 for F-J, and a set of 8 perfectly congruent characters determine a pectinate alphabetical tree. Turner and Zandee-and presumably any other phylogeneticist-would agree that character X has a perfect fit to the tree of no extra steps, (A(B(C(D(E

Page 13: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

FORUM: PARSIMONY AND WEIGHTING.. . . . 103

(F(G(H(1J)))) ))))). Now consider the tree of 9 extra steps, with the sequence of taxa backwards within each group, (E(D(C(B(A( ( ( (F G)H)I)J)))))). In that tree, F-J form a monophyletic group, so that just like before, can be accounted for without homoplasy. But although X is in perfect agreement with the tree, Turner and‘ Zandee-and presumedly no other phylogeneticist-would consider that it ”fits” this second tree much more poorly than it did the first, because the other characters fit the tree more poorly. Turner and Zandee do not give an actual formula for a fitting function, and therefore it is not possible to give actual figures. But any formula satisfying their requirement of measuring fit for individual characters taking into account total length will have the absurd consequences just described, and perhaps the even more absurd implication that an individual character may better “fit” trees which require more steps for the character. Although Turner and Zandee would never accept that a tree with more total steps may have a better fit to the data, their function could easily lead to consider a tree that requires more steps for a given character as having a better fit to that character-something that can hardly deserve the name of “parsimony”.

The problem of character independence needed no mention in my 1993a paper, but Turner and Zandee’s fitting function requires that attention be brought to it. Characters in phylogenetic analysis are usually considered as independent lines of evidence. This is, to the extent that there is no necessary correlation (logical or functional) between several characters, the fact that they suggest a similar phylogenetic conclusion, when they do, is significant. To the extent that there is no independence among characters, there is no particular meaning in their agreement-that agreement is just an expression of mutual dependence. Characters are then expected to be independent of each other, and well-justified methods will measure the fit for each character accordingly, independently of the fit of others. Turner and Zandee’s method, unlike mine, would consider that a character fits a tree well only if other characters also fit the tree well. It is as if characters were no longer considered independent pieces of evidence, as if the degree to which a character conforms to a hypothesis of relationships depended on how well other characters conform. Ironically, Turner and Zandee propose their fitting function to salvage what they believe is a requirement of parsimony-trees of minimum length under equal weights. In so doing, they forgot something that, unlike equal weights, is really a requirement of parsimony. Parsimony requires, to produce meaningfd results, that characters be independent.

I thank J. M. Carpenter, J. S. Farris, N. I. Platnick, G. Scrocchi, and 0. Seberg for comments on the manuscript and encouragement.

REFERENCES

CARPENTER, J. M. 1988. Choosing among multiple equally parsimonious cladograms. Cladistics 4 291-296.

Page 14: PARSIMONY AND WEIGHTING: A REPLY TO TURNER AND ZANDEE

104 P. A. GOLOBOFF

FARRIS, J. S. 1969. A successive approximations approach to character weighting. Syst. Zool. 18: 374-385.

FARRIS, J. 1977. On the phenetic approach to vertebrate classification. In: Goody, P. and B. Hecht (eds.). Major patterns in Vertebrate evolution. Plenum Press, New York, pp. 823-850.

FARRIS, J. 1978. The 11th annual numerical taxonomy conference-and part of the 10th. Syst. Zool. 2 7 229-238.

FARRIS, J. 1979a. On the naturalness of phylogenetic classification. Syst. Zool. 28: 200-214. FARRIS, J. 197913. The information content of the phylogenetic system. Syst. Zool. 28:

FARRIS, J. 1980a. Naturalness, information, invariance, and the consequences of phenetic

FARRIS, J. 1980b. The efficient diagnoses of the phylogenetic system. Syst. Zool. 29: 38-1. FARRIS, J. 1982. Simplicity and informativeness in systematics and phylogeny. Syst. Zool. 31:

413-444. FARRIS, J. S. 1983. The logical basis of phylogenetic analysis. In: N. Platnick and V. Funk

(eds.). Proceedings of the Second Meeting of the Willi Hennig Society. Advances in Cladistics 2. Columbia University Press, New York, pp. 7-36.

FARRIS, J. S. 1989. The retention index and rescaled consistency index. Cladistics 5: 417-419. FARRIS, J. S. 1995. Conjectures and refutations. Cladistics 11: 105-108. FEUENSTEIN, J. 1981. A likelihood approach to character weighting, and what it tells us about

GOLOBOFF, P. A. 1993a. Estimating character weights during tree search. Cladistics 9 83-91. GOLOBOFF, P. A. 199313. Peewee. Ver. 2. The American Museum of Natural History, New

GO LOB OFF,^. A. 1993c. A reanalysis of Mygalomorph spider families (Araneae). Am. Mus.

GOLOBOFF, P. A. 1994. Character optimization and calculation of tree lengths. Cladistics 9: 433436.

GOLOBOFT, P. A. 1995. A revision of the South American spiders of the family Nemesiidae (Araneae, Mygalomorphae). Part I: species from Peru, Chile, Argentina, and Uruguay. Bull. Am. Mus. Nat. Hist. 224: 1-189.

483-519.

criteria. Syst. Zool. 29: 360-381.

parsimony and compatibility. Biol. J. Linn. Soc. 1 6 183-196.

York.

Novit. 3056 1-32.

HENNIG, W. 1966. Phylogenetic systematics. University of Ilinois Press, Urbana. SZUMIK. C. A. 1994. O l i p b i a uetusta, a new fossil teratembiid (Embioptera) from Dominican

SZUMIK, C. A. (in press). The higher classification of the order Embioptera: A cladistic

TURNER, H. AND R ZANDEE. 1995. The behaviour of Goloboffs tree fitness measure F.

Amber. J. N.Y. Entomol. Soc. 102: 67-73.

analysis. Cladistics.

Cladistics 11: 57-72.


Top Related