the probabilities of trees and cladograms under...

8
Research Article The Probabilities of Trees and Cladograms under Ford’s -Model Tomás M. Coronado , Arnau Mir, and Francesc Rosselló Balearic Islands Health Research Institute (IdISBa) and Department of Mathematics and Computer Science, University of the Balearic Islands, 07122 Palma, Spain Correspondence should be addressed to Francesc Rossell´ o; [email protected] Received 1 February 2018; Accepted 8 March 2018; Published 18 April 2018 Academic Editor: B´ ela T´ othm´ er´ esz Copyright © 2018 Tom´ as M. Coronado et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Ford’s -model is one of the most popular random parametric models of bifurcating phylogenetic tree growth, having as specific instances both the uniform and the Yule models. Its general properties have been used to study the behavior of phylogenetic tree shape indices under the probability distribution it defines. But the explicit formulas provided by Ford for the probabilities of unlabeled trees and phylogenetic trees fail in some cases. In this paper we give correct explicit formulas for these probabilities. 1. Introduction e study of random growth models of rooted phylogenetic trees and the statistical properties of the shapes of the phylogenetic trees they produce was initiated almost one century ago by Yule [1] and it has gained momentum in the last 20 years: see, for instance, [2–8]. e final goal of this line of research is to understand the relationship between the forces that drive evolution and the topological properties of “real-life” phylogenetic trees [3, 9]; see also [10, Chapter 33]. One of the most popular such models is Ford’s -model for rooted bifurcating phylogenetic trees or cladograms [4]. It consists of a parametric model that generalizes both the uniform model (where new leaves are added equiprobably to any arc, giving rise to the uniform probability distribution on the sets of cladograms with a fixed set of taxa) and Yule’s model (where new leaves are added equiprobably only to pendant arcs, i.e., to arcs ending in leaves) by allocating a possibly different probability (that depends on a parameter and hence its name, “-model”) to the addition of the new leaves to pendant arcs or to internal arcs. When models like Ford’s model are used to contrast topological properties of phylogenetic trees contained in databases like TreeBase (https://treebase.org), only their general properties (moments, asymptotic behavior) are employed. But, in the course of a research where we have needed to compute the probabilities of several specific cladograms under this model [11], we have noticed that the explicit formulas that Ford gives in [4, §3.5] for the probabilities of cladograms and of tree shapes (unlabeled rooted bifurcating trees) are wrong, failing for some trees with ⩾8 leaves; see Propositions 29 and 32 in [4], with the definition of given in page 30 therein, for Ford’s formulas. So, to help the future user of Ford’s model, in this paper we give the correct explicit formulas for these prob- abilities. is paper is accompanied by the GitHub page https://github.com/biocom-uib/prob-alpha where the interested reader can find a SageMath [12] module to compute these probabilities and their explicit values on the sets T of cladograms with leaves labeled 1,...,, for every from 2 to 8. 2. Preliminaries 2.1. Definitions, Notations, and Conventions. roughout this paper, by a tree , we mean a rooted bifurcating tree. As it is customary, we understand as a directed graph, with its arcs pointing away from the root, which we shall denote by . en, all nodes in have out-degree either 0 (its leaves, which form the set ()) or 2 (its internal nodes, which form the set int ()). e children of an internal node V are those Hindawi e Scientific World Journal Volume 2018, Article ID 1916094, 7 pages https://doi.org/10.1155/2018/1916094

Upload: others

Post on 25-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

Research ArticleThe Probabilities of Trees and Cladograms underFordrsquos 120572-Model

Tomaacutes M Coronado ArnauMir and Francesc Rosselloacute

Balearic Islands Health Research Institute (IdISBa) and Department of Mathematics and Computer ScienceUniversity of the Balearic Islands 07122 Palma Spain

Correspondence should be addressed to Francesc Rossello cescrossellouibes

Received 1 February 2018 Accepted 8 March 2018 Published 18 April 2018

Academic Editor Bela Tothmeresz

Copyright copy 2018 Tomas M Coronado et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Fordrsquos 120572-model is one of the most popular random parametric models of bifurcating phylogenetic tree growth having as specificinstances both the uniform and the Yule models Its general properties have been used to study the behavior of phylogenetictree shape indices under the probability distribution it defines But the explicit formulas provided by Ford for the probabilitiesof unlabeled trees and phylogenetic trees fail in some cases In this paper we give correct explicit formulas for these probabilities

1 Introduction

The study of random growth models of rooted phylogenetictrees and the statistical properties of the shapes of thephylogenetic trees they produce was initiated almost onecentury ago by Yule [1] and it has gained momentum in thelast 20 years see for instance [2ndash8] The final goal of thisline of research is to understand the relationship betweenthe forces that drive evolution and the topological propertiesof ldquoreal-liferdquo phylogenetic trees [3 9] see also [10 Chapter33] One of the most popular such models is Fordrsquos 120572-modelfor rooted bifurcating phylogenetic trees or cladograms [4]It consists of a parametric model that generalizes both theuniform model (where new leaves are added equiprobablyto any arc giving rise to the uniform probability distributionon the sets of cladograms with a fixed set of taxa) and Yulersquosmodel (where new leaves are added equiprobably only topendant arcs ie to arcs ending in leaves) by allocating apossibly different probability (that depends on a parameter120572 and hence its name ldquo120572-modelrdquo) to the addition of the newleaves to pendant arcs or to internal arcs

When models like Fordrsquos model are used to contrasttopological properties of phylogenetic trees contained indatabases like TreeBase (httpstreebaseorg) onlytheir general properties (moments asymptotic behavior)are employed But in the course of a research where we

have needed to compute the probabilities of several specificcladograms under this model [11] we have noticed thatthe explicit formulas that Ford gives in [4 sect35] for theprobabilities of cladograms and of tree shapes (unlabeledrooted bifurcating trees) are wrong failing for some treeswith 119899 ⩾ 8 leaves see Propositions 29 and 32 in [4] with thedefinition of 119902 given in page 30 therein for Fordrsquos formulas

So to help the future user of Fordrsquos model in thispaper we give the correct explicit formulas for these prob-abilities This paper is accompanied by the GitHub pagehttpsgithubcombiocom-uibprob-alpha wherethe interested reader can find a SageMath [12] module tocompute these probabilities and their explicit values on thesetsT119899 of cladograms with 119899 leaves labeled 1 119899 for every119899 from 2 to 8

2 Preliminaries

21 Definitions Notations and Conventions Throughout thispaper by a tree 119879 we mean a rooted bifurcating tree As itis customary we understand 119879 as a directed graph with itsarcs pointing away from the root which we shall denote by119903119879 Then all nodes in 119879 have out-degree either 0 (its leaveswhich form the set 119871(119879)) or 2 (its internal nodes which formthe set 119881int(119879)) The children of an internal node V are those

Hindawie Scientific World JournalVolume 2018 Article ID 1916094 7 pageshttpsdoiorg10115520181916094

2 The Scientific World Journal

1 2 3 4

1 2 3 4

n

lowast

lowastn

olowast

lowastn

o

n

Figure 1 An example of images under the forgetful mappings between (ordered and unordered) cladograms and tree shapes In the orderedobjects the ordering is represented by the nodesrsquo colors gray ≺ white

nodes 119906 such that (V 119906) is an arc in 119879 and they form the setchild(V) A node 119909 is a descendant of a node V when thereexists a directed path from V to 119909 in 119879 For every node V thesubtree119879V of 119879 rooted at V is the subgraph of119879 induced on theset of descendants of V

A tree 119879 is ordered when it is endowed with an ordering≺V on every set child(V) A cladogram (resp an orderedcladogram) on a set of taxa Σ is a tree (resp an ordered tree)with its leaves bijectively labeled in Σ Whenever we want tostress the fact that a tree is not a cladogram that is it is anunlabeled tree we shall use the term tree shape

It is important to point out that although ordered treeshave no practical interest from the phylogenetic point of viewbecause the orderings on the sets of children of internal nodesdo not carry any phylogenetic information they are usefulfrom the mathematical point of view because the existenceof the orderings allows one to easily prove certain extraproperties that can later be translated to the unordered setting(cf Proposition 1)

An isomorphism of ordered trees is an isomorphism ofrooted trees that moreover preserves these orderings Anisomorphism of cladograms (resp of ordered cladograms)is an isomorphism of trees (resp of ordered trees) thatpreserves the leavesrsquo labels We shall always identify a treeshape an ordered tree shape a cladogram or an orderedcladogram with its isomorphism class and in particular weshall make henceforth the abuse of language of saying thattwo of these objects 119879 1198791015840 are the same in symbols 119879 = 1198791015840when they are (only) isomorphic We shall denote byTlowast119899 andOTlowast119899 respectively the sets of tree shapes and of ordered treeshapes with 119899 leaves Given any finite set of taxa Σ we shalldenote by TΣ and OTΣ respectively the sets of cladogramsand of ordered cladograms on Σ When the specific set Σ isunrelevant and only its cardinal matters we shall write T119899and OT119899 (with 119899 = |Σ|) instead of TΣ and OTΣ and thenwe shall understand that Σ is [119899] = 1 2 119899

There exist natural isomorphism-preserving forgetfulmappings

n

o

n

lowastn

olowast

lowastn

lowast

that ldquoforgetrdquo the orderings or the labels of the trees Inparticular we shall call the image of a cladogram under 120587its shape Figure 1 depicts an example of images under theseforgetful mappings

Let us introduce some more notations For every nodeV in a tree 119879 120581119879(V) is its number of descendant leaves Forevery internal node V in an ordered tree 119879 with childrenV1≺VV2 its numerical split is the ordered pair NS119879(V) =(120581119879(V1) 120581119879(V2)) If instead 119879 is unordered and if child(V) =V1 V2 with 120581119879(V1) ⩽ 120581119879(V2) then NS119879(V) = (120581119879(V1) 120581119879(V2))In both cases themultiset of numerical splits of 119879 is NS(119879) =NS119879(V) | V isin 119881int(119879) For instance if 119879 is the cladogramdepicted in Figure 2 then

NS (119879) = (1 1) (1 1) (1 1) (2 2) (1 4) (2 5) (1)

A symmetric branch point in a tree 119879 is an internal nodeV such that if V1 and V2 are its children then the subtrees 119879V1and119879V2 of119879 rooted at themhave the same shape For instancethe symmetric branch points in the cladogram depicted inFigure 2 are those filled in black

Given two cladograms 119879 and1198791015840 on Σ and Σ1015840 respectivelywith Σ cap Σ1015840 = 0 their root join is the cladogram 119879 ⋆ 1198791015840 onΣcupΣ1015840 obtained by connecting the roots of119879 and1198791015840 to a (new)common root 119903 see Figure 3 If 119879 1198791015840 are ordered cladograms119879⋆1198791015840 is ordered by inheriting the orderings on 119879 and 1198791015840 andordering the children of the new root 119903 as 119903119879≺1199031199031198791015840 If 119879 and1198791015840 are tree shapes a similar construction yields a tree shape

The Scientific World Journal 3

119879⋆1198791015840 if they are moreover ordered then 119879⋆1198791015840 becomes anordered tree shape as explained above

22The 120572-Model Fordrsquos 120572-model [4] defines for every 119899 ⩾ 1a family of probability density functions 119875(lowast)120572119899 on Tlowast119899 thatdepends on one parameter 120572 isin [0 1] and then it translatesthis family into three other families of probability densityfunctions 119875120572119899 on T119899 119875(119900lowast)120572119899 on OTlowast119899 and 119875(119900)120572119899 on OT119899by imposing that the probability of a tree shape is equallydistributed among its preimages under 120587 120587119900lowast and 120587 ∘ 120587119900 =120587119900lowast ∘ 120587lowast respectively

It is well known [13] that every119879 isin T119899 can be obtained ina unique way by adding recurrently to a single node labeled 1

new leaves labeled 2 119899 to arcs (ie splitting an arc (119906 V)into two arcs (119906 119908) and (119908 V) and then adding a new arc fromthe inserted node119908 to a new leaf) or to a new root (ie addinga new root 119908 and new arcs from 119908 to the old root and to anew leaf) The value of 119875(lowast)120572119899 (119879lowast) for 119879lowast isin Tlowast119899 is determinedthrough all possible ways of constructing cladograms withshape 119879lowast in this way More specifically

(1) if 1198791 and 1198792 denote respectively the only cladogramsinT1 andT2 let 11987510158401205721(1198791) = 11987510158401205722(1198792) = 1

(2) for every 119898 = 3 119899 let 119879119898 isin T119898 be obtained byadding a new leaf labeled119898 to 119879119898minus1 Then

1198751015840120572119898 (119879119898) = 120572119898 minus 1 minus 120572 sdot 1198751015840120572119898minus1 (119879119898minus1) if the new leaf is added to an internal arc or to a new root1 minus 120572119898 minus 1 minus 120572 sdot 1198751015840120572119898minus1 (119879119898minus1) if the new leaf is added to a pendant arc

(2)

(3) When the desired number 119899 of leaves is reached theprobability of every tree shape 119879lowast119899 isin Tlowast119899 is defined as

119875(lowast)120572119899 (119879lowast119899 ) = sum120587(119879119899)=119879

lowast

119899

1198751015840120572119899 (119879119899) (3)

For instance Figure 4 shows the construction of twocladograms in T5 with the same shape and how theirprobability 11987510158401205725 is built using the recursion in Step (2)If we generate all cladograms in T5 with this shape wecompute their probabilities 11987510158401205725 and then we add up all theseprobabilities we obtain the probability 119875(lowast)1205725 of this shapewhich turns out to be 2(1 minus 120572)(4 minus 120572) cf [4 Figure 23]

Once119875(lowast)120572119899 is defined onTlowast119899 it is transported toT119899OTlowast119899 and OT119899 by defining the probability of an object in one ofthese sets as the probability of its image inTlowast119899 divided by thenumber of preimages of this image

(i) For every 119879 isin T119899 if 120587(119879) = 119879lowast isin Tlowast119899 and it has 119896symmetric branch points then

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) (4)

because |120587minus1(119879lowast)| = 1198992119896 (see eg [4 Lemma 31])(ii) For every 119879119900 isin OT119899 if 120587119900(119879119900) = 119879 isin T119899 then

119875(119900)120572119899 (119879119900) = 12119899minus1 sdot 119875120572119899 (119879) (5)

because |120587minus1119900 (119879)| = 2119899minus1 (119879 has 2119899minus1 differentpreimages under 120587119900 obtained by taking all possibledifferent combinations of orderings on the 119899 minus 1 setschild(V) V isin 119881int(119879lowast))

(iii) For every 119879lowast119900 isin OTlowast119899 if 120587119900lowast(119879lowast119900 ) = 119879lowast isin Tlowast119899 and ithas 119896 symmetric branch points then

119875(119900lowast)120572119899 (119879lowast119900 ) = 12119899minus119896minus1 sdot 119875(lowast)120572119899 (119879lowast) (6)

because |120587minus1119900lowast(119879lowast)| = 2119899minus1minus119896 (from the 2119899minus1 possiblepreimages of 119879lowast under 120587119900lowast defined by all possibledifferent combinations of orderings on the 119899 minus 1 setschild(V) V isin 119881int(119879lowast) those differing only on theorderings on the children of the 119896 symmetric branchpoints are actually the same ordered tree shape)

The family (119875(119900lowast)120572119899 )119899 of probabilities of ordered tree shapessatisfies the usefulMarkov branching recurrence (in the senseof [2 sect4]) given by the following proposition In it and in thesequel let for every 119886 119887 isin Z+

119902120572 (119886 119887) = Γ120572 (119886) Γ120572 (119887)Γ120572 (119886 + 119887) sdot 120593120572 (119886 119887) (7)

where

120593120572 (119886 119887) = 1205722 (119886 + 119887119886 ) + (1 minus 2120572)(119886 + 119887 minus 2119886 minus 1 ) (8)

and Γ120572 Z+ rarr R is the mapping defined by Γ120572(1) = 1 andfor every 119899 ⩾ 2 Γ120572(119899) = (119899 minus 1 minus 120572) sdot Γ120572(119899 minus 1)Proposition 1 For every 0 lt 119898 lt 119899 and for every 119879lowast119898 isin OTlowast119898and 119879lowast119899minus119898 isin OTlowast119899minus119898

119875(119900lowast)120572119899 (119879lowast119898 ⋆ 119879lowast119899minus119898)= 119902120572 (119898 119899 minus 119898)119875(119900lowast)120572119898 (119879lowast119898) 119875(119900lowast)120572119899minus119898 (119879lowast119899minus119898) (9)

This recurrence together with the fact that 119875(119900lowast)1205721 of asingle node is 1 implies that for every 119879lowast119900 isin OTlowast119899

119875(119900lowast)120572119899 (119879lowast119900 ) = prod(119886119887)isinNS(119879lowast

119900)

119902120572 (119886 119887) (10)

For proofs of Proposition 1 and (10) see Lemma 27 andProposition 28 in [4] respectively

4 The Scientific World Journal

1 2 3 4 5 6 7

Figure 2 A cladogram in T7 The black nodes are its symmetricbranch points

T

r

T

Figure 3 The root join 119879 ⋆ 11987910158403 Main Results

Our first result is an explicit formula for 119875120572119899(119879) for every119899 ⩾ 1 and 119879 isin T119899Proposition 2 For every 119879 isin T119899 its probability under the120572-model is

119875120572119899 (119879) = 2119899minus1119899 sdot Γ120572 (119899) prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (11)

Proof Given 119879 isin T119899 let 119879119900 be any ordered cladogram suchthat 120587119900(119879119900) = 119879 and let 119879lowast119900 = 120587lowast(119879119900) isin OTlowast119899 and 119879lowast =120587(119879) = 120587119900lowast(119879lowast119900 ) If 119879lowast has 119896 symmetric branch points thenby (4) (6) and (10)

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) = 2119896

119899 sdot 2119899minus119896minus1 sdot 119875(119900lowast)120572119899 (119879lowast119900 )= 2119899minus1119899 prod

(119886119887)isinNS(119879lowast119900)

119902120572 (119886 119887) (12)

Now on the one hand it is easy to check that

NS (119879)= (min 119886 119887 max 119886 119887) | (119886 119887) isin NS (119879lowast0 ) (13)

and therefore since 119902120572 is symmetric

119875120572119899 (119879) = 2119899minus1119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (14)

It remains to simplify this product If for every V isin 119881int(119879)we denote its children by V1 and V2 then

prod(119886119887)isinNS(119879)

119902120572 (119886 119887)= prod

Visin119881int(119879)

Γ120572 (120581119879 (V1)) Γ120572 (120581119879 (V2))Γ120572 (120581119879 (V)) 120593120572 (NS (V)) (15)

For every V isin 119881int(119879) 119903119879 the term Γ120572(120581119879(V)) appearstwice in this product in the denominator of the factorcorresponding to V itself and in the numerator of the factor

corresponding to its parent Therefore all terms Γ120572(120581119879(V)) inthis product vanish except Γ120572(120581119879(119903119879)) = Γ120572(119899) (that appears inthe denominator of its factor) and every Γ120572(120581119879(V)) = Γ120572(1) =1 with V a leaf Thus

119875120572119899 (119879) = 2119899minus1119899 sdot 1Γ120572 (119899) sdot prodVisin119881int(119879)120593120572 (NS (V)) (16)

as we claimed

Remark 3 Ford states (see [4 Proposition 32 and page 30])that if 119879 isin T119899 then

119875120572119899 (119879) = 2119896119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (17)

where 119896 is the number of symmetric branching points in 119879and

119902120572 (119886 119887) = 2119902120572 (119886 119887) if 119886 = 119887119902120572 (119886 119887) if 119886 = 119887 (18)

If we simplifyprod(119886119887)isinNS(119879)119902120572(119886 119887) as in the proof of Proposi-tion 2 this formula for 119875120572119899(119879) becomes

119875120572119899 (119879) = 2119896+119898119899 sdot Γ120572 (119899) sdot prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (19)

where119898 is the number of internal nodes whose children havedifferent numbers of descendant leaves This formula doesnot agree with the one given in Proposition 2 above because

119896 + 119898 = 119899 minus 1 minus 10038161003816100381610038161003816V isin 119881int (119879) | child (V)= V1 V2 120581119879 (V1) = 120581119879 (V2) but 120587 (119879V1)= 120587 (119879V2)10038161003816100381610038161003816

(20)

and hence it may happen that 119896 + 119898 lt 119899 minus 1 Thefirst example of a cladogram with this property (and theonly one up to relabeling with at most 8 leaves) is thecladogram isin T8 depicted in Figure 5 For it our formulagives (see (822) in the document ProblsAlphapdf inhttpsgithubcombiocom-uibprob-alpha)

1198751205728 () = (1 minus 120572)2 (2 minus 120572)126 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (21)

while expression (19) assigns to a probability of half thisvalue

(1 minus 120572)2 (2 minus 120572)252 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (22)

This last value cannot be right for several reasons Firstly by[4 sect312] when 120572 = 12 Fordrsquos model is equivalent to theuniform model where every cladogram in T119899 has the sameprobability

110038161003816100381610038161198611198791198991003816100381610038161003816 = 1(2119899 minus 3) (23)

The Scientific World Journal 5

1 2 1 2 3 1 2 34 1 2 345

1 2 1 3 2 1 3 24 1 5 243

P2 (T2) = 1

P2 (T2) = 1

P3 (T3) =

2 minus P4 (T4) =

1 minus

3 minus middot

2 minus P5 (T5) =

4 minus middot1 minus

3 minus middot

2 minus

P3 (T

3) =

1 minus

2 minus P4 (T

4) =

1 minus

3 minus middot1 minus

2 minus P5 (T

5) =

1 minus

4 minus middot1 minus

3 minus middot1 minus

2 minus

Figure 4 Two examples of computations of the probability 1198751015840120572119899 of a cladogram through its construction in Step (2) of the definition of the120572-model

1 2 3 4 5 6 7 8

Figure 5 The cladogram isin T8 used in Remark 3

and when 120572 = 0 Fordrsquos model gives rise to the Yule model[1 14] where the probability of every 119879 isin T119899 is

119875119884 (119879) = 2119899minus1119899 prodVisin119881int(119879)

1120581119879 (V) minus 1 (24)

In particular 119875128() should be equal to 1135135 and11987508() should be equal to 119845 Both values are consistentwith our formula while expression (22) yields half thesevalues

As a second reason which can be checked using asymbolic computation program let us mention that if wetake expression (22) as the probability of and hence ofall other cladograms with its shape and we assign to allother cladograms in T8 the probabilities computed withProposition 2 which agree on them with the values given by(19) (they are also provided in the aforementioned documentProblsAlphapdf) these probabilities do not add up 1

Combining Proposition 2 and (4) we obtain the followingresult

Corollary 4 For every 119879lowast isin Tlowast119899 with 119896 symmetric branchpoints

119875(lowast)120572119899 (119879lowast) = 2119899minus119896minus1Γ120572 (119899) prod(119886119887)isinNS(119879lowast)

120593120572 (119886 119887) (25)

This formula does not agree either with the one given in[4 Proposition 29] the difference lies again in the same factor

of 2 to the power of the number of internal nodes that are notsymmetric branch points but whose children have the samenumber of descendant leaves

The family of densitymappings (119875120572119899)119899 satisfies the follow-ing Markov branching recurrence

Corollary 5 For every 0 lt 119898 lt 119899 and for every 119879119898 isin T119898and 119879119899minus119898 isin T119899minus119898119875120572119899 (119879119898 ⋆ 119879119899minus119898)

= 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898) 119875120572119899minus119898 (119879119899minus119898) (26)

Proof If 119879119898 isin T119898 and 119879119899minus119898 isin T119899minus119898 then119875120572119898 (119879119898) = 2119898minus1119898Γ120572 (119898) prod

(119886119887)isinNS(119879119898)120593120572 (119886 119887)

119875120572119899minus119898 (119879119899minus119898) = 2119899minus119898minus1(119899 minus 119898)Γ120572 (119899 minus 119898)sdot prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887) 119875120572119899 (119879119898 ⋆ 119879119899minus119898) = 2119899minus1119899Γ120572 (119899) prod

(119886119887)isinNS(119879119898⋆119879119899minus119898)120593120572 (119886 119887)

= 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)( prod(119886119887)isinNS(119879119898)

120593120572 (119886 119887))sdot ( prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887)) = 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)sdot 119898Γ120572 (119898)2119898minus1 119875120572119898 (119879119898) sdot (119899 minus 119898)Γ120572 (119899 minus 119898)2119899minus119898minus1sdot 119875120572119899minus119898 (119879119899minus119898) = 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898)sdot 119875120572119899minus119898 (119879119899minus119898)

(27)

as we claimed

6 The Scientific World Journal

Figure 6 The tree shapes inTlowast6 mentioned in Remark 6

Remark 6 Against what is stated in [4] (119875(lowast)120572119899 )119899 does notsatisfy any Markov branching recurrence that is there doesnot exist any symmetric mapping 119876 Z+ times Z+ rarr R suchthat for every 119896 119897 ⩾ 1 and for every 119879119896 isin Tlowast119896 and 119879119897 isin Tlowast119897

119875(lowast)120572119896+119897

(119879119896 ⋆ 119879119897) = 119876 (119896 119897) sdot 119875(lowast)120572119896 (119879119896) sdot 119875(lowast)120572119897 (119879119897) (28)

Indeed let 119879lowast119898 lowast119898 isin Tlowast119898 be any two different tree shapesboth with 119898 leaves and 119896 symmetric branch points forinstance the tree shapes inTlowast6 depicted in Figure 6 Then

119875(lowast)120572119898 (119879lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887) 119875(lowast)120572119898 (lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod

(119886119887)isinNS(lowast119898)

120593120572 (119886 119887) (29)

In this case 119879lowast119898 ⋆ 119879lowast119898 isin Tlowast2119898 has 2119896 + 1 symmetric branchpoints and therefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ 119879lowast119898) = 22119898minus2119896minus2Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆119879lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))2

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898) ( Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898))2

= 119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (119879lowast119898)

(30)

while 119879lowast119898 ⋆ lowast119898 isin Tlowast2119898 has 2119896 symmetric branch points andtherefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ lowast119898) = 22119898minus2119896minus1Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))

sdot ( prod(119886119887)isinNS(lowast

119898)

120593120572 (119886 119887)) = 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898) sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (lowast119898)= 2119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (lowast119898)

(31)

and 119902120572(119898119898) = 2119902120572(119898119898) This shows that there does notexist any well-defined single real number 119876(119898119898) such that

119875(lowast)1205722119898 (119879lowast1119898 ⋆ 119879lowast2119898) = 119876 (119898119898) sdot 119875(lowast)120572119898 (119879lowast1119898)sdot 119875(lowast)120572119898 (119879lowast2119898) (32)

for every 119879lowast1119898 119879lowast2119898 isin Tlowast119898Data Availability

The data used to support the findings of this study areavailable at the Github page that accompanies this paper

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This researchwas supported by SpanishMinistry of Economyand Competitiveness and European Regional DevelopmentFund Project DPI2015-67082-P (MINECOFEDER) Theauthors thank G Cardona and G Riera for several commentson the SageMath module that accompanies this paper

References

[1] G U Yule ldquoA Mathematical Theory of Evolution Based on theConclusions of Dr J C Willis FRSrdquo Philosophical Transac-tions of the Royal Society B Biological Sciences vol 213 no 402-410 pp 21ndash87 1925

[2] D Aldous ldquoProbability distributions on cladogramsrdquo in Ran-dom discrete structures D Aldous and R Pemantle Eds vol76 pp 1ndash18 Springer New York NY USA 1996

[3] M G B Blum and O Francois ldquoWhich random processesdescribe the tree of life A large-scale study of phylogenetic treeimbalancerdquo Systematic Biology vol 55 no 4 pp 685ndash691 2006

[4] D J Ford ldquoProbabilities on cladograms Introduction to thealpha modelrdquo PhD Thesis (Stanford University) httpsarxivorgabsmath0511246

[5] S Keller-Schmidt M Tugrul V M Eguıluz E Hernandez-Garcıa and K Klemm ldquoAnomalous scaling in an age-dependent branching modelrdquo Physical Review E StatisticalNonlinear and Soft Matter Physics vol 91 no 2 Article ID022803 2015

[6] M Kirkpatrick and M Slatkin ldquoSearching for EvolutionaryPatterns in the Shape of a Phylogenetic Treerdquo Evolution vol 47no 4 pp 1171ndash1181 1993

[7] L Popovic andM Rivas ldquoTopology and inference for Yule treeswith multiple statesrdquo Journal of Mathematical Biology vol 73no 5 pp 1251ndash1291 2016

[8] R Sainudiin and A Veber ldquoA Beta-splitting model for evolu-tionary treesrdquo Royal Society Open Science vol 3 no 5 ArticleID 160016 2016

[9] A O Mooers and S B Heard ldquoInferring evolutionary processfrom phylogenetic tree shaperdquoThe Quarterly Review of Biologyvol 72 no 1 pp 31ndash54 1997

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 2: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

2 The Scientific World Journal

1 2 3 4

1 2 3 4

n

lowast

lowastn

olowast

lowastn

o

n

Figure 1 An example of images under the forgetful mappings between (ordered and unordered) cladograms and tree shapes In the orderedobjects the ordering is represented by the nodesrsquo colors gray ≺ white

nodes 119906 such that (V 119906) is an arc in 119879 and they form the setchild(V) A node 119909 is a descendant of a node V when thereexists a directed path from V to 119909 in 119879 For every node V thesubtree119879V of 119879 rooted at V is the subgraph of119879 induced on theset of descendants of V

A tree 119879 is ordered when it is endowed with an ordering≺V on every set child(V) A cladogram (resp an orderedcladogram) on a set of taxa Σ is a tree (resp an ordered tree)with its leaves bijectively labeled in Σ Whenever we want tostress the fact that a tree is not a cladogram that is it is anunlabeled tree we shall use the term tree shape

It is important to point out that although ordered treeshave no practical interest from the phylogenetic point of viewbecause the orderings on the sets of children of internal nodesdo not carry any phylogenetic information they are usefulfrom the mathematical point of view because the existenceof the orderings allows one to easily prove certain extraproperties that can later be translated to the unordered setting(cf Proposition 1)

An isomorphism of ordered trees is an isomorphism ofrooted trees that moreover preserves these orderings Anisomorphism of cladograms (resp of ordered cladograms)is an isomorphism of trees (resp of ordered trees) thatpreserves the leavesrsquo labels We shall always identify a treeshape an ordered tree shape a cladogram or an orderedcladogram with its isomorphism class and in particular weshall make henceforth the abuse of language of saying thattwo of these objects 119879 1198791015840 are the same in symbols 119879 = 1198791015840when they are (only) isomorphic We shall denote byTlowast119899 andOTlowast119899 respectively the sets of tree shapes and of ordered treeshapes with 119899 leaves Given any finite set of taxa Σ we shalldenote by TΣ and OTΣ respectively the sets of cladogramsand of ordered cladograms on Σ When the specific set Σ isunrelevant and only its cardinal matters we shall write T119899and OT119899 (with 119899 = |Σ|) instead of TΣ and OTΣ and thenwe shall understand that Σ is [119899] = 1 2 119899

There exist natural isomorphism-preserving forgetfulmappings

n

o

n

lowastn

olowast

lowastn

lowast

that ldquoforgetrdquo the orderings or the labels of the trees Inparticular we shall call the image of a cladogram under 120587its shape Figure 1 depicts an example of images under theseforgetful mappings

Let us introduce some more notations For every nodeV in a tree 119879 120581119879(V) is its number of descendant leaves Forevery internal node V in an ordered tree 119879 with childrenV1≺VV2 its numerical split is the ordered pair NS119879(V) =(120581119879(V1) 120581119879(V2)) If instead 119879 is unordered and if child(V) =V1 V2 with 120581119879(V1) ⩽ 120581119879(V2) then NS119879(V) = (120581119879(V1) 120581119879(V2))In both cases themultiset of numerical splits of 119879 is NS(119879) =NS119879(V) | V isin 119881int(119879) For instance if 119879 is the cladogramdepicted in Figure 2 then

NS (119879) = (1 1) (1 1) (1 1) (2 2) (1 4) (2 5) (1)

A symmetric branch point in a tree 119879 is an internal nodeV such that if V1 and V2 are its children then the subtrees 119879V1and119879V2 of119879 rooted at themhave the same shape For instancethe symmetric branch points in the cladogram depicted inFigure 2 are those filled in black

Given two cladograms 119879 and1198791015840 on Σ and Σ1015840 respectivelywith Σ cap Σ1015840 = 0 their root join is the cladogram 119879 ⋆ 1198791015840 onΣcupΣ1015840 obtained by connecting the roots of119879 and1198791015840 to a (new)common root 119903 see Figure 3 If 119879 1198791015840 are ordered cladograms119879⋆1198791015840 is ordered by inheriting the orderings on 119879 and 1198791015840 andordering the children of the new root 119903 as 119903119879≺1199031199031198791015840 If 119879 and1198791015840 are tree shapes a similar construction yields a tree shape

The Scientific World Journal 3

119879⋆1198791015840 if they are moreover ordered then 119879⋆1198791015840 becomes anordered tree shape as explained above

22The 120572-Model Fordrsquos 120572-model [4] defines for every 119899 ⩾ 1a family of probability density functions 119875(lowast)120572119899 on Tlowast119899 thatdepends on one parameter 120572 isin [0 1] and then it translatesthis family into three other families of probability densityfunctions 119875120572119899 on T119899 119875(119900lowast)120572119899 on OTlowast119899 and 119875(119900)120572119899 on OT119899by imposing that the probability of a tree shape is equallydistributed among its preimages under 120587 120587119900lowast and 120587 ∘ 120587119900 =120587119900lowast ∘ 120587lowast respectively

It is well known [13] that every119879 isin T119899 can be obtained ina unique way by adding recurrently to a single node labeled 1

new leaves labeled 2 119899 to arcs (ie splitting an arc (119906 V)into two arcs (119906 119908) and (119908 V) and then adding a new arc fromthe inserted node119908 to a new leaf) or to a new root (ie addinga new root 119908 and new arcs from 119908 to the old root and to anew leaf) The value of 119875(lowast)120572119899 (119879lowast) for 119879lowast isin Tlowast119899 is determinedthrough all possible ways of constructing cladograms withshape 119879lowast in this way More specifically

(1) if 1198791 and 1198792 denote respectively the only cladogramsinT1 andT2 let 11987510158401205721(1198791) = 11987510158401205722(1198792) = 1

(2) for every 119898 = 3 119899 let 119879119898 isin T119898 be obtained byadding a new leaf labeled119898 to 119879119898minus1 Then

1198751015840120572119898 (119879119898) = 120572119898 minus 1 minus 120572 sdot 1198751015840120572119898minus1 (119879119898minus1) if the new leaf is added to an internal arc or to a new root1 minus 120572119898 minus 1 minus 120572 sdot 1198751015840120572119898minus1 (119879119898minus1) if the new leaf is added to a pendant arc

(2)

(3) When the desired number 119899 of leaves is reached theprobability of every tree shape 119879lowast119899 isin Tlowast119899 is defined as

119875(lowast)120572119899 (119879lowast119899 ) = sum120587(119879119899)=119879

lowast

119899

1198751015840120572119899 (119879119899) (3)

For instance Figure 4 shows the construction of twocladograms in T5 with the same shape and how theirprobability 11987510158401205725 is built using the recursion in Step (2)If we generate all cladograms in T5 with this shape wecompute their probabilities 11987510158401205725 and then we add up all theseprobabilities we obtain the probability 119875(lowast)1205725 of this shapewhich turns out to be 2(1 minus 120572)(4 minus 120572) cf [4 Figure 23]

Once119875(lowast)120572119899 is defined onTlowast119899 it is transported toT119899OTlowast119899 and OT119899 by defining the probability of an object in one ofthese sets as the probability of its image inTlowast119899 divided by thenumber of preimages of this image

(i) For every 119879 isin T119899 if 120587(119879) = 119879lowast isin Tlowast119899 and it has 119896symmetric branch points then

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) (4)

because |120587minus1(119879lowast)| = 1198992119896 (see eg [4 Lemma 31])(ii) For every 119879119900 isin OT119899 if 120587119900(119879119900) = 119879 isin T119899 then

119875(119900)120572119899 (119879119900) = 12119899minus1 sdot 119875120572119899 (119879) (5)

because |120587minus1119900 (119879)| = 2119899minus1 (119879 has 2119899minus1 differentpreimages under 120587119900 obtained by taking all possibledifferent combinations of orderings on the 119899 minus 1 setschild(V) V isin 119881int(119879lowast))

(iii) For every 119879lowast119900 isin OTlowast119899 if 120587119900lowast(119879lowast119900 ) = 119879lowast isin Tlowast119899 and ithas 119896 symmetric branch points then

119875(119900lowast)120572119899 (119879lowast119900 ) = 12119899minus119896minus1 sdot 119875(lowast)120572119899 (119879lowast) (6)

because |120587minus1119900lowast(119879lowast)| = 2119899minus1minus119896 (from the 2119899minus1 possiblepreimages of 119879lowast under 120587119900lowast defined by all possibledifferent combinations of orderings on the 119899 minus 1 setschild(V) V isin 119881int(119879lowast) those differing only on theorderings on the children of the 119896 symmetric branchpoints are actually the same ordered tree shape)

The family (119875(119900lowast)120572119899 )119899 of probabilities of ordered tree shapessatisfies the usefulMarkov branching recurrence (in the senseof [2 sect4]) given by the following proposition In it and in thesequel let for every 119886 119887 isin Z+

119902120572 (119886 119887) = Γ120572 (119886) Γ120572 (119887)Γ120572 (119886 + 119887) sdot 120593120572 (119886 119887) (7)

where

120593120572 (119886 119887) = 1205722 (119886 + 119887119886 ) + (1 minus 2120572)(119886 + 119887 minus 2119886 minus 1 ) (8)

and Γ120572 Z+ rarr R is the mapping defined by Γ120572(1) = 1 andfor every 119899 ⩾ 2 Γ120572(119899) = (119899 minus 1 minus 120572) sdot Γ120572(119899 minus 1)Proposition 1 For every 0 lt 119898 lt 119899 and for every 119879lowast119898 isin OTlowast119898and 119879lowast119899minus119898 isin OTlowast119899minus119898

119875(119900lowast)120572119899 (119879lowast119898 ⋆ 119879lowast119899minus119898)= 119902120572 (119898 119899 minus 119898)119875(119900lowast)120572119898 (119879lowast119898) 119875(119900lowast)120572119899minus119898 (119879lowast119899minus119898) (9)

This recurrence together with the fact that 119875(119900lowast)1205721 of asingle node is 1 implies that for every 119879lowast119900 isin OTlowast119899

119875(119900lowast)120572119899 (119879lowast119900 ) = prod(119886119887)isinNS(119879lowast

119900)

119902120572 (119886 119887) (10)

For proofs of Proposition 1 and (10) see Lemma 27 andProposition 28 in [4] respectively

4 The Scientific World Journal

1 2 3 4 5 6 7

Figure 2 A cladogram in T7 The black nodes are its symmetricbranch points

T

r

T

Figure 3 The root join 119879 ⋆ 11987910158403 Main Results

Our first result is an explicit formula for 119875120572119899(119879) for every119899 ⩾ 1 and 119879 isin T119899Proposition 2 For every 119879 isin T119899 its probability under the120572-model is

119875120572119899 (119879) = 2119899minus1119899 sdot Γ120572 (119899) prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (11)

Proof Given 119879 isin T119899 let 119879119900 be any ordered cladogram suchthat 120587119900(119879119900) = 119879 and let 119879lowast119900 = 120587lowast(119879119900) isin OTlowast119899 and 119879lowast =120587(119879) = 120587119900lowast(119879lowast119900 ) If 119879lowast has 119896 symmetric branch points thenby (4) (6) and (10)

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) = 2119896

119899 sdot 2119899minus119896minus1 sdot 119875(119900lowast)120572119899 (119879lowast119900 )= 2119899minus1119899 prod

(119886119887)isinNS(119879lowast119900)

119902120572 (119886 119887) (12)

Now on the one hand it is easy to check that

NS (119879)= (min 119886 119887 max 119886 119887) | (119886 119887) isin NS (119879lowast0 ) (13)

and therefore since 119902120572 is symmetric

119875120572119899 (119879) = 2119899minus1119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (14)

It remains to simplify this product If for every V isin 119881int(119879)we denote its children by V1 and V2 then

prod(119886119887)isinNS(119879)

119902120572 (119886 119887)= prod

Visin119881int(119879)

Γ120572 (120581119879 (V1)) Γ120572 (120581119879 (V2))Γ120572 (120581119879 (V)) 120593120572 (NS (V)) (15)

For every V isin 119881int(119879) 119903119879 the term Γ120572(120581119879(V)) appearstwice in this product in the denominator of the factorcorresponding to V itself and in the numerator of the factor

corresponding to its parent Therefore all terms Γ120572(120581119879(V)) inthis product vanish except Γ120572(120581119879(119903119879)) = Γ120572(119899) (that appears inthe denominator of its factor) and every Γ120572(120581119879(V)) = Γ120572(1) =1 with V a leaf Thus

119875120572119899 (119879) = 2119899minus1119899 sdot 1Γ120572 (119899) sdot prodVisin119881int(119879)120593120572 (NS (V)) (16)

as we claimed

Remark 3 Ford states (see [4 Proposition 32 and page 30])that if 119879 isin T119899 then

119875120572119899 (119879) = 2119896119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (17)

where 119896 is the number of symmetric branching points in 119879and

119902120572 (119886 119887) = 2119902120572 (119886 119887) if 119886 = 119887119902120572 (119886 119887) if 119886 = 119887 (18)

If we simplifyprod(119886119887)isinNS(119879)119902120572(119886 119887) as in the proof of Proposi-tion 2 this formula for 119875120572119899(119879) becomes

119875120572119899 (119879) = 2119896+119898119899 sdot Γ120572 (119899) sdot prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (19)

where119898 is the number of internal nodes whose children havedifferent numbers of descendant leaves This formula doesnot agree with the one given in Proposition 2 above because

119896 + 119898 = 119899 minus 1 minus 10038161003816100381610038161003816V isin 119881int (119879) | child (V)= V1 V2 120581119879 (V1) = 120581119879 (V2) but 120587 (119879V1)= 120587 (119879V2)10038161003816100381610038161003816

(20)

and hence it may happen that 119896 + 119898 lt 119899 minus 1 Thefirst example of a cladogram with this property (and theonly one up to relabeling with at most 8 leaves) is thecladogram isin T8 depicted in Figure 5 For it our formulagives (see (822) in the document ProblsAlphapdf inhttpsgithubcombiocom-uibprob-alpha)

1198751205728 () = (1 minus 120572)2 (2 minus 120572)126 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (21)

while expression (19) assigns to a probability of half thisvalue

(1 minus 120572)2 (2 minus 120572)252 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (22)

This last value cannot be right for several reasons Firstly by[4 sect312] when 120572 = 12 Fordrsquos model is equivalent to theuniform model where every cladogram in T119899 has the sameprobability

110038161003816100381610038161198611198791198991003816100381610038161003816 = 1(2119899 minus 3) (23)

The Scientific World Journal 5

1 2 1 2 3 1 2 34 1 2 345

1 2 1 3 2 1 3 24 1 5 243

P2 (T2) = 1

P2 (T2) = 1

P3 (T3) =

2 minus P4 (T4) =

1 minus

3 minus middot

2 minus P5 (T5) =

4 minus middot1 minus

3 minus middot

2 minus

P3 (T

3) =

1 minus

2 minus P4 (T

4) =

1 minus

3 minus middot1 minus

2 minus P5 (T

5) =

1 minus

4 minus middot1 minus

3 minus middot1 minus

2 minus

Figure 4 Two examples of computations of the probability 1198751015840120572119899 of a cladogram through its construction in Step (2) of the definition of the120572-model

1 2 3 4 5 6 7 8

Figure 5 The cladogram isin T8 used in Remark 3

and when 120572 = 0 Fordrsquos model gives rise to the Yule model[1 14] where the probability of every 119879 isin T119899 is

119875119884 (119879) = 2119899minus1119899 prodVisin119881int(119879)

1120581119879 (V) minus 1 (24)

In particular 119875128() should be equal to 1135135 and11987508() should be equal to 119845 Both values are consistentwith our formula while expression (22) yields half thesevalues

As a second reason which can be checked using asymbolic computation program let us mention that if wetake expression (22) as the probability of and hence ofall other cladograms with its shape and we assign to allother cladograms in T8 the probabilities computed withProposition 2 which agree on them with the values given by(19) (they are also provided in the aforementioned documentProblsAlphapdf) these probabilities do not add up 1

Combining Proposition 2 and (4) we obtain the followingresult

Corollary 4 For every 119879lowast isin Tlowast119899 with 119896 symmetric branchpoints

119875(lowast)120572119899 (119879lowast) = 2119899minus119896minus1Γ120572 (119899) prod(119886119887)isinNS(119879lowast)

120593120572 (119886 119887) (25)

This formula does not agree either with the one given in[4 Proposition 29] the difference lies again in the same factor

of 2 to the power of the number of internal nodes that are notsymmetric branch points but whose children have the samenumber of descendant leaves

The family of densitymappings (119875120572119899)119899 satisfies the follow-ing Markov branching recurrence

Corollary 5 For every 0 lt 119898 lt 119899 and for every 119879119898 isin T119898and 119879119899minus119898 isin T119899minus119898119875120572119899 (119879119898 ⋆ 119879119899minus119898)

= 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898) 119875120572119899minus119898 (119879119899minus119898) (26)

Proof If 119879119898 isin T119898 and 119879119899minus119898 isin T119899minus119898 then119875120572119898 (119879119898) = 2119898minus1119898Γ120572 (119898) prod

(119886119887)isinNS(119879119898)120593120572 (119886 119887)

119875120572119899minus119898 (119879119899minus119898) = 2119899minus119898minus1(119899 minus 119898)Γ120572 (119899 minus 119898)sdot prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887) 119875120572119899 (119879119898 ⋆ 119879119899minus119898) = 2119899minus1119899Γ120572 (119899) prod

(119886119887)isinNS(119879119898⋆119879119899minus119898)120593120572 (119886 119887)

= 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)( prod(119886119887)isinNS(119879119898)

120593120572 (119886 119887))sdot ( prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887)) = 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)sdot 119898Γ120572 (119898)2119898minus1 119875120572119898 (119879119898) sdot (119899 minus 119898)Γ120572 (119899 minus 119898)2119899minus119898minus1sdot 119875120572119899minus119898 (119879119899minus119898) = 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898)sdot 119875120572119899minus119898 (119879119899minus119898)

(27)

as we claimed

6 The Scientific World Journal

Figure 6 The tree shapes inTlowast6 mentioned in Remark 6

Remark 6 Against what is stated in [4] (119875(lowast)120572119899 )119899 does notsatisfy any Markov branching recurrence that is there doesnot exist any symmetric mapping 119876 Z+ times Z+ rarr R suchthat for every 119896 119897 ⩾ 1 and for every 119879119896 isin Tlowast119896 and 119879119897 isin Tlowast119897

119875(lowast)120572119896+119897

(119879119896 ⋆ 119879119897) = 119876 (119896 119897) sdot 119875(lowast)120572119896 (119879119896) sdot 119875(lowast)120572119897 (119879119897) (28)

Indeed let 119879lowast119898 lowast119898 isin Tlowast119898 be any two different tree shapesboth with 119898 leaves and 119896 symmetric branch points forinstance the tree shapes inTlowast6 depicted in Figure 6 Then

119875(lowast)120572119898 (119879lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887) 119875(lowast)120572119898 (lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod

(119886119887)isinNS(lowast119898)

120593120572 (119886 119887) (29)

In this case 119879lowast119898 ⋆ 119879lowast119898 isin Tlowast2119898 has 2119896 + 1 symmetric branchpoints and therefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ 119879lowast119898) = 22119898minus2119896minus2Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆119879lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))2

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898) ( Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898))2

= 119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (119879lowast119898)

(30)

while 119879lowast119898 ⋆ lowast119898 isin Tlowast2119898 has 2119896 symmetric branch points andtherefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ lowast119898) = 22119898minus2119896minus1Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))

sdot ( prod(119886119887)isinNS(lowast

119898)

120593120572 (119886 119887)) = 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898) sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (lowast119898)= 2119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (lowast119898)

(31)

and 119902120572(119898119898) = 2119902120572(119898119898) This shows that there does notexist any well-defined single real number 119876(119898119898) such that

119875(lowast)1205722119898 (119879lowast1119898 ⋆ 119879lowast2119898) = 119876 (119898119898) sdot 119875(lowast)120572119898 (119879lowast1119898)sdot 119875(lowast)120572119898 (119879lowast2119898) (32)

for every 119879lowast1119898 119879lowast2119898 isin Tlowast119898Data Availability

The data used to support the findings of this study areavailable at the Github page that accompanies this paper

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This researchwas supported by SpanishMinistry of Economyand Competitiveness and European Regional DevelopmentFund Project DPI2015-67082-P (MINECOFEDER) Theauthors thank G Cardona and G Riera for several commentson the SageMath module that accompanies this paper

References

[1] G U Yule ldquoA Mathematical Theory of Evolution Based on theConclusions of Dr J C Willis FRSrdquo Philosophical Transac-tions of the Royal Society B Biological Sciences vol 213 no 402-410 pp 21ndash87 1925

[2] D Aldous ldquoProbability distributions on cladogramsrdquo in Ran-dom discrete structures D Aldous and R Pemantle Eds vol76 pp 1ndash18 Springer New York NY USA 1996

[3] M G B Blum and O Francois ldquoWhich random processesdescribe the tree of life A large-scale study of phylogenetic treeimbalancerdquo Systematic Biology vol 55 no 4 pp 685ndash691 2006

[4] D J Ford ldquoProbabilities on cladograms Introduction to thealpha modelrdquo PhD Thesis (Stanford University) httpsarxivorgabsmath0511246

[5] S Keller-Schmidt M Tugrul V M Eguıluz E Hernandez-Garcıa and K Klemm ldquoAnomalous scaling in an age-dependent branching modelrdquo Physical Review E StatisticalNonlinear and Soft Matter Physics vol 91 no 2 Article ID022803 2015

[6] M Kirkpatrick and M Slatkin ldquoSearching for EvolutionaryPatterns in the Shape of a Phylogenetic Treerdquo Evolution vol 47no 4 pp 1171ndash1181 1993

[7] L Popovic andM Rivas ldquoTopology and inference for Yule treeswith multiple statesrdquo Journal of Mathematical Biology vol 73no 5 pp 1251ndash1291 2016

[8] R Sainudiin and A Veber ldquoA Beta-splitting model for evolu-tionary treesrdquo Royal Society Open Science vol 3 no 5 ArticleID 160016 2016

[9] A O Mooers and S B Heard ldquoInferring evolutionary processfrom phylogenetic tree shaperdquoThe Quarterly Review of Biologyvol 72 no 1 pp 31ndash54 1997

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 3: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

The Scientific World Journal 3

119879⋆1198791015840 if they are moreover ordered then 119879⋆1198791015840 becomes anordered tree shape as explained above

22The 120572-Model Fordrsquos 120572-model [4] defines for every 119899 ⩾ 1a family of probability density functions 119875(lowast)120572119899 on Tlowast119899 thatdepends on one parameter 120572 isin [0 1] and then it translatesthis family into three other families of probability densityfunctions 119875120572119899 on T119899 119875(119900lowast)120572119899 on OTlowast119899 and 119875(119900)120572119899 on OT119899by imposing that the probability of a tree shape is equallydistributed among its preimages under 120587 120587119900lowast and 120587 ∘ 120587119900 =120587119900lowast ∘ 120587lowast respectively

It is well known [13] that every119879 isin T119899 can be obtained ina unique way by adding recurrently to a single node labeled 1

new leaves labeled 2 119899 to arcs (ie splitting an arc (119906 V)into two arcs (119906 119908) and (119908 V) and then adding a new arc fromthe inserted node119908 to a new leaf) or to a new root (ie addinga new root 119908 and new arcs from 119908 to the old root and to anew leaf) The value of 119875(lowast)120572119899 (119879lowast) for 119879lowast isin Tlowast119899 is determinedthrough all possible ways of constructing cladograms withshape 119879lowast in this way More specifically

(1) if 1198791 and 1198792 denote respectively the only cladogramsinT1 andT2 let 11987510158401205721(1198791) = 11987510158401205722(1198792) = 1

(2) for every 119898 = 3 119899 let 119879119898 isin T119898 be obtained byadding a new leaf labeled119898 to 119879119898minus1 Then

1198751015840120572119898 (119879119898) = 120572119898 minus 1 minus 120572 sdot 1198751015840120572119898minus1 (119879119898minus1) if the new leaf is added to an internal arc or to a new root1 minus 120572119898 minus 1 minus 120572 sdot 1198751015840120572119898minus1 (119879119898minus1) if the new leaf is added to a pendant arc

(2)

(3) When the desired number 119899 of leaves is reached theprobability of every tree shape 119879lowast119899 isin Tlowast119899 is defined as

119875(lowast)120572119899 (119879lowast119899 ) = sum120587(119879119899)=119879

lowast

119899

1198751015840120572119899 (119879119899) (3)

For instance Figure 4 shows the construction of twocladograms in T5 with the same shape and how theirprobability 11987510158401205725 is built using the recursion in Step (2)If we generate all cladograms in T5 with this shape wecompute their probabilities 11987510158401205725 and then we add up all theseprobabilities we obtain the probability 119875(lowast)1205725 of this shapewhich turns out to be 2(1 minus 120572)(4 minus 120572) cf [4 Figure 23]

Once119875(lowast)120572119899 is defined onTlowast119899 it is transported toT119899OTlowast119899 and OT119899 by defining the probability of an object in one ofthese sets as the probability of its image inTlowast119899 divided by thenumber of preimages of this image

(i) For every 119879 isin T119899 if 120587(119879) = 119879lowast isin Tlowast119899 and it has 119896symmetric branch points then

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) (4)

because |120587minus1(119879lowast)| = 1198992119896 (see eg [4 Lemma 31])(ii) For every 119879119900 isin OT119899 if 120587119900(119879119900) = 119879 isin T119899 then

119875(119900)120572119899 (119879119900) = 12119899minus1 sdot 119875120572119899 (119879) (5)

because |120587minus1119900 (119879)| = 2119899minus1 (119879 has 2119899minus1 differentpreimages under 120587119900 obtained by taking all possibledifferent combinations of orderings on the 119899 minus 1 setschild(V) V isin 119881int(119879lowast))

(iii) For every 119879lowast119900 isin OTlowast119899 if 120587119900lowast(119879lowast119900 ) = 119879lowast isin Tlowast119899 and ithas 119896 symmetric branch points then

119875(119900lowast)120572119899 (119879lowast119900 ) = 12119899minus119896minus1 sdot 119875(lowast)120572119899 (119879lowast) (6)

because |120587minus1119900lowast(119879lowast)| = 2119899minus1minus119896 (from the 2119899minus1 possiblepreimages of 119879lowast under 120587119900lowast defined by all possibledifferent combinations of orderings on the 119899 minus 1 setschild(V) V isin 119881int(119879lowast) those differing only on theorderings on the children of the 119896 symmetric branchpoints are actually the same ordered tree shape)

The family (119875(119900lowast)120572119899 )119899 of probabilities of ordered tree shapessatisfies the usefulMarkov branching recurrence (in the senseof [2 sect4]) given by the following proposition In it and in thesequel let for every 119886 119887 isin Z+

119902120572 (119886 119887) = Γ120572 (119886) Γ120572 (119887)Γ120572 (119886 + 119887) sdot 120593120572 (119886 119887) (7)

where

120593120572 (119886 119887) = 1205722 (119886 + 119887119886 ) + (1 minus 2120572)(119886 + 119887 minus 2119886 minus 1 ) (8)

and Γ120572 Z+ rarr R is the mapping defined by Γ120572(1) = 1 andfor every 119899 ⩾ 2 Γ120572(119899) = (119899 minus 1 minus 120572) sdot Γ120572(119899 minus 1)Proposition 1 For every 0 lt 119898 lt 119899 and for every 119879lowast119898 isin OTlowast119898and 119879lowast119899minus119898 isin OTlowast119899minus119898

119875(119900lowast)120572119899 (119879lowast119898 ⋆ 119879lowast119899minus119898)= 119902120572 (119898 119899 minus 119898)119875(119900lowast)120572119898 (119879lowast119898) 119875(119900lowast)120572119899minus119898 (119879lowast119899minus119898) (9)

This recurrence together with the fact that 119875(119900lowast)1205721 of asingle node is 1 implies that for every 119879lowast119900 isin OTlowast119899

119875(119900lowast)120572119899 (119879lowast119900 ) = prod(119886119887)isinNS(119879lowast

119900)

119902120572 (119886 119887) (10)

For proofs of Proposition 1 and (10) see Lemma 27 andProposition 28 in [4] respectively

4 The Scientific World Journal

1 2 3 4 5 6 7

Figure 2 A cladogram in T7 The black nodes are its symmetricbranch points

T

r

T

Figure 3 The root join 119879 ⋆ 11987910158403 Main Results

Our first result is an explicit formula for 119875120572119899(119879) for every119899 ⩾ 1 and 119879 isin T119899Proposition 2 For every 119879 isin T119899 its probability under the120572-model is

119875120572119899 (119879) = 2119899minus1119899 sdot Γ120572 (119899) prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (11)

Proof Given 119879 isin T119899 let 119879119900 be any ordered cladogram suchthat 120587119900(119879119900) = 119879 and let 119879lowast119900 = 120587lowast(119879119900) isin OTlowast119899 and 119879lowast =120587(119879) = 120587119900lowast(119879lowast119900 ) If 119879lowast has 119896 symmetric branch points thenby (4) (6) and (10)

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) = 2119896

119899 sdot 2119899minus119896minus1 sdot 119875(119900lowast)120572119899 (119879lowast119900 )= 2119899minus1119899 prod

(119886119887)isinNS(119879lowast119900)

119902120572 (119886 119887) (12)

Now on the one hand it is easy to check that

NS (119879)= (min 119886 119887 max 119886 119887) | (119886 119887) isin NS (119879lowast0 ) (13)

and therefore since 119902120572 is symmetric

119875120572119899 (119879) = 2119899minus1119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (14)

It remains to simplify this product If for every V isin 119881int(119879)we denote its children by V1 and V2 then

prod(119886119887)isinNS(119879)

119902120572 (119886 119887)= prod

Visin119881int(119879)

Γ120572 (120581119879 (V1)) Γ120572 (120581119879 (V2))Γ120572 (120581119879 (V)) 120593120572 (NS (V)) (15)

For every V isin 119881int(119879) 119903119879 the term Γ120572(120581119879(V)) appearstwice in this product in the denominator of the factorcorresponding to V itself and in the numerator of the factor

corresponding to its parent Therefore all terms Γ120572(120581119879(V)) inthis product vanish except Γ120572(120581119879(119903119879)) = Γ120572(119899) (that appears inthe denominator of its factor) and every Γ120572(120581119879(V)) = Γ120572(1) =1 with V a leaf Thus

119875120572119899 (119879) = 2119899minus1119899 sdot 1Γ120572 (119899) sdot prodVisin119881int(119879)120593120572 (NS (V)) (16)

as we claimed

Remark 3 Ford states (see [4 Proposition 32 and page 30])that if 119879 isin T119899 then

119875120572119899 (119879) = 2119896119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (17)

where 119896 is the number of symmetric branching points in 119879and

119902120572 (119886 119887) = 2119902120572 (119886 119887) if 119886 = 119887119902120572 (119886 119887) if 119886 = 119887 (18)

If we simplifyprod(119886119887)isinNS(119879)119902120572(119886 119887) as in the proof of Proposi-tion 2 this formula for 119875120572119899(119879) becomes

119875120572119899 (119879) = 2119896+119898119899 sdot Γ120572 (119899) sdot prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (19)

where119898 is the number of internal nodes whose children havedifferent numbers of descendant leaves This formula doesnot agree with the one given in Proposition 2 above because

119896 + 119898 = 119899 minus 1 minus 10038161003816100381610038161003816V isin 119881int (119879) | child (V)= V1 V2 120581119879 (V1) = 120581119879 (V2) but 120587 (119879V1)= 120587 (119879V2)10038161003816100381610038161003816

(20)

and hence it may happen that 119896 + 119898 lt 119899 minus 1 Thefirst example of a cladogram with this property (and theonly one up to relabeling with at most 8 leaves) is thecladogram isin T8 depicted in Figure 5 For it our formulagives (see (822) in the document ProblsAlphapdf inhttpsgithubcombiocom-uibprob-alpha)

1198751205728 () = (1 minus 120572)2 (2 minus 120572)126 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (21)

while expression (19) assigns to a probability of half thisvalue

(1 minus 120572)2 (2 minus 120572)252 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (22)

This last value cannot be right for several reasons Firstly by[4 sect312] when 120572 = 12 Fordrsquos model is equivalent to theuniform model where every cladogram in T119899 has the sameprobability

110038161003816100381610038161198611198791198991003816100381610038161003816 = 1(2119899 minus 3) (23)

The Scientific World Journal 5

1 2 1 2 3 1 2 34 1 2 345

1 2 1 3 2 1 3 24 1 5 243

P2 (T2) = 1

P2 (T2) = 1

P3 (T3) =

2 minus P4 (T4) =

1 minus

3 minus middot

2 minus P5 (T5) =

4 minus middot1 minus

3 minus middot

2 minus

P3 (T

3) =

1 minus

2 minus P4 (T

4) =

1 minus

3 minus middot1 minus

2 minus P5 (T

5) =

1 minus

4 minus middot1 minus

3 minus middot1 minus

2 minus

Figure 4 Two examples of computations of the probability 1198751015840120572119899 of a cladogram through its construction in Step (2) of the definition of the120572-model

1 2 3 4 5 6 7 8

Figure 5 The cladogram isin T8 used in Remark 3

and when 120572 = 0 Fordrsquos model gives rise to the Yule model[1 14] where the probability of every 119879 isin T119899 is

119875119884 (119879) = 2119899minus1119899 prodVisin119881int(119879)

1120581119879 (V) minus 1 (24)

In particular 119875128() should be equal to 1135135 and11987508() should be equal to 119845 Both values are consistentwith our formula while expression (22) yields half thesevalues

As a second reason which can be checked using asymbolic computation program let us mention that if wetake expression (22) as the probability of and hence ofall other cladograms with its shape and we assign to allother cladograms in T8 the probabilities computed withProposition 2 which agree on them with the values given by(19) (they are also provided in the aforementioned documentProblsAlphapdf) these probabilities do not add up 1

Combining Proposition 2 and (4) we obtain the followingresult

Corollary 4 For every 119879lowast isin Tlowast119899 with 119896 symmetric branchpoints

119875(lowast)120572119899 (119879lowast) = 2119899minus119896minus1Γ120572 (119899) prod(119886119887)isinNS(119879lowast)

120593120572 (119886 119887) (25)

This formula does not agree either with the one given in[4 Proposition 29] the difference lies again in the same factor

of 2 to the power of the number of internal nodes that are notsymmetric branch points but whose children have the samenumber of descendant leaves

The family of densitymappings (119875120572119899)119899 satisfies the follow-ing Markov branching recurrence

Corollary 5 For every 0 lt 119898 lt 119899 and for every 119879119898 isin T119898and 119879119899minus119898 isin T119899minus119898119875120572119899 (119879119898 ⋆ 119879119899minus119898)

= 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898) 119875120572119899minus119898 (119879119899minus119898) (26)

Proof If 119879119898 isin T119898 and 119879119899minus119898 isin T119899minus119898 then119875120572119898 (119879119898) = 2119898minus1119898Γ120572 (119898) prod

(119886119887)isinNS(119879119898)120593120572 (119886 119887)

119875120572119899minus119898 (119879119899minus119898) = 2119899minus119898minus1(119899 minus 119898)Γ120572 (119899 minus 119898)sdot prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887) 119875120572119899 (119879119898 ⋆ 119879119899minus119898) = 2119899minus1119899Γ120572 (119899) prod

(119886119887)isinNS(119879119898⋆119879119899minus119898)120593120572 (119886 119887)

= 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)( prod(119886119887)isinNS(119879119898)

120593120572 (119886 119887))sdot ( prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887)) = 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)sdot 119898Γ120572 (119898)2119898minus1 119875120572119898 (119879119898) sdot (119899 minus 119898)Γ120572 (119899 minus 119898)2119899minus119898minus1sdot 119875120572119899minus119898 (119879119899minus119898) = 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898)sdot 119875120572119899minus119898 (119879119899minus119898)

(27)

as we claimed

6 The Scientific World Journal

Figure 6 The tree shapes inTlowast6 mentioned in Remark 6

Remark 6 Against what is stated in [4] (119875(lowast)120572119899 )119899 does notsatisfy any Markov branching recurrence that is there doesnot exist any symmetric mapping 119876 Z+ times Z+ rarr R suchthat for every 119896 119897 ⩾ 1 and for every 119879119896 isin Tlowast119896 and 119879119897 isin Tlowast119897

119875(lowast)120572119896+119897

(119879119896 ⋆ 119879119897) = 119876 (119896 119897) sdot 119875(lowast)120572119896 (119879119896) sdot 119875(lowast)120572119897 (119879119897) (28)

Indeed let 119879lowast119898 lowast119898 isin Tlowast119898 be any two different tree shapesboth with 119898 leaves and 119896 symmetric branch points forinstance the tree shapes inTlowast6 depicted in Figure 6 Then

119875(lowast)120572119898 (119879lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887) 119875(lowast)120572119898 (lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod

(119886119887)isinNS(lowast119898)

120593120572 (119886 119887) (29)

In this case 119879lowast119898 ⋆ 119879lowast119898 isin Tlowast2119898 has 2119896 + 1 symmetric branchpoints and therefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ 119879lowast119898) = 22119898minus2119896minus2Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆119879lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))2

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898) ( Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898))2

= 119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (119879lowast119898)

(30)

while 119879lowast119898 ⋆ lowast119898 isin Tlowast2119898 has 2119896 symmetric branch points andtherefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ lowast119898) = 22119898minus2119896minus1Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))

sdot ( prod(119886119887)isinNS(lowast

119898)

120593120572 (119886 119887)) = 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898) sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (lowast119898)= 2119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (lowast119898)

(31)

and 119902120572(119898119898) = 2119902120572(119898119898) This shows that there does notexist any well-defined single real number 119876(119898119898) such that

119875(lowast)1205722119898 (119879lowast1119898 ⋆ 119879lowast2119898) = 119876 (119898119898) sdot 119875(lowast)120572119898 (119879lowast1119898)sdot 119875(lowast)120572119898 (119879lowast2119898) (32)

for every 119879lowast1119898 119879lowast2119898 isin Tlowast119898Data Availability

The data used to support the findings of this study areavailable at the Github page that accompanies this paper

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This researchwas supported by SpanishMinistry of Economyand Competitiveness and European Regional DevelopmentFund Project DPI2015-67082-P (MINECOFEDER) Theauthors thank G Cardona and G Riera for several commentson the SageMath module that accompanies this paper

References

[1] G U Yule ldquoA Mathematical Theory of Evolution Based on theConclusions of Dr J C Willis FRSrdquo Philosophical Transac-tions of the Royal Society B Biological Sciences vol 213 no 402-410 pp 21ndash87 1925

[2] D Aldous ldquoProbability distributions on cladogramsrdquo in Ran-dom discrete structures D Aldous and R Pemantle Eds vol76 pp 1ndash18 Springer New York NY USA 1996

[3] M G B Blum and O Francois ldquoWhich random processesdescribe the tree of life A large-scale study of phylogenetic treeimbalancerdquo Systematic Biology vol 55 no 4 pp 685ndash691 2006

[4] D J Ford ldquoProbabilities on cladograms Introduction to thealpha modelrdquo PhD Thesis (Stanford University) httpsarxivorgabsmath0511246

[5] S Keller-Schmidt M Tugrul V M Eguıluz E Hernandez-Garcıa and K Klemm ldquoAnomalous scaling in an age-dependent branching modelrdquo Physical Review E StatisticalNonlinear and Soft Matter Physics vol 91 no 2 Article ID022803 2015

[6] M Kirkpatrick and M Slatkin ldquoSearching for EvolutionaryPatterns in the Shape of a Phylogenetic Treerdquo Evolution vol 47no 4 pp 1171ndash1181 1993

[7] L Popovic andM Rivas ldquoTopology and inference for Yule treeswith multiple statesrdquo Journal of Mathematical Biology vol 73no 5 pp 1251ndash1291 2016

[8] R Sainudiin and A Veber ldquoA Beta-splitting model for evolu-tionary treesrdquo Royal Society Open Science vol 3 no 5 ArticleID 160016 2016

[9] A O Mooers and S B Heard ldquoInferring evolutionary processfrom phylogenetic tree shaperdquoThe Quarterly Review of Biologyvol 72 no 1 pp 31ndash54 1997

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 4: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

4 The Scientific World Journal

1 2 3 4 5 6 7

Figure 2 A cladogram in T7 The black nodes are its symmetricbranch points

T

r

T

Figure 3 The root join 119879 ⋆ 11987910158403 Main Results

Our first result is an explicit formula for 119875120572119899(119879) for every119899 ⩾ 1 and 119879 isin T119899Proposition 2 For every 119879 isin T119899 its probability under the120572-model is

119875120572119899 (119879) = 2119899minus1119899 sdot Γ120572 (119899) prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (11)

Proof Given 119879 isin T119899 let 119879119900 be any ordered cladogram suchthat 120587119900(119879119900) = 119879 and let 119879lowast119900 = 120587lowast(119879119900) isin OTlowast119899 and 119879lowast =120587(119879) = 120587119900lowast(119879lowast119900 ) If 119879lowast has 119896 symmetric branch points thenby (4) (6) and (10)

119875120572119899 (119879) = 2119896119899 sdot 119875(lowast)120572119899 (119879lowast) = 2119896

119899 sdot 2119899minus119896minus1 sdot 119875(119900lowast)120572119899 (119879lowast119900 )= 2119899minus1119899 prod

(119886119887)isinNS(119879lowast119900)

119902120572 (119886 119887) (12)

Now on the one hand it is easy to check that

NS (119879)= (min 119886 119887 max 119886 119887) | (119886 119887) isin NS (119879lowast0 ) (13)

and therefore since 119902120572 is symmetric

119875120572119899 (119879) = 2119899minus1119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (14)

It remains to simplify this product If for every V isin 119881int(119879)we denote its children by V1 and V2 then

prod(119886119887)isinNS(119879)

119902120572 (119886 119887)= prod

Visin119881int(119879)

Γ120572 (120581119879 (V1)) Γ120572 (120581119879 (V2))Γ120572 (120581119879 (V)) 120593120572 (NS (V)) (15)

For every V isin 119881int(119879) 119903119879 the term Γ120572(120581119879(V)) appearstwice in this product in the denominator of the factorcorresponding to V itself and in the numerator of the factor

corresponding to its parent Therefore all terms Γ120572(120581119879(V)) inthis product vanish except Γ120572(120581119879(119903119879)) = Γ120572(119899) (that appears inthe denominator of its factor) and every Γ120572(120581119879(V)) = Γ120572(1) =1 with V a leaf Thus

119875120572119899 (119879) = 2119899minus1119899 sdot 1Γ120572 (119899) sdot prodVisin119881int(119879)120593120572 (NS (V)) (16)

as we claimed

Remark 3 Ford states (see [4 Proposition 32 and page 30])that if 119879 isin T119899 then

119875120572119899 (119879) = 2119896119899 prod(119886119887)isinNS(119879)

119902120572 (119886 119887) (17)

where 119896 is the number of symmetric branching points in 119879and

119902120572 (119886 119887) = 2119902120572 (119886 119887) if 119886 = 119887119902120572 (119886 119887) if 119886 = 119887 (18)

If we simplifyprod(119886119887)isinNS(119879)119902120572(119886 119887) as in the proof of Proposi-tion 2 this formula for 119875120572119899(119879) becomes

119875120572119899 (119879) = 2119896+119898119899 sdot Γ120572 (119899) sdot prod(119886119887)isinNS(119879)

120593120572 (119886 119887) (19)

where119898 is the number of internal nodes whose children havedifferent numbers of descendant leaves This formula doesnot agree with the one given in Proposition 2 above because

119896 + 119898 = 119899 minus 1 minus 10038161003816100381610038161003816V isin 119881int (119879) | child (V)= V1 V2 120581119879 (V1) = 120581119879 (V2) but 120587 (119879V1)= 120587 (119879V2)10038161003816100381610038161003816

(20)

and hence it may happen that 119896 + 119898 lt 119899 minus 1 Thefirst example of a cladogram with this property (and theonly one up to relabeling with at most 8 leaves) is thecladogram isin T8 depicted in Figure 5 For it our formulagives (see (822) in the document ProblsAlphapdf inhttpsgithubcombiocom-uibprob-alpha)

1198751205728 () = (1 minus 120572)2 (2 minus 120572)126 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (21)

while expression (19) assigns to a probability of half thisvalue

(1 minus 120572)2 (2 minus 120572)252 (7 minus 120572) (6 minus 120572) (5 minus 120572) (3 minus 120572) (22)

This last value cannot be right for several reasons Firstly by[4 sect312] when 120572 = 12 Fordrsquos model is equivalent to theuniform model where every cladogram in T119899 has the sameprobability

110038161003816100381610038161198611198791198991003816100381610038161003816 = 1(2119899 minus 3) (23)

The Scientific World Journal 5

1 2 1 2 3 1 2 34 1 2 345

1 2 1 3 2 1 3 24 1 5 243

P2 (T2) = 1

P2 (T2) = 1

P3 (T3) =

2 minus P4 (T4) =

1 minus

3 minus middot

2 minus P5 (T5) =

4 minus middot1 minus

3 minus middot

2 minus

P3 (T

3) =

1 minus

2 minus P4 (T

4) =

1 minus

3 minus middot1 minus

2 minus P5 (T

5) =

1 minus

4 minus middot1 minus

3 minus middot1 minus

2 minus

Figure 4 Two examples of computations of the probability 1198751015840120572119899 of a cladogram through its construction in Step (2) of the definition of the120572-model

1 2 3 4 5 6 7 8

Figure 5 The cladogram isin T8 used in Remark 3

and when 120572 = 0 Fordrsquos model gives rise to the Yule model[1 14] where the probability of every 119879 isin T119899 is

119875119884 (119879) = 2119899minus1119899 prodVisin119881int(119879)

1120581119879 (V) minus 1 (24)

In particular 119875128() should be equal to 1135135 and11987508() should be equal to 119845 Both values are consistentwith our formula while expression (22) yields half thesevalues

As a second reason which can be checked using asymbolic computation program let us mention that if wetake expression (22) as the probability of and hence ofall other cladograms with its shape and we assign to allother cladograms in T8 the probabilities computed withProposition 2 which agree on them with the values given by(19) (they are also provided in the aforementioned documentProblsAlphapdf) these probabilities do not add up 1

Combining Proposition 2 and (4) we obtain the followingresult

Corollary 4 For every 119879lowast isin Tlowast119899 with 119896 symmetric branchpoints

119875(lowast)120572119899 (119879lowast) = 2119899minus119896minus1Γ120572 (119899) prod(119886119887)isinNS(119879lowast)

120593120572 (119886 119887) (25)

This formula does not agree either with the one given in[4 Proposition 29] the difference lies again in the same factor

of 2 to the power of the number of internal nodes that are notsymmetric branch points but whose children have the samenumber of descendant leaves

The family of densitymappings (119875120572119899)119899 satisfies the follow-ing Markov branching recurrence

Corollary 5 For every 0 lt 119898 lt 119899 and for every 119879119898 isin T119898and 119879119899minus119898 isin T119899minus119898119875120572119899 (119879119898 ⋆ 119879119899minus119898)

= 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898) 119875120572119899minus119898 (119879119899minus119898) (26)

Proof If 119879119898 isin T119898 and 119879119899minus119898 isin T119899minus119898 then119875120572119898 (119879119898) = 2119898minus1119898Γ120572 (119898) prod

(119886119887)isinNS(119879119898)120593120572 (119886 119887)

119875120572119899minus119898 (119879119899minus119898) = 2119899minus119898minus1(119899 minus 119898)Γ120572 (119899 minus 119898)sdot prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887) 119875120572119899 (119879119898 ⋆ 119879119899minus119898) = 2119899minus1119899Γ120572 (119899) prod

(119886119887)isinNS(119879119898⋆119879119899minus119898)120593120572 (119886 119887)

= 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)( prod(119886119887)isinNS(119879119898)

120593120572 (119886 119887))sdot ( prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887)) = 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)sdot 119898Γ120572 (119898)2119898minus1 119875120572119898 (119879119898) sdot (119899 minus 119898)Γ120572 (119899 minus 119898)2119899minus119898minus1sdot 119875120572119899minus119898 (119879119899minus119898) = 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898)sdot 119875120572119899minus119898 (119879119899minus119898)

(27)

as we claimed

6 The Scientific World Journal

Figure 6 The tree shapes inTlowast6 mentioned in Remark 6

Remark 6 Against what is stated in [4] (119875(lowast)120572119899 )119899 does notsatisfy any Markov branching recurrence that is there doesnot exist any symmetric mapping 119876 Z+ times Z+ rarr R suchthat for every 119896 119897 ⩾ 1 and for every 119879119896 isin Tlowast119896 and 119879119897 isin Tlowast119897

119875(lowast)120572119896+119897

(119879119896 ⋆ 119879119897) = 119876 (119896 119897) sdot 119875(lowast)120572119896 (119879119896) sdot 119875(lowast)120572119897 (119879119897) (28)

Indeed let 119879lowast119898 lowast119898 isin Tlowast119898 be any two different tree shapesboth with 119898 leaves and 119896 symmetric branch points forinstance the tree shapes inTlowast6 depicted in Figure 6 Then

119875(lowast)120572119898 (119879lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887) 119875(lowast)120572119898 (lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod

(119886119887)isinNS(lowast119898)

120593120572 (119886 119887) (29)

In this case 119879lowast119898 ⋆ 119879lowast119898 isin Tlowast2119898 has 2119896 + 1 symmetric branchpoints and therefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ 119879lowast119898) = 22119898minus2119896minus2Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆119879lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))2

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898) ( Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898))2

= 119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (119879lowast119898)

(30)

while 119879lowast119898 ⋆ lowast119898 isin Tlowast2119898 has 2119896 symmetric branch points andtherefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ lowast119898) = 22119898minus2119896minus1Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))

sdot ( prod(119886119887)isinNS(lowast

119898)

120593120572 (119886 119887)) = 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898) sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (lowast119898)= 2119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (lowast119898)

(31)

and 119902120572(119898119898) = 2119902120572(119898119898) This shows that there does notexist any well-defined single real number 119876(119898119898) such that

119875(lowast)1205722119898 (119879lowast1119898 ⋆ 119879lowast2119898) = 119876 (119898119898) sdot 119875(lowast)120572119898 (119879lowast1119898)sdot 119875(lowast)120572119898 (119879lowast2119898) (32)

for every 119879lowast1119898 119879lowast2119898 isin Tlowast119898Data Availability

The data used to support the findings of this study areavailable at the Github page that accompanies this paper

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This researchwas supported by SpanishMinistry of Economyand Competitiveness and European Regional DevelopmentFund Project DPI2015-67082-P (MINECOFEDER) Theauthors thank G Cardona and G Riera for several commentson the SageMath module that accompanies this paper

References

[1] G U Yule ldquoA Mathematical Theory of Evolution Based on theConclusions of Dr J C Willis FRSrdquo Philosophical Transac-tions of the Royal Society B Biological Sciences vol 213 no 402-410 pp 21ndash87 1925

[2] D Aldous ldquoProbability distributions on cladogramsrdquo in Ran-dom discrete structures D Aldous and R Pemantle Eds vol76 pp 1ndash18 Springer New York NY USA 1996

[3] M G B Blum and O Francois ldquoWhich random processesdescribe the tree of life A large-scale study of phylogenetic treeimbalancerdquo Systematic Biology vol 55 no 4 pp 685ndash691 2006

[4] D J Ford ldquoProbabilities on cladograms Introduction to thealpha modelrdquo PhD Thesis (Stanford University) httpsarxivorgabsmath0511246

[5] S Keller-Schmidt M Tugrul V M Eguıluz E Hernandez-Garcıa and K Klemm ldquoAnomalous scaling in an age-dependent branching modelrdquo Physical Review E StatisticalNonlinear and Soft Matter Physics vol 91 no 2 Article ID022803 2015

[6] M Kirkpatrick and M Slatkin ldquoSearching for EvolutionaryPatterns in the Shape of a Phylogenetic Treerdquo Evolution vol 47no 4 pp 1171ndash1181 1993

[7] L Popovic andM Rivas ldquoTopology and inference for Yule treeswith multiple statesrdquo Journal of Mathematical Biology vol 73no 5 pp 1251ndash1291 2016

[8] R Sainudiin and A Veber ldquoA Beta-splitting model for evolu-tionary treesrdquo Royal Society Open Science vol 3 no 5 ArticleID 160016 2016

[9] A O Mooers and S B Heard ldquoInferring evolutionary processfrom phylogenetic tree shaperdquoThe Quarterly Review of Biologyvol 72 no 1 pp 31ndash54 1997

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 5: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

The Scientific World Journal 5

1 2 1 2 3 1 2 34 1 2 345

1 2 1 3 2 1 3 24 1 5 243

P2 (T2) = 1

P2 (T2) = 1

P3 (T3) =

2 minus P4 (T4) =

1 minus

3 minus middot

2 minus P5 (T5) =

4 minus middot1 minus

3 minus middot

2 minus

P3 (T

3) =

1 minus

2 minus P4 (T

4) =

1 minus

3 minus middot1 minus

2 minus P5 (T

5) =

1 minus

4 minus middot1 minus

3 minus middot1 minus

2 minus

Figure 4 Two examples of computations of the probability 1198751015840120572119899 of a cladogram through its construction in Step (2) of the definition of the120572-model

1 2 3 4 5 6 7 8

Figure 5 The cladogram isin T8 used in Remark 3

and when 120572 = 0 Fordrsquos model gives rise to the Yule model[1 14] where the probability of every 119879 isin T119899 is

119875119884 (119879) = 2119899minus1119899 prodVisin119881int(119879)

1120581119879 (V) minus 1 (24)

In particular 119875128() should be equal to 1135135 and11987508() should be equal to 119845 Both values are consistentwith our formula while expression (22) yields half thesevalues

As a second reason which can be checked using asymbolic computation program let us mention that if wetake expression (22) as the probability of and hence ofall other cladograms with its shape and we assign to allother cladograms in T8 the probabilities computed withProposition 2 which agree on them with the values given by(19) (they are also provided in the aforementioned documentProblsAlphapdf) these probabilities do not add up 1

Combining Proposition 2 and (4) we obtain the followingresult

Corollary 4 For every 119879lowast isin Tlowast119899 with 119896 symmetric branchpoints

119875(lowast)120572119899 (119879lowast) = 2119899minus119896minus1Γ120572 (119899) prod(119886119887)isinNS(119879lowast)

120593120572 (119886 119887) (25)

This formula does not agree either with the one given in[4 Proposition 29] the difference lies again in the same factor

of 2 to the power of the number of internal nodes that are notsymmetric branch points but whose children have the samenumber of descendant leaves

The family of densitymappings (119875120572119899)119899 satisfies the follow-ing Markov branching recurrence

Corollary 5 For every 0 lt 119898 lt 119899 and for every 119879119898 isin T119898and 119879119899minus119898 isin T119899minus119898119875120572119899 (119879119898 ⋆ 119879119899minus119898)

= 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898) 119875120572119899minus119898 (119879119899minus119898) (26)

Proof If 119879119898 isin T119898 and 119879119899minus119898 isin T119899minus119898 then119875120572119898 (119879119898) = 2119898minus1119898Γ120572 (119898) prod

(119886119887)isinNS(119879119898)120593120572 (119886 119887)

119875120572119899minus119898 (119879119899minus119898) = 2119899minus119898minus1(119899 minus 119898)Γ120572 (119899 minus 119898)sdot prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887) 119875120572119899 (119879119898 ⋆ 119879119899minus119898) = 2119899minus1119899Γ120572 (119899) prod

(119886119887)isinNS(119879119898⋆119879119899minus119898)120593120572 (119886 119887)

= 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)( prod(119886119887)isinNS(119879119898)

120593120572 (119886 119887))sdot ( prod(119886119887)isinNS(119879119899minus119898)

120593120572 (119886 119887)) = 2119899minus1119899Γ120572 (119899)120593120572 (119898 119899 minus 119898)sdot 119898Γ120572 (119898)2119898minus1 119875120572119898 (119879119898) sdot (119899 minus 119898)Γ120572 (119899 minus 119898)2119899minus119898minus1sdot 119875120572119899minus119898 (119879119899minus119898) = 2119902120572 (119898 119899 minus 119898)( 119899119898 ) 119875120572119898 (119879119898)sdot 119875120572119899minus119898 (119879119899minus119898)

(27)

as we claimed

6 The Scientific World Journal

Figure 6 The tree shapes inTlowast6 mentioned in Remark 6

Remark 6 Against what is stated in [4] (119875(lowast)120572119899 )119899 does notsatisfy any Markov branching recurrence that is there doesnot exist any symmetric mapping 119876 Z+ times Z+ rarr R suchthat for every 119896 119897 ⩾ 1 and for every 119879119896 isin Tlowast119896 and 119879119897 isin Tlowast119897

119875(lowast)120572119896+119897

(119879119896 ⋆ 119879119897) = 119876 (119896 119897) sdot 119875(lowast)120572119896 (119879119896) sdot 119875(lowast)120572119897 (119879119897) (28)

Indeed let 119879lowast119898 lowast119898 isin Tlowast119898 be any two different tree shapesboth with 119898 leaves and 119896 symmetric branch points forinstance the tree shapes inTlowast6 depicted in Figure 6 Then

119875(lowast)120572119898 (119879lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887) 119875(lowast)120572119898 (lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod

(119886119887)isinNS(lowast119898)

120593120572 (119886 119887) (29)

In this case 119879lowast119898 ⋆ 119879lowast119898 isin Tlowast2119898 has 2119896 + 1 symmetric branchpoints and therefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ 119879lowast119898) = 22119898minus2119896minus2Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆119879lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))2

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898) ( Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898))2

= 119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (119879lowast119898)

(30)

while 119879lowast119898 ⋆ lowast119898 isin Tlowast2119898 has 2119896 symmetric branch points andtherefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ lowast119898) = 22119898minus2119896minus1Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))

sdot ( prod(119886119887)isinNS(lowast

119898)

120593120572 (119886 119887)) = 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898) sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (lowast119898)= 2119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (lowast119898)

(31)

and 119902120572(119898119898) = 2119902120572(119898119898) This shows that there does notexist any well-defined single real number 119876(119898119898) such that

119875(lowast)1205722119898 (119879lowast1119898 ⋆ 119879lowast2119898) = 119876 (119898119898) sdot 119875(lowast)120572119898 (119879lowast1119898)sdot 119875(lowast)120572119898 (119879lowast2119898) (32)

for every 119879lowast1119898 119879lowast2119898 isin Tlowast119898Data Availability

The data used to support the findings of this study areavailable at the Github page that accompanies this paper

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This researchwas supported by SpanishMinistry of Economyand Competitiveness and European Regional DevelopmentFund Project DPI2015-67082-P (MINECOFEDER) Theauthors thank G Cardona and G Riera for several commentson the SageMath module that accompanies this paper

References

[1] G U Yule ldquoA Mathematical Theory of Evolution Based on theConclusions of Dr J C Willis FRSrdquo Philosophical Transac-tions of the Royal Society B Biological Sciences vol 213 no 402-410 pp 21ndash87 1925

[2] D Aldous ldquoProbability distributions on cladogramsrdquo in Ran-dom discrete structures D Aldous and R Pemantle Eds vol76 pp 1ndash18 Springer New York NY USA 1996

[3] M G B Blum and O Francois ldquoWhich random processesdescribe the tree of life A large-scale study of phylogenetic treeimbalancerdquo Systematic Biology vol 55 no 4 pp 685ndash691 2006

[4] D J Ford ldquoProbabilities on cladograms Introduction to thealpha modelrdquo PhD Thesis (Stanford University) httpsarxivorgabsmath0511246

[5] S Keller-Schmidt M Tugrul V M Eguıluz E Hernandez-Garcıa and K Klemm ldquoAnomalous scaling in an age-dependent branching modelrdquo Physical Review E StatisticalNonlinear and Soft Matter Physics vol 91 no 2 Article ID022803 2015

[6] M Kirkpatrick and M Slatkin ldquoSearching for EvolutionaryPatterns in the Shape of a Phylogenetic Treerdquo Evolution vol 47no 4 pp 1171ndash1181 1993

[7] L Popovic andM Rivas ldquoTopology and inference for Yule treeswith multiple statesrdquo Journal of Mathematical Biology vol 73no 5 pp 1251ndash1291 2016

[8] R Sainudiin and A Veber ldquoA Beta-splitting model for evolu-tionary treesrdquo Royal Society Open Science vol 3 no 5 ArticleID 160016 2016

[9] A O Mooers and S B Heard ldquoInferring evolutionary processfrom phylogenetic tree shaperdquoThe Quarterly Review of Biologyvol 72 no 1 pp 31ndash54 1997

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 6: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

6 The Scientific World Journal

Figure 6 The tree shapes inTlowast6 mentioned in Remark 6

Remark 6 Against what is stated in [4] (119875(lowast)120572119899 )119899 does notsatisfy any Markov branching recurrence that is there doesnot exist any symmetric mapping 119876 Z+ times Z+ rarr R suchthat for every 119896 119897 ⩾ 1 and for every 119879119896 isin Tlowast119896 and 119879119897 isin Tlowast119897

119875(lowast)120572119896+119897

(119879119896 ⋆ 119879119897) = 119876 (119896 119897) sdot 119875(lowast)120572119896 (119879119896) sdot 119875(lowast)120572119897 (119879119897) (28)

Indeed let 119879lowast119898 lowast119898 isin Tlowast119898 be any two different tree shapesboth with 119898 leaves and 119896 symmetric branch points forinstance the tree shapes inTlowast6 depicted in Figure 6 Then

119875(lowast)120572119898 (119879lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887) 119875(lowast)120572119898 (lowast119898) = 2119898minus119896minus1Γ120572 (119898) prod

(119886119887)isinNS(lowast119898)

120593120572 (119886 119887) (29)

In this case 119879lowast119898 ⋆ 119879lowast119898 isin Tlowast2119898 has 2119896 + 1 symmetric branchpoints and therefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ 119879lowast119898) = 22119898minus2119896minus2Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆119879lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))2

= 22119898minus2119896minus2Γ120572 (2119898) 120593120572 (119898119898) ( Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898))2

= 119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (119879lowast119898)

(30)

while 119879lowast119898 ⋆ lowast119898 isin Tlowast2119898 has 2119896 symmetric branch points andtherefore

119875(lowast)1205722119898 (119879lowast119898 ⋆ lowast119898) = 22119898minus2119896minus1Γ120572 (2119898) prod(119886119887)isinNS(119879lowast

119898⋆lowast119898)

120593120572 (119886 119887)

= 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)( prod(119886119887)isinNS(119879lowast

119898)

120593120572 (119886 119887))

sdot ( prod(119886119887)isinNS(lowast

119898)

120593120572 (119886 119887)) = 22119898minus2119896minus1Γ120572 (2119898) 120593120572 (119898119898)sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (119879lowast119898) sdot Γ120572 (119898)2119898minus119896minus1119875(lowast)120572119898 (lowast119898)= 2119902120572 (119898119898) 119875(lowast)120572119898 (119879lowast119898) 119875(lowast)120572119898 (lowast119898)

(31)

and 119902120572(119898119898) = 2119902120572(119898119898) This shows that there does notexist any well-defined single real number 119876(119898119898) such that

119875(lowast)1205722119898 (119879lowast1119898 ⋆ 119879lowast2119898) = 119876 (119898119898) sdot 119875(lowast)120572119898 (119879lowast1119898)sdot 119875(lowast)120572119898 (119879lowast2119898) (32)

for every 119879lowast1119898 119879lowast2119898 isin Tlowast119898Data Availability

The data used to support the findings of this study areavailable at the Github page that accompanies this paper

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This researchwas supported by SpanishMinistry of Economyand Competitiveness and European Regional DevelopmentFund Project DPI2015-67082-P (MINECOFEDER) Theauthors thank G Cardona and G Riera for several commentson the SageMath module that accompanies this paper

References

[1] G U Yule ldquoA Mathematical Theory of Evolution Based on theConclusions of Dr J C Willis FRSrdquo Philosophical Transac-tions of the Royal Society B Biological Sciences vol 213 no 402-410 pp 21ndash87 1925

[2] D Aldous ldquoProbability distributions on cladogramsrdquo in Ran-dom discrete structures D Aldous and R Pemantle Eds vol76 pp 1ndash18 Springer New York NY USA 1996

[3] M G B Blum and O Francois ldquoWhich random processesdescribe the tree of life A large-scale study of phylogenetic treeimbalancerdquo Systematic Biology vol 55 no 4 pp 685ndash691 2006

[4] D J Ford ldquoProbabilities on cladograms Introduction to thealpha modelrdquo PhD Thesis (Stanford University) httpsarxivorgabsmath0511246

[5] S Keller-Schmidt M Tugrul V M Eguıluz E Hernandez-Garcıa and K Klemm ldquoAnomalous scaling in an age-dependent branching modelrdquo Physical Review E StatisticalNonlinear and Soft Matter Physics vol 91 no 2 Article ID022803 2015

[6] M Kirkpatrick and M Slatkin ldquoSearching for EvolutionaryPatterns in the Shape of a Phylogenetic Treerdquo Evolution vol 47no 4 pp 1171ndash1181 1993

[7] L Popovic andM Rivas ldquoTopology and inference for Yule treeswith multiple statesrdquo Journal of Mathematical Biology vol 73no 5 pp 1251ndash1291 2016

[8] R Sainudiin and A Veber ldquoA Beta-splitting model for evolu-tionary treesrdquo Royal Society Open Science vol 3 no 5 ArticleID 160016 2016

[9] A O Mooers and S B Heard ldquoInferring evolutionary processfrom phylogenetic tree shaperdquoThe Quarterly Review of Biologyvol 72 no 1 pp 31ndash54 1997

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 7: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

The Scientific World Journal 7

[10] J Felsenstein Inferring Phylogenies Sinauer Associates Inc2004

[11] T M Coronado A Mir F Rossello and G Valiente ldquoA balanceindex for phylogenetic trees based on quartetsrdquo httpsarxivorgabs180105411

[12] SageMath the SageMathematics Software System (Version 76)The Sage Developers (2017) httpwwwsagemathorg

[13] L L Cavalli-Sforza and AW Edwards ldquoPhylogenetic AnalysisModels and Estimation Proceduresrdquo Evolution vol 21 no 3 pp550ndash570 1967

[14] E FHarding ldquoTheprobabilities of rooted tree-shapes generatedby random bifurcationrdquo Advances in Applied Probability vol 3pp 44ndash77 1971

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom

Page 8: The Probabilities of Trees and Cladograms under Ford’sdownloads.hindawi.com/journals/tswj/2018/1916094.pdf · symbolic computation program, let us mention that if we take expression

Hindawiwwwhindawicom

International Journal of

Volume 2018

Zoology

Hindawiwwwhindawicom Volume 2018

Anatomy Research International

PeptidesInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Journal of Parasitology Research

GenomicsInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Neuroscience Journal

Hindawiwwwhindawicom Volume 2018

BioMed Research International

Cell BiologyInternational Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Biochemistry Research International

ArchaeaHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Genetics Research International

Hindawiwwwhindawicom Volume 2018

Advances in

Virolog y Stem Cells International

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Enzyme Research

Hindawiwwwhindawicom Volume 2018

International Journal of

MicrobiologyHindawiwwwhindawicom

Nucleic AcidsJournal of

Volume 2018

Submit your manuscripts atwwwhindawicom