the fowlkes–mallows statistic and the comparison of two independently determined dendrograms

5
The Fowlke~Ma Statistic and the Comparison of Two Independent y Determined Dendrograms A. F. L. NemecB and R. 0. Brinkhurst Ocean Ecology Division, Institute of Ocean Sciences, 9866 West Saanich Road, P.0. Box 6080, Sidney, B.C. V8L 4BZ Nernec, A. F. L., and R. 0. Brinkherrst. 1988. The Fswlkes-Mallows statistic and the comparison of two inde- pendently determined dendrograms. Can. j. Fish. Aquat. Sci. 45: 971-975. When interpreting the results of a cluster analysis, it is important to understand why specific clustering patterns arise. Comparison sf a "dependent" dendrogram with a second, independently determined "covariate" dendro- gram (i.e. one that is based solely on information provided by various explanatory variables) is a simple way of investigatingthe role played by the covariates. The Fowlkes-Mallows statistic, which is a measure of the degree sf similarity between two dendrograms, caw be used to test the null hypothesis that two dendrograms are unre- lated. We show that the Fswlkes-Mallows test can be usefully employed in the systematic comparison of a dependent dendrograrn and covariate dendrograrn. Since the test is nonpararnetric, it is appsicable to a wide range sf problems. As an illustrative example, a species abundance matrix for several benthic communities is subjected to a standard cluster analysis, and the resultant (dependent)dendrogram is compared with a clustering based on the geographic location sf the communities. At least some of the clustering seen in the dependent dendrograrn can be attributed to the geographic proximity of the stations. korsqu'on interpr&teles resultats d'une analyse de grappes, il est important de comprendre pourquoi des structures de grappes particulieres se pr6sentent. La cornparaison d'un dendrogramrne (( dependant D avec un deuxi&me dendrogramme de covariables )) determine i ndependamment (c.-3-d. fond6 erniquement sur I'i nformation four- nie par diverses variables explicatives) offre un moyen simple d'6tudier Be r61e joue par les covariabies. La sta- tistique de F o w l k e ~ a l l o w ~ ~ qui mesure le degr6 de similitude entre deux dendrogrammes, peut sewir 2 tester I'hypoth&se nulle selon laquelle deux dendrogrammes ne pr6sentent pas de corr6latisn. Nsus rnsntrons que le test de Fowlkes-Mallows peut sewir eficacement A la cornparaison systematique d'un dendrogramme dependant et d'un dendrograrnrne de covariables. Puisque le test est won parametrique, il put s'appliquer 3 une vaste gamme de probl$mes. A titre d'exemple, une rnatrice d'abwdance des esphces pour plusieurs cornmunautes knthiques est soumise 2 une analyse de grappes normale, et le dendrograrnme (dkpendant) qui en r$sulte est cornpar4 21 une structure de grappes fond6e sur I'emplacement geographique des com~munaut$s. Au rnoins une partie de I'agregation observee dans Be dendrogramme $$pendant peut etre attribuee A la proximite gesgraphique des stations. Received August 26, 1 987 Accepted February 9, 1988 (.EL399 C luster analysis is used to group objects according to sim- ilarity. In most applications, a single data set is subjected to an agglomerative, hierarchical cluster analysis and the results are presented as a dendmgrm. Central to the interpre- tation of a dendrogm is the distinction between clusters that are red, i.e. reflect a natural division of the underlying popu- lation, and clusters that are spurious. This question is usudaEIy resolved by considering the dendrogram, and the data set from which it was derived, in isolation. However, the interpretation sf a dendrogram does not end with cluster identification. It is of en important to understand why specific clustering patterns arise. This leads to the consideration sf factors other than those which were used to construct the dendrogrm. The former fac- tors will be refened to as the covuiaks, or independent vari- ables, while the latter will be referred to as the dependent var- iables. In the present paper, we focus on the relationship %sent address: International Statistics and Research Corp., P.O. Box 496, Brentwmd Bay, B.C. VOS 180. between the covs%pjlates and the morphology of the "dependent dlendrograrn," and not explicitly on the problem sf cluster identification. There are various ways of investigating the role sf csvariates in a cluster analysis. The approach taken here is the systematic comp~son of the dependent dendrogram with a fixed "co- variate dends~grm,~ ' i.e. a preconceived clustering based on a knowledge of the covaiiates alone. A comparison of the two denkograms may lead to the identification of clustering pat- terns (in the dependent dendmgram) that can be predicted by, or attributed to, clusterings associated with the covariate(s). For example, studies of benthic communities often reveal that clus- ters derived from an analysis of species abundance data cm be explained by environmental factors, such as substrate type, nutrient levels, water temperature, and salinity. The resemblance between the dependent denkogram and the covariate dendrogram must be assessed objectively. In partic- ular, it is necessary to determine how much apparent similarity Can. J. Fish. Aquab. Sei., Vo'ol. 45, 1988 Can. J. Fish. Aquat. Sci. Downloaded from www.nrcresearchpress.com by Ohio State University on 11/11/14 For personal use only.

Upload: r-o

Post on 16-Mar-2017

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: The Fowlkes–Mallows Statistic and the Comparison of Two Independently Determined Dendrograms

The Fowlke~Ma Statistic and the Comparison of Two Independent y Determined Dendrograms

A. F. L. NemecB and R. 0. Brinkhurst

Ocean Ecology Division, Institute of Ocean Sciences, 9866 West Saanich Road, P.0. Box 6080, Sidney, B.C. V8L 4BZ

Nernec, A. F. L., and R. 0. Brinkherrst. 1988. The Fswlkes-Mallows statistic and the comparison of two inde- pendently determined dendrograms. Can. j. Fish. Aquat. Sci. 45: 971-975.

When interpreting the results of a cluster analysis, it is important to understand why specific clustering patterns arise. Comparison sf a "dependent" dendrogram with a second, independently determined "covariate" dendro- gram (i.e. one that is based solely on information provided by various explanatory variables) is a simple way of investigating the role played by the covariates. The Fowlkes-Mallows statistic, which is a measure of the degree sf similarity between two dendrograms, caw be used to test the null hypothesis that two dendrograms are unre- lated. We show that the Fswlkes-Mallows test can be usefully employed in the systematic comparison of a dependent dendrograrn and covariate dendrograrn. Since the test i s nonpararnetric, it is appsicable to a wide range sf problems. As an illustrative example, a species abundance matrix for several benthic communities is subjected to a standard cluster analysis, and the resultant (dependent) dendrogram is compared with a clustering based on the geographic location sf the communities. At least some of the clustering seen in the dependent dendrograrn can be attributed to the geographic proximity of the stations.

korsqu'on interpr&te les resultats d'une analyse de grappes, il est important de comprendre pourquoi des structures de grappes particulieres se pr6sentent. La cornparaison d'un dendrogramrne (( dependant D avec un deuxi&me dendrogramme de covariables )) determine i ndependamment (c. -3-d. fond6 erniquement sur I'i nformation four- nie par diverses variables explicatives) offre un moyen simple d'6tudier Be r61e joue par les covariabies. La sta- tistique de F o w l k e ~ a l l o w ~ ~ qui mesure le degr6 de similitude entre deux dendrogrammes, peut sewir 2 tester I'hypoth&se nulle selon laquelle deux dendrogrammes ne pr6sentent pas de corr6latisn. Nsus rnsntrons que le test de Fowlkes-Mallows peut sewir eficacement A la cornparaison systematique d'un dendrogramme dependant et d'un dendrograrnrne de covariables. Puisque le test est won parametrique, il p u t s'appliquer 3 une vaste gamme de probl$mes. A titre d'exemple, une rnatrice d'abwdance des esphces pour plusieurs cornmunautes knthiques est soumise 2 une analyse de grappes normale, et le dendrograrnme (dkpendant) qui en r$sulte est cornpar4 21 une structure de grappes fond6e sur I'emplacement geographique des com~munaut$s. Au rnoins une partie de I'agregation observee dans Be dendrogramme $$pendant peut etre attribuee A la proximite gesgraphique des stations.

Received August 26, 1 987 Accepted February 9, 1988 (.EL399

C luster analysis is used to group objects according to sim- ilarity. In most applications, a single data set is subjected to an agglomerative, hierarchical cluster analysis and the

results are presented as a dendmgrm. Central to the interpre- tation of a dendrogm is the distinction between clusters that are red, i.e. reflect a natural division of the underlying popu- lation, and clusters that are spurious. This question is usudaEIy resolved by considering the dendrogram, and the data set from which it was derived, in isolation. However, the interpretation sf a dendrogram does not end with cluster identification. It is of en important to understand why specific clustering patterns arise. This leads to the consideration sf factors other than those which were used to construct the dendrogrm. The former fac- tors will be refened to as the covuiaks, or independent vari- ables, while the latter will be referred to as the dependent var- iables. In the present paper, we focus on the relationship

%sent address: International Statistics and Research Corp., P.O. Box 496, Brentwmd Bay, B.C. VOS 180.

between the covs%pjlates and the morphology of the "dependent dlendrograrn," and not explicitly on the problem sf cluster identification.

There are various ways of investigating the role sf csvariates in a cluster analysis. The approach taken here is the systematic c o m p ~ s o n of the dependent dendrogram with a fixed "co- variate dends~grm,~ ' i.e. a preconceived clustering based on a knowledge of the covaiiates alone. A comparison of the two denkograms may lead to the identification of clustering pat- terns (in the dependent dendmgram) that can be predicted by, or attributed to, clusterings associated with the covariate(s). For example, studies of benthic communities often reveal that clus- ters derived from an analysis of species abundance data cm be explained by environmental factors, such as substrate type, nutrient levels, water temperature, and salinity.

The resemblance between the dependent denkogram and the covariate dendrogram must be assessed objectively. In partic- ular, it is necessary to determine how much apparent similarity

Can. J. Fish. Aquab. Sei., Vo'ol. 45, 1988

Can

. J. F

ish.

Aqu

at. S

ci. D

ownl

oade

d fr

om w

ww

.nrc

rese

arch

pres

s.co

m b

y O

hio

Stat

e U

nive

rsity

on

11/1

1/14

For

pers

onal

use

onl

y.

Page 2: The Fowlkes–Mallows Statistic and the Comparison of Two Independently Determined Dendrograms

can be expected if there is no relationship between the two den- drogms. Fowkes and Mallows (1983) defined a measure of the degree of similarity between two dendrograrns. The Fowkes-Mdlows statistic is sufficiently simple that its distri- butional properties are relatively easy to determine, if a certain condition is imposed on the structure of the two dendrsgrms, md there is a "random docation" of the objects to the clusters. Consequently, Fowlkes and Mallows were able to derive a non- parametric test of the null hypothesis that two dendrogms are unrelated.

Fowkes and Mdlows illustrated their test by comparing the results of two different types of cluster mdysis, which were applied to the same data. In that case, the null hypothesis that the two cluskrings are unrelated is unlikely to be m e , A more realistic null hypothesis would have specified some kind of rela- tionship between the two dendrcsgrms (see Nemec and Brink- harst (1988) for a related discussion). Fowlkes and Mallows' inappropriate application of the test lead to the criticism that their null hypothesis is a "straw-man" and lacks relevance in mmy applications (see the comments (Wallace 1983) that accompany Fowlkes and Mallows 1983). Unfortunately, the criticism belies the usefdness of the Fowlkes-Mallows test when it is applied correctly, e.g . to the compkson of two inde- pendently determined dendrogrms .

In the present paper, we show hat the Fowlkes-MdBows sta- tistic c m be usefully employed in the systematic comparison of a dependent dendrogm and a fixed covariate dendrogram. The null distributon is derived by imposing a condition on the struc- ture of the dendrograms that is often more appropriate than the Fowlkes-Mallows condition. An efficient procedure for gen- erating the null distribution md an application of the test are described.

Methods

Let T , be the dependent dendrogram, which is derived from an agglomerative, hierarchical cluster mdysis of N objects, and let T, be a preconceived clustering, which is based on a howl- edge of one or more independent variabIes. It will be assumed that T , is subject to random variation, due to variability in the dependent variables, while jk, is considered fixed.

The Fowlke+Mrallsws Statistic

The degree of similarity between Tl md T, can be measured by the statistic defined by Fowlkes and Mallows (1984), which compaes the composition sf the clusters defined by the two trees. The statistic is evaluated by cutting T, at a level of sim- ilarity that defines k clusters, i.e. the (N-k)th linkage level, if the linkages are numbered in order of decreasing similarityv and by cutting T2 at a level that defines I clusters. For the given k and l , the Fowlkes-Mallows statistic B(k,&) is defined as:

where rn, is the number of objects that are common to the ith cluster of T , and the jth cluster of T, (i = 1,2,. . . , &; j = 1 J.. . , 4; the dot notation means summation over the corresponding subscript. The choice of & and I is discussed in the next section.

The Fowlkes-Mdlows statistic depends only on the sizes (mi., i = 1,2,. . . , k; m,, j = 1,2,. . .l) and the composition of the

clusters, md not on the associated similarity scdes. This is an advantage in the present application, since the two scales may not be directly compxab%e. For example, in a study sf benthic communities, T , might correspond to a single lidcage cluster analysis sf species presenceabsence data, with the similarity scale defined by the Jaccard coefficient, while T, might cor- respond to a single linkage cluster mdysis of water tempera- tures, with the similarity scale defined by Euclidean distance.

The range of B(k,i) is 0 to I. The minimum occurs when there is no similarity between the two sets of clusters defined by T, and T2, or when m,, = 0 or I for d l i and 4. The latter condition holds in the trivial case of k = N or 1 = N, which implies that the corresponding clusters each contain a single element. More generally, the condition holds when every pair of objects that belongs to the s m e cluster according to TI belongs to two different clusters according to T,. In cases where N < kI, the minimum vdue of zero cannot be achieved. Wdlace (1983) has suggested that B(k,l) should be modified to take this into account. If k = I, it is easy to see that B(&,o = 1 when the clusters defined by T , and T, are identical, i.e. when m, = mi. = mj for 8' = j and mv = 0 for i =k j. In the general case, where k + I, the maximum is not necessarily 1. These md other properties of B(&,I) we discussed in Fowlkes and Mdlows (1983) and Wdlace (1983).

Here the null hypothesis to be tested is that T, and T, are unrelated, in the sense that the resemblance between T, and T, is no greater t h a that which would be expected if T, were a random clustering. To test this null hypothesis, the clusters defined at the successive levels of the T, hierarchy are compared with the corresponding clusters defined by T,. This is done by evaluating the Fowlkes-Mallows statistic B(k,l) for k = & = N - 1, N - 2,. . . ,2. Since B(k,k) is expected to be larger when T, and T2 are related than when they are unrelated, the null hypothesis should be rejected when the observed value of B(k,k) is sufficiently large, for at least one k.

One way of determining whether, for a given & and %, the observed value sf B(k,!) is large enough to wmmt rejection sf the null hypothesis is to calculate the associated P vdue. The P value is the probability that B(k,&) would be at least as large as the observed value if the null hypothesis were true. A small P value (G0.05 for a 5% level sf significance) should lead to the rejection of the null hypothesis. To calculate the P value the null distribution of B(k,&) is required.

Null Distsibatbn of the Fowlke+blIsws Statistic

The null distribution of B(k,&) is computed by assuming that T, is generated by random linkages. This means that at any given stage of the cluster mdysis, every pair of clusters has an equd probability of being linked at the next stage. M e n k is large a d & is arbitrary, B(R,o takes on only a small number of values and the derivation of the null distribution is a simple exercise in combinatorics. For exmple, it is easy to see that for every 1:

and, if the binomial cc~eff~cient (i) is defined to be zero when i < 2:

Can. J. Fish. Aquat. Sci., Vo/. 45, 1988

Can

. J. F

ish.

Aqu

at. S

ci. D

ownl

oade

d fr

om w

ww

.nrc

rese

arch

pres

s.co

m b

y O

hio

Stat

e U

nive

rsity

on

11/1

1/14

For

pers

onal

use

onl

y.

Page 3: The Fowlkes–Mallows Statistic and the Comparison of Two Independently Determined Dendrograms

DDN -1 1 I

EEN EEM EES

H%M Z2W

FIG. 1. Geographic tree (T,). The lidcages me nurnbe~d in order of increasing distance. The Eirnkage levels are r a o d drawn $0 scale.

where

As k decreases, the derivation of the null distribution becomes increasingly complex, because B(k,l) takes on more values and the number of different cluster configurations becomes very large. Some idea of the magnitude of the problem can be gained by noting that there are (y)(", 9). . I) different linkage sequences that produce k clusters at the (N - k)th lidage level of TI. The sequences are equally likely under the null hypoth- esis, but it is obvious that many sequences yield the same clus- ter configuration. Hence, the values of B(k,o are by no means unique. Because the correspondence between the linkage sequences and the values of B(k,&) is not easily determined, the exact null distribution of B(k,&) cannot be derived analytically. The distribution could be determined (approximately) by gen- erating every possible linkage sequence (a large random sample of linkage sequences), using a computer, and evaluating B(k,l) for each such sequence. However, this can be quite time con- suming if N is large.

One way to make the problem more manageable is to restrict the number of outcomes by impsing some condition on the structure of T , . Fowlkes and Mallows did this by considering only those trees (T, and T,) with clusters (defined by the spec- ified linkage levels k and 0 that are the same size as the clusters of the observed trees, i.e. they considered m,., i = 1,2,. . . , k, md my, j = 1,2,.. ., 1, to be fixed and equal to the observed cluster sizes. We will refer to this condition as ' 'Condition 1 .' ' By imposing Condition I , Fowlkes md Mallows were able ta derive analytical expressions for the (conditional) mean and variance of B(k,k). The generalization to k =+ 1 is straight- forward. In the present application, T, is assumed to be fixed so there is no loss of generality in constraining the sizes of the iP; clusters to be m.,, m.,,. . . ,an.,. On the other hand, fixing the sizes of the TI clusters is more difficult to justify. Wallace (1983) commented on the drawbacks of impsing Condition 1 and not% that, among other things, it can lead to a loss of infor- mation. In the remainder of this section, we discuss a practical alternative ta Condition 2 .

When B(k,l) is evaluated for k = 1 = N - 1, N - 2 ,..., 2, it is reasonable to compute the null distribution of B(k,l) by

restricting attention to those trees that (at the (N - k - 1)st linkage level) define the same k + 1 clusters as the observed T, . This restriction will be referred to as "Csndition 2 ." Given that Condition 2 holds, it is easy to generate the null distribution of B(k,&) for the specified k and 6. W e n k = N - 1, the dis- tribution is the same as the unconditional (i.e. no restrictions on the structure of T,) distribution of B(k,!). Expressions for the unconditional probabilities were given in the f ~ s t part of this section. Although it has not been possible to generalize the formulas to k < N - 1, the distribution can easily be deter- mined with the aid of a computer.

Subject to Condition 2, there are CI1) different cluster con- figurations that ate possible at the (N - k)th linkage level, i.e. there me (k,' l) different ways of linking two of the k + 1 clusters that me defined at the (N - k - 1)st lidage level to f o m k clusters at the (N - k)th level. Every configuration has the same probability under the null hypothesis. For a given k md &, let M(k,l) = [m,(k,i)] = [mu] be the k x l 'matching matrix" where rn, was defined in The Fowlkes-Mallows Statistic. Since the Fowlke~Mallows statistic B(k,l) depends on TI and T2 only through M(k,l), the nu1 distribution of B(k,k) can be computed by generating the matching matrices that correspond to the (&,"I) different linkages. One can then calculate the associated values of B(k,l) and tabulate their relative frequencies (probabilities).

When the computation is repeated for a sequence of succes- sive linkage levels k and I, i.e. k = I = N - 1 ,N - 2,. . . ,2, it is useful to note that the ith row of M(k,&) can be obtained from M(k+ 1 ,o according to the following recursive formula:

mu (k,l) = m,, ,(k+ I,!) + m12,(k+ 1,l)

if clusters i, md i2 of the (N - k - 1)st linkage level are linked to form the ith cluster of the (N - k)th linkage level. The remaining k - 1 rows of M(k + 1 ,&) are unchanged in M(k,l). A similar relationship holds for the colums M(k ,o and M(k,& + 1). Given the initial matching matrix MfN,N), which has ones on the diagonal and zeroes elsewhere, the matching matrices for sulccessive values of k and 1 can be calculated effi- ciently by adding the appropriate rows and columns according to the above recursion.

In general, the similarity between T, and T2 that is measured by B(k+ 1,o contributes to the similarity that is measured by B(k,l). An advantage of Condition 2 over Condition B is that when the clusters defined at the (N - k - 1)st linkage level are considered fixed, B(k,&) and B(k + 1 ,l) we statistically inde- pendent (uncorrelated) fm all k (since B(k + 1,1) is a constant in that case). In other words, Condition 2 accounts for the con- tribution to B(k, 1) that is due to BQk + 1 ,l). 'Ibis makes it simple to correct for the fact that the overdl significance level of the sequence of tests corresponding to k = I = N - 1, N - 2,. . .2 is less than the significance level of m individual test. W e n the significance levels are evaluated subject to Condition 2, the solution to this 'multiple comparisons problem9 ' is to apply an se,-,(lQO)% level of significance to each B(k,k) where a,-, satisfies

and a is the desired overall level of significance (See Miller 198 1).

In a program to monitor the effects of mine tailings disposal on the benthic comunities of the Alice

Can. .I. Fish. Aquab. Sci., Vok. 45, 1988

Can

. J. F

ish.

Aqu

at. S

ci. D

ownl

oade

d fr

om w

ww

.nrc

rese

arch

pres

s.co

m b

y O

hio

Stat

e U

nive

rsity

on

11/1

1/14

For

pers

onal

use

onl

y.

Page 4: The Fowlkes–Mallows Statistic and the Comparison of Two Independently Determined Dendrograms

TABLE 1 . Comp&son of Fig. 1 of this paper and fig. 2b of Nemec aped B i f i u r s t (1988). Column 1 is the number of dusters k, column 2 gives the Fowlkes--Mallows statistic B(k,k), column 3 gives the cor- responding P value when Condition 2 holds, columns 4 and 5 give the mean ( 4 ) and standard deviation (a,) of B(R,k) when Condition 2 holds, and columns 6 and 7 give the mean (F,) and standard deviation (a,) of B(&,k) when Condition 1 holds.

TABLE 2. Compiggison of Fig. 1 of this paper md fig. 2b of Nemec md Bpi&urst (1988). Column 1 is the number of clusters k , c o % u m 2 gives the Fowlkes-Mdlows statistic B(k,4), c o l u m 3 gives the cor- responding P value when Condition 2 holds, co%ums 4 md 5 give the mean (k) and standad deviation (a,) of B(k,4) when Condition 2 holds, a d columns 6 and 7 give the mem (p,) md standard deviation (u,) of B(k,4) when Condition 1 holds.

(1) (2) (3) (4) ( 5 ) (6) (7) k B(k94) P C&2 u 2 Fl Vl

collated from several sites in the Alice A m and Hastings A m regions of British Columbia over a period of 5 yr (Kathman et al. 1 983, 1984; B~nf iurs t et al. 1984). The data for 1 2 different stations that were sampkd in 1983 are considered here. The data comprise two samples collected at each of nine stations in the Alice A m md three control stations in the Hastings A m . The stations lie along four trmsects. The three Alice A m tran- sects are denoted c e , DD, and EE; the Hastings A m transect is denoted Z 2 . Each transect has thee stations which are dis- tinguished by E or W for east or west, N or S for north or south, or M for the middle station. Abundamces were recorded for the 67 taxa that were identified in the 24 samples, and a taxa by community (station) abundance matrix was constructed by averaging the abundmces over the two samples taken from each station. A cluster analysis, using the Bray-Curtis similarity coefficient (Bray and Curtis 1957) and the "unweighted pair group method using averages'' (UPGMA) (Sneath md Sokd 1973), was applied to the abundance matrix as a mems of inves- tigating relationships among the 12 benthic communities (for details, see Kathman et a%. 1984; B~nkhurst et al. 1987). The results are summarized in fig. 2b of Nemec and Brinhurst (1988).

One question that is of interest is whether the observed sim- ilarities in the structure of the communities (as measured by the Bray-Curtis coefficient) can be attributed to the geographic proximity of the stations. To test the null hypothesis that the clustering based on the species abundance data is unrelated to the geographic proximity of the stations, a "'geographic refer- ence tree" (T2) was constmcted. The geographic tree (Fig. 1) is m UPGMA clustering of the 12 stations, using the geo- graphic distance between the stations as the distance measure. A map showing the location of the stations can be found in Kathman et ai. (1984). The four distinct geographic clusters, which correspond to the four transwts (CC, DD, EE, and 22), are readily identified in Fig. I , at the eighth linkage level.

The two trees T , and T2 were compared by computing the Fowlkes-Mallows statistic, B(k, k), for k = d = 1 1 , 10, . . . , 2. Subject to Condition 2, the null distribution of B(k,k) was gen- erated for each k and was used to compute the associated P value. The distribution was also used to compute the condi- tiond mean and standard deviation of B(k,k). For eomp~son , the conditional (i.e. subject to Condition 1 ) mean and standard deviation were computed according to the fomulas given by

Fowlkes md M d l o w ~ . ~ To test whether or not the four geo- graphic clusters (the CC, DD, EE, md 22 trmsects) are recovered at my level of T , , the calculations were repeated with I = 4 md k = 11, 10,. . ., 2. The results of the calculations are summarized in Tables 1 and 2.

Examination of the B values in Table 1 suggests that it is unlikely that a random clustering of the stations would have produced the two geographic clusters, i.e. the EE cluster and the 22 cluster, that are apparent in TI. It is not surprising to find that the EE stations, which lie on the Alice A m trmsect that is farthest from the source sf effluent, and the 22 stations in the Hastings A m , which are even more remote, follow a geographic clustering pattern. In the absence of other forces, such as pollution, the similarity between benthic communities stations might well be expected to depend on the proximity of the communities. On the other hand, there is no evidence to suggest that the clustering of six remaining stations is geographic.

In this application of the Fowlkes-Mallows test, and in other applications, the failure to reject the null hypothesis (at one or more levels of TI) does not "prove" that T , is a random clus- tering. It simply means that TI exhibits no more similarity to T2 than that wwhh would be expected if T, were generated by a random clustering scheme. The possibility that T, is related to some other T2 is not mied out. The importance of this dis- tinction was apparent when the 198% data were analyzed in the same fashion as the 1983 data. En that year, the effect of the mine tailings was quite pronounced. Three stations (CCN, CCM, and DDM) showed a severe depletion in the number and variety of species. The cluster analysis of the abundance matrix identified these three stations as a distinct cluster, despite their quite wide separation (see Nemec and BrinBurst 1988). Although the cluster would not have been predicted on the basis of the geography of the stations, it is obvious that the clustering of these t h e stations cannot be considered random if the path of the plume of the mine tailings is taken into consideration. The three stations fell immediately in the path of the plume (as detected in the water column). Ht is interesting to note that suf- ficient recovery had taken place by 1983 that the three stations

2A FORTRAN program for calculating the (Condition 2) P value and the conditional (Condition 1 and 2) mean and stand& deviation of B(&,8) is available upon request.

Can. J. Fish. Aqse~t. Sci., VoI. 45, 1988

Can

. J. F

ish.

Aqu

at. S

ci. D

ownl

oade

d fr

om w

ww

.nrc

rese

arch

pres

s.co

m b

y O

hio

Stat

e U

nive

rsity

on

11/1

1/14

For

pers

onal

use

onl

y.

Page 5: The Fowlkes–Mallows Statistic and the Comparison of Two Independently Determined Dendrograms

CCN, CCM, aaad BDM were no longer distinguishable from the rest of the CC md DD stations. Further discussion of the Alice A m data a d additional examples of the test described here can be found in Nemec and BHi&urst (19883, Mathman et al. (1984), B r i ~ u r s t (1987), B~mmt et al. (19873, Burd and BHi&unt (1987), and Burd et d. (1987).

Acknowledgements

We are grateful to J o e Felserastein, Jim Nemec, Brenda Burd, Ken Denman, and Dave Mackas for helpful comments and suggestions.

References

BRAY, J. R., AND a. T. CURTIS. 1957. An ~rdination of the upland forest com- munities sf southern Wisconsin. %sl. Monogr. 27: 325-349.

B ~ K ~ U R S T , R. 0. 1987. Distribution and abundance sf macrobenthic infauna from the continental shelf off southwestern Vancouver Island, British Columbia. Can. Tech. Rep. Hydrsgr. Ocean Sci. 85: 87 p.

B ~ R S T , R. 0 . , R. D. KATHMAN, AND B. BURD. 1987. Benthic studies in Alice h and Hastings h, B.C. in relation to mine tailings disposal, 1982-1986. Can. Tech. Rep. My&ogr. Ocean Sci. 89: 44 p.

BURD, B . , AND R. 0. BRIN~URST. 1987. Macrobenthic infarna from Hecate Strait, B . 6 . Can. Tech. Rep. Hydrogr. Ocean Sci. 88: 123 p.

BURD, B., D. M ~ E ? AND R. O. BRINKHURST. 1987. Distribution and abun- dance of macrobenthic infauna from B o u n h y Bay and mud bays near the Btitish Cs1umbidlJ.S. border. Canan. Tech. Rep. Hydrogr. Ocean Sei. 84: 34 p.

FOWLRES, E. B . , AND C. L. MALLOWS. 1983. A method for comp&ng two hierarchical clusterings. J. Am. Stat. Asscec. 78: 553-569.

KATNMAN, R. D., R. 0. BRINKNURST, R. E. WOODS, AND S. F. CROSS. 1984. Benthic studies in Alice Am, B.C.. following cessation of miwe tailings disposal. Can. Tech. Rep. Hy&ogr. Ocean Sci. 37: 57 p.

KATHMAN, R. D., a. 0. BWINWURST, R. E. WOODS, AND D. C. J E ~ H B S , 1983. Benthic studies in the Alice h and Hastings A m , B.C. in relation to mine tailings. Canan. Tech. Rep. Hydogr. Ocean. Sci. 22: 30 p.

MILLER, R. G. JR. 198 1 . Simultaneous statistical inference. 2nd ed. Sptinger- Verlag, New York, NY- 299 p.

E\~EMEC, A. F. L., AND R. 0. BRINKHUBST. 1988. Using the bootstrap to assess statistical significance in the cluster analysis sf species abundance data. Can. J. Fish. Aqua?. Sci. 45: 965-978).

SNEATH, P. H. A., AND R. R. %XAL. 1973. Numerical taxonomy: the principles and practice of numerical classification. W. H. Freeman, San Francisco, CA. 573 p.

WALLACE, D. L. 1983. Comment: a method for comparing two hierarchical cl~~sterings by E. B. Fowlkes and C. E. Mallows. 1. Am. Stat- Assw. 78: 569-576.

Can

. J. F

ish.

Aqu

at. S

ci. D

ownl

oade

d fr

om w

ww

.nrc

rese

arch

pres

s.co

m b

y O

hio

Stat

e U

nive

rsity

on

11/1

1/14

For

pers

onal

use

onl

y.