neutral assembly of bacterial communities

R E S E A R C H A R T I C L E

Neutral assemblyof bacterial communitiesStephen Woodcock1, Christopher J. van der Gast2, Thomas Bell3, Mary Lunn4, Thomas P. Curtis5,Ian M. Head5 & William T. Sloan1

1Department of Civil Engineering, University of Glasgow, Glasgow, UK; 2Centre for Ecology and Hydrology, Oxford, UK; 3Department of Zoology,

University of Oxford, Oxford, UK; 4Department of Statistics, University of Oxford, Oxford, UK; and 5School of Civil Engineering and Geosciences,

University of Newcastle upon Tyne, Newcastle, UK

Correspondence: William T. Sloan,

Department of Civil Engineering, University of

Glasgow, Glasgow, G12 8LT, UK. Tel.: +44

141 330 4076; fax: +44 141 330 4557;

e-mail: [email protected]

Received 18 January 2007; revised 29 June

2007; accepted 3 July 2007.

First published online October 2007.

DOI:10.1111/j.1574-6941.2007.00379.x

Editor: Jim Prosser

Keywords

community assembly; dispersal; insular

comunities; mathematical model; neutral

model.

Abstract

Two recent, independent advances in ecology have generated interest and

controversy: the development of neutral community models (NCMs) and the

extension of biogeographical relationships into the microbial world. Here these

two advances are linked by predicting an observed microbial taxa–volume

relationship using an NCM and provide the strongest evidence so far for neutral

community assembly in any group of organisms, macro or micro. Previously,

NCMs have only ever been fitted using species-abundance distributions of

macroorganisms at a single site or at one scale and parameter values have been

calibrated on a case-by-case basis. Because NCMs predict a malleable two-

parameter taxa-abundance distribution, this is a weak test of neutral community

assembly and, hence, of the predictive power of NCMs. Here the two parameters of

an NCM are calibrated using the taxa-abundance distribution observed in a small

waterborne bacterial community housed in a bark-lined tree-hole in a beech tree.

Using these parameters, unchanged, the taxa-abundance distributions and

taxa–volume relationship observed in 26 other beech tree communities whose

sizes span three orders of magnitude could be predicted. In doing so, a simple

quantitative ecological mechanism to explain observations in microbial ecology is

simultaneously offered and the predictive power of NCMs is demonstrated.

Introduction

Scale is a problem in microbial ecology. Even using the

most-up-to-date molecular methods, one is limited to

observing and characterizing very small samples from what

are ostensibly very large naturally occurring microbial com-

munities. The considerable technical sophistication and skill

required to collect, analyse and enumerate microbial popu-

lations correctly in environmental samples can sometimes

obscure just how small, in relative terms, samples are. Take,

for example, a large clone library of say 500 clones derived

from a 1 mg soil sample; the soil sample itself may contain as

many as 109 individual organisms and applied microbial

ecologists will generally be interested in the services pro-

vided by the communities at a scale somewhat larger than a

single 1 g sample. By analogy, when there are currently

6� 109 humans in the world a single sample of a few

hundred individuals is unlikely to be sufficient to character-

ize the global distribution of any human traits unless it is

extremely homogeneous. Molecular methods are advancing

so quickly that in the near future it may be possible to get

close to a complete census in a sample (Sogin et al., 2006)

but, even then, a 1 g sample is small if one’s aspirations are to

characterize an entire field of soil. This disparity between

sample size and community size is enormous and far greater

than any comparable sampling issues in mainstream ecol-

ogy. Consequently, patterns are perceived through a sparse,

often distorted (Sloan et al., 2007) map of the microbial

world.

Thus, the modus operandi in microbial ecology is extra-

polation from very small samples and the fact that microbial

systems are always observed at a scale much smaller than

ultimately aimed at to characterize them, amplifies some of

challenges faced in classical ecology. In scaling from a leaf to

the ecosystem to the landscape and beyond (Jarvis &

McNaughton, 1986), there must be an understanding of

how information is transferred from fine scales to broad

scales and vice versa (Levin, 1992). This problem of

FEMS Microbiol Ecol 62 (2007) 171–180 c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

cascading information and ecological process understanding

through a hierarchy of different scales is being tackled using

mathematical models by ‘landscape ecologists’ (Wu &

Hobbs, 2002). The models integrate mathematical descrip-

tions of plausible plot-scale ecological processes to form

patterns at the landscape scale. In going from rRNA genes in

a sample to the sample itself and beyond, microbial ecolo-

gists face similar technical and conceptual challenges (Sloan

et al., 2007) but with the added challenge of having an

uncertain picture of the broad-scale patterns (Woodcock

et al., 2006). This has consequences for the complexity of the

models that can be aspired to be used. Simon Levin in his

McArthur Award Lecture (Levin, 1992) on ‘The problem of

pattern and scale in ecology’ described the essence of model-

ling thus ‘to facilitate the acquisition of this understanding

(scaling), by abstracting and incorporating just enough detail

to produce observed patterns. A good model does not attempt

to reproduce every detail of the biological system; the system

itself suffices for that. Rather, the objective of a model should

be to ask how much detail can be ignored without producing

results that contradict specific sets of observations . . .’ This

judicious paradigm has been embraced by theoretical ecol-

ogists and a wide variety of conceptual models of potential

ecological pattern-forming mechanisms have been encoded

into mathematical models and then shown to produce

observed patterns in the spatial distribution and relative

abundance of taxa. All are simplifications of the system

being modelled. The majority serve to, in some way, whittle

down the set of plausible mechanisms that can lead to a

particular pattern; theoretical ecologists are all too aware of

the ‘same behaviour implies same mechanism’ fallacy and,

since their representations of the ecological processes are

rarely calibrated against observations, few say emphatically

that their models are correct.

While validating the plausibility of a model through

mathematics is vital, the paucity of attempts to go beyond

this and validate the models themselves is frustrating

(Belovsky et al., 2004); indeed Schoener (1972) cautioned

against the ‘constipating accumulation of untested models’

more that 30 years ago. In microbial ecology, one is less

constrained by the broad-scale patterns because one can find

them so difficult to identify and thus theoretical microbial

ecology, if pursued in the same fashion, unconstrained by

biological reality, could significantly add to this uncomfor-

table blockage of untested and perhaps untestable models.

Thus one further proviso is required; for a model to

ultimately be of some practical use it should be predictive.

This means that a model calibrated at one site, one scale or

on the basis of one set of ecological processes should be

capable of predicting phenomena at different sites, at

different scales or that pertain to seemingly unrelated

mechanisms. Harte (2004) implicitly combines the para-

digm cited by Levin (1992), which calls for parsimony, with

the requirement for prediction by suggesting that theories

are of most interest when the ratio of the number of

predictions that they make to the number of assumptions

and adjustable parameters is large. It is for this reason that

when it was chosen to investigate the roles of chance and

dispersal limitation on patterns in microbial community

composition the simplest possible conceptual model of

community assembly that incorporated these factors was

selected (Curtis et al., 2006; Sloan et al., 2006, 2007); a

simple neutral community assembly model (Hubbell, 2001)

where the composition at a local scale is shaped only by

random immigration, birth and death events.

Neutral community assembly models (NCMs) (Bell,

2000; Hubbell, 2001) have been shown to reproduce the

distribution of taxa abundances in a wide range of different

biological communities. However, in most previous applica-

tions of neutral theory the model parameters are selected to

minimize the difference between observed and predicted

taxa-abundance distributions. The merit of neutral theory,

over and above other hypotheses on the formation of

biological communities is then argued on the basis of (often

small) differences in a goodness of fit statistic for calibrated

taxa-abundance distributions (McGill, 2003; Volkov et al.,

2003, 2006; Chave et al., 2006). These arguments can seem

rather arcane when there has been little attempt to validate

the models (Harte, 2003). In addition, microbial ecologists

were until recently precluded from the debate because, for

most environments, only a small fraction of the diversity can

be experimentally defined (Curtis et al., 2002). Despite the

advances in molecular methods for characterizing naturally

occurring microbial communities in situ, the disparity in

scale between sample and community size and some in-

herent limitations of the methods conspire to make a purely

empirical definition of a taxa-abundance distribution at a

single site very difficult. Sloan et al. (2006) circumvented

this problem for microbial communities by deriving a

method for calibrating Hubbell’s neutral theory based on a

theoretical relationship between the mean relative abun-

dance of common taxa and the frequency with which they

are expected to appear in multiple similarly sized samples.

Thus, in Sloan et al. (2006, 2007), the criterion of a

parsimonious calibrated model capable of reproducing

patterns is met. However, neutral theory is controversial

and its parsimony still grates with many who seek to provide

evidence that it cannot explain all the variance in real

communities of macroorganisms (e.g. McGill et al., 2006).

Calibrating an NCM at one site or one scale is not a

convincing endorsement of the model’s underlying assump-

tions and many alternative models could potentially repro-

duce either the taxa-abundance distributions (McGill, 2003)

or the abundance-frequency relationships (He & Gaston,

2003) observed. A strong test of neutral theory has to

demonstrate its predictive power and this has never

FEMS Microbiol Ecol 62 (2007) 171–180c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

172 S. Woodcock et al.

previously been achieved (Condit et al., 2002; Gotelli &

McGill, 2006; McGill et al., 2006). The authors provide the

first demonstration of an NCM calibrated using microbial

taxa abundances at one site and one scale accurately

predicting very different taxa-abundances distributions and

the observed taxa–volume relationship across a range of

scales and sites.

Hubbell’s neutral model makes predictions about how

the richness and abundance distribution of taxa on island-

like communities will be affected by community size and

immigration. Indeed, the genesis of his NCM came about

through an attempt to present a unified theory that com-

bined the Theory of Island Biogeography (MacArthur &

Wilson, 1967), which makes predictions on species richness

within insular communities, with predictions on relative

abundance of taxa. Ascertaining whether these predictions

are borne out in reality requires taxa abundance data from a

set of insular communities for which community size or

immigration varies significantly but which are very similar

in all other respects. If the NCM presents an adequate

representation of the ecological process that shape the

community structure then it should be possible to explain a

significant proportion of the variance in the taxa-abundance

distributions for all the communities by employing a single

set of parameters calibrated at one site. However, datasets

for insular communities of significantly different sizes or

with different degrees of isolation that are housed in very

similar ecosystems are rare. Bell et al. (2005) published just

such a dataset for water-borne bacteria living in tree holes in

beech trees in the same woodland. Samples were taken from

29 rainwater-filled, bark-lined holes, each of which housed

a small ecosystem. The range of volumes of these habitats

spanned three orders of magnitude; the smallest was a mere

50 mL, the largest 18 000 mL. Bell et al. (2005) reported that

bacterial species richness increased with tree hole volume in

a manner that could be modelled using a single power law

relationship, which hints at some consistent process of

community assembly. All the fluid was removed from the

tree holes and was homogenized by stirring. Bacterial

richness was determined from denaturing gradient gel

electrophoresis (DGGE) analysis (Bell et al., 2005). Epifluor-

escence microscopy was performed (Porter & Feig, 1980)

and the density of organisms in the tree holes was revealed

to be around 105 mL�1. The sample size analyzed for all the

tree holes was 5 mL (c. 5� 105 individuals). Physically and

chemically, the bacterial communities were similar; they

were all supported by similar nutrients (decaying leaf litter),

relatively stagnant, but subject to invasion events from either

airborne or rainwater-borne microorganisms. The greatest

geographic distance between any two trees in the study was

around two miles. The distributions of the relative abun-

dance of taxa in the samples was not reported in Bell et al.

(2005) but are used here (e.g. Figs 1 and 2). These were

determined by the relative intensity of bands on the DGGE

gels. Because of detection limitations inherent to the DGGE

analysis (e.g. Cocolin et al., 2000; Leclerc et al., 2004;

Woodcock et al., 2006) used in the initial study, only the

top few ranked taxa were observed at each site, hence the

abundances in the dataset were normalized relative only to

the total abundances of these most common taxa. Quanti-

fication of the absolute abundances of taxa in a sample using

DGGE is open to criticism (Heuer et al., 2001); however,

exactly the same technique was used for each tree hole and,

therefore, exactly the same biases applied to each sample.

Thus, for the comparative analysis of relative abundances

presented here, it is fair to say that significant shifts in

DGGE patterns reflect real shifts in the bacterial community

composition. What is immediately striking from these data

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

Bacterial taxa rank

Rel

ativ

e ab

unda

nce

Fig. 1. The ranked taxa-abundance distribution (squares) observed in a

5 mL sample from the smallest tree hole which had a volume of 50 mL.

The line shows the expected rank abundance distribution obtained by

averaging 1000 realizations of ranked abundances simulated by Hub-

bell’s neutral model with NT = 5�105, y= 15 and m = 1.0�10�6.

0

0.1

0.2

0.3

0.4

Bacterial taxa rank

4 4503 000

1 460640

111

50

Rel

ativ

e ab

unda

nce

11 000

Fig. 2. Ranked taxa-abundance distributions for a selection of seven of

the 29 tree holes ranging in volume from 11 000–50 mL. The lines

represent taxa-abundance distributions predicted by the neutral model

with m = 10�6 and y= 15 calibrated using data from the 50 mL tree hole.


173Neutral assembly of bacterial communities

is just how dramatically and systematically the shape of the

taxa-abundance distributions change between tree holes,

with large communities exhibiting a much more even

distribution than small ones. This is all the more remarkable

because the sample size at each site was exactly the same.

Given the proximity of tree holes and the similarity of

their environments, what affects the difference in the taxa-

abundance distributions? If one were to assume, initially,

that the tree hole environments were identical in everything

except for their volume and that the same forces act to shape

the community composition, then can volume alone explain

the differences? It is shown that, accounting for tree hole

volume, the distributions are significantly different and this

hypothesis is rejected. Thus the distribution of taxa abun-

dances in tree hole samples does not derive from the same

underlying distribution. Therefore, there is no benefit to be

gained in testing, what many commentators believe should

be the null hypothesis in any study of taxa abundances

(McGill, 2003; Gotelli & McGill, 2006), that a particular

arbitrary parameter distribution, such as the lognormal, fits

the data. Then the hypothesis that the tree holes house

distinct, homogeneous, island-like communities that are

neutrally assembled from a single metacommunity with a

consistent rate of random immigrations into each tree hole

is tested and could not be rejected. This is achieved by

calibrating a NCM using the taxa-abundance distribution

from the smallest tree hole, predicting the taxa-abundance

distributions from all other tree holes and then testing at the

5% significance level whether the simulated and observed

taxa-abundance distributions are the same.

Materials and methods

Do the samples derive from the samedistribution?

Firstly, the hypothesis that the same structuring forces

shaped the bacterial tree hole communities and that, conse-

quently, the sample taxa-abundance distributions derive

from the same underlying distribution was tested. Synthetic

populations were generated for each environmental sample

by selecting 5� 105 individuals with replacement from the

observed taxa-abundance distributions. Species were in-

dexed by their ranked abundance, with the most abundant

being ranked 1, the second most abundant 2, etc. The

abundances reported for the observed data were relative to

the total abundance of the taxa that appeared on the DGGE

gels. Therefore the abundances of synthetic populations

were also normalized by the same number of top ranked

taxa. A Kolmogorov–Smirnoff test was then applied to every

combination of two synthetic samples to determine how

likely they were to have come from the same underlying

distribution.

Is the community neutrally assembled?

Secondly, the hypothesis that the samples are from neutrally

assembled communities fed by immigrants from a single

source community with a constant immigration rate was

tested. To do this, Hubbell’s NCM was calibrated using the

taxa-abundance distribution in the sample drawn from the

smallest tree hole. In Hubbell’s model it is assumed that the

distribution of taxa abundances in the source metacommu-

nity is described by a log-series distribution with a single

parameter y, which Hubbell calls the fundamental biodiver-

sity number because it indexes the overall biodiversity. In

local communities, which are assumed to be saturated with

individuals, when an individual organisms dies it is either

replaced with probability m by an immigrant drawn at

random from the source community or, alternatively, by

reproduction from within the local community with prob-

ability 1�m. Given local reproduction, the probability that

any particular taxon reproduces depends on its relative

abundance, which requires knowledge of the number of

individuals in the local community, NT . Thus the shape of

the taxa-abundance distribution for a neutrally assembled

community depends on the values of the three parameters:

NT, y and m. NT was estimated using the tree hole volume

and the density of organisms [O(105) mL�1]. y and m were

considered free parameters that were adjusted to give the

best least squares fit between the observed and simulated

expected taxa-abundance distribution for the smallest tree

hole. Least squares fitting was adopted because it is liable to

be biased towards fitting the model to the higher observed

relative taxa abundances; the authors have more confidence

in these than the lower abundances estimated from DGGE

band intensities. To simulate the sample distribution, a

realization of the relative abundance of taxa in the meta

community fpigSMi¼1, where SM is the number of different

taxa in the meta community, was first generated. To do this,

it is noted (after Volkov et al., 2003) that provided y/SM is

small, the log-series distribution can be approximated by a gdistribution. It is shown in the appendix that this leads to a

simple method for generating realizations of relative abun-

dance of taxa in the metacommunity fpigSMi¼1 by sampling at

random from gamma distributions. For any given realiza-

tion of fpigSMi¼1 then Sloan et al. (2007) show that the

distribution of taxa abundances in the local neutrally

assembled community fyigSMi¼1 is Dirichlet DirðNT mp1; . . . ;

NT mpSM Þ and give a simple algorithm for generating a

realization of fyigSMi¼1. Sloan et al. (2007) also show that the

distribution of taxa abundances in a sample of size NS (i.e.

what is observed on the DGGE gel) from that distribution is

Dirichlet DirðNSy1; . . . ;NSynÞ. Therefore, given any pair of

parameters m and y, it was straightforward to simulate 1000

realizations of the taxa abundance distribution in a sample

of NS individuals from a tree hole comprising NT



individuals. These were then averaged to give the expected

taxa-abundance distribution. For the purposes of a com-

parative analysis, since the abundances reported for the

observed data were relative to the total abundance of the

taxa that appeared on the DGGE gels, the abundance of

synthetic populations were also normalized by the same

number of top-ranked taxa.

Results

Two hypotheses were tested. Firstly, that the taxa-abundance

distributions observed in all the tree holes derive by ran-

domly sampling from the same distribution. The P-values

were so low that the hypothesis that the samples are all from

the same underlying distribution at the 0.05% level (i.e. all

P-values o 0.0005) can be confidently rejected. Secondly,

the hypothesis that the communities are neutrally assembled

from the same source community and that the taxa-abun-

dance distributions could be reproduced by a NCM was

tested. A stringent test of this was adopted in that, rather

than seeking parameter values on the basis of all the data

from all the tree holes, it was decided to calibrate two free

parameters the immigration probability, m, and index to the

biodiversity in the source community, y, in Hubbell’s NCM

using data from only one tree hole; the smallest. The least-

squares best fit to the relative abundance of the observable

taxa in a sample (5 mL) from the smallest (50 mL) tree hole

was obtained with y= 15 and m = 1.0� 10�6 (Fig. 1). These

parameters values were then used to predict the expected

abundance distributions in all other tree holes, the only

parameter that changed between tree holes was NT , the total

number of individuals.

Figure 2 gives seven examples of the remarkably good

match between the observed and predicted taxa-abundances

that was obtained, in the majority of tree holes. A quantita-

tive measure of goodness of fit was obtained simulating an

additional 500 realizations of the tree hole taxa-abundance

distributions. Pearson’s statistic for goodness of fit was

calculated for these and for the observed distribution

Xn

i¼1

ðEðiÞ � xðiÞÞ2

EðiÞ

where E(i) is the expected abundance of the ith ranked

taxon and x(i) is its abundance in the simulation or observed

dataset. A P-value was then estimated from the proportion

of these 500 trials that produced a goodness of fit statistic

greater than that calculated for the observed data (Table 1).

Hypothesis testing at the 5% significance level suggested

(Table 1) that for 27 of the 29 tree hole communities, there

was no evidence to reject the neutral model. There was no

reason to assume any anomalies in either environmental

conditions or sampling procedure for the two tree holes

where the neutral model was rejected.

The success of the neutral model in predicting taxa-

abundance distributions over a range of different scales

(Fig. 2, Table 1) demonstrates its potential as a predictive

tool. Much of the excitement about Hubbell’s neutral model

stems from its ability to link the prediction on different

ecological phenomena. Indeed, Hubbell refers to his theory

as the ‘unified theory of biodiversity and biogeography’

because of its potential to link predictions on the shape of

taxa-abundance distributions to a relationship between taxa

richness and area sampled (taxa–area relationship). This

link has never previously been explicitly demonstrated. In

Fig. 3 it is shown that the predicted richness of taxa in each

of the tree hole samples closely matches the observed

richness. These predictions have again been produced using

the parameter values calibrated using data from the smallest

tree hole (y= 15, m = 10�6) a detection threshold of 0.005 on

the relative abundance of taxa was assumed; this was the

minimum relative abundance to appear on any of the DGGE

gels. Bell et al. (2005) fitted the phenomenological model of

a power-law relationship to their observed taxa–area rela-

tionship, which is reproduced in Fig. 3a. When a similar

Table 1. Estimated P-values for the goodness of fit using a NCM

calibrated against the smallest site, tree hole 21

Tree-hole number Volume (mL) P-value

1 360 0.076

2 3250 0.324

3 1700 0.500

4 750 0.280

5 18 000 0.020�

6 180 0.620

7 640 0.454

8 4450 0.612

9 3600 0.152

10 3150 0.146

11 2250 0.326

12 1800 0.650

13 1250 0.154

14 60 0.480

15 1950 0.122

16 2850 0.214

17 2225 0.956

18 900 0.228

19 11 000 0.158

20 1460 0.588

21 50 0.956

22 3000 0.674

23 140 0.342

24 220 0.094

25 111 0.808

26 350 0.466

27 1200 0.068

28 3000 0.736

29 600 0.040�

�Not statistically significant.

The parameter pair used was (15, 10�6).



model is fitted the predicted taxa richnesses (Fig. 3b), an

almost identical relationship is obtained. The greatest devia-

tion between observed and predicted is, perhaps unsurpris-

ingly, in the largest tree hole where the neutral model was

rejected (Fig. 3c). For the most part, however, the link

between the taxa–area relationship and taxa-abundance

distributions suggested by Hubbell is borne out in the tree

hole data set. the smallest tree hole was deliberately selected

to calibrate the model parameters because it offers the

greatest information on the rate of immigration, m. Accord-

ing to Hubbell’s model, as community sizes increase, the

systems increasingly resemble the source community and

the effects of immigration become obscured. Thus, the

smallest site offers the greatest opportunities to quantify

immigration into the systems. The value of y= 15 calibrated

on the smallest tree hole is consistent with calibrating on any

other tree hole. Independently, calibrating the model using

all of the other tree holes suggests that y lies in the range

15 � y � 25 and it transpires that the predictions on both

the taxa-abundance distributions and taxa–volume relation-

ship across all the tree holes were insensitive to changes in

this range. This insensitivity is not a property of the neutral

model itself. Rather it is an artefact of the experimental

methods available to microbial ecologists. As discussed in

the introduction, microbial ecologists are limited to viewing

a small percentage of the overall diversity in an environ-

mental sample using rapid community profiling techniques

such as DGGE. This is a generic problem in microbial

ecology that is discussed in more detail elsewhere (Curtis

et al., 2006; Woodcock et al., 2006; Sloan et al., 2007).

However, in the context of this application the reason for the

insensitivity to y is demonstrated in Fig. 4. This shows the

taxa-abundance distributions that one would expect from a

random samples from two log-series distributed source

communities: one with y= 15 and the other with y= 25. If

there were no dispersal limitation, then these are the

distributions one might expect in all the tree holes. In the

entire sample of 5� 105, there are significant differences in

the overall richness of taxa and in the distribution of taxa-

abundances as a function of y. However, using DGGE it is

impossible to detect all the taxa, only those with abundance

greater than some threshold can be detected. A detection

limit of 0.5% relative abundance is displayed in Fig. 4a and b

the taxa abundance distribution for taxa whose abundances

are greater than this limit are displayed. The distributions

are quite similar and thus the abundance distribution of

detectable diversity is quite insensitive to y. Sloan et al.

(2007) point out that it is difficult to determine the under-

lying taxa-abundance distribution from such small samples

and, therefore, it may be that the source community

abundances are not in fact log-series distributed; other

source distributions might produce similar results. Thus

the success of the neutral model in predicting the taxa-

abundance distributions in the tree holes should not be seen

as a validation of Hubbell’s model in its entirety. Sufficient

information is not available about rare taxa to conclude that

the log-series is the source community’s taxa-abundance

distribution, let alone verify Hubbell’s conceptual model for

the maintenance of biodiversity in the source community.

However, given that there is some underlying source com-

munity distribution, the first test showed that the tree hole

communities are not merely random samples from that

source community; the abundance distributions of detect-

able diversity all differ significantly from one another. Some

ecological process must be affecting these differences. This

could be a function of the environment, but part of the

0.6

1

1.4

1.5 2.5 3 3.5

S = 2.11V 0.26

0.6

1

1.4

1.8

55 10 15 20 25 30

Log10 (treehole volume, V )

Log10 (treehole volume, V )

1.8

Log 1

0 (n

umbe

r of

taxa

, S)

S = 2.19V 0.25

4.542

1.5 2.5 3 3.5 4.542

Log 1

0 (n

umbe

r of

taxa

, S)

Observed number of taxa

30

25

20

15

10

Pre

dict

ed n

umbe

r of

taxa

, S(c)

(b)

(a)

Fig. 3. (a) The observed bacterial richness in all 29 tree holes. The solid

line represents the power-law relationship, S = 2.11 V 0.26, fitted using

linear regression. (b) The bacterial richness predicted by the neutral

model with y= 15 and m = 10�6 calibrated using the taxa-abundance

distribution of the smallest tree hole. The solid line represents the power-

law relationship, S = 2.19 V 0.25, again fitted using linear regression. (c)

Observed vs. predicted richness in each tree hole. The line represents

perfect agreement and the two squares indicate where the neutral

model was rejected.



attraction in examining the tree-hole communities is that

their environments are all so similar. Besides, the greatest

perceptible difference between the tree holes is their volume

and hence the size of the communities they house. In the

predictions presented in this paper, the source community

abundance distribution and the immigration rate are held

constant and the only parameter that changes from tree hole

to tree hole is the community size, NT , which is estimated

to be the product of the measured bacterial density and

volume. Thus the ecological mechanism that effects the

difference in tree hole abundance distributions is the chan-

ging relative importance of random immigration on tree

holes of different sizes. The inability to reject the neutral

model predictions on the basis of the data in the majority of

tree holes suggests that this simple explanation cannot be

ruled out.

Discussion

The success of the neutral model in explaining the different

taxa-abundance distributions in the detectable diversity of

tree holes, whose sizes vary over three orders of magnitude,

without the need to change any parameters, constitutes the

strongest evidence, so far, that NCMs can usefully describe

community composition. The evidence creates a compelling

case to study carefully the role that random reproduction,

death and immigration play in shaping bacterial community

structure. It suggests that at least some bacterial commu-

nities are dispersal-limited and, therefore, challenges the

perspectives held by some commentators that global dis-

persal of microorganisms prevents them from having bio-

geography (Fenchel, 2003) and that microbial population

sizes are sufficiently large to preclude local stochastic

extinctions (Fenchel & Finlay, 2005). This suggestion could

be tested by directly measuring immigration rates into

similar but different sized bacterial communities.

Naturally occurring communities of microorganisms are

vital to life on earth and are of profound practical signifi-

cance in agriculture, medicine and engineering. Describing

patterns in microbial communities is, therefore, important

but not as important as explaining why the patterns form.

Thus the model presented here is of particular importance

because rather than fitting an arbitrary mathematical func-

tion to an observed pattern in microbial taxa abundances,

the model calibrated at one scale successfully makes predic-

tions at others. Prediction is rare in both microbial and

macrobial ecology (Harte, 2004) and is a far more convin-

cing test of an ecological theory than fitting predicted to

observed taxa-abundance distributions at a single site.

However, in presenting strong evidence in favor of Hubbell’s

neutral theory, one can court controversy. Indeed, when

these results were presented at the joint Society for General

Microbiology Meeting/British Ecological Society on which

this thematic issue is based, several of the audience strongly

objected. The grounds for this were that very many other

models could have produced similar patterns and that

processes such as niche differentiation could possibly ex-

plain the same patterns and better represented the biological

complexity that is believed to exist. This reflects debate that

has run in the ecological literature for the past five years

(Hubbell, 2006; McGill et al., 2006) where there was an

initial polarization between the niche and neutral perspec-

tives on community assembly. As the debate has matured,

the hostilities have diminished and there is now a degree of

conciliation and a recognition that the two are not mutually

exclusive (Hubbell, 2006) and that chance, dispersal limita-

tion and niche differentiation, or species sorting, all have a

role to play. However, the parsimony of neutral theory still

remains controversial and many seek to provide evidence

that it cannot explain all the variance in real communities of

0 50 100 150 200 250Taxa rank

Rel

ativ

e ab

unda

nce

θ =15θ =25

Detection threshold

0 10 20 30 40 50

10−1

10−2

10−2

10−4

Taxa rank

Rel

ativ

e ab

unda

nce

θ =15θ =25

(a)

(b)

Fig. 4. (a) The ranked relative abundance of taxa in a random sample of

5�105 individuals from two different log-series distributed source

communities; one with parameter y= 15 the other, more diverse, with

y= 25. The dashed line is a threshold in relative abundance below which

taxa will not be detected using DGGE. (b) The distribution of abundance

taxa in the random samples that can be detected by DGGE. The

distribution of common taxa is less sensitive to the value y than that of

rare taxa.



macroorganisms (e.g. McGill et al., 2006). There is no doubt

that the neutral theory will fail to explain all of the variance

but would maintain that the success, so far, of the neutral

theory in explaining and, indeed, predicting the majority of

the variance in microbial community composition is such

that the burden of proof lies firmly with those that believe

niche differentiation dominates the community assembly

process. Their route to providing quantitative evidence of

this will require a predictive model based on niche or species

sorting concepts and none currently exist. There are highly

cited examples of models that successfully meld determinis-

tic, niche-based concepts with dispersal in a spatially dis-

tributed environment to demonstrate that a combination of

these factors can promote biodiversity (e.g. Tilman, 1994;

Mouquet & Loreau, 2003). However, these demonstrations

tend to rely on a large number of (often invented) taxon-

specific parameters. This degree of specificity is currently

impossible in microbial ecology. Furthermore, there are no

examples of such models being calibrated at one site or scale

and then subsequently predicting phenomena at another.

The rationale behind these theoretical demonstrations is

commendable, in that ultimately to transfer information

successfully through all scales in the landscapes of micro-

organisms and macroorganisms will require a theory that

incorporates both demographic stochasticity and determi-

nistic factors. However, for the foreseeable future, the

representation of deterministic factors cannot rely on the

experimental definition of a suite of parameters for each

species, many of which cannot even be seen in microbial

communities. Thus some alternative model that encapsu-

lates the deterministic factors is required, perhaps based on

energy concepts (Brown et al., 2004) or maximizing disorder

(Shipley et al., 2006), and this remains an exciting challenge

that transcends all the subdiscipline boundaries that appear

to exist in ecology. For the moment though, it would appear

that neutral dynamics are the best quantitative description

of bacterial community assembly in beech tree holes in

Wytham Woods, Oxfordshire, UK.

References

Bell G (2000) The distribution of abundance in neutral

communities. Am Nat 155: 606–617.

Bell T, Ager D, Song J, Newman JA, Thompson IP, Lilley AK &

van der Gast CJ (2005) Larger islands house more bacterial

taxa. Science 308: 1884.

Belovsky GE, Botkin DB, Crowl TA et al. (2004) Ten suggestions

to strengthen the science of ecology. Bioscience 54: 345–351.

Brown JH, Gillooly JF, Allen AP, Savage VM & West GB (2004)

Toward a metabolic theory of ecology. Ecology 85: 1771–1789.

Chave J, Alonso D & Etienne RS (2006) Theoretical biology –

comparing models of species abundance. Nature 441: E1–E1.

Cocolin L, Bisson LF & Mills DA (2000) Direct profiling of the

yeast dynamics in wine fermentations. FEMS Microbiol Lett

189: 81–87.

Condit R, Pitman N, Leigh EG et al. (2002) Beta-diversity in

tropical forest trees. Science 295: 666–669.

Curtis TP, Sloan WT & Scannell JW (2002) Estimating

prokaryotic diversity and its limits. Proc Natl Acad Sci USA 99:

10494–10499.

Curtis TP, Head I, Lunn M, Sloan WT, Schloss PD & Woodcock S

(2006) What is the extent of microbial diversity. Philos Transac

Roy Soc 361: 2023–2037.

Fenchel T (2003) Biogeography for bacteria. Science 301: 925–926.

Fenchel T & Finlay BJ (2005) Bacteria and Island Biogeography.

Science 309: 1997–1999.

Gotelli NJ & McGill BJ (2006) Null versus neutral models: what’s

the difference? Ecography 29: 793–800.

Harte J (2003) Tail of death and resurrection. Nature 424:

1006–1007.

Harte J (2004) The value of null theories in ecology. Ecology 85:

1792–1794.

He HL & Gaston KJ (2003) Occupancy, spatial variance, and the

abundance of species. Am Nat 162: 366–375.

Heuer H, Wieland G, Schonfeld J, Schnwalder A, Gomes NCM &

Smalla K (2001) Bacterial community profiling using DGGE

or TGGE analysis. Environmental Molecular Microbiology

(Rochelle PA, ed), pp. 177–190. Horizon Scientific Press,

Wymondham, UK.

Hubbell SP (2001) The Unified Neutral Theory of Biodiversity and

Biogeography. Princeton University Press, Princeton, NJ.

Hubbell SP (2006) Neutral theory and the evolution of ecological

equivalence. Ecology 87: 1387–1398.

Jarvis PG & McNaughton KG (1986) Stomatal control of

transpiration – scaling up from leaf to region. Adv Ecol Res 15:

1–49.

Leclerc M, Delgenes JP & Godon JJ (2004) Diversity of the

archaeal community in 44 anaerobic digesters as determined

by single strand conformation polymorphism analysis and 16S

rDNA sequencing. Environ Microbiol 6: 809–819.

Levin SA (1992) The problem of pattern and scale in ecology.

Ecology 73: 1943–1967.

MacArthur RH & Wilson EO (1967) The Theory of Island

Biogeography. Princeton University Press, Princeton.

McGill BJ (2003) A test of the unified neutral theory of

biodiversity. Nature 422: 881–885.

McGill BJ, Maurer BA & Weiser MD (2006) Empirical evaluation

of neutral theory. Ecology 87: 1411–1423.

Mouquet N & Loreau M (2003) Community patterns in source-

sink metacommunities. Am Nat 162: 544–557.

Porter KG & Feig YS (1980) The use of Dapi for identifying and

counting aquatic microflora. Limnol Oceanogr 25: 943–948.

Schoener T (1972) Mathematical ecology and its place among the

sciences. Science 178: 389.

Shipley B, Vile D & Garnier E (2006) From plant traits to plant

communities: a statistical mechanistic approach to

biodiversity. Science 314: 812–814.



Sloan WT, Woodcock S, Lunn M, Head IM, Nee S & Curtis TP

(2006) The roles of immigration and chance in shaping

prokaryote community structure. Environ Microbiol 8: 732–740.

Sloan WT, Woodcock S, Lunn M, Head I & Curtis TP (2007)

Modeling taxa-abundance distributions in microbial

communities using environmental sequence data. Microb Ecol

53: 443–455.

Sogin ML, Morrison HG, Huber JA et al. (2006) Microbial

diversity in the deep sea and the underexplored ‘‘rare

biosphere’’. Proc Natl Acad Sci USA 103: 12115–12120.

Tilman D (1994) Competition and biodiversity in spatially

structured habitats. Ecology 75: 2–16.

Volkov I, Banavar JR, Hubbell SP & Maritan A (2003) Neutral

theory and relative species abundance in ecology. Nature 424:

1035–1037.

Volkov I, Banavar JR, He FL, Hubbell SP & Maritan A (2006)

Theoretical biology – comparing models of species abundance

– Reply. Nature 441: E1–E2.

Woodcock S, Curtis TP, Head IM, Lunn M & Sloan WT (2006)

Taxa-area relationships for microbes: the unsampled and the

unseen. Ecol Lett 9: 805–812.

Wu JG & Hobbs R (2002) Key issues and research priorities in

landscape ecology: an idiosyncratic synthesis. Landscape Ecol

17: 355–365.

Appendix

Sampling from a log-series taxa abundance distribution

Let m be relative abundance and S(m) be the expected

number of taxa with abundance m then according to Hub-

bell’s model of metacommunity dynamics S is described by

Fisher’s log-series distribution

S mð Þ ¼ yxm

mðA1Þ

where

x ¼ 1� e �SMy

� �ðA2Þ

and SM is the total number of taxa in the source community.

It is not immediately obvious how to sample at random

from this distribution to generate realisations of the taxa

abundance distribution in the metacommunity. However, a

straightforward sampling algorithm becomes apparent if we

use an approximation suggested by Volkov et al. (2003).

They noted that as y=SM ! 0 then the log-series distribu-

tion can be approximated by

S mð Þ ¼ SM

G y=SMð Þ x1�x

� �y=SMe�m=

x1�xð Þmy=SM�1 ðA3Þ

since SM

G y=SMð Þ x1�xð Þy=SM

! 1and e�m=x

1�xð Þ ! e�m lnðxÞ.

The advantage of this formulation is that the species

abundance distribution can be obtained by generating

SM independent realisations of Gamma variables mi �G y=SM ;

1�xx

� �� for i = 1, . . . , SM for finite y as y/SM ! 0.

As the variables are independent, their joint density

function is simply the product of their individual density

functions

f ðm1; . . . ; mSMÞ ¼ 1

G y=SMð ÞSM x1�x

� �y� e�m1=

x1�xð Þmy=SM�1

1

h i� � � e�mSM

= x1�xð Þmy=SM�1

SM

h iðA4Þ

However, rather than using absolute abundances which

requires explicit knowledge of the number of individuals

in the metacommunity, we consider the relative abundance,

pi, of each species. Setting pi ¼ mi=PSM

i mi and NM ¼PSM

1

mi we note that only SM� 1 of these pi variables are now

independent.

Therefore, set

mi ¼ NMpi for i ¼ 1; . . . ; SM � 1

and

mSM¼ NMð1� p1 � . . . � pSM�1Þ:

The joint density function of p1; . . . ; pSM is therefore

gðp1; � � � ; pSM�1;NMÞ ¼1

G y=SMð ÞSM x1�x

� �y�YSM

1

e�NM pi=x

1�xð Þ NMpið Þy=SM�1h i

� det Jj jðA5Þ

where J is the Jacobian, given by

NM 0 � � � 0 p1

0 NM. .

. ...

p2

..

. . .. . .

.0 ..

.

0 � � � 0 NM pSM�1

�NM �NM � � � �NM pSM

0BBBBBBBB@

1CCCCCCCCA

ðA6Þ

It can be seen that det Jj j ¼ NSM�1M and therefore,

gðm1; . . . ; mSM�1;NMÞ ¼e�NM=

x1�xð ÞNy�1

M

G yð Þ x1�x

� �y" #

G yð Þpy=SM�11 � � � py=SM�1

SM

G y=SMð ÞSM

" #:

ðA7Þ



The first term of this implies that NM � G y; 1�xx

� �� ,

which gives that E NMð Þ ¼ y x1�x as expected for the

log-series distribution. The second bracket states that

p1; . . . ; pSM have a Dirichlet distribution p1; . . . ; pSM �Dir y=SM ; :::; y=SMð Þ. Additionally, p1; . . . ; pSM are indepen-

dent of NM. Therefore, the distribution of relative

abundances can be obtained by generating SM

independent realisations of Gamma variables pi �G y=SM ; 1ð Þ for i = 1, . . ., SM and then normalizing byPSM

1 pi.



neutral assembly of bacterial communities

Documents