neutral assembly of bacterial communities
TRANSCRIPT
R E S E A R C H A R T I C L E
Neutral assemblyof bacterial communitiesStephen Woodcock1, Christopher J. van der Gast2, Thomas Bell3, Mary Lunn4, Thomas P. Curtis5,Ian M. Head5 & William T. Sloan1
1Department of Civil Engineering, University of Glasgow, Glasgow, UK; 2Centre for Ecology and Hydrology, Oxford, UK; 3Department of Zoology,
University of Oxford, Oxford, UK; 4Department of Statistics, University of Oxford, Oxford, UK; and 5School of Civil Engineering and Geosciences,
University of Newcastle upon Tyne, Newcastle, UK
Correspondence: William T. Sloan,
Department of Civil Engineering, University of
Glasgow, Glasgow, G12 8LT, UK. Tel.: +44
141 330 4076; fax: +44 141 330 4557;
e-mail: [email protected]
Received 18 January 2007; revised 29 June
2007; accepted 3 July 2007.
First published online October 2007.
DOI:10.1111/j.1574-6941.2007.00379.x
Editor: Jim Prosser
Keywords
community assembly; dispersal; insular
comunities; mathematical model; neutral
model.
Abstract
Two recent, independent advances in ecology have generated interest and
controversy: the development of neutral community models (NCMs) and the
extension of biogeographical relationships into the microbial world. Here these
two advances are linked by predicting an observed microbial taxa–volume
relationship using an NCM and provide the strongest evidence so far for neutral
community assembly in any group of organisms, macro or micro. Previously,
NCMs have only ever been fitted using species-abundance distributions of
macroorganisms at a single site or at one scale and parameter values have been
calibrated on a case-by-case basis. Because NCMs predict a malleable two-
parameter taxa-abundance distribution, this is a weak test of neutral community
assembly and, hence, of the predictive power of NCMs. Here the two parameters of
an NCM are calibrated using the taxa-abundance distribution observed in a small
waterborne bacterial community housed in a bark-lined tree-hole in a beech tree.
Using these parameters, unchanged, the taxa-abundance distributions and
taxa–volume relationship observed in 26 other beech tree communities whose
sizes span three orders of magnitude could be predicted. In doing so, a simple
quantitative ecological mechanism to explain observations in microbial ecology is
simultaneously offered and the predictive power of NCMs is demonstrated.
Introduction
Scale is a problem in microbial ecology. Even using the
most-up-to-date molecular methods, one is limited to
observing and characterizing very small samples from what
are ostensibly very large naturally occurring microbial com-
munities. The considerable technical sophistication and skill
required to collect, analyse and enumerate microbial popu-
lations correctly in environmental samples can sometimes
obscure just how small, in relative terms, samples are. Take,
for example, a large clone library of say 500 clones derived
from a 1 mg soil sample; the soil sample itself may contain as
many as 109 individual organisms and applied microbial
ecologists will generally be interested in the services pro-
vided by the communities at a scale somewhat larger than a
single 1 g sample. By analogy, when there are currently
6� 109 humans in the world a single sample of a few
hundred individuals is unlikely to be sufficient to character-
ize the global distribution of any human traits unless it is
extremely homogeneous. Molecular methods are advancing
so quickly that in the near future it may be possible to get
close to a complete census in a sample (Sogin et al., 2006)
but, even then, a 1 g sample is small if one’s aspirations are to
characterize an entire field of soil. This disparity between
sample size and community size is enormous and far greater
than any comparable sampling issues in mainstream ecol-
ogy. Consequently, patterns are perceived through a sparse,
often distorted (Sloan et al., 2007) map of the microbial
world.
Thus, the modus operandi in microbial ecology is extra-
polation from very small samples and the fact that microbial
systems are always observed at a scale much smaller than
ultimately aimed at to characterize them, amplifies some of
challenges faced in classical ecology. In scaling from a leaf to
the ecosystem to the landscape and beyond (Jarvis &
McNaughton, 1986), there must be an understanding of
how information is transferred from fine scales to broad
scales and vice versa (Levin, 1992). This problem of
FEMS Microbiol Ecol 62 (2007) 171–180 c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
cascading information and ecological process understanding
through a hierarchy of different scales is being tackled using
mathematical models by ‘landscape ecologists’ (Wu &
Hobbs, 2002). The models integrate mathematical descrip-
tions of plausible plot-scale ecological processes to form
patterns at the landscape scale. In going from rRNA genes in
a sample to the sample itself and beyond, microbial ecolo-
gists face similar technical and conceptual challenges (Sloan
et al., 2007) but with the added challenge of having an
uncertain picture of the broad-scale patterns (Woodcock
et al., 2006). This has consequences for the complexity of the
models that can be aspired to be used. Simon Levin in his
McArthur Award Lecture (Levin, 1992) on ‘The problem of
pattern and scale in ecology’ described the essence of model-
ling thus ‘to facilitate the acquisition of this understanding
(scaling), by abstracting and incorporating just enough detail
to produce observed patterns. A good model does not attempt
to reproduce every detail of the biological system; the system
itself suffices for that. Rather, the objective of a model should
be to ask how much detail can be ignored without producing
results that contradict specific sets of observations . . .’ This
judicious paradigm has been embraced by theoretical ecol-
ogists and a wide variety of conceptual models of potential
ecological pattern-forming mechanisms have been encoded
into mathematical models and then shown to produce
observed patterns in the spatial distribution and relative
abundance of taxa. All are simplifications of the system
being modelled. The majority serve to, in some way, whittle
down the set of plausible mechanisms that can lead to a
particular pattern; theoretical ecologists are all too aware of
the ‘same behaviour implies same mechanism’ fallacy and,
since their representations of the ecological processes are
rarely calibrated against observations, few say emphatically
that their models are correct.
While validating the plausibility of a model through
mathematics is vital, the paucity of attempts to go beyond
this and validate the models themselves is frustrating
(Belovsky et al., 2004); indeed Schoener (1972) cautioned
against the ‘constipating accumulation of untested models’
more that 30 years ago. In microbial ecology, one is less
constrained by the broad-scale patterns because one can find
them so difficult to identify and thus theoretical microbial
ecology, if pursued in the same fashion, unconstrained by
biological reality, could significantly add to this uncomfor-
table blockage of untested and perhaps untestable models.
Thus one further proviso is required; for a model to
ultimately be of some practical use it should be predictive.
This means that a model calibrated at one site, one scale or
on the basis of one set of ecological processes should be
capable of predicting phenomena at different sites, at
different scales or that pertain to seemingly unrelated
mechanisms. Harte (2004) implicitly combines the para-
digm cited by Levin (1992), which calls for parsimony, with
the requirement for prediction by suggesting that theories
are of most interest when the ratio of the number of
predictions that they make to the number of assumptions
and adjustable parameters is large. It is for this reason that
when it was chosen to investigate the roles of chance and
dispersal limitation on patterns in microbial community
composition the simplest possible conceptual model of
community assembly that incorporated these factors was
selected (Curtis et al., 2006; Sloan et al., 2006, 2007); a
simple neutral community assembly model (Hubbell, 2001)
where the composition at a local scale is shaped only by
random immigration, birth and death events.
Neutral community assembly models (NCMs) (Bell,
2000; Hubbell, 2001) have been shown to reproduce the
distribution of taxa abundances in a wide range of different
biological communities. However, in most previous applica-
tions of neutral theory the model parameters are selected to
minimize the difference between observed and predicted
taxa-abundance distributions. The merit of neutral theory,
over and above other hypotheses on the formation of
biological communities is then argued on the basis of (often
small) differences in a goodness of fit statistic for calibrated
taxa-abundance distributions (McGill, 2003; Volkov et al.,
2003, 2006; Chave et al., 2006). These arguments can seem
rather arcane when there has been little attempt to validate
the models (Harte, 2003). In addition, microbial ecologists
were until recently precluded from the debate because, for
most environments, only a small fraction of the diversity can
be experimentally defined (Curtis et al., 2002). Despite the
advances in molecular methods for characterizing naturally
occurring microbial communities in situ, the disparity in
scale between sample and community size and some in-
herent limitations of the methods conspire to make a purely
empirical definition of a taxa-abundance distribution at a
single site very difficult. Sloan et al. (2006) circumvented
this problem for microbial communities by deriving a
method for calibrating Hubbell’s neutral theory based on a
theoretical relationship between the mean relative abun-
dance of common taxa and the frequency with which they
are expected to appear in multiple similarly sized samples.
Thus, in Sloan et al. (2006, 2007), the criterion of a
parsimonious calibrated model capable of reproducing
patterns is met. However, neutral theory is controversial
and its parsimony still grates with many who seek to provide
evidence that it cannot explain all the variance in real
communities of macroorganisms (e.g. McGill et al., 2006).
Calibrating an NCM at one site or one scale is not a
convincing endorsement of the model’s underlying assump-
tions and many alternative models could potentially repro-
duce either the taxa-abundance distributions (McGill, 2003)
or the abundance-frequency relationships (He & Gaston,
2003) observed. A strong test of neutral theory has to
demonstrate its predictive power and this has never
FEMS Microbiol Ecol 62 (2007) 171–180c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
172 S. Woodcock et al.
previously been achieved (Condit et al., 2002; Gotelli &
McGill, 2006; McGill et al., 2006). The authors provide the
first demonstration of an NCM calibrated using microbial
taxa abundances at one site and one scale accurately
predicting very different taxa-abundances distributions and
the observed taxa–volume relationship across a range of
scales and sites.
Hubbell’s neutral model makes predictions about how
the richness and abundance distribution of taxa on island-
like communities will be affected by community size and
immigration. Indeed, the genesis of his NCM came about
through an attempt to present a unified theory that com-
bined the Theory of Island Biogeography (MacArthur &
Wilson, 1967), which makes predictions on species richness
within insular communities, with predictions on relative
abundance of taxa. Ascertaining whether these predictions
are borne out in reality requires taxa abundance data from a
set of insular communities for which community size or
immigration varies significantly but which are very similar
in all other respects. If the NCM presents an adequate
representation of the ecological process that shape the
community structure then it should be possible to explain a
significant proportion of the variance in the taxa-abundance
distributions for all the communities by employing a single
set of parameters calibrated at one site. However, datasets
for insular communities of significantly different sizes or
with different degrees of isolation that are housed in very
similar ecosystems are rare. Bell et al. (2005) published just
such a dataset for water-borne bacteria living in tree holes in
beech trees in the same woodland. Samples were taken from
29 rainwater-filled, bark-lined holes, each of which housed
a small ecosystem. The range of volumes of these habitats
spanned three orders of magnitude; the smallest was a mere
50 mL, the largest 18 000 mL. Bell et al. (2005) reported that
bacterial species richness increased with tree hole volume in
a manner that could be modelled using a single power law
relationship, which hints at some consistent process of
community assembly. All the fluid was removed from the
tree holes and was homogenized by stirring. Bacterial
richness was determined from denaturing gradient gel
electrophoresis (DGGE) analysis (Bell et al., 2005). Epifluor-
escence microscopy was performed (Porter & Feig, 1980)
and the density of organisms in the tree holes was revealed
to be around 105 mL�1. The sample size analyzed for all the
tree holes was 5 mL (c. 5� 105 individuals). Physically and
chemically, the bacterial communities were similar; they
were all supported by similar nutrients (decaying leaf litter),
relatively stagnant, but subject to invasion events from either
airborne or rainwater-borne microorganisms. The greatest
geographic distance between any two trees in the study was
around two miles. The distributions of the relative abun-
dance of taxa in the samples was not reported in Bell et al.
(2005) but are used here (e.g. Figs 1 and 2). These were
determined by the relative intensity of bands on the DGGE
gels. Because of detection limitations inherent to the DGGE
analysis (e.g. Cocolin et al., 2000; Leclerc et al., 2004;
Woodcock et al., 2006) used in the initial study, only the
top few ranked taxa were observed at each site, hence the
abundances in the dataset were normalized relative only to
the total abundances of these most common taxa. Quanti-
fication of the absolute abundances of taxa in a sample using
DGGE is open to criticism (Heuer et al., 2001); however,
exactly the same technique was used for each tree hole and,
therefore, exactly the same biases applied to each sample.
Thus, for the comparative analysis of relative abundances
presented here, it is fair to say that significant shifts in
DGGE patterns reflect real shifts in the bacterial community
composition. What is immediately striking from these data
1 2 3 4 5 6 70
0.1
0.2
0.3
0.4
0.5
Bacterial taxa rank
Rel
ativ
e ab
unda
nce
Fig. 1. The ranked taxa-abundance distribution (squares) observed in a
5 mL sample from the smallest tree hole which had a volume of 50 mL.
The line shows the expected rank abundance distribution obtained by
averaging 1000 realizations of ranked abundances simulated by Hub-
bell’s neutral model with NT = 5�105, y= 15 and m = 1.0�10�6.
0
0.1
0.2
0.3
0.4
Bacterial taxa rank
4 4503 000
1 460640
111
50
Rel
ativ
e ab
unda
nce
11 000
Fig. 2. Ranked taxa-abundance distributions for a selection of seven of
the 29 tree holes ranging in volume from 11 000–50 mL. The lines
represent taxa-abundance distributions predicted by the neutral model
with m = 10�6 and y= 15 calibrated using data from the 50 mL tree hole.
FEMS Microbiol Ecol 62 (2007) 171–180 c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
173Neutral assembly of bacterial communities
is just how dramatically and systematically the shape of the
taxa-abundance distributions change between tree holes,
with large communities exhibiting a much more even
distribution than small ones. This is all the more remarkable
because the sample size at each site was exactly the same.
Given the proximity of tree holes and the similarity of
their environments, what affects the difference in the taxa-
abundance distributions? If one were to assume, initially,
that the tree hole environments were identical in everything
except for their volume and that the same forces act to shape
the community composition, then can volume alone explain
the differences? It is shown that, accounting for tree hole
volume, the distributions are significantly different and this
hypothesis is rejected. Thus the distribution of taxa abun-
dances in tree hole samples does not derive from the same
underlying distribution. Therefore, there is no benefit to be
gained in testing, what many commentators believe should
be the null hypothesis in any study of taxa abundances
(McGill, 2003; Gotelli & McGill, 2006), that a particular
arbitrary parameter distribution, such as the lognormal, fits
the data. Then the hypothesis that the tree holes house
distinct, homogeneous, island-like communities that are
neutrally assembled from a single metacommunity with a
consistent rate of random immigrations into each tree hole
is tested and could not be rejected. This is achieved by
calibrating a NCM using the taxa-abundance distribution
from the smallest tree hole, predicting the taxa-abundance
distributions from all other tree holes and then testing at the
5% significance level whether the simulated and observed
taxa-abundance distributions are the same.
Materials and methods
Do the samples derive from the samedistribution?
Firstly, the hypothesis that the same structuring forces
shaped the bacterial tree hole communities and that, conse-
quently, the sample taxa-abundance distributions derive
from the same underlying distribution was tested. Synthetic
populations were generated for each environmental sample
by selecting 5� 105 individuals with replacement from the
observed taxa-abundance distributions. Species were in-
dexed by their ranked abundance, with the most abundant
being ranked 1, the second most abundant 2, etc. The
abundances reported for the observed data were relative to
the total abundance of the taxa that appeared on the DGGE
gels. Therefore the abundances of synthetic populations
were also normalized by the same number of top ranked
taxa. A Kolmogorov–Smirnoff test was then applied to every
combination of two synthetic samples to determine how
likely they were to have come from the same underlying
distribution.
Is the community neutrally assembled?
Secondly, the hypothesis that the samples are from neutrally
assembled communities fed by immigrants from a single
source community with a constant immigration rate was
tested. To do this, Hubbell’s NCM was calibrated using the
taxa-abundance distribution in the sample drawn from the
smallest tree hole. In Hubbell’s model it is assumed that the
distribution of taxa abundances in the source metacommu-
nity is described by a log-series distribution with a single
parameter y, which Hubbell calls the fundamental biodiver-
sity number because it indexes the overall biodiversity. In
local communities, which are assumed to be saturated with
individuals, when an individual organisms dies it is either
replaced with probability m by an immigrant drawn at
random from the source community or, alternatively, by
reproduction from within the local community with prob-
ability 1�m. Given local reproduction, the probability that
any particular taxon reproduces depends on its relative
abundance, which requires knowledge of the number of
individuals in the local community, NT . Thus the shape of
the taxa-abundance distribution for a neutrally assembled
community depends on the values of the three parameters:
NT, y and m. NT was estimated using the tree hole volume
and the density of organisms [O(105) mL�1]. y and m were
considered free parameters that were adjusted to give the
best least squares fit between the observed and simulated
expected taxa-abundance distribution for the smallest tree
hole. Least squares fitting was adopted because it is liable to
be biased towards fitting the model to the higher observed
relative taxa abundances; the authors have more confidence
in these than the lower abundances estimated from DGGE
band intensities. To simulate the sample distribution, a
realization of the relative abundance of taxa in the meta
community fpigSMi¼1, where SM is the number of different
taxa in the meta community, was first generated. To do this,
it is noted (after Volkov et al., 2003) that provided y/SM is
small, the log-series distribution can be approximated by a gdistribution. It is shown in the appendix that this leads to a
simple method for generating realizations of relative abun-
dance of taxa in the metacommunity fpigSMi¼1 by sampling at
random from gamma distributions. For any given realiza-
tion of fpigSMi¼1 then Sloan et al. (2007) show that the
distribution of taxa abundances in the local neutrally
assembled community fyigSMi¼1 is Dirichlet DirðNT mp1; . . . ;
NT mpSM Þ and give a simple algorithm for generating a
realization of fyigSMi¼1. Sloan et al. (2007) also show that the
distribution of taxa abundances in a sample of size NS (i.e.
what is observed on the DGGE gel) from that distribution is
Dirichlet DirðNSy1; . . . ;NSynÞ. Therefore, given any pair of
parameters m and y, it was straightforward to simulate 1000
realizations of the taxa abundance distribution in a sample
of NS individuals from a tree hole comprising NT
FEMS Microbiol Ecol 62 (2007) 171–180c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
174 S. Woodcock et al.
individuals. These were then averaged to give the expected
taxa-abundance distribution. For the purposes of a com-
parative analysis, since the abundances reported for the
observed data were relative to the total abundance of the
taxa that appeared on the DGGE gels, the abundance of
synthetic populations were also normalized by the same
number of top-ranked taxa.
Results
Two hypotheses were tested. Firstly, that the taxa-abundance
distributions observed in all the tree holes derive by ran-
domly sampling from the same distribution. The P-values
were so low that the hypothesis that the samples are all from
the same underlying distribution at the 0.05% level (i.e. all
P-values o 0.0005) can be confidently rejected. Secondly,
the hypothesis that the communities are neutrally assembled
from the same source community and that the taxa-abun-
dance distributions could be reproduced by a NCM was
tested. A stringent test of this was adopted in that, rather
than seeking parameter values on the basis of all the data
from all the tree holes, it was decided to calibrate two free
parameters the immigration probability, m, and index to the
biodiversity in the source community, y, in Hubbell’s NCM
using data from only one tree hole; the smallest. The least-
squares best fit to the relative abundance of the observable
taxa in a sample (5 mL) from the smallest (50 mL) tree hole
was obtained with y= 15 and m = 1.0� 10�6 (Fig. 1). These
parameters values were then used to predict the expected
abundance distributions in all other tree holes, the only
parameter that changed between tree holes was NT , the total
number of individuals.
Figure 2 gives seven examples of the remarkably good
match between the observed and predicted taxa-abundances
that was obtained, in the majority of tree holes. A quantita-
tive measure of goodness of fit was obtained simulating an
additional 500 realizations of the tree hole taxa-abundance
distributions. Pearson’s statistic for goodness of fit was
calculated for these and for the observed distribution
Xn
i¼1
ðEðiÞ � xðiÞÞ2
EðiÞ
where E(i) is the expected abundance of the ith ranked
taxon and x(i) is its abundance in the simulation or observed
dataset. A P-value was then estimated from the proportion
of these 500 trials that produced a goodness of fit statistic
greater than that calculated for the observed data (Table 1).
Hypothesis testing at the 5% significance level suggested
(Table 1) that for 27 of the 29 tree hole communities, there
was no evidence to reject the neutral model. There was no
reason to assume any anomalies in either environmental
conditions or sampling procedure for the two tree holes
where the neutral model was rejected.
The success of the neutral model in predicting taxa-
abundance distributions over a range of different scales
(Fig. 2, Table 1) demonstrates its potential as a predictive
tool. Much of the excitement about Hubbell’s neutral model
stems from its ability to link the prediction on different
ecological phenomena. Indeed, Hubbell refers to his theory
as the ‘unified theory of biodiversity and biogeography’
because of its potential to link predictions on the shape of
taxa-abundance distributions to a relationship between taxa
richness and area sampled (taxa–area relationship). This
link has never previously been explicitly demonstrated. In
Fig. 3 it is shown that the predicted richness of taxa in each
of the tree hole samples closely matches the observed
richness. These predictions have again been produced using
the parameter values calibrated using data from the smallest
tree hole (y= 15, m = 10�6) a detection threshold of 0.005 on
the relative abundance of taxa was assumed; this was the
minimum relative abundance to appear on any of the DGGE
gels. Bell et al. (2005) fitted the phenomenological model of
a power-law relationship to their observed taxa–area rela-
tionship, which is reproduced in Fig. 3a. When a similar
Table 1. Estimated P-values for the goodness of fit using a NCM
calibrated against the smallest site, tree hole 21
Tree-hole number Volume (mL) P-value
1 360 0.076
2 3250 0.324
3 1700 0.500
4 750 0.280
5 18 000 0.020�
6 180 0.620
7 640 0.454
8 4450 0.612
9 3600 0.152
10 3150 0.146
11 2250 0.326
12 1800 0.650
13 1250 0.154
14 60 0.480
15 1950 0.122
16 2850 0.214
17 2225 0.956
18 900 0.228
19 11 000 0.158
20 1460 0.588
21 50 0.956
22 3000 0.674
23 140 0.342
24 220 0.094
25 111 0.808
26 350 0.466
27 1200 0.068
28 3000 0.736
29 600 0.040�
�Not statistically significant.
The parameter pair used was (15, 10�6).
FEMS Microbiol Ecol 62 (2007) 171–180 c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
175Neutral assembly of bacterial communities
model is fitted the predicted taxa richnesses (Fig. 3b), an
almost identical relationship is obtained. The greatest devia-
tion between observed and predicted is, perhaps unsurpris-
ingly, in the largest tree hole where the neutral model was
rejected (Fig. 3c). For the most part, however, the link
between the taxa–area relationship and taxa-abundance
distributions suggested by Hubbell is borne out in the tree
hole data set. the smallest tree hole was deliberately selected
to calibrate the model parameters because it offers the
greatest information on the rate of immigration, m. Accord-
ing to Hubbell’s model, as community sizes increase, the
systems increasingly resemble the source community and
the effects of immigration become obscured. Thus, the
smallest site offers the greatest opportunities to quantify
immigration into the systems. The value of y= 15 calibrated
on the smallest tree hole is consistent with calibrating on any
other tree hole. Independently, calibrating the model using
all of the other tree holes suggests that y lies in the range
15 � y � 25 and it transpires that the predictions on both
the taxa-abundance distributions and taxa–volume relation-
ship across all the tree holes were insensitive to changes in
this range. This insensitivity is not a property of the neutral
model itself. Rather it is an artefact of the experimental
methods available to microbial ecologists. As discussed in
the introduction, microbial ecologists are limited to viewing
a small percentage of the overall diversity in an environ-
mental sample using rapid community profiling techniques
such as DGGE. This is a generic problem in microbial
ecology that is discussed in more detail elsewhere (Curtis
et al., 2006; Woodcock et al., 2006; Sloan et al., 2007).
However, in the context of this application the reason for the
insensitivity to y is demonstrated in Fig. 4. This shows the
taxa-abundance distributions that one would expect from a
random samples from two log-series distributed source
communities: one with y= 15 and the other with y= 25. If
there were no dispersal limitation, then these are the
distributions one might expect in all the tree holes. In the
entire sample of 5� 105, there are significant differences in
the overall richness of taxa and in the distribution of taxa-
abundances as a function of y. However, using DGGE it is
impossible to detect all the taxa, only those with abundance
greater than some threshold can be detected. A detection
limit of 0.5% relative abundance is displayed in Fig. 4a and b
the taxa abundance distribution for taxa whose abundances
are greater than this limit are displayed. The distributions
are quite similar and thus the abundance distribution of
detectable diversity is quite insensitive to y. Sloan et al.
(2007) point out that it is difficult to determine the under-
lying taxa-abundance distribution from such small samples
and, therefore, it may be that the source community
abundances are not in fact log-series distributed; other
source distributions might produce similar results. Thus
the success of the neutral model in predicting the taxa-
abundance distributions in the tree holes should not be seen
as a validation of Hubbell’s model in its entirety. Sufficient
information is not available about rare taxa to conclude that
the log-series is the source community’s taxa-abundance
distribution, let alone verify Hubbell’s conceptual model for
the maintenance of biodiversity in the source community.
However, given that there is some underlying source com-
munity distribution, the first test showed that the tree hole
communities are not merely random samples from that
source community; the abundance distributions of detect-
able diversity all differ significantly from one another. Some
ecological process must be affecting these differences. This
could be a function of the environment, but part of the
0.6
1
1.4
1.5 2.5 3 3.5
S = 2.11V 0.26
0.6
1
1.4
1.8
55 10 15 20 25 30
Log10 (treehole volume, V )
Log10 (treehole volume, V )
1.8
Log 1
0 (n
umbe
r of
taxa
, S)
S = 2.19V 0.25
4.542
1.5 2.5 3 3.5 4.542
Log 1
0 (n
umbe
r of
taxa
, S)
Observed number of taxa
30
25
20
15
10
Pre
dict
ed n
umbe
r of
taxa
, S(c)
(b)
(a)
Fig. 3. (a) The observed bacterial richness in all 29 tree holes. The solid
line represents the power-law relationship, S = 2.11 V 0.26, fitted using
linear regression. (b) The bacterial richness predicted by the neutral
model with y= 15 and m = 10�6 calibrated using the taxa-abundance
distribution of the smallest tree hole. The solid line represents the power-
law relationship, S = 2.19 V 0.25, again fitted using linear regression. (c)
Observed vs. predicted richness in each tree hole. The line represents
perfect agreement and the two squares indicate where the neutral
model was rejected.
FEMS Microbiol Ecol 62 (2007) 171–180c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
176 S. Woodcock et al.
attraction in examining the tree-hole communities is that
their environments are all so similar. Besides, the greatest
perceptible difference between the tree holes is their volume
and hence the size of the communities they house. In the
predictions presented in this paper, the source community
abundance distribution and the immigration rate are held
constant and the only parameter that changes from tree hole
to tree hole is the community size, NT , which is estimated
to be the product of the measured bacterial density and
volume. Thus the ecological mechanism that effects the
difference in tree hole abundance distributions is the chan-
ging relative importance of random immigration on tree
holes of different sizes. The inability to reject the neutral
model predictions on the basis of the data in the majority of
tree holes suggests that this simple explanation cannot be
ruled out.
Discussion
The success of the neutral model in explaining the different
taxa-abundance distributions in the detectable diversity of
tree holes, whose sizes vary over three orders of magnitude,
without the need to change any parameters, constitutes the
strongest evidence, so far, that NCMs can usefully describe
community composition. The evidence creates a compelling
case to study carefully the role that random reproduction,
death and immigration play in shaping bacterial community
structure. It suggests that at least some bacterial commu-
nities are dispersal-limited and, therefore, challenges the
perspectives held by some commentators that global dis-
persal of microorganisms prevents them from having bio-
geography (Fenchel, 2003) and that microbial population
sizes are sufficiently large to preclude local stochastic
extinctions (Fenchel & Finlay, 2005). This suggestion could
be tested by directly measuring immigration rates into
similar but different sized bacterial communities.
Naturally occurring communities of microorganisms are
vital to life on earth and are of profound practical signifi-
cance in agriculture, medicine and engineering. Describing
patterns in microbial communities is, therefore, important
but not as important as explaining why the patterns form.
Thus the model presented here is of particular importance
because rather than fitting an arbitrary mathematical func-
tion to an observed pattern in microbial taxa abundances,
the model calibrated at one scale successfully makes predic-
tions at others. Prediction is rare in both microbial and
macrobial ecology (Harte, 2004) and is a far more convin-
cing test of an ecological theory than fitting predicted to
observed taxa-abundance distributions at a single site.
However, in presenting strong evidence in favor of Hubbell’s
neutral theory, one can court controversy. Indeed, when
these results were presented at the joint Society for General
Microbiology Meeting/British Ecological Society on which
this thematic issue is based, several of the audience strongly
objected. The grounds for this were that very many other
models could have produced similar patterns and that
processes such as niche differentiation could possibly ex-
plain the same patterns and better represented the biological
complexity that is believed to exist. This reflects debate that
has run in the ecological literature for the past five years
(Hubbell, 2006; McGill et al., 2006) where there was an
initial polarization between the niche and neutral perspec-
tives on community assembly. As the debate has matured,
the hostilities have diminished and there is now a degree of
conciliation and a recognition that the two are not mutually
exclusive (Hubbell, 2006) and that chance, dispersal limita-
tion and niche differentiation, or species sorting, all have a
role to play. However, the parsimony of neutral theory still
remains controversial and many seek to provide evidence
that it cannot explain all the variance in real communities of
0 50 100 150 200 250Taxa rank
Rel
ativ
e ab
unda
nce
θ =15θ =25
Detection threshold
0 10 20 30 40 50
10−1
10−2
10−2
10−4
Taxa rank
Rel
ativ
e ab
unda
nce
θ =15θ =25
(a)
(b)
Fig. 4. (a) The ranked relative abundance of taxa in a random sample of
5�105 individuals from two different log-series distributed source
communities; one with parameter y= 15 the other, more diverse, with
y= 25. The dashed line is a threshold in relative abundance below which
taxa will not be detected using DGGE. (b) The distribution of abundance
taxa in the random samples that can be detected by DGGE. The
distribution of common taxa is less sensitive to the value y than that of
rare taxa.
FEMS Microbiol Ecol 62 (2007) 171–180 c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
177Neutral assembly of bacterial communities
macroorganisms (e.g. McGill et al., 2006). There is no doubt
that the neutral theory will fail to explain all of the variance
but would maintain that the success, so far, of the neutral
theory in explaining and, indeed, predicting the majority of
the variance in microbial community composition is such
that the burden of proof lies firmly with those that believe
niche differentiation dominates the community assembly
process. Their route to providing quantitative evidence of
this will require a predictive model based on niche or species
sorting concepts and none currently exist. There are highly
cited examples of models that successfully meld determinis-
tic, niche-based concepts with dispersal in a spatially dis-
tributed environment to demonstrate that a combination of
these factors can promote biodiversity (e.g. Tilman, 1994;
Mouquet & Loreau, 2003). However, these demonstrations
tend to rely on a large number of (often invented) taxon-
specific parameters. This degree of specificity is currently
impossible in microbial ecology. Furthermore, there are no
examples of such models being calibrated at one site or scale
and then subsequently predicting phenomena at another.
The rationale behind these theoretical demonstrations is
commendable, in that ultimately to transfer information
successfully through all scales in the landscapes of micro-
organisms and macroorganisms will require a theory that
incorporates both demographic stochasticity and determi-
nistic factors. However, for the foreseeable future, the
representation of deterministic factors cannot rely on the
experimental definition of a suite of parameters for each
species, many of which cannot even be seen in microbial
communities. Thus some alternative model that encapsu-
lates the deterministic factors is required, perhaps based on
energy concepts (Brown et al., 2004) or maximizing disorder
(Shipley et al., 2006), and this remains an exciting challenge
that transcends all the subdiscipline boundaries that appear
to exist in ecology. For the moment though, it would appear
that neutral dynamics are the best quantitative description
of bacterial community assembly in beech tree holes in
Wytham Woods, Oxfordshire, UK.
References
Bell G (2000) The distribution of abundance in neutral
communities. Am Nat 155: 606–617.
Bell T, Ager D, Song J, Newman JA, Thompson IP, Lilley AK &
van der Gast CJ (2005) Larger islands house more bacterial
taxa. Science 308: 1884.
Belovsky GE, Botkin DB, Crowl TA et al. (2004) Ten suggestions
to strengthen the science of ecology. Bioscience 54: 345–351.
Brown JH, Gillooly JF, Allen AP, Savage VM & West GB (2004)
Toward a metabolic theory of ecology. Ecology 85: 1771–1789.
Chave J, Alonso D & Etienne RS (2006) Theoretical biology –
comparing models of species abundance. Nature 441: E1–E1.
Cocolin L, Bisson LF & Mills DA (2000) Direct profiling of the
yeast dynamics in wine fermentations. FEMS Microbiol Lett
189: 81–87.
Condit R, Pitman N, Leigh EG et al. (2002) Beta-diversity in
tropical forest trees. Science 295: 666–669.
Curtis TP, Sloan WT & Scannell JW (2002) Estimating
prokaryotic diversity and its limits. Proc Natl Acad Sci USA 99:
10494–10499.
Curtis TP, Head I, Lunn M, Sloan WT, Schloss PD & Woodcock S
(2006) What is the extent of microbial diversity. Philos Transac
Roy Soc 361: 2023–2037.
Fenchel T (2003) Biogeography for bacteria. Science 301: 925–926.
Fenchel T & Finlay BJ (2005) Bacteria and Island Biogeography.
Science 309: 1997–1999.
Gotelli NJ & McGill BJ (2006) Null versus neutral models: what’s
the difference? Ecography 29: 793–800.
Harte J (2003) Tail of death and resurrection. Nature 424:
1006–1007.
Harte J (2004) The value of null theories in ecology. Ecology 85:
1792–1794.
He HL & Gaston KJ (2003) Occupancy, spatial variance, and the
abundance of species. Am Nat 162: 366–375.
Heuer H, Wieland G, Schonfeld J, Schnwalder A, Gomes NCM &
Smalla K (2001) Bacterial community profiling using DGGE
or TGGE analysis. Environmental Molecular Microbiology
(Rochelle PA, ed), pp. 177–190. Horizon Scientific Press,
Wymondham, UK.
Hubbell SP (2001) The Unified Neutral Theory of Biodiversity and
Biogeography. Princeton University Press, Princeton, NJ.
Hubbell SP (2006) Neutral theory and the evolution of ecological
equivalence. Ecology 87: 1387–1398.
Jarvis PG & McNaughton KG (1986) Stomatal control of
transpiration – scaling up from leaf to region. Adv Ecol Res 15:
1–49.
Leclerc M, Delgenes JP & Godon JJ (2004) Diversity of the
archaeal community in 44 anaerobic digesters as determined
by single strand conformation polymorphism analysis and 16S
rDNA sequencing. Environ Microbiol 6: 809–819.
Levin SA (1992) The problem of pattern and scale in ecology.
Ecology 73: 1943–1967.
MacArthur RH & Wilson EO (1967) The Theory of Island
Biogeography. Princeton University Press, Princeton.
McGill BJ (2003) A test of the unified neutral theory of
biodiversity. Nature 422: 881–885.
McGill BJ, Maurer BA & Weiser MD (2006) Empirical evaluation
of neutral theory. Ecology 87: 1411–1423.
Mouquet N & Loreau M (2003) Community patterns in source-
sink metacommunities. Am Nat 162: 544–557.
Porter KG & Feig YS (1980) The use of Dapi for identifying and
counting aquatic microflora. Limnol Oceanogr 25: 943–948.
Schoener T (1972) Mathematical ecology and its place among the
sciences. Science 178: 389.
Shipley B, Vile D & Garnier E (2006) From plant traits to plant
communities: a statistical mechanistic approach to
biodiversity. Science 314: 812–814.
FEMS Microbiol Ecol 62 (2007) 171–180c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
178 S. Woodcock et al.
Sloan WT, Woodcock S, Lunn M, Head IM, Nee S & Curtis TP
(2006) The roles of immigration and chance in shaping
prokaryote community structure. Environ Microbiol 8: 732–740.
Sloan WT, Woodcock S, Lunn M, Head I & Curtis TP (2007)
Modeling taxa-abundance distributions in microbial
communities using environmental sequence data. Microb Ecol
53: 443–455.
Sogin ML, Morrison HG, Huber JA et al. (2006) Microbial
diversity in the deep sea and the underexplored ‘‘rare
biosphere’’. Proc Natl Acad Sci USA 103: 12115–12120.
Tilman D (1994) Competition and biodiversity in spatially
structured habitats. Ecology 75: 2–16.
Volkov I, Banavar JR, Hubbell SP & Maritan A (2003) Neutral
theory and relative species abundance in ecology. Nature 424:
1035–1037.
Volkov I, Banavar JR, He FL, Hubbell SP & Maritan A (2006)
Theoretical biology – comparing models of species abundance
– Reply. Nature 441: E1–E2.
Woodcock S, Curtis TP, Head IM, Lunn M & Sloan WT (2006)
Taxa-area relationships for microbes: the unsampled and the
unseen. Ecol Lett 9: 805–812.
Wu JG & Hobbs R (2002) Key issues and research priorities in
landscape ecology: an idiosyncratic synthesis. Landscape Ecol
17: 355–365.
Appendix
Sampling from a log-series taxa abundance distribution
Let m be relative abundance and S(m) be the expected
number of taxa with abundance m then according to Hub-
bell’s model of metacommunity dynamics S is described by
Fisher’s log-series distribution
S mð Þ ¼ yxm
mðA1Þ
where
x ¼ 1� e �SMy
� �ðA2Þ
and SM is the total number of taxa in the source community.
It is not immediately obvious how to sample at random
from this distribution to generate realisations of the taxa
abundance distribution in the metacommunity. However, a
straightforward sampling algorithm becomes apparent if we
use an approximation suggested by Volkov et al. (2003).
They noted that as y=SM ! 0 then the log-series distribu-
tion can be approximated by
S mð Þ ¼ SM
G y=SMð Þ x1�x
� �y=SMe�m=
x1�xð Þmy=SM�1 ðA3Þ
since SM
G y=SMð Þ x1�xð Þy=SM
! 1and e�m=x
1�xð Þ ! e�m lnðxÞ.
The advantage of this formulation is that the species
abundance distribution can be obtained by generating
SM independent realisations of Gamma variables mi �G y=SM ;
1�xx
� �� �for i = 1, . . . , SM for finite y as y/SM ! 0.
As the variables are independent, their joint density
function is simply the product of their individual density
functions
f ðm1; . . . ; mSMÞ ¼ 1
G y=SMð ÞSM x1�x
� �y� e�m1=
x1�xð Þmy=SM�1
1
h i� � � e�mSM
= x1�xð Þmy=SM�1
SM
h iðA4Þ
However, rather than using absolute abundances which
requires explicit knowledge of the number of individuals
in the metacommunity, we consider the relative abundance,
pi, of each species. Setting pi ¼ mi=PSM
i mi and NM ¼PSM
1
mi we note that only SM� 1 of these pi variables are now
independent.
Therefore, set
mi ¼ NMpi for i ¼ 1; . . . ; SM � 1
and
mSM¼ NMð1� p1 � . . . � pSM�1Þ:
The joint density function of p1; . . . ; pSM is therefore
gðp1; � � � ; pSM�1;NMÞ ¼1
G y=SMð ÞSM x1�x
� �y�YSM
1
e�NM pi=x
1�xð Þ NMpið Þy=SM�1h i
� det Jj jðA5Þ
where J is the Jacobian, given by
NM 0 � � � 0 p1
0 NM. .
. ...
p2
..
. . .. . .
.0 ..
.
0 � � � 0 NM pSM�1
�NM �NM � � � �NM pSM
0BBBBBBBB@
1CCCCCCCCA
ðA6Þ
It can be seen that det Jj j ¼ NSM�1M and therefore,
gðm1; . . . ; mSM�1;NMÞ ¼e�NM=
x1�xð ÞNy�1
M
G yð Þ x1�x
� �y" #
G yð Þpy=SM�11 � � � py=SM�1
SM
G y=SMð ÞSM
" #:
ðA7Þ
FEMS Microbiol Ecol 62 (2007) 171–180 c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
179Neutral assembly of bacterial communities
The first term of this implies that NM � G y; 1�xx
� �� �,
which gives that E NMð Þ ¼ y x1�x as expected for the
log-series distribution. The second bracket states that
p1; . . . ; pSM have a Dirichlet distribution p1; . . . ; pSM �Dir y=SM ; :::; y=SMð Þ. Additionally, p1; . . . ; pSM are indepen-
dent of NM. Therefore, the distribution of relative
abundances can be obtained by generating SM
independent realisations of Gamma variables pi �G y=SM ; 1ð Þ for i = 1, . . ., SM and then normalizing byPSM
1 pi.
FEMS Microbiol Ecol 62 (2007) 171–180c� 2007 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved
180 S. Woodcock et al.