drift sometimes dominates selection, and vice versa: a reply to clatterbuck, sober and lewontin

Drift sometimes dominates selection, and vice versa:a reply to Clatterbuck, Sober and Lewontin

Robert Brandon • Leonore Fleming

Received: 10 September 2013 / Accepted: 21 February 2014 / Published online: 15 May 2014

� Springer Science+Business Media Dordrecht 2014

Abstract Clatterbuck et al. (Biol Philos 28: 577–592, 2013) argue that there is no

fact of the matter whether selection dominates drift or vice versa in any particular

case of evolution. Their reasons are not empirically based; rather, they are purely

conceptual. We show that their conceptual presuppositions are unmotivated,

unnecessary and overly complex. We also show that their conclusion runs contrary

to current biological practice. The solution is to recognize that evolution involves a

probabilistic sampling process, and that drift is a deviation from probabilistic

expectation. We conclude that conceptually, there are no problems with distin-

guishing drift from selection, and empirically—as modern science illustrates—when

drift does occur, there is a quantifiable fact of the matter to be discovered.

Keywords Drift � Selection � Probabilistic sampling � Chance � Causal

relevance � Evolution

Introduction

Under what conditions will selection dominate drift and vice versa? Clatterbuck

et al. (2013) address this question and come to a very clear and surprising

conclusion—never. The answer is not based on some new empirical discovery, but

rather on a conceptual analysis that concludes that the question is ill-formed, that the

problem it raises is a pseudoproblem.

R. Brandon (&)

Department of Philosophy, Duke University, 201 West Duke Building, Box 90743, Durham,

NC 27708, USA

e-mail: [email protected]

L. Fleming

Department of Philosophy, Utica College, 1600 Burrstone Rd., Utica, NY 13502, USA

123

Biol Philos (2014) 29:577–585

DOI 10.1007/s10539-014-9437-z

Although their conclusion is clear, how they get there is far from clear. Near the

end of their paper, they summarize their argument:

The relationship of selection to drift resembles the relationship of the

probability a coin has of landing heads and the number of times the coin is

tossed … Suppose you toss the fair coin ten times and the outcome is 40 %

heads. This outcome was due to the fact that the coin was fair and to the fact

that you tossed it ten times. Do not ask which cause was stronger. You can

change the probability of that outcome by changing either the bias of the coin

or the number of times it is tossed (p. 590).

Comparing natural selection to flipping a coin can be quite useful; unfortunately,

Clatterbuck et al. do not make use of the conceptual resources probability theory

offers. This leads to their conclusion that the very question of whether or not drift

dominates selection makes no sense.

Along the way they address a number of questions (which we list, followed by

their answers in parentheses):

Are drift and selection both causes of evolution? (Yes.)

Are drift and selection distinct causes? If so, can we ascertain the relative

strength of drift versus selection by asking what would happen if we ‘‘zero-

out’’ one of the two quantities? (No and, consequently, No—even though much

of the paper is relevant only if they are distinct causes, e.g., the discussion of

asbestos, smoking and lung cancer.)

Is drift a result? (No—it is a process; except when it is not, see p. 587.)

Is it a mistake to say that the same causal set-up can one time yield a result we

should label selection and another time yield drift? (Yes.)

One could be forgiven if one ended up more confused after reading this paper than

before.

We think their argument is both wrong and overly complex. A simple change

in perspective will yield a clear and decisive answer to all of these questions and

will do so in a way that is consistent with and supportive of much of current

biological practice. For instance, Alcaide (2010) ends a brief perspective piece on

Miller et al. (2010) by saying of their study that it ‘‘supports the major role of

genetic drift in shaping MHC (major histocompatibility complex) variation in

small and isolated populations. Accumulating evidence suggests that natural

selection is not always strong enough to override genetic drift’’ (pp. 3843–3844).

And recently, when studying the genetic diversity of innate immunity toll-like

receptor (TLR) genes in a population of robins, Grueber et al. (2013) conclude that

their ‘‘results show that genetic drift can be the major determinant of the genetic

makeup of a population, even when natural selection confers a survival advantage

to a heterozygote genotype, such as those associated with the innate immune

system’’ (p. 4479). Is this conceptual confusion or good science? We think the

latter.

578 R. Brandon, L. Fleming

123

Probabilistic sampling

Take a chance set-up; say a coin and tossing device or a radioactive atom and a

Geiger counter. The chance set-up displays a stable, highly predictable outcome that

can be characterized by a probability measure. The coin, tossed by a human starting

with the head side up, yields heads with Pr = 0.51 (Diaconis et al. 2007). The

carbon-15 atom will decay in a period of 2.449 s with Pr = 0.5. Assuming that

these probabilities are objective stable features of the chance set-up allows us to

draw on the resources of probability theory to explain and predict.

For example, suppose we have a fair coin-tossing set-up [Pr (heads) = 0.5], and

we toss the coin four times. There are five possible outcomes: 4 heads; 3 heads and 1

tail; 2 heads and 2 tails; 1 head and 3 tails; and 4 tails. Using the probability calculus

we can calculate the probability of each of these. The most probable outcome is 2

heads and 2 tails. However, its probability is only 0.375, which means that, more

often than not, we will get a result that deviates from probabilistic expectation. But

with larger samples, say 1,000 tosses, and a binning of results, say getting somewhere

between 490 and 510 heads, the most probable outcome becomes highly probable.

All of this simple probabilistic reasoning transfers over to more complex cases.

Suppose now we have two different coins, one that has Pr (heads) = 0.7 and the

other has a Pr (heads) = 0.3. We toss both coins 4 times each. Again we can

calculate the probability distribution over the five possible results for each coin,

based on its characteristic probability. For coin A they are:

4 heads; 0 tails : Pr ¼ 0:2401

3 heads; 1 tail : Pr ¼ 0:4116


1 head; 3 tails : Pr ¼ 0:0756


For coin B the probabilities are reversed in the obvious way. Now we can

compare the two trials both qualitatively and quantitatively. What is the likelihood

that coin A will yield more heads than coin B? We can calculate the answer in a

tedious but conceptually straightforward way. Take all the ways coin A can yield

more heads than coin B, e.g., coin A yields 4 heads and coin B yields 0 heads. That

happens with probability = 0.2401 9 0.2401 (the product of the probabilities of the

two component events) = 0.0576. Do this for all of the nine other independent ways

in which coin A outperforms (in terms of yielding heads) coin B. Then add those ten

numbers. The result is the probability that coin A yields more heads than coin B, and

its numerical value is 0.8058. The single most likely way in which coin

A outperforms coin B is for coin A to yield 3 heads and 1 tail while coin

B yields 1 head and 3 tails. The probability of this conjoined event is 0.1694. So the

most likely specific result is not very likely, but when we group together all of the

ways in which A outperforms B, the likelihood of that is quite high. You would see

that in a little more than 8 times in 10.

Take a chance set-up and run a number of trials. That is probabilistic sampling.

With probabilistic sampling there will be a well-defined expected result. Actual trials

Drift sometimes dominates selection, and vice versa 579

123

will yield results that either hit the expected value exactly or deviate from the

expectation. In cases of deviation from expectation the deviation can be big or small

and these qualitative categories can be quantified. All that is part and parcel of the

process of probabilistic sampling. Obviously the process stays the same whether or

not the result is in accord with expectation. This is true whether we are dealing with a

simple trial of one chance set-up or a compound trial with two or more chance set-ups.

We can certainly distinguish between cases of trials that yield the expected result

from those that deviate from expectations. But notice that this can only be done post

hoc, and is a results-based (in contrast to a process-based) distinction. Is there

anything else to be said about this distinction? For instance, should we say that trials

that yield results in agreement with expectation are one sort of process (not-chance)

while those that deviate from expectation are another sort (chance)? We doubt

anyone would find this move appealing. Why did one series of four tosses of our fair

coin yield 2 heads and 2 tails? The answer: chance. More fully we can show the

overall probability distribution of possible results and where the given result falls on

this distribution (and, of course, we can quantify its probability). Why did another

series of four tosses yield 0 heads and 4 tails? The answer is the same. Chance.

Natural selection and drift

The analogy between our compound chance set-up described above and natural

selection is obvious. Natural selection is comparative. Selection cannot occur if

there is only one type (genotype, phenotype, allele) in the population. Our two-coin

case is like natural selection in that we have two entities with differing probabilities

of the outcome. Organisms are chance set-ups with respect to reproduction

(Brandon 1978). That is, they have a probability distribution of offspring numbers

(the outcome) that is stable for a given environment. This is the fundamental

assumption of the theory of evolution by natural selection (see Endler 1986,

Fig. 1.4). Is it true? How would we know?

Consider a biased coin. How would we know that it had a stable propensity for

landing on heads? An engineering analysis is possible, but ultimately we would

want to toss it a large number of times. Here’s a true story: one of us (RB) modified

a US quarter to be biased for heads. The modification was based on a simple

engineering analysis that proved to be a good first approximation. The coin was bent

inward on the head side, thus increasing the portion of a 360� rotation which would

result in head side up. A student tossed the coin 1,000 times starting with heads up

and letting the coin land on a carpeted floor. Results were recorded and they were

637 heads and 363 tails. That data is good evidence for positing that the probability

of heads in that set-up is 0.64. With that posit one is able to predict future data,

which was done for a class the following year. A public prediction was made and a

student went home to test it. Her results were in agreement with the prediction to the

second decimal place. This was repeated for a 3rd year with equally impressive

results. So the coin has a stable propensity to land on heads approximately 64 % of

the time. This posited propensity is predictive and it is explanatory after the fact.

That is good (sufficient for us) reason to believe it is real.


123

Whether or not some part of the world is a chance set-up with respect to some set

of outcomes is a matter to be decided empirically, not a priori. It turns out, for

instance, that radioactive isotopes are chance set-ups with respect to decay. They

have characteristic half-lives that allow for prediction, explanation and ultimately

practical usage, e.g., in medicine. Empirical studies of natural selection, both in the

wild and in the lab, provide ample evidence that organisms are indeed chance set-ups

with respect to reproduction. To give just one major sort of example: every study

showing long term directional selection shows a stable difference in fitness, i.e., the

probability distribution of offspring number for a given type, between types.

Now we can bring in the analogy to the two-coin example. Given a trial, say four

tosses of both coins, we can define the probabilistic expectation and then after the

fact observe whether our result is in accord with expectations, or deviates from it.

This fit, or lack thereof, between result and expectation can be described

qualitatively or quantitatively. There is, without question, a fact of the matter to

be ascertained here. And recall that there is no temptation to describe one sort of

result as due to one sort of process and the other as due to another process. It is one

process—probabilistic sampling. It is all chance.

Just as we can run a trial with our two coins, we can run a trial with two (or more)

organisms in a common environment. Given that they are chance set-ups with

respect to reproduction, there will be well-defined probability distributions of all

possible reproductive outcomes. Thus we can define probabilistic expectations.

Actual results will either accord with, or deviate from, expectations. When they

accord with expectations and the probability distributions differ, we label the result

selection. Drift is when results deviate from expectations (Brandon 2005).

A conceptual problem arises when one takes this results-based distinction and tries

to read into that some difference in processes (see, e.g., Millstein 2002, 2005). But just

as in the coin case, there is no difference in process, there is just probabilistic sampling,

which, by its very nature, tends to yield both results that agree with expectations and

ones that deviate from them. Contrary to Clatterbuck et al. drift is not a process. It is,

however, a predictable result of a process, namely probabilistic sampling.

Ns

The product of N, effective population size, and s, the selection coefficient is a

predictive and explanatory tool in population genetics. When Ns is small drift

dominates selection, when it is large selection dominates drift, or at least that is the

standard view from population genetics. It is just this view that Clatterbuck et al.

argue is wrong. We agree with one point they make—namely that it is arbitrary to

specify some exact value of Ns that marks the transition point between one regime

and the other. But, as they acknowledge, this is not news, ‘‘Most biologists now

recognize a gray zone’’ (p. 581).1 They also acknowledge that Ns is a perfectly good

1 Brandon and Nijhout (2006) discuss and visually represent this gray zone in a treatment of the

conditions under which we expect selection to dominate drift and vice versa. See especially Fig. 2, p. 285.


123

predictive tool (p. 589). We presume they would say the same thing about Ns as an

explanatory tool.

We also agree with their main critical point concerning Ns, but are baffled as to

why they make it. Their primary conclusion is that the value of Ns cannot be

definitive of selection, nor of drift. That is, they think that the ‘‘conventional view’’

(p. 589) is that the value of Ns settles the question of the relative strength of

selection and drift. But, they argue, this is wrong. Why? Their argument boils down

to this: The relative strength issue needs to be settled by the nature of the ‘‘causal

set-up’’,2 but the value of Ns fails to ‘‘deliver the goods’’ (p. 589), because the same

value of Ns can result in different outcomes.

Our bafflement has two sources. First, we are very surprised to see that

Clatterbuck et al. think that it is the conventional view among evolutionary biologists

that the value of Ns settles the question of whether selection dominates drift or vice

versa. Without doing some sort of sociological survey we hesitate to pronounce on

what most biologists think about this. However, we will point out that such a view is

inconsistent with the idea that drift is a sort of ‘‘sampling error’’ (Roughgarden 1979,

p. 57). It is our impression that this is the conventional view among biologists.

Conventional or not, it has been our contention that such a view of drift is correct. If

drift is sampling error, or deviation from probabilistic expectation, then it is a

result—one that can occur in one run of a chance set-up, and not occur in another run

of the exact same set-up (contrary to Clatterbuck et al.).3

So we think their main critical point is directed against a straw-position. If we are

wrong about that, then the state of conceptual confusion regarding the relation

between selection and drift is worse than we thought. But our second source of

bafflement is more fundamental. They say:

Let the populations run for as many generations as you please, and still, given

enough trials, the fitter allele will increase in some trials and decrease in

others. However, by hypothesis, the causal strengths of drift and selection are

identical in all these populations…The assumption is that the strengths of drift

and selection should be understood in terms of features of the causal set-up,

not in terms of which outcomes happen to ensure (p. 589).

But that assumption is unnecessary and unmotivated. Yes one needs a certain sort of

causal set-up to get drift, namely a chance set-up. But that sort of set-up doesn’t

guarantee drift, it merely makes it more or less probable. With few exceptions,4 drift

is never made necessary by the causal set-up.

2 Chance set-ups are, in our view, causal set-ups. We are unsure of whether Clatterbuck et al. mean

something else by this term.3 One reason one might think that there is a difference in the processes of drift and selection is that the

demographic facts that determine N operate on the entire genome while the ecological facts that

determine s at a particular genetic locus operate (to a first approximation at least) locally. However, if it is

Ns that is the critical parameter for drift, it acts locally just like s. That is, the genome-wide effect of N is

filtered through the selective forces acting at a particular locus to produce the value of Ns that applies at

that locus.4 See the hypothetical case on pp. 332–33 in (Brandon and Carson 1996). Also see Figure 1 in (Brandon

2005), which sets out the modalities of selection and drift along a line of all possible probability

distributions.


123

Their argument seems to be based on the following assumption: same set-up,

same result. This is precisely the assumption one would make in a deterministic

world, and precisely the assumption you shouldn’t make in an indeterministic

world. Are they tacitly presuming that evolutionary drift is a deterministic

phenomenon? That would be bizarre, and without justification. But what then lies

behind their argument?

We will conclude this section by making three brief points about Ns. First, there

is nothing particularly biological about Ns. One can easily derive the analogous

quantity from our two-coin example. Everything else being equal, the larger the

sample size, N, the more likely it is that the actual result will be close to the

expected result. Similarly, everything else being equal, the greater the difference

between the biases of the two coins, s, the more likely the result will be close to the

expectation. To use a common metaphor: if the difference in probabilities is the

signal, and sampling error the noise, then the larger the value of Ns the more likely it

is that the signal will dominate the noise, and vice versa. But notice that the

inference that is justified here is one about the likelihood of certain results. It is a

probabilistic inference, not a deductive one. It can be used to predict and explain

results, not to classify results.

Second, although the Ns criterion provides a useful tool for predicting and

explaining the relative strengths of selection and drift, it is not a perfect tool. In

particular, when s is zero, Ns is zero, but it would be misleading to clump all such

cases together. Even when s is zero, the value of N still matters. Imagine a two-coin

set-up where both coins have the same probability of heads. The expected result is

that they should yield the same number of heads when tossed the same number of

times. But obviously, the larger the number of tosses, the more likely it is that the

expectation will be matched by the result.

Our third and final point is that Ns is not the only method to test the relative

strengths of selection and drift. There are various comparative methods in wide

usage.5 For example, the basic idea behind the McDonald-Kreitman (MK) test

(McDonald and Kreitman 1991) is to compare the behavior of synonymous

nucleotide substitutions with non-synonyms substitutions in the target species

(within the background of an outgroup sequence from a closely related species).

This method is not without its flaws, and improvements have been made to it since

its introduction (see e.g., Eyre-Walker 2006; Messer and Petrov 2013).6 Moreover,

it was introduced to test the hypothesis of directional selection against a null model

of neutral evolution and so is not a test that gives a yes–no answer to the question of

5 There are also some non-comparative methods. For example, Wallace et al. (2013) estimate the strength

of selection on codon usage based on experimental data and the genome sequence in Saccharomyces

cerevisiae. Another (somewhat controversial) example is based on codon ‘‘volatility’’ and applicable to a

single genome (see Plotkin et al. 2004).6 The main difficulties have to do with linkage. For instance, if substitutions in the third position of a

codon are synonymous (and so, to a first approximation, neutral) while mutations in the first two positions

are not, this sets up the comparison at the heart of the MK method. However, obviously the fate of the

third position is not independent of the first two. If directional selection is acting, then the nucleotide that

happens to be in the third position is tugged along with the selected nucleotide combination at positions

one and two. This is called hitchhiking. But what this means is that when directional selection is acting,

the third position is not in fact acting in a way that instantiates the proper null model.


123

whether selection dominates drift at the target site.7 Nonetheless, the fundamental

logic behind the MK approach is sound. Find something that behaves in a drift-like

way and compare the behavior of the object of interest to that.

The study mentioned in this article’s introduction, Miller et al. (2010) does just

that. It looked at allelic variation in MHC genes in a number of populations of

tuataras on small islands off of New Zealand. There is a reasonable presumption that

MHC alleles are under selection since they mediate pathogen resistance. The pattern

of MHC allelic variation in numerous island populations of tuatara was compared to

that of neutral microsatellite markers.8 The concurrence of patterns of variation

between MHC alleles and neutral alleles was taken as evidence of the role of drift in

producing the observed pattern of MHC allelic diversity. The authors concluded that

drift was indeed responsible for some of the observed variation.

In the other study we mentioned in this article’s introduction, Grueber et al.

(2013) investigated genetic diversity at toll-like receptor (TLR) genes in a re-

introduced population of the Stewart Island robin. After determining that there was

evidence of selection using generalized linear mixed effects modeling (GLMM),

they then tested the magnitude of drift versus selection using a variety of Monte

Carlo simulations. They compared their data of TLR genes (specifically, TLR4BE) in

juvenile robins to the distribution of expected proportions of TLR genes from 5,000

Monte Carlo simulation iterations. The authors concluded that drift overwhelmed

selection and that ‘‘genetic drift is therefore a significant concern during the

establishment phase of colonialization’’ (4479).

Both of these empirical examples involve comparative methods. In the first case

it is a comparison of a site presumably under selection with one not under selection,

and in the second case a comparison of the site of interest with a computer

simulation that has known statistical properties. In neither case was there a direct

measurement of Ns nor an attempt to estimate its value. The conclusions do not

depend on the value of Ns. Note that this is a methodological/epistemological point.

We are not claiming that the value of Ns is irrelevant to the outcomes, rather we are

pointing out that one can know that drift has swamped selection in certain cases

without any direct knowledge of the value of Ns.

Conclusions

Clatterbuck et al. argue that there is no fact of the matter whether selection

dominates drift or vice versa in any particular case of evolution. Thus, for example,

Miller et al. (2010) are not just wrong when they claim to have demonstrated that

drift has governed the evolution of some cases of diversification in MHC alleles in

New Zealand tuataras—they have made a conceptual error. But, we have shown that

if one thinks of evolution as involving a probabilistic sampling process, and that

drift is to be identified as deviation from probabilistic expectation, then it follows in

7 The ‘‘alternative’’ to directional selection is not drift. There are multiple alternatives in addition to drift,

most importantly, stabilizing selection.8 Note that hitchhiking is not a problem for this comparison.


123

a straightforward way that when drift does occur there is a quantifiable fact of the

matter to be discovered.

Furthermore, answers to the other questions raised follow in an equally straightfor-

ward manner. Are drift and selection both causes of evolution? Yes. Are they separate or

distinct causes? No, they are both products of the same process, namely probabilistic

sampling. Is drift a result? Yes, because different iterations of exactly the same process

can yield no drift. Thus, contrary to what Clatterbuck et al. assume, it is perfectly

sensible to claim that different runs of the same set-up can yield qualitatively different

results. To assume otherwise is to assume that drift is a deterministic phenomenon.

In contrast, to assume that evolution involves genuine chance, that there are

objective probabilities that govern the lives and deaths of organisms (and biological

entities at both higher and lower levels of organization), is to provide a foundation

for the objectivity of drift, selection and their relative strengths. It would be nice,

therefore, if this assumption were correct. We take its great utility to be evidence for

its truth. But our aim here has not been to support that grand conclusion; rather it has

been to demonstrate the virtues of coherence and simplicity that attach to our view.

Acknowledgments We wish to thank the philosophy of biology reading group at Duke University and

an anonymous reviewer for helpful comments. Special thanks go to David McCandlish for help with

some final tweaks.

References

Alcaide M (2010) On the relative roles of selection and genetic drift in shaping MHC variation. Mol Ecol

19:3842–3844

Brandon RN (1978) Adaptation and evolutionary theory. Stud Hist Philos Sci Part A 9(3):181–206

Brandon RN (2005) The difference between selection and drift: a reply to Millstein. Biol Philos 20:153–170

Brandon RN, Carson S (1996) The indeterministic character of evolutionary theory: no ‘‘no hidden

variables proof’’ but no room for determinism either. Philos Sci 63(3):315–337

Brandon RN, Nijhout F (2006) The empirical nonequivalence of genic and genotypic models of selection: a

(decisive) refutation of genic selectionism and pluralistic genic selectionism. Philos Sci 73:277–297

Clatterbuck H, Sober E, Lewontin RC (2013) Selection never dominates drift (nor vice versa). Biol Philos

28:577–592

Diaconis P, Holmes S, Montgomery R (2007) Dynamical bias in the coin toss. SIAM Rev 49(2):211–235

Endler JA (1986) Natural selection in the wild. Princeton University Press, Princeton

Eyre-Walker A (2006) The genomic rate of adaptive evolution. Trends Ecol Evol 10:569–575

Grueber CE, Wallis GP, Jamieson IG (2013) Genetic drift outweighs natural selection at toll-like receptor

(TLR) immunity loci in a re-introduced population of a threatened species. Mol Ecol 22:4470–4482

McDonald JH, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature

351(6328):652–654

Messer PW, Petrov DA (2013) Frequent adaptation and the McDonald-Kreitman test. Proc Natl Acad Sci

USA 110(21):8615–8620. doi:10.1073/pnas.1220835110

Miller H, Allendorf F, Daugherty C (2010) Genetic diversity and differentiation at MHC genes in island

populations of tuatara (Sphenodon spp.). Mol Ecol 19:3894–3908

Millstein RL (2002) Are random drift and natural selection conceptually distinct? Biol Philos 17:33–53

Millstein RL (2005) Selection versus drift: a response to Brandon’s reply. Biol Philos 20:171–175

Plotkin JB, Dushoff J, Fraser HB (2004) Detecting selection using a single genome sequence of M.

tuberculosis and P. falciparum. Nature 428:942–945

Roughgarden J (1979) Theory of population genetics and evolutionary ecology: an introduction.

Macmillan, New York

Wallace EW, Airoldi EM, Drummond DA (2013) Estimating selection on synonymous codon usage from

noisy experimental data. Mol Biol Evol 30(6):1438–1453


123

http://dx.doi.org/10.1073/pnas.1220835110

drift sometimes dominates selection, and vice versa: a reply to clatterbuck, sober and lewontin

Documents