can the past predict the future? experimental tests of historically based population models

Can the past predict the future? Experimental tests ofhistorically based population modelsPETER B . ADLER * , KERRY M . BYRNE † § and JAMES LEIKER‡

*Department of Wildland Resources and the Ecology Center, Utah State University, Logan, UT 84322, USA, †Graduate Degree

Program in Ecology, Colorado State University, Fort Collins, CO 80523, USA, ‡Sternberg Museum of Natural History,

Fort Hays State University, Hays, KS 67601, USA

Abstract

A frequently advocated approach for forecasting the population-level impacts of climate change is to project models

based on historical, observational relationships between climate and demographic rates. Despite the potential pitfalls

of this approach, few historically based population models have been experimentally validated. We conducted a

precipitation manipulation experiment to test population models fit to observational data collected from the 1930s to

the 1970s for six prairie forb species. We used the historical population models to predict experimental responses to

the precipitation manipulations, and compared these predictions to ones generated by a statistical model fit directly

to the experimental data. For three species, a sensitivity analysis of the effects of precipitation and grass cover on forb

population growth showed consistent results for the historical population models and the contemporary statistical

models. Furthermore, the historical population models predicted population growth rates in the experimental plots

as well or better than the statistical models, ignoring variation explained by spatial random effects and local density-

dependence. However, for the remaining three species, the sensitivity analyses showed that the historical and statisti-

cal models predicted opposite effects of precipitation on population growth, and the historical models were very poor

predictors of experimental responses. For these species, historical observations were not well replicated in space, and

for two of them the historical precipitation-demography correlations were weak. Our results highlight the strengths

and weaknesses of observational and experimental approaches, and increase our confidence in extrapolating histori-

cal relationships to predict population responses to climate change, at least when the historical correlations are strong

and based on well-replicated observations.

Keywords: climate change, competition, demography, ecological forecasting, mixed-grass prairie, plant ecology

Received 19 September 2012; revised version received 28 January 2013 and accepted 4 February 2013

Introduction

A common approach for predicting the population

consequences of climate change relies on long-term

observational data to describe quantitative relation-

ships between climate variables and demographic rates

(Post & Stenseth, 1999; Coulson et al., 2001; Botkin et al.,

2007; Solhoy et al., 2008; Doak & Morris, 2010; Dalgleish

et al., 2011; Luo et al., 2011). Population models contain-

ing these climate-demography relationships can then

be simulated to explore the potential impacts of altered

precipitation or temperature (e.g. Adler et al., 2012).

The key assumption of this approach is that historical

climate-demography correlations will hold under

future conditions. This assumption appears sensible. In

many ecosystems, climatic variation is the primary dri-

ver of interannual variation in population growth rates

(Andrewartha & Birch, 1954) and climate-demography

correlations that are strong enough to emerge in noisy,

observational data should be reliable.

On the other hand, there are good reasons to be sus-

picious of phenomenological models based on histori-

cal correlations. The obvious problem is that correlation

does not imply causation: The true driving variable

may be unmeasured and not linked mechanistically

with climate. As a result, a climate-demography rela-

tionship, which phenomenological models assume to

be stationary, could change under future conditions.

For example, a historically negative correlation between

precipitation and plant performance might reflect high

herbivory in wet years, rather than a direct effect of

water availability. If the relationship between precipita-

tion and herbivore densities changes in the future, then

the relationship between precipitation and plant perfor-

mance will also change. A separate problem could

occur if future conditions fall outside the range of his-

torical climate variation, leading to inaccurate linear

extrapolations of potentially nonlinear processes. Statis-

tical power may also be an issue. Although a two or

§Current address: Department of Plant Sciences, University of Cal-

ifornia Davis, Davis, CA 95616, USA

Correspondence: Peter Adler, tel.+ 435 797 1021,

fax + 435 797 3796, e-mail: [email protected]

© 2013 Blackwell Publishing Ltd 1793

Global Change Biology (2013) 19, 1793–1803, doi: 10.1111/gcb.12168

three decade time series is long by demographic stan-

dards (R. Salguero–G�omez, unpublished results), it

may not be long enough to reliably estimate climatic

influences on population growth, especially given the

importance of decadal-scale climatic variation (Biondi

et al., 2001; Hessl et al., 2004).

Experimental validation could greatly increase our

confidence in predictions made from models fit to

observational data. Because temperature or precipita-

tion manipulations break many of the correlations,

measured and unmeasured, inherent in observational

time series, experimental confirmation of historical

climate-demography relationships would indicate that

those relationships are in fact causal or at least reflect a

tight correlation between the true driver and the

manipulated climate variable. When experimental and

observational results are inconsistent, as in a recent

meta-analysis of temperature effects on phenology

(Wolkovich et al., 2012), the mismatch can help identify

the important underlying mechanisms (Dunne et al.,

2004; Rutishauser et al., 2012).

We conducted a 4-year precipitation manipulation

experiment in a southern mixed prairie to validate pre-

dictions of plant population models built from long-

term observational data. Adler & HilleRisLambers

(2008) used demographic data spanning the period

from 1936 to 1972 to model the survival and recruit-

ment of ten forbs species as a function of precipitation,

temperature, conspecific density, and perennial grass

cover. That analysis produced some surprising results,

such as negative rather than positive effects of precipi-

tation on some species, and positive rather than nega-

tive effects of grass cover on forb populations. From

2008 to 2011, at the same location where the long-term

data were collected, we used drought shelters and irri-

gation to alter precipitation. We monitored the densities

and population growth rates of six of the 10 species

analyzed in Adler & HilleRisLambers (2008) (hereafter

A&HRL) to address the following questions: (1) How

consistent are the historical and experimental relation-

ships between precipitation and population growth? (2)

How well can a population model fit to historical,

observational data predict population changes in con-

temporary, experimental plots? (3) What are the

strengths and weaknesses of observational and experi-

mental approaches to ecological forecasting?

We addressed the first two research questions by

comparing sensitivities to covariates (question 1) and

predictions (question 2) from the historical population

models to sensitivities and predictions of statistical

models fit directly to the experimental data. In princi-

ple, it is possible to address our question about the

accuracy of the historical model’s predictions without

comparison with a contemporary, statistical model.

However, given the high proportion of unexplained

variance typical of ecological data sets, interpretation

could be challenging: A weak correlation between pre-

dictions and observations could indicate either a bad

model or an inherently noisy response variable. Predic-

tions of the contemporary statistical model provide a

baseline for comparison. We should consider the histor-

ical population model a failure if its predictions are

much worse than predictions from the contemporary

statistical model, and a success if its predictions are

nearly as good or better, which is possible if the histori-

cal data set contains more information than the

experimental data set about climate-demography rela-

tionships. To our knowledge, this experimental valida-

tion of population models fit to historical observations

is unique, especially given the four-decade interval

between the end of the historical time series and initia-

tion of the experiment.

Materials and methods

Site description

The study site is located two miles west of Hays, Kansas

(38.8°N, 99.3°W) in native southern mixed-grass prairie. Mean

annual precipitation is 580 mm, with 80% falling April

through September. Mean annual temperature is 12 °C. Gradi-

ents in soil type produce distinct plant communities (Albert-

son & Tomanek, 1965), ranging from a shortgrass community

on level uplands to communities dominated by taller bluestem

species on hillslopes and in swales. While the historical data

set includes permanent quadrats in all of these communities,

we conducted the contemporary experiment on the shallow

limestone hillslope sites dominated by the C4 perennial

grasses Schizachyrium scoparium, Andropogon gerardii, and Bout-

eloua curtipendula. This community also hosts a high diversity

of perennial forb species.

Experimental design

Our experiment consisted of three treatments: drought, ambi-

ent precipitation, and irrigation. We replicated these treat-

ments three times in two blocks separated by 0.5 km. Each of

the 18 total plots is 8 m long, oriented with the slope, and 2 m

wide. All plots were protected from moderate intensity sum-

mer livestock grazing by electrical fences erected in 2007.

Before the 2008 growing season, we randomly assigned

each plot to one of the three precipitation treatments. The pur-

pose of the treatments was to create large differences in grow-

ing season precipitation, rather than to simulate a particular

future precipitation scenario. We imposed drought using pas-

sive 10 m long 9 4 m wide rainfall shelters that intercepted

approximately 50% of incoming rainfall (Adler et al., 2009)

beginning in late March 2008. The pitched roofs of the shelters

were made of 15 cm wide strips of corrugated polycarbonate

with >90% PAR transmittance (Dynaglass brand) which

© 2013 Blackwell Publishing Ltd, Global Change Biology, 19, 1793–1803

1794 P. B . ADLER et al.

channelled rainfall into gutters leading away from the plots.

Rain falling between the roofing strips reached the plots.

Water for the irrigation treatment was pumped from a 5680 L

holding tank into a network of soaker hoses (2008 and 2009) or

drip lines (2010 and 2011). We used municipal water low in

nitrates. Each week from May through September we applied

the long-term average weekly precipitation. This ‘ambient

+ normal’ approach ensured a wetter than normal treatment,

even if ambient precipitation was well below normal. We used

precipitation data from a National Climatic Data Center

(NCDC) weather station (HAYS 1S) located approximately

5 km southeast of the field site.

We maintained the drought, ambient, and irrigation treat-

ments from 2008 through the 2010 growing season. For the

2011 growing season, we flipped the drought and irrigation

treatments to manipulate the sequence of precipitation years

and test the impact of lag precipitation on our target forb pop-

ulations. We split the drought and irrigation treatments into

half, and switched the treatment on one randomly selected

half-plot. Thus, we turned half of each drought plot into an

irrigation plot, and half of each irrigation plot into a drought

plot. The other half of each plot continued to receive the same

treatment it had received for the first 3 years. Cutting our

plots into half reduced their area, turning one 8 9 2 m plot

into two 3 9 2 m plots, with a 2 m buffer between each half.

Forb censuses

In each experimental plot, we monitored the density of six of

the ten forb species analyzed by A&HRL (the other four species

analyzed by A&HRL were not common enough in the experi-

mental plots to include in this study): Cirsium undulatum, Echin-

acea angustifolia, Lesquerella ovalifolia, Paronychia jamesii, Psoralea

tenuiflora, and Thelesperma megapotamicum. All six species are

common herbaceous forbs in southern mixed prairie, and espe-

cially on the shallow limestone soils where we conducted the

experiment. We censused population densities in late July of

each year, searching exhaustively by subdividing each plot into

1 m2 sections. (Density was also the metric of abundance in the

historical data). In 2010, we recorded the locations of these sub-

sections in anticipation of the 2011 swap of the drought and irri-

gation treatments. We also monitored the canopy cover of the

perennial grasses using visual estimates in ten 50 9 20 cm

Daubenmire frames per plot (five 50 9 20 cm Daubenmire

frames per swapped half-plot in 2011).

Statistical analysis of experimental results

We began by analyzing changes in the density of each species

in each treatment. For each species, we used a generalized lin-

ear mixed-effects model [function glmmPQL of package

MASS (Venables & Ripley, 1994) in R 2.15.0], assuming a nega-

tive binomial distribution for density, after estimating the dis-

persion parameter for the negative binomial using function

glm.nb (also in package MASS). We modeled forb counts in

each 16 m2 plot as a function of the following fixed effects:

treatment, year since initiation of treatments (a continuous

variable), a treatment-by-year interaction, and tallgrass cover

(cover of shortgrass and annual grass species was low in the

experimental plots). We included block and plot as random

effects. We only analyzed years 2008–2010 in this analysis,

excluding 2011 and the treatment swap which reduced plot

sizes and thus reduced counts per plot. Although the density

data provide a convenient description of gross treatment

effects on the forb populations, the analysis is complicated by

temporal autocorrelation, spatial variability in pre-treatment

densities, and problematic distributions.

Our key analysis focuses on log per capita growth rates,

defined as ln(Ni,t+1/Ni,t) where Ni,t is the density of a focal

species in plot i at time t. This normally distributed response

variable highlights treatment effects on changes in population

size while removing problems of temporal autocorrelation

and spatial variability in densities. Furthermore, we included

data from growing seasons 2008–2009, 2009–2010, and 2010–

2011 all in one analysis, using the full plot densities for the

first two transitions and the corresponding half-plot densities

for the swapped treatments in the final transition.

We used a linear mixed-effects modeling approach (func-

tion lmer in package lme4 of R.2.15.0) for each species, with

log per capita growth rate of each species as the response vari-

able, block and plot as the random effects, and conspecific

density, tallgrass cover and the following three treatment-

specific precipitation variables as fixed effects: (1) annual

precipitation in the previous year (October–September), (2)

dormant season (October–March) precipitation in the current

year, and (3) growing season (April–September) precipitation

in the current year. We chose this model structure to match

the compositional and precipitation covariates used in

A&HRL. However, while the A&HRL model included tem-

perature covariates, we did not include them in this statistical

model because they did not improve model fit and led to very

high variance inflation factors (>20); variance inflation factors

for the covariates we did include were <4 in all cases, ruling

out serious multicollinearity. Focusing on precipitation

received in each treatment, rather than on treatment per se,

allowed us to include in one analysis the 2011 swapped treat-

ments with the constant treatments and to directly compare

the experimental results with the historically based models.

Applying standard significance tests to mixed-effects models

requires strong assumptions because the degrees of freedom

are not known. Because our analysis is primarily focused on

model predictions, not hypothesis tests, we took a conserva-

tive approach, and assessed the significance of covariate

effects in these models using the t-statistic. Reasoning that the

degrees of freedom in our models must be larger than a mini-

mum of 12 (ignoring the 3 years observed for each plot, we

have 18 plots and 6 fixed effects to estimate), we set our signif-

icance threshold at |t| = 2.1, which corresponds to a = 0.05

for 18 degrees of freedom.

One limitation of the per capita growth rate analysis is that

it includes only non-zero densities and excludes plot-level

extinction and colonization events. We considered separate

analyses of precipitation effects on plot-level colonization and

extinction probabilities, but the sample sizes were too small.

Therefore, we simply report the raw data on extinction and

colonization events.


EXPERIMENTAL TESTS OF POPULATION MODELS 1795

Comparison of historical and experimental responses

Before explaining how we compared the historical and experi-

mental precipitation responses of our target species, we need

to describe differences between the historical data and models

and the experimental analysis described above. The historical

data come from permanent quadrats in which all individual

plants were mapped annually from the 1930s into the early

1970s. Using an algorithm to track individual plants (Lauen-

roth & Adler, 2008), A&HRL extracted data on forb species’

individual survival and age, and quadrat-level recruitment.

A&HRL then built Bayesian hierarchical models to analyze

survival and recruitment. Survival was a function of spatial

(quadrat) random effects, age-class (two classes: 1 year old

plants and older plants), precipitation and temperature effects

(the three precipitation covariates described above, along with

dormant and growing season temperature), and the density of

conspecifics and the cover of short and tall perennial grasses

within a 10 cm radius of the focal individual. Recruitment, the

number of new individuals of the focal species appearing in a

quadrat in a year, was modeled as a Ricker-type function, with

the number of new recruits depending on the estimated den-

sity of parent plants in the quadrat the previous year and on

the fecundity of those parent plants in each year. The esti-

mated density of parent plants was a latent variable intended

to accommodate uncertainty about the seed bank as well as

the number of plants that might disperse seed to the focal

quadrat or influence seedling establishment in the quadrat.

Fecundity was a function of random quadrat effects and the

same climate and vegetation covariates used in the survival

model. Once the survival and recruitment models were fitted,

they were combined to simulate the population dynamics of

the focal forb species (Adler & HilleRisLambers, 2008; Dalgle-

ish et al., 2010). These models fit the data reasonably well: For

all six species, correlations between observed and predicted

per capita growth rates ranged from 0.57 (for C. undulatum) to

0.78 (for L. ovalifolia).

We used the following procedure to generate predictions

for the experimental plots using the historical model: For each

experimental plot in each year, we used the observed year-

specific temperature covariates, year and treatment-specific

precipitation covariates, and year and plot-specific conspecific

density and tallgrass cover covariates to drive the historically

based survival and recruitment models. Predicted survival

plus predicted recruitment divided by density in the previous

year equals the predicted per capita growth rate for each plot,

which we can compare directly to the experimental observa-

tions. To make these comparisons meaningful, we focused on

variability in population growth rates explained by the climate

and composition covariates, rather than on variation

explained by random effects. For example, because the experi-

mental plots are not in exactly the same locations as the histor-

ical plots (which are twenty to hundreds of meters away), the

historical random quadrat effects cannot improve prediction.

To keep the comparison ‘fair’, we also ignored (averaged

across) the random plot effects fit in the mixed-effects analysis

of the experimental data. Also, the historical survival model is

age-structured, but we had no information about the ages of

the plants in our experiment. Therefore, we used a weighted

average of the age-class survival rates, with weights given by

the proportion of plants in each age class in the historical data

set. Because we assumed that plant age structure was constant

across all experimental plots, age cannot affect the correlation

between observations and the predictions of the historical

model.

A second issue in applying the historical model to the

experimental data involves density-dependence. In the histori-

cal analysis, density-dependence was modeled at the neigh-

borhood scale for survival and the 1 m2 scale for recruitment,

while the experimental analysis incorporates density-depen-

dence at a coarser scale (16 m2 in most plots but 6 m2 for the

switched-treatment plots in 2010–2011). To apply our histori-

cal models to the experimental data, we converted our tall-

grass canopy cover estimates into basal cover, the currency

used in the historical dataset, by dividing by 2 (our qualitative

results were insensitive to exact value of the conversion fac-

tor). Conspecific density-dependence presented a more diffi-

cult problem. To directly apply the density-dependence

parameters estimated by the historical models, we scaled

plot-level densities of the focal species to the 10 cm radius

neighborhood scale (for survival) and to the 1 m2 scale (for

recruitment). However, this scaling assumes that the individu-

als of the target species are distributed uniformly across the

plot. Moreover, A&HRL’s recruitment model used a latent

‘parent plants’ variable to account for propagules arriving

from outside the plot or from the seed bank. We had no way

to link our experimental data to this latent variable, and thus

relied only on observed densities. Because our density esti-

mates from the experimental plots did not perfectly match the

density estimates required by the historical model, we calcu-

lated a second set of predictions in which we simply held den-

sities constant across all experimental plots at the site average

density. We applied the same approach to predictions from

the experimental model, first using observed plot-specific

densities and then conducting a second set of calculations that

averaged across plot-level variation in density.

Our first approach for comparing the historical and experi-

mental predictions focused on the sensitivity of the population

growth rate to the three precipitation covariates and perennial

grass cover. This comparison addresses differences in the

magnitude and direction of covariate effects in the historical

population models and the statistical models fit directly to the

experimental data. For each species, and for both models, we

calculated the change in the predicted per capita growth

caused by a 10% increase in each covariate (leaving the

remaining covariates unperturbed), averaging across all

experimental plot-by-year combinations.

Our second approach for comparing the historical and

experimental responses addressed the accuracy of the predic-

tions. Given the challenges of using an existing model to make

predictions about a novel data set with a different structure

than the original data, we evaluated model accuracy by using

correlations between predictions and observations, rather than

a likelihood-based metric that would account for absolute dif-

ferences between predictions and observations. We calculated

the correlation between the observed experimental responses

and the predictions of the statistical model fit directly to the



data, and then calculated the correlation between the experi-

mental responses and predictions made by the statistical mod-

els fit directly to the experimental data. The comparison of

these observed-predicted correlations provides an intuitive

way to evaluate the performance of the historical model. We

present results for predictions that include density-depen-

dence and for predictions that ignore density-dependence by

averaging densities across all plots.

Results

Precipitation manipulations

Ambient precipitation from 2008 to 2011 was 681, 637,

695, and 377 mm, respectively, with 3 years above the

580 mm mean. Assuming 50% interception by the rain-

fall shelters, the drought plots received half of these

totals in 2009–2011 and a little more than half in the

2008 water year as the shelters were not constructed

until March, 2008. The irrigation treatment increased

the annual totals to 1062, 1018, 1075, and 757 mm, all

well above the mean. The experimental manipulations

resulted in sequences of precipitation years outside the

range of variation experienced during the 1936–1972period of historical data collection (Fig. 1). The combi-

nation of high ambient precipitation and irrigation led

to particularly unusual precipitation sequences. The

precipitation treatments led to clear differences in soil

moisture (Fig. S1 in the Supplementary Information).

Experimental effects on forb populations

Although the main effects of treatment on density were

not significant for any species (Fig. 2, Table S1), year-

by-treatment interactions indicated that our precipita-

tion manipulation caused significant changes in density

by 2010 for all species except C. undulatum and

Ps. tenuiflora (the treatment swap in 2011 was not

included in this analysis). Irrigation decreased the

0 200 400 600 800 1000

020

040

060

080

010

00

Lag annual precipitation (mm)

Gro

win

g se

ason

pre

cipi

tatio

n (m

m) historical observations

droughtambientirrigdrought−>irrigirrig−>drought

Fig. 1 Comparison of historical precipitation from 1936 to 1972

and precipitation in our experimental treatments from 2008 to

2011.

0.0

0.5

1.0

1.5

C. undulatum

2008 2009 2010 2011

01

23

45

E. angustifolia L. ovalifolia

2008 2009 2010 2011

05

1015

2008 2009 2010 2011

0.0

0.5

1.0

1.5

2.0

2.5

0.0

0.2

0.4

0.6

0.8

1.0

Pa. jamesii

2008 2009 2010 2011

Ps. tenuiflora

2008 2009 2010 2011

ambientdroughtirrigationdrought−>irrigirrig−>drought

01

23

4

T. megapotamicum

2008 2009 2010 2011

Year

Mea

n de

nsity

m−2

Fig. 2 Changes in mean densities by treatment for each species. In 2011, drought and irrigation plots were split into half, with one half

receiving the same treatment as before and the other half receiving the opposite treatment. Bars show standard errors of the experimen-

tal observations.



density of all four responding species (a significant,

negative irrigation-by-year interaction effect; Table

S1). Drought had a positive effect over time on

the density of Pa. jamesii. Tallgrass cover, which

increased with precipitation and with time (Fig. S2),

had no significant effects on forb densities. Livestock

exclusion may have contributed to the increases in

grass cover.

The per capita growth rates of three species showed

significant responses to variation in precipitation

among years and treatments (Fig. 3, Table 1). E. angust-

ifolia, L. ovalifolia, and Pa. jamesii responded negatively

to lag annual precipitation, and the first two species

responded positively to dormant season precipitation.

Growing season precipitation did not have a significant

effect on any species. Tallgrass cover had a significant

effect on one species, T. megapotamicum, and the effect

was negative (Fig. 4, Table 1). Conspecific density-

dependence had significant effects on two species,

C. undulatum and Ps. tenuiflora (Table 1) and in both

cases the effect was negative.

The per capita growth rate analysis excludes coloni-

zation and extinction events. Although the low number

of such events for most species limits statistical analy-

sis, the raw data show more extinctions and fewer colo-

nizations in the irrigation treatment than in the ambient

or drought treatments (Table S2).

Comparison of historical and experimental precipitationeffects

We compared the historical and experimental effects of

the precipitation covariates and grass cover by using

both the historical population model and the statistical

model fit directly to the experimental data to calculate

the sensitivity of the mean per capita growth rates to a

10% increase in each of these factors. For C. undulatum,

Ps. tenuiflora, and T. megapotamicum, the historical

population models and the contemporary statistical

models generated consistent results. Sensitivities to

each covariate were always consistent in direction,

though the magnitude of the sensitivities varied

between models (Fig. 4). For E. angustifolia, L. ovalifolia,

and Pa. jamesii, the historical model predicted strong

positive effects of lag precipitation and growing season

precipitation while the contemporary statistical model

showed negative effects of these covariates. Sensitivities

of the two models to dormant season precipitation were

consistent in direction for two of these three species,

and sensitivities to grass cover were consistent in direc-

tion, if variable in magnitude, for all three species.

We evaluated the accuracy of predictions from the

historically based population models by comparing

correlations between experimental observations and

predictions from the historical population models with

–2.0

–1.0

0.0

1.0

C. undulatum

2009 2010 2011

ambientdroughtirrigdrought−>irrigirrig−>drought

–2.0

–1.0

0.0

1.0

E. angustifolia

2009 2010 2011

–2.0

–1.0

0.0

1.0

L. ovalifolia

2009 2010 2011

–2.0

–1.0

0.0

1.0

Pa. jamesii

2009 2010 2011

–2.0

–1.0

0.0

1.0

Ps. tenuiflora

2009 2010 2011

–2.0

–1.0

0.0

1.0

T. megapotamicum

2009 2010 2011

Year

Mea

n pe

r cap

ita g

row

th ra

te

Fig. 3 Mean (log) per capita growth rate by species and treatment. In 2011, drought and irrigation plots were split in half, with one

half receiving the same treatment as before and the other half receiving the opposite treatment. Bars show standard errors of the

experimental observations. ‘Year’ refers to the change between the year shown and the previous year. The dotted line indicates a log

per capita growth rate of zero; points above this line indicate an increasing population and points below the line indicate a decreasing

population.



correlations between experimental observations and

predictions of the statistical models fit directly to the

experimental data. These comparisons, which ignored

variation explained by quadrat or plot random effects

(Table 2), showed that the historically based models

performed surprisingly well for three species and

poorly for three (Table 2, Fig. 5). We also compared

correlations between models with and without conspe-

cific density-dependence (Table 2). After removing plot

and density effects, the historically based population

model actually performed as well or better than the

statistical model fit directly to the experimental data for

C. undulatum, Ps. tenuiflora, and T. megapotamicum

(Table 2, Fig. 5). The historical model predicted growth

rates in the ambient precipitation treatments especially

well (black symbols in Fig. 5). For these three species,

correlations between experimental predictions and

historical observations increased from values of 0.39,

0.36, and 0.59, respectively, when considering all treat-

ments, to values of 0.47, 0.51, and 0.77 if we considered

only the ambient treatment plots.

For E. angustifolia, L. ovalifolia, and Pa. jamesii, predic-

tions from the historically based model were either

weakly or negatively correlated with the experimental

observations (Table 2, Fig. 5). However, historically

based predictions for the ambient plots were better,

especially for L. ovalifolia. For these three species, corre-

lations between experimental predictions and historical

observations increased from values of �0.13, �0.36,

and 0.07, respectively, when considering all treatments,

to correlations of 0.11, 0.49, and 0.22 when we only

considered the ambient treatments.

Discussion

Although historically based populations models are

often recommended as a promising approach for eco-

logical forecasting (Botkin et al., 2007), experimental

validation is rare. The ability of population models fit

with historical, observational data to successfully

predict the experimental responses of three of our six

study species provides some confidence in using this

approach to forecast climate change impacts. On the

other hand, the historically based models performed

poorly for the remaining species. These successes and

failures offer lessons to guide future research and

ecological forecasting approaches.

How consistent are the historical and experimentalrelationships between precipitation and populationperformance?

Our sensitivity analysis showed that experimental

responses to precipitation were consistent with histori-

cal responses in direction, if not magnitude, for three

species, C. undulatum, Ps. tenuiflora, and T. megapotami-

cum (Fig. 4). However, for E. angustifolia, L. ovalifolia,

and Pa. jamesii, the sensitivity analyses revealed large

differences in historical and experimental responses.

For these three species, historical responses to lag

annual and growing season precipitation were weak

Table 1 Linear mixed-effects models for forb per capita

growth rates. Shown are estimates of the fixed effects; random

effects (block and plot) are not shown. |t-values| > 2.1 are

shown in bold as a conservative estimate of statistical signifi-

cance (a = 0.05 for df = 18)

Coefficient Value Standard error t-value

Cirsium undulatum

Intercept �0.3613 0.2892 �1.2491

Lag ppt 0.0001 0.0004 0.3100

Growing season ppt 0.0002 0.0006 0.4315

Dormant season ppt 0.0006 0.0022 0.2935

Grass Cover 0.0075 0.0061 1.2375

Lag density �0.5291 0.1340 �3.9495

Echinacea angustifolia

Intercept 0.2695 0.2117 1.2734

Lag ppt �0.0011 0.0003 �3.2520

Growing season ppt �0.0004 0.0003 �1.3190


Grass Cover �0.0001 0.0048 �0.0207

Lag density 0.0148 0.0429 0.3441

Lesquerella ovalifolia

Intercept 0.9882 0.3641 2.7143

Lag ppt �0.0021 0.0005 �3.8475



Grass Cover 0.0010 0.0067 0.1484

Lag density �0.0303 0.0214 �1.4135

Paronychia jamesii

Intercept 0.5451 0.3586 1.5201

Lag ppt �0.0019 0.0008 �2.4278



Grass Cover 0.0070 0.0094 0.7398

Lag density �0.2780 0.1438 �1.9330

Psoralea tenuiflora

Intercept 0.5021 0.3070 1.6353

Lag ppt �0.0004 0.0006 �0.7318


Dormant season ppt �0.0014 0.0020 �0.6951

Grass Cover 0.0071 0.0075 0.9466

Lag density �0.8542 0.3001 �2.8458

Thelesperma megapotamicum

Intercept 1.2401 0.5306 2.3371

Lag ppt 0.0012 0.0009 1.2401

Growing season ppt 0.0002 0.0008 0.1852

Dormant season ppt �0.0001 0.0030 �0.0223

Grass Cover �0.0319 0.0106 �3.0084

Lag density �0.2904 0.1527 �1.9021



and positive, while the experimental responses to these

precipitation covariates were strong and negative. The

obvious question is why the historical models matched

experimental responses so well for some species and so

poorly for others. Although we found no obvious

differences in the traits of the two groups of species, we

did notice one dramatic difference in historical

sampling. The three species for which the historical

LagP

PT

Gro

PP

T

Dor

mP

PT

Gra

ss

–0.15

–0.10

–0.05

0.00

0.05

0.10

0.15C. undulatumExperimentalHistorical

LagP

PT

Gro

PP

T

Dor

mP

PT

Gra

ss

–0.15

–0.10

–0.05

0.00

0.05

0.10

0.15E. angustifolia

LagP

PT

Gro

PP

T

Dor

mP

PT

Gra

ss

–0.15

–0.10

–0.05

0.00

0.05

0.10

0.15L. ovalifolia

LagP

PT

Gro

PP

T

Dor

mP

PT

Gra

ss

–0.15

–0.10

–0.05

0.00

0.05

0.10

0.15Pa. jamesii

LagP

PT

Gro

PP

T

Dor

mP

PT

Gra

ss–0.15

–0.10

–0.05

0.00

0.05

0.10

0.15Ps. tenuiflora

LagP

PT

Gro

PP

T

Dor

mP

PT

Gra

ss

–0.15

–0.10

–0.05

0.00

0.05

0.10

0.15T. megapotamicumΔ

grow

th ra

te

Fig. 4 Sensitivity of population growth rate to covariates included in both the historical and experimental models. The y-axis shows

the change in the predicted (log) population growth rate, averaged across all experimental plots, caused by a 10% increase in the values

of the observed covariates. Parameter labels on the x-axis: ‘LagP’ is annual precipitation in the previous year, ‘GrowP’ is growing

season precipitation, ‘DormP’ is dormant season precipitation, and ‘Grass’ is perennial grass cover.

Table 2 Comparison of historical and experimental predictions of population growth rates

Species Full experimental Fixed experimental

Fixed experimental,

constant density Fixed historical

Fixed historical,

constant density

Cirsium undulatum 0.63 0.58 0.27 0.02 0.39

Echinacea angustifolia 0.65 0.63 0.62 �0.23 �0.13

Lesquerella ovalifolia 0.68 0.66 0.64 �0.40 �0.36

Paronychia jamesii 0.50 0.50 0.43 �0.28 0.07

Psoralea tenuiflora 0.45 0.45 0.21 0.07 0.36

Thelesperma megapotamicum 0.64 0.56 0.47 0.48 0.59

Values are correlation coefficients for the observed population growth rates in the experimental plots against population growth

rates predicted by the following models: The ‘Full experimental’ model is a generalized linear mixed-effects model fit directly to the

experimental data, and includes both fixed and random effects. The ‘Fixed experimental’ model uses predictions from the full

model after averaging across plot random effects. The ‘Fixed experimental, constant density’ model also removes conspecific den-

sity-dependent effects from the predictions by replacing the observed plot-level density with the average density across all plots.

The ‘Fixed historical’ model makes predictions using models fit to long-term observational data, averaging across spatial (plot) ran-

dom effects. For the ‘Fixed historical, constant density’ predictions, we replace the plot-level conspecific density covariate with the

average conspecific density across all plots.



responses failed to match the experimental responses

occurred in only six to eight of the permanent, historical

quadrats located on shallow limestone soils, whereas

the species whose experimental and historical responses

were consistent occurred in 14–44 quadrats, across a

range of soil types. We conducted our experiment on

the shallow soils where all six species occur and initially

worried that models for the widespread species might

be less accurate for that specific soil type. Our results

show the opposite problem; the historical models were

more reliable for the generalist species. For the shallow

soil specialists, low spatial replication may have

decreased the power to detect climate-demography cor-

relations or, conversely, increased the chances of spuri-

ous correlations.

The experiment helped confirm historical relation-

ships that we originally found counter-intuitive.

Because water is a limiting resource for plant growth in

our prairie study site, we were surprised when our his-

torical analysis suggested that some forb populations

increase in dry years and decrease in wet years (Adler

& HilleRisLambers, 2008). Our experimental results

confirmed these responses: Population densities were

lower in the irrigation treatment for four of the six

species and four species responded negatively to lag

annual and growing season precipitation, though the

strength of these negative responses varied.

Competition could explain these negative precipita-

tion effects if rapid growth by the dominant grasses

during wet years limits the amount of light or nutrients

available to the forb species. However, our historical

analysis suggested that this was not the case. Grass

cover had positive effects on the survival and recruit-

ment of many forb species (Adler & HilleRisLambers,

2008). Results from the precipitation experiment con-

firm that grass is not driving the negative effects of pre-

cipitation (or irrigation). Grass cover had no significant

effects on forb density in our experimental plots, and it

significantly affected the per capita growth rate of only

one species, T. megapotamicum. The negative effect of

grass cover on this species was consistent with the his-

torical effect. Of the three species which responded sig-

nificantly, and positively, to historical variation in grass

cover (Adler & HilleRisLambers, 2008), the experimen-

tal effect was positive as well, though not statistically

significant (Table 1, Fig. 4).

If competition from grasses is not the mechanism

explaining the negative effects of high precipitation on

some of our forbs, what is? We can only speculate that

high soil moisture promotes disease (Burdon, 1987) or

limits the growth of plants adapted to more aerated soil

conditions (Silvertown et al., 1999; Araya et al., 2011).

How well can a population model fit to historical,observational data predict population changes incontemporary, experimental plots?

For three species, C. undulatum, Ps. tenuiflora, and

T. megapotamicum, the historically based models

generated surprisingly good predictions of population

–0.4 0.0 0.4–1

.50.

01.

0–1 0 1 2 3

–1.5

0.0

1.0

C. undulatum

–0.8 −0.4 0.0

–2.0

–0.5

0 1 2 3–2

.0–0

.5

E. angustifolia

–1.5 –0.5 0.5

–3–1

1

0 2 4 6 8

–3–1

1L. ovalifolia

–1.0 –0.5 0.0

–3–1

01

–1 0 1 2 3

–3–1

01

Pa. jamesii

–0.3 –0.1 0.1–1.5

0.0

1.0

–2 –1 0 1 2–1.5

0.0

1.0

Ps. tenuiflora

–1.0 –0.5 0.0 0.5

–20

1

Experimentalpredictions

–1 0 1 2

–20

1

Historicalpredictions

T. megapotamicum

Exp

erim

enta

l obs

erva

tions

Fig. 5 Comparison of observed (log) per capita growth rates

from the experimental plots with predictions from statistical

models fit directly to the experimental data (left column) and

from simulations of population models fit to historical, observa-

tional data (right column) for all six species (rows). Both sets of

predictions exclude variation explained by random effects and

density-dependence. Lines show the 1 : 1 relationship. Colors

and symbols: ambient treatment = solid black squares, drought

= solid read circles, irrigation = solid blue triangles, drought

switched to irrigation in 2011 = hollow blue circles, irrigation

switched to drought in 2011 = hollow red triangles.



growth rates in our precipitation manipulation experi-

ment. In fact, ignoring quadrat random effects as well as

the effects of local density-dependence, the historically

based population models outperformed statistical mod-

els fit directly to the experimental data for these species.

Two of these species, C. undulatum and Ps. tenuiflora,

had strong historical correlations between precipitation

and vital rates but very weak responses to our experi-

mental precipitation manipulations. Most of the varia-

tion in the experimental responses of these two species

was explained by quadrat and local conspecific density-

dependence. Perhaps we should not be surprised that

35 years of historical, observational data may better

describe climate-vegetation relationships than a rela-

tively short experimental manipulation. For example,

the historically based models include temperature

effects while our experimental analysis did not. How-

ever, holding temperature constant when generating

predictions from the historical model had little effect on

the correlation between observations and predictions.

Thus, the success of the historical model relative to the

contemporary statistical model does not directly reflect

the inclusion of temperature covariates in the former.

The third species whose dynamics were successfully

predicted by the historically based model, T. megapotam-

icum, responded in a strong, consistent way to both his-

torical and experimental precipitation. These three

success stories demonstrate that, for some species, we

may be able to use the past to predict the future.

The failure of the historical model to predict the

experimental responses of the remaining three species

is also instructive. For E. angustifolia and Pa. jamesii, the

explanation seems fairly clear: These species occurred

on a very limited number of historical quadrats, limit-

ing our power to detect climate-demography relation-

ships. Given these weak relationships, we should not

have expected the historical models for these two spe-

cies to perform well. However, the results for L. ovalifo-

lia are harder to explain. Although this species was also

restricted to a small number of historical quadrats, it

responded significantly, and positively, to dormant sea-

son precipitation in both the historical and experimen-

tal analyses. However, its weak positive sensitivity to

historical lag and growing season precipitation contrast

with strong negative sensitivity to experimental lag and

growing season precipitation (Fig. 4). Moreover, the

historical model includes a very strong, positive effect

of tallgrass cover on L. ovalifolia, in contrast to the

weak, but also positive, experimental response we

observed. The poor predictions of the historical model

reflect the combined effects of very high growing

season precipitation and very high grass cover in the

irrigated plots, conditions that fall outside the historic

range of variability. For the ambient treatment plots,

where precipitation and grass cover fell within the

historical range of variability, predictions from the

historical model were actually quite good (observed-

predicted correlation of 0.49; Fig. 5 black squares).

Experimental artifacts might also play a role. For exam-

ple, our drought shelters did cause subtle decreases in

radiation and day time temperatures, and the regular

intervals of our irrigation treatments led to consistently

high soil water availability. However, it is not clear

why these artifacts would have stronger effects on

L. ovalifolia than on the other species. Our results for

L. ovalifolia offer a cautionary tale about the risks of

using historically based models to make forecasts about

conditions far outside the historical range of variability.

Another lesson from our comparison of historical

and experimental analyses concerns the role of conspe-

cific density-dependence and the spatial scale at which

density-dependent effects are estimated. Even after

scaling experimental densities from the full plot scale to

the smaller scales at which density was incorporated in

the historical analyses, simulations of the historically

based models that incorporated local variation per-

formed poorly for all species except T. megapotamicum

(Table 2). In fact, for many species, the simulated pre-

dictions improved when we ignored local density vari-

ation. One explanation for this result is that our study

species are not distributed randomly, as our scaling of

density assumed. Another alternative is that density-

dependent processes, such as natural enemies, may

vary in strength from year to year, so that the long-term

average of such effects (as in our historical models)

may be a poor predictor of density-dependence during

a short period. Finally, density-dependence may oper-

ate differently in experiments, where small manipu-

lated plots are surrounded by large areas with ambient

densities, than in observational studies, where densities

are likely to be similar in study plots and the surround-

ing matrix. In many cases, ecological forecasts may not

need to include density-dependent interactions, but

when they do, care should be taken to apply histori-

cally based parameter estimates appropriately.

Observations vs. experiments

Our analysis highlights the strengths and weaknesses

of forecasting approaches based on either long-term

observational data or manipulative experiments. The

primary advantage of the long-term observational

approach is its power for detecting relationships

between demographic responses and many potentially

interacting climate variables and biotic covariates. In

contrast, experiments can only manipulate a limited

number of variables and often those manipulations

introduce unrealistic artifacts. On the other hand, the



great disadvantage of the observational approach is its

reliance on extrapolation to address conditions outside

the historical range of variability, the dangers of which

are illustrated by the poor predictions of our historical

model for L. ovalifolia in the irrigated treatment. The

obvious advantage of the experimental approach is its

ability to isolate and test individual mechanisms. How-

ever, strong inference about underlying mechanisms is

not essential for successful prediction; empirical models

based on pattern-recognition and machine learning

often generate better predictions than mechanistic

models (Breiman, 2001). Where an understanding of

mechanism may be critical is for predicting nonlinear

responses to conditions far outside the historical range

of variability. The most powerful strategy is to combine

both approaches, relying on long-term observational

data to detect climate-demography relationships and

experiments to test underlying mechanisms and explore

changes in these relationships under novel conditions.

Our results increase our confidence in extrapolating

historical, observational relationships to predict popu-

lation responses to climate change, especially when

those historical correlations have strong statistical sup-

port from a spatially well-replicated population. On the

other hand, the failure of the historical models to pre-

dict the experimental response of L. ovalifolia suggests

caution in applying this approach, especially for condi-

tions far outside the historic range of variability. Our

study demonstrates the power of combining long-term

observational data sets with contemporary experi-

ments. Although we cannot hope to apply this inte-

grated approach to all systems, an expanding collection

of case studies would help establish the general poten-

tial for basing ecological forecasts on projections from

historically based models.

Acknowledgements

We thank the Department of Biology, Fort Hays State Univer-sity, for access to the field site and for use of laboratory space.Janneke Hill Ris Lambers and two anonymous reviewers madesuggestions that greatly improved earlier versions of the manu-script. The research was supported by grants to PBA from NSF(DEB-1054040), Utah State University, and the Utah AgricultureExperiment Station (UAES), Utah State University, and isapproved as UAES journal paper number 8472. KMB wassupported by the Shortgrass Steppe Long Term EcologicalResearch (SGS LTER) site (NSF DEB 1027319) and by TheNature Conservancy Nebraska Chapter’s J.E. Weaver Competi-tive Grants Program.

References

Adler PB, HilleRisLambers J (2008) The influence of climate and species composition

on the population dynamics of 10 prairie forbs. Ecology, 89, 3049–3060.

Adler PB, Leiker J, Levine JM (2009) Direct and indirect effects of climate change on a

prairie plant community. PLoS ONE, 4, e6887.

Adler PB, Dalgleish HJ, Ellner SP (2012) Forecasting plant community impacts of

climate variability and change: when do competitive interactions matter? Journal of

Ecology, 100, 478–487.

Albertson FW, Tomanek GW (1965) Vegetation changes during a 30-year period in

grassland communities near Hays, Kansas. Ecology, 46, 714–720.

Andrewartha HG, Birch LC (1954) The Distribution and Abundance of Animals. Univer-

sity of Chicago Press, Chicago, IL.

Araya YN, Silvertown J, Gowing DJ, McConway KJ, Linder HP, Midgley G (2011)

A fundamental, eco-physiological basis for niche segregation in plant communi-

ties. New Phytologist, 189, 253–258.

Biondi F, Gershunov A, Cayan DR (2001) North Pacific decadal climate variability

since 1661. Journal of Climate, 14, 5–10.

Botkin D, Saxe H, Araujo M et al. (2007) Forecasting the effects of global warming on

biodiversity. BioScience, 57, 227–236.

Breiman L (2001) Statistical modeling: the two cultures (with comments and a rejoin-

der by the author). Statistical Science, 16, 199–231.

Burdon JJ (1987) Diseases and Plant Population Biology. Cambridge University Press,

Cambridge.

Coulson T, Catchpole E, Albon S et al. (2001) Age, sex, density, winter weather, and

population crashes in Soay sheep. Science, 292, 1528–1531.

Dalgleish HJ, Koons DN, Adler PB (2010) Can life-history traits predict the response of

forb populations to changes in climate variability? Journal of Ecology, 98, 209–217.

Dalgleish HJ, Koons DN, Hooten MB, Moffet CA, Adler PB (2011) Climate influences

the demography of three dominant sagebrush steppe plants. Ecology, 92, 75–85.

Doak DF, Morris WF (2010) Demographic compensation and tipping points in

climate-induced range shifts. Nature, 467, 959–962.

Dunne JA, Saleska SR, Fischer ML, Harte J (2004) Integrating experimental and

gradient methods in ecological climate change research. Ecology, 85, 904–916.

Hessl AE, McKenzie D, Schellhaas R (2004) Drought and pacific decadal oscillation

linked to fire occurrence in the inland Pacific Northwest. Ecological Applications, 14,

425–442.

Lauenroth WK, Adler PB (2008) Demography of perennial grassland plants: survival,

life expectancy and life span. Journal of Ecology, 96, 1023–1032.

Luo Y, Ogle K, Tucker C et al. (2011) Ecological forecasting and data assimilation in a

data-rich era. Ecological Applications, 21, 1429–1442.

Post E, Stenseth N (1999) Climatic variability, plant phenology, and northern ungu-

lates. Ecology, 80, 1322–1339.

Rutishauser T, St€ockli R, Harte J, Kueppers L (2012) Climate change: flowering in the

greenhouse. Nature, 485, 448–449.

Silvertown J, Dodd M, Gowing D, Mountford J (1999) Hydrologically defined niches

reveal a basis for species richness in plant communities. Nature, 400, 61–63.

Solhoy T, Stenseth N, Kausrud K et al. (2008) Linking climate change to lemming

cycles. Nature, 456, 93–U3.

Venables WN, Ripley BD (1994) Modern Applied Statistics with S-Plus. Springer-Verlag,

New York.

Wolkovich EM, Cook BI, Allen JM et al. (2012) Warming experiments underpredict

plant phenological responses to climate change. Nature, 485, 494–497.

Supporting Information

Additional Supporting Information may be found in theonline version of this article:

Figure S1. Mean daily volumetric soil water content in thetop 5 cm of the soil profile during the 2008 (a), 2009 (b), 2010(c), and 2011 (d) growing seasons for each treatment (n = 5for drought, n = 2 for ambient, and n = 6 for irrigation in2008–2010, and n = 2 for each treatment in 2011). Soil waterwas measured every 4 h and averaged to one daily value.Figure S2. Tallgrass cover by treatment.Table S1. Models for forb density. Shown are estimates ofthe fixed effects from generalized liner mixed-effects modelsthat assume a negative binomial distribution for density.Table S2. Colonization and extinction events by species andtreatment. Numbers indicate the number of colonization/extinction events. “N” is the total number of observationsfor each species in each treatment (number of plots multi-plied by the number of year-to-year transitions observed).



can the past predict the future? experimental tests of historically based population models

Documents