introduction to statistical estimation methods finse alpine research center, 10-11 september 2010

Introduction to statistical estimation methods

Finse Alpine Research Center,10-11 September 2010

OUTLINE

TODAY: Mostly maximum likelihood

Focus of the course:•Introduce essential methods for statistical modelling in ecology•Construction of biologically sound models•Estimation of parameter values and associated uncertainties•Interpretation of results•Introduce concepts that are important for the course next week

TOMORROW: Mostly Bayesian statistics

SUNDAY: Day off / Glacier hike

MONDAY TO FRIDAY: Occupancy modelling workshop (3 new lecturers – joining on the glacier hike)

– Some lectures – Many Exercises – Tutoring – – Help each other – Ask questions – Be active!

MATERIAL ON: http://www.finse.uio.no/

Quantification of …:… relationship between variables… differences between groups of individuals… the effect of experimental treatments… predictions for the future (effects of climate change)… of effect of management strategiesand not the least: Quantification of uncertainty!

Most studies in ecology require quantification in some way:

Quantification of anything requires:… some sort of model… ways to estimate parameters / distributions of random variables

Claim: •In ecology, the main question is seldom IF something has an effect

•The questions are more about HOW and HOW MUCH

Energy expenditure(Field metabolic

rate)

Body mass

HABITAT

Season

Sex

Reproductive state

Temperature

Weather

Activity / Behaviour

OTHER THINGS(biological things +

measurement error)

?

Example:How does habitat quality affect energy expenditure?

The question should not be IF these variables have an effect – from biological theory we can be almost certain that all these variables have an effect.

Relationships in ecology are almost infinitely complex (there is no true model)

“All models are wrong, but some are useful” (Box)

Energy expenditure(Field metabolic

rate)

Body mass

HABITAT

Season

Sex

Reproductive state

Temperature

Weather

Activity / Behaviour

OTHER THINGS(biological things +

measurement error)

?

“Typical approach”:

1.Put everything into a linear model (multiple regression)

2. Remove non-significant effects

3. Reporting p-values

Without thinking about HOW the various predictor variables can affect the response variables

Without thinking about what you are really interested in

Without quantifying HOW MUCH the predictor variables affect the response variable, and without thinking about BIOLOGICAL SIGNIFICANCE

Eff

ect s

ize

Small

Large

p<0.001 NS p<0.05 NS p<0.05

Biologically significant

Not biologically significant

Could be important – more data needed

Statistical significance vs. biological relevance 5 different confidence intervals:

Null-hypothesis tests are often used erroneously to make a classification of “no effect” (not significant) and “significant effect” with no consideration of the potential biological significance (a somewhat thoughtless process).

E.g. statements like “Predator density did not affect prey survival” with no further detail on effect size.

Number of papers questioning the utility of null hypothesis testing in scientific research

0

20

40

60

80

100

120

140

160

180

1940s 1950s 1960s 1970s 1980s 1990s

EcologyBus./Econ.StatisticsMedicineSocial Sci.All

Anderson et al. 2000

Null-hypothesis testing in ecological science:

Yoccoz, N. G. 1991. Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bulletin of the Ecological Society of America 72:106-111.

Anderson, D. R., K. P. Burnham, and W. L. Thompson. 2000. Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64:912-923.

Web-page:

http://www.cnr.colostate.edu/~anderson/

Null-hypothesis testing in ecological science:

Yoccoz, N. G. 1991. Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bulletin of the Ecological Society of America 72:106-111.

Anderson, D. R., K. P. Burnham, and W. L. Thompson. 2000. Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64:912-923.

Web-page:

http://www.cnr.colostate.edu/~anderson/

= deviance + 2 × no. parameters + small sample correction

1

)1(22))ˆ(ln(2 AICc

Kn

KKK L-

In a set of models, the model with the lowest AIC will, on average, be the model with the lowest K-L distance (i.e., give predictions closest to the truth).

In a set of models, the model with the lowest AIC will, on average, be the model with the lowest K-L distance (i.e., give predictions closest to the truth).

Using p-values for model selection is a different thing B

ias2

Var

ianc

e (u

ncer

tain

ty)

Number of parameters (K) Prediction

Freq

.Fr

eq.

Truth

Too simple model

Too complex model

Example:Influence of testosterone on size of home-range in voles.

60 sites…

Testosterone treated male

Control male

Do testosterone treated males have larger home-ranges at high densities? What are the effects at low densities?

Response variable: Predictor variables:

Home range size measured by radio-telemetry ~

Treatment

Density

Body mass

Think about HOW things are related

F Value Pr(F)

Body mass 15.79 <0.001

Treatment 0.99 0.32

Density 104.90 <0.001

Body mass × Treatment 0.003 0.96

Density × Treatment 2.66 0.11

Full model:

F Value Pr(F)

Body mass 15.95 <0.001

Treatment 1.00 0.32

Density 105.95 <0.001

Density × Treatment 2.63 0.11

Step 1:

F Value Pr(F)

Body mass 15.68 <0.001

Treatment 0.99 0.32

Density 104.18 <0.001

Step 2:

F Value Pr(F)

Body mass 15.06 <0.001

Density 96.15 <0.001Step 3:

Conclusion:

There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11).

Conclusion:

There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11).

Response variable: home range size

Density

Hom

e ra

nge

size

log(Density)

log(

Hom

e ra

nge

size

)

D<c: y = constant

D≥c: y = c*(D-c)b

D<c: log(y) = constant

D≥c: log(y) = log(c) + b*log(D-c)

Body mass (M)

Hom

e ra

nge

size

log(Body mass)

log(

Hom

e ra

nge

size

)

y = aMb

log(y) = a+ b*log(M)

log(Population density)

log(

Hom

e ra

nge

size

)

Treatment + Treatment × Density Treatment + Density Treatment × Density

Candidate models

introduction to statistical estimation methods finse alpine research center, 10-11 september 2010

Documents

response variables

biological significance

variables differences

various predictor variables

useful box slide

biological theory

ecology construction

quantification of uncertainty