introduction to statistical estimation methods finse alpine research center, 10-11 september 2010
Post on 21-Dec-2015
216 views
TRANSCRIPT
OUTLINE
TODAY: Mostly maximum likelihood
Focus of the course:•Introduce essential methods for statistical modelling in ecology•Construction of biologically sound models•Estimation of parameter values and associated uncertainties•Interpretation of results•Introduce concepts that are important for the course next week
TOMORROW: Mostly Bayesian statistics
SUNDAY: Day off / Glacier hike
MONDAY TO FRIDAY: Occupancy modelling workshop (3 new lecturers – joining on the glacier hike)
– Some lectures – Many Exercises – Tutoring – – Help each other – Ask questions – Be active!
MATERIAL ON: http://www.finse.uio.no/
Quantification of …:… relationship between variables… differences between groups of individuals… the effect of experimental treatments… predictions for the future (effects of climate change)… of effect of management strategiesand not the least: Quantification of uncertainty!
Most studies in ecology require quantification in some way:
Quantification of anything requires:… some sort of model… ways to estimate parameters / distributions of random variables
Claim: •In ecology, the main question is seldom IF something has an effect
•The questions are more about HOW and HOW MUCH
Energy expenditure(Field metabolic
rate)
Body mass
HABITAT
Season
Sex
Reproductive state
Temperature
Weather
Activity / Behaviour
OTHER THINGS(biological things +
measurement error)
?
Example:How does habitat quality affect energy expenditure?
The question should not be IF these variables have an effect – from biological theory we can be almost certain that all these variables have an effect.
Relationships in ecology are almost infinitely complex (there is no true model)
“All models are wrong, but some are useful” (Box)
Energy expenditure(Field metabolic
rate)
Body mass
HABITAT
Season
Sex
Reproductive state
Temperature
Weather
Activity / Behaviour
OTHER THINGS(biological things +
measurement error)
?
“Typical approach”:
1.Put everything into a linear model (multiple regression)
2. Remove non-significant effects
3. Reporting p-values
Without thinking about HOW the various predictor variables can affect the response variables
Without thinking about what you are really interested in
Without quantifying HOW MUCH the predictor variables affect the response variable, and without thinking about BIOLOGICAL SIGNIFICANCE
Eff
ect s
ize
Small
Large
p<0.001 NS p<0.05 NS p<0.05
Biologically significant
Not biologically significant
Could be important – more data needed
Statistical significance vs. biological relevance 5 different confidence intervals:
Null-hypothesis tests are often used erroneously to make a classification of “no effect” (not significant) and “significant effect” with no consideration of the potential biological significance (a somewhat thoughtless process).
E.g. statements like “Predator density did not affect prey survival” with no further detail on effect size.
Number of papers questioning the utility of null hypothesis testing in scientific research
0
20
40
60
80
100
120
140
160
180
1940s 1950s 1960s 1970s 1980s 1990s
EcologyBus./Econ.StatisticsMedicineSocial Sci.All
Anderson et al. 2000
Null-hypothesis testing in ecological science:
Yoccoz, N. G. 1991. Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bulletin of the Ecological Society of America 72:106-111.
Anderson, D. R., K. P. Burnham, and W. L. Thompson. 2000. Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64:912-923.
Web-page:
http://www.cnr.colostate.edu/~anderson/
Null-hypothesis testing in ecological science:
Yoccoz, N. G. 1991. Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bulletin of the Ecological Society of America 72:106-111.
Anderson, D. R., K. P. Burnham, and W. L. Thompson. 2000. Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64:912-923.
Web-page:
http://www.cnr.colostate.edu/~anderson/
= deviance + 2 × no. parameters + small sample correction
1
)1(22))ˆ(ln(2 AICc
Kn
KKK L-
In a set of models, the model with the lowest AIC will, on average, be the model with the lowest K-L distance (i.e., give predictions closest to the truth).
In a set of models, the model with the lowest AIC will, on average, be the model with the lowest K-L distance (i.e., give predictions closest to the truth).
Using p-values for model selection is a different thing B
ias2
Var
ianc
e (u
ncer
tain
ty)
Number of parameters (K) Prediction
Freq
.Fr
eq.
Truth
Too simple model
Too complex model
Example:Influence of testosterone on size of home-range in voles.
60 sites…
Testosterone treated male
Control male
Do testosterone treated males have larger home-ranges at high densities? What are the effects at low densities?
Response variable: Predictor variables:
Home range size measured by radio-telemetry ~
Treatment
Density
Body mass
Think about HOW things are related
F Value Pr(F)
Body mass 15.79 <0.001
Treatment 0.99 0.32
Density 104.90 <0.001
Body mass × Treatment 0.003 0.96
Density × Treatment 2.66 0.11
Full model:
F Value Pr(F)
Body mass 15.95 <0.001
Treatment 1.00 0.32
Density 105.95 <0.001
Density × Treatment 2.63 0.11
Step 1:
F Value Pr(F)
Body mass 15.68 <0.001
Treatment 0.99 0.32
Density 104.18 <0.001
Step 2:
F Value Pr(F)
Body mass 15.06 <0.001
Density 96.15 <0.001Step 3:
Conclusion:
There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11).
Conclusion:
There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11).
Response variable: home range size
Density
Hom
e ra
nge
size
log(Density)
log(
Hom
e ra
nge
size
)
D<c: y = constant
D≥c: y = c*(D-c)b
D<c: log(y) = constant
D≥c: log(y) = log(c) + b*log(D-c)
Body mass (M)
Hom
e ra
nge
size
log(Body mass)
log(
Hom
e ra
nge
size
)
y = aMb
log(y) = a+ b*log(M)