brief review of statistical concepts and methods

39
BRIEF REVIEW OF STATISTICAL BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS CONCEPTS AND METHODS

Upload: doyle

Post on 09-Feb-2016

46 views

Category:

Documents


1 download

DESCRIPTION

BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS. Mathematical expectation. The mean (x) of random variable x is:. where n is the number of observations, the variance (s 2 ) is:. Mathematical expectation. The standard deviation ( s ) is:. The coefficient of variation is:. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

BRIEF REVIEW OF STATISTICAL BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS CONCEPTS AND METHODS

Page 2: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

The mean (x) of random variable x is:

where n is the number of observations, the variance (s2) is:

Mathematical expectationMathematical expectation

Page 3: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Mathematical expectationMathematical expectation

The standard deviation (s) is:

The coefficient of variation is:

Page 4: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Precision, bias, and accuracy

Page 5: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Basic probability

The probability of a event occurring is expressed as: P(event)

The probability of the event not occurring, 1- P(event) or P(~event).

If events are independent, the probability of events A and B occurring is estimated as: p(A) * p(B).

Page 6: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

The probability of capturing a single fish given 1 is present: p(capture)

capturing 2 fish given 2 are present:  p(capture)*p(capture) = p(capture)2, the probability of catching at least 1 given 2 present:  p(capture)*(1-p(capture )) + (1- p(capture))*p(capture) + p(detect)2

 or:  1- (1- p(capture))N

where N = number of fish present

Detection and capture probabilities

Page 7: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

The probability of detecting a fish during a single event: p(detect)

On all three sampling occasions is:  p(detect)*p(detect)*p(detect) = p(detect)3, the probability of not catching it during any of the 3 occasions is:  (1- p(detect))*(1- p(detect))*(1- p(detect)) = (1- p(detect))3,

 and the probability of catching it on at least 1 occasion is the complement of not catching it during any of the occasions:  1- (1- p(detect))3.

Probability example

Page 8: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

The probability a fish is present: p(present)

The probability detecting a fish, given it is present : p(detect | present)

The probability detecting a fish given it is not present?

The probability a fish is present and detected:  p(detect | present) * p(present)

Conditional probability

Page 9: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

The probability N fish are present: p(N)

The probability detecting at least 1 fish, given N are present : p(detect | N)

The probability N fish are present and at least 1 is detected:  p(detect | N) * p(N)

Conditional probability

Page 10: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

p(present | not detected)

The probability fish species present: p(present) not present: 1- p(present)

The probability detection, given present : p(detect | present) probability detection, given not present : p(detect | not present) = 1

Total probability of the event of not detecting the species:

Two possibilities: (1) present but not detected and (2) not present

P(not detected| present)*P(present) + P(not detected| not present)*P(not present)

Question: if we sampled but did not detect a fish species, what are the chances it was present?

Page 11: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

p(present | not detected) =

p(not detected| present)*p(present)

p(not detected| present)*p(present) + p(not detected| not present)*p(not present)

Assume 80% probability of detection: p(not detected| present) = 1- 0.80 = 0.20

Assume 40% probability of bull trout present: p(present) = 0.40, p(not present) = 0.60 p(not detected| not present) = 1

Now calculate: 0.20*0.40

0.20*0.40 + 1*0.60 = 0.118 or 11.8%

Bayes rule

Page 12: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Models and fisheries managementModels and fisheries management

“True” Models• Fundamental assumption: there is no “true” model that generates biological data

• Truth in biological sciences has essentially infinite dimension; hence, full reality cannot be revealed with finite samples.

• Biological systems are complex with many small effects, interactions, individual heterogeneity, and environmental covariates.

• Thus all models are approximations of reality

• Greater amounts of data are required to model smaller effects.

Page 13: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

• Several models can represent a single hypotheses

Models = hypotheses

• Models are tools for evaluating hypotheses

• Models are very explicit representations of hypotheses

• Hypotheses are unproven theories, suppositions that are tentatively accepted to explain facts or as the basis for further investigation

Models and hypothesesModels and hypotheses

Page 14: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Hypothesis: shoal bass reproduction success is greater when there are more reproductively active adults

Y = aN

Y = aN/(1+bN)

Number of young is proportional to the number of adults

Number of young increases with the number of adultsuntil nesting areas are saturated

Y = aNe-bN Number of young is increases until the carrying capacity ofnesting and rearing areas is reached

Models and hypotheses: exampleModels and hypotheses: example

Page 15: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Number of shoal bass

Num

ber o

f YO

Y

Y = aN

Y = aN/(1+bN)

Y = aNeY = aNe-bN-bN

Page 16: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Tapering Effect SizesTapering Effect Sizes

• Biological systems there are often large important effects, followed by smaller effects, and then yet smaller effects.

• These effects might be sequentially revealed as sample size increases because information content increases

• Rare events yet are more difficult to study (e.g. fire, flood, volcanism)

Big effects

smalleffects

Page 17: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Model selection Model selection

• Determine what is the best explanation given the data

• Determine what is the best model for predicting the response

• Two approaches in fisheries/ecology

Null hypothesis testingInformation theoretic approaches

Page 18: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Null hypothesis testingNull hypothesis testing

Develop an a priori hypothesis

Deduce testable predictions (i.e., models)

Carry out suitable test (experiment)

Compare test results with predictions

Retain or reject hypothesis

Page 19: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Hypothesis testing example:Hypothesis testing example:Density independence for lake sturgeon populations Density independence for lake sturgeon populations

Hypothesis: lake sturgeon reproduction is density independent

Prediction: there is no relation between adult density and age 0 density

Test: measure age 0 density for various adult densities over time

Compare: Linear regression between age 0 and adult sturgeon densities, P value = 0.1839

Result: Retain hypothesis lake sturgeon reproduction is density independent

Using a critical -level = 0.05, we conclude no significant relationship

Model: Y = B0

Page 20: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Model selection based on p-valuesModel selection based on p-values

• No theoretical basis for model selection

• P-values ~ precision of estimate

• P-values strongly dependent on sample size

P(the data (or more extreme data)| Model) vs. L(model | the data)

JUST SAY NO TO STATISTICAL SIGNIFICANCE TESTING JUST SAY NO TO STATISTICAL SIGNIFICANCE TESTING FOR MODEL SELECTIONFOR MODEL SELECTION

Page 21: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Information theory Information theory

If full reality cannot be included in a model, how do we tell how close we are to truth.

Entropy is synonymous with uncertainty

truth

Kullback-Leibler distance based on information theory

The measures how much information is in accounted for in a model

Page 22: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

K,L distance (information) is represented by: I (truth| model)

AIC is based on the concept of minimizing K-L distance

It represents information lost when the candidate model is used toApproximate truth thus SMALL values mean better fit

Akaike noticed that the maximum log likelihoodLog( L (model or parameter estimate | the data) ) was related to K-L distance

Information theory Information theory

Page 23: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Sums of squares in regression also is a measure of the relative fit of a model

What a maximum likelihood estimate?What a maximum likelihood estimate?

It is those parameter values that maximize the value of the likelihood,

given the data

05

1015202530354045

0 5 10

SSE = deviations2

Page 24: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

The maximum log likelihood (and SSE) is a biased estimate of K-L distance

AIC = -2ln(L (model | the data)) + 2K

Akaike’s contribution was that he showed that:

AIC = -2ln(likelihood) + 2*K

Measures model lack of fit Penalty for increasing model size(enforces parsimony)

It is based on the principle of parsimony

Number of parametersFew Many

Varia

nce

Bia

s2

Heuristic interpretation

Page 25: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

If ratio of n/K is < 40 then use AICc

AIC: Small sample bias adjustmentAIC: Small sample bias adjustment

AICc = -2*ln(likelihood | data) + 2*K + (2*K*(K+1))/(n-K-1)

As n gets big….

(2*K*(K+1))/(n-K-1) = 1/very large number

(2*K*(K+1))/(n-K-1) = 0

So…. AICc = AIC

Page 26: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

AIC by itself is relatively meaningless.Recall that we find the best model by comparing various models and examiningTheir relative distance to the “truth”

Model selection with AICModel selection with AIC

What is model selection?

We do this by calculating the difference between the best fitting model (lowest AIC) and the other models.

Model selection uncertainty

Which model is the best?What about if you collect data at the same spot next year,next week, next door?

AIC weights-- long run interpretation vs. Bayesian.

Confidence set of models analogous to confidence intervals

Page 27: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Where do we get AIC?Where do we get AIC?

K

-2ln(L (model | the data))

Page 28: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Interpreting AICInterpreting AIC

Best model (lowest AICc)

Difference between lowest AIC and model(relative distance from truth)

Page 29: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Interpreting AICInterpreting AIC

AICc weight, ranges 0-1 with 1 = best model

Interpreted a relative likelihood that model is best, given the data and the other models in the set

Page 30: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Interpreting AICInterpreting AIC

Ratio of 2 weights interpreted as the strength of evidence for one model over another

Here the best model is 0.86748/0.13056 = 6.64 times more likely to be The best model for estimating striped bass population size

Page 31: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Confidence model setConfidence model set

Using a 1/8 (0.12) rule for weight of evidence, my confidence set includes the top two models (both model likelihoods > 0.12).

Analogous to a confidence interval for a parameter estimate

Page 32: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Linear models reviewLinear models review

Y: response variable (dependent variable)

X: predictor variable (independent variable)Y = 0 + 1*X + e

0 is the intercept1 is the slope (parameter) associated with Xe is the residual error

Page 33: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Linear models reviewLinear models review

When Y is a probability it is bounded by 0, 1

Y = 0 + 1*XCan provide values <0 and > 1, we need to transformor use a link function

For probabilities, the logit link is the most useful

Page 34: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Logit linkLogit link

= ln( )p1- p

is the log odds p is the probability of an event

Page 35: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Log linear modelsLog linear models(logistic regression)(logistic regression)

= 0 + 1*X is the log odds

0 is the intercept1 is the slope (parameter) associated with X

Betas are on a logit scale and the log-odds needs to be back transformed

Page 36: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Back transformation:Back transformation:Inverse logit linkInverse logit link

-1

1+exp( )

is the log odds p is the probability of an event

p =

Page 37: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Back transformation exampleBack transformation example

= 0 + 1*X

0 = - 2.51 = 0.5 X = 2

Page 38: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Back transformation exampleBack transformation example

= -2.5 + 0.5*2 = -1.5

11+exp(1.5)= 0.18 or 18%

Page 39: BRIEF REVIEW OF STATISTICAL  CONCEPTS AND METHODS

Interpreting beta estimatesInterpreting beta estimates

exp(0.5) = 1.65

Betas are on a logit scale, to interpret calculate odds ratiosUsing the exponential function

1 = 0.5

Interpretation: for each 1 unit increase in X, the event is 1.65 times more likely to occur

For example, for each 1 inch increase in length, a fish is 1.65 times more likely to be caught