maximum likelihood example pseudoreplication example:Òthe
TRANSCRIPT
1
Maximum Likelihood
• L[hypothesis | data] = Pr[data | hypothesis]
• Allows you to solve problems and test
hypotheses that would be extremely difficult
in any other way
2
Example
• What proportion of this class has shoplifted
an item worth more than $10?
• Flip a coin
• Don’t tell ANYONE the result
– If “heads,” answer “heads”
– If “tails,” answer “heads” if you’ve shoplifted
something, “tails” otherwise
3
Pseudoreplication
• The error that occurs when samples are not
independent, but are treated as though they
are
4
Example: “The transylvania effect”
A study of 130,000 calls for police assistancein 1980 found that they were more likely thanchance to occur during a full moon.
5
Example: “The transylvania effect”
A study of 130,000 calls for police assistancein 1980 found that they were more likely thanchance to occur during a full moon.
Problem: There may have been 130,000calls in the data set, but there were only 13full moons in 1980. These data are notindependent.
6
Pseudoreplication
• We are making a false claim about the
number of independent samples in our data
• Very common mistake in biology
• Easiest solution: use the average of all the
pseudoreplicates
7
Very small samples and
assumptions
• Question from the class:“Say there's a test which you desire to carry out which is expensive and therefore you canafford only 2 treatments, each with two replicates. How would we go about analysing anydifference, because are sample would be so small that we wouldn't be able to know if ourdata followed a normal distribution, right? and would these tests be worth carrying out sincethey would have pretty low power?”
• Answer: most scientists will justproceed with the test
• Interpret the results as “if ourassumptions are true (and we have noidea), then…”
8
Very small samples and
assumptions
• Example: does the Earth have more speciesof living things than other planets in thesolar system?
• Data: Earth=10,000,000-100,000,000
• Mercury, Venus, Mars, Jupiter, Saturn,Uranus, Neptune=0 (as far as we know)
9
Hypothesis testing
• Null hypothesis are usually very simple, and
often known beforehand to be false
• You will eventually reject them if you have
a big enough sample size
10
Example
• Study on logging
• Ho: The density of large trees is greater in
unlogged versus logged areas
11
Fewer trees
12
Statistical significance !
Biological importance
• “Statistically significant” means P < 0.05
• But it does not necessarily mean important!
• Likewise, nonsignificant results can be
biologically important
• It’s always useful to estimate a parameter or
effect size, with a confidence interval
13
Examples
• Some studies of thousands of children havefound statistically significant associations ofIQ with birth order
• These differences are on the order of 1-2 IQpoints
• Such differences are not biologicallyimportant for individuals, and can’t explainwhy your sister is smarter than you!
14
Examples
• Large study of hormone replacement
therapy showed no significant benefit of
HRT to post-menopausal women
• Confidence interval for the effect size
suggested that any possible undetected
effect is likely to be extremely small
15
Correlation does not require
causation
16
Correlation and Causation
Hot weather
Ice cream
Violent
crime
17
Data for many countries:
18
Confounding variables
• Variables that mask or distort the
association between measured variables in a
study
• Two approaches:
– Try to measure them all
– Do an experiment
19
Make a Plan
• Develop a clear statement of the question
• List possible outcomes
• Develop an experimental plan
• Keep the design as simple as possible
• Check for common design problems
• Is sample size big enough?
• Discuss with other people!
20
The importance of controls
• Placebo effect - an improvement in a
medical condition that results from the
psychological effects of medical treatment
– Most people get better over time
– Humans like to please others, including their
doctors
– Benefits of doctors beyond drugs
– Direct psychological effects on health
21
The importance of controls
• Well-documented for pain relief
• Up to 40% of people report improvement in
pain when given sugar pills
• Drugs and treatments must be analyzed in
this context
22
Head On = stick of wax
23
“I’m addicted to placebos. I
could quit but it wouldn’t
matter.”
Steven Wright
24
Mistakes
• Two types of mistakes:
– Experimental mistakes
– Statistical mistakes (“Type III error”)
25
Mistakes
• Two types of mistakes:
– Experimental mistakes
– Statistical mistakes (“Type III error”)
26
Experimental Mistakes
27
Mistakes
• Two types of mistakes:
– Experimental mistakes
– Statistical mistakes (“Type III error”)
28
Statistical Mistakes
• 1/3 to 1/2 of scientific papers that use
statistics make at lease minor mistakes
• ~ 8% major mistakes - enough to alter the
conclusions of the paper
• Be careful when reading papers
• Be careful with your own work!
29
Data dredging
• The process of carrying out statistical tests
on your data until you come up with a
statistically significant result.
30
P = 0.05
+ second digit
31 32
Beware multiple comparisons
Probability of a Type I error in N tests = 1-(1-!)N
For 20 tests, the
probability of at least
one Type I error is
~65%.
33
Example - ESP
34
Six or more correct answers: you have ESP!
35
Bonferroni correction
!
"*=
"
number of tests
Anyone in the class have 8 or more correct?
36
Garbage-in, garbage-out
• Small P-values do not rescue a poor
measurement
• Example: IQ test bias
37
Aboriginal-based IQ Test
1.What number comes next in thesequence, one, two, three,__________?
MANY
38
Aboriginal-based IQ Test
2. As wallaby is to animal socigarette is to __________
TREE
39
Aboriginal-based IQ Test
3. Three of the following items maybe classified with salt-watercrocodile. Which are they?
marine turtle brolga
frilled lizard black snake
40
Fraud happens
Original Haeckel's copy
(echidna embryos)
41
Recent Fraud Example
• Woo Sek Hwang, human cloning
• Much of the data suspected to be fabricated
42
Regression to the mean
• When repeated measurements are taken
over time…
• Individuals with extreme values for the first
measurement tend to be nearer to the mean
for the second measurement
43
Regression to the Mean
44
Regression to the Mean
The “sophomore slump”
45
Publication bias
Papers are more likely to be published if P<0.05
This causes a bias in the science reported in the literature.
46
Meta-analysis
• Compiles all known scientific studies
testing the same null hypothesis and
quantitavely combines them to give an
overall estimate of the effect and its
statistical properties
• This is a GREAT honours project…