psychology 202a advanced psychological statistics december 1, 2015
TRANSCRIPT
Psychology 202aAdvanced Psychological
Statistics
December 1, 2015
The plan for today
• Continuing discussion of power• Power for t tests• Power for ANOVA• G*power
Consider that in tabular form:
H0 True H0 False
H0 RejectedType I error
(p = )aGreat!
H0 Retained No problemType II error
(p = )b
What is power?
• In that scenario, power = 1 – .b• In other words, power is the probability
that we will avoid a Type II error, given that the null hypothesis is actually false.
What affects power?
• To understand what affects power, consider the simplest possible situation for testing a hypothesis about means:– Hypothesis about a single mean;– The population standard deviation is known to
be 15;– The sample size is 25;– The null hypothesis is that m = 100.– The truth is that m = 105.
When will we reject the null?
• We’ll be doing a Z test.• We’ll reject the null if Z < -1.96 or if Z >
1.96.
Z Statistic
-4 -3 -2 -1 0 1 2 3 4
But the null hypothesis isn’t true!
• We stipulated that m is really 105.• In that case, the expected value of the Z
statistic is really (105-100)/3 = 5/3, not 0.
Z Statistic
-4 -3 -2 -1 0 1 2 3 4 5
What is the power?
• What is the probability that a single draw from a normal distribution with mean 5/3 and standard deviation 1 will be < -1.96 or > 1.96?
• pnorm(-1.96,5/3,1) + (1-pnorm(1.96,5/3,1))• So the power is about 0.38.
What affects power?
• Power will be increased by:– anything that tends to make the test statistic
large;– anything that tends to make the critical value
small.
Things that make the statistic large:
• Big effect• Small variability• Big sample size
Things that make the critical value small:
• Less stringent alpha level• One-tailed tests• In most cases, bigger sample size
(because for more complicated statistics, the critical value depends on degrees of freedom)
But when will we ever do a Z test?
• Noncentral distributions.• Noncentrality parameter expresses exactly
how the null hypothesis is false (with a bit of sample size thrown in). delta <- (105-100)/3 tcrit <- qt(.975,24) pt(-tcrit,24,delta) + (1-pt(tcrit,24,delta))
• So power would really be a bit lower: .36 rather than .38.
Power in more complex testing situations
• Two-sample t test
• Calculation in R: delta <- (105-100)/15 * sqrt(12*13/25)
tcrit <- qt(.975,23)
pt(-tcrit,23,delta) + (1-pt(tcrit,23,delta))
.21
2121
nnnn
Using G*power
• Obtaining and installing• Power calculations in G*power
– power for a given situation– required n for a given power– minimum detectable effect size
Power analysis for ANOVA
• If your main interest is in a contrast, do the power analysis for that contrast (as if it were a t test using MSe in place of the pooled variance estimate.
• Power analysis for the omnibus F test:
.2
2
e
jn
ANOVA power example
• Suppose we are planning an experiment with five groups, and we expect the means to be spread over a ± one standard deviation range.
• Pick an arbitrary standard deviation (say, 10). So the means might be (40, 45, 50, 55, 60).
ANOVA power example
• So how large does n need to be to give power of, say, .9?
n
n5.2
1001050510 22222
ANOVA power example
• In R:n <- seq(2,15,1)
dfn <- 4
dfd <- 5*(n-1)
fcrit <- qf(.95, dfn, dfd)
lambda <- 2.5*n
power <- 1-pf(fcrit,dfn,dfd,lambda)
cbind(n,power)
ANOVA power example
• Illustration in G*power
Next time
• Two-way ANOVA