quant 2 - lecture 7 (t-tests w error)personal.kent.edu/~marmey/quant2spring05/quant 2... · concept...
TRANSCRIPT
Student’s t-Distribution
The t-Distribution, t-Tests, Measures of Effect Size, & Managing Violations
of Assumptions
Sampling Distributions Redux
• Chapter 7 opens with a return to the concept of sampling distributions from chapter 4– Sampling distributions of the mean
Sampling Distribution of the Mean
• Because the SDotM is so important in statistics, you should understand it
• The SDotM is governed by the Central Limit Theorem
Given a population with a mean μ and a variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to μ, a variance equal to σ2/n, and a standard deviation equal to . The distribution will approach the normal distribution as n, the sample size, increases. (p. 178)
n/2σ
Sampling Distribution of the Mean
Translation:1. For any population with a given mean and
variance the sampling distribution of the mean will have:
• µx = μ• σx
2 = σ2/n• σx = σ/√n
2. As n increases, the sampling distribution of the mean (µx) approaches a normal curve
Sampling Distribution of the Mean
• Analysis:– Although µx and µ will tend to be similar to one
another…– The relationships between…
• σx2 and σ2
• σx and σ– …will differ as a function of the sample size
• We saw this in our sampling distribution of the mean example from chapter 4…
So, you wanna test a hypothesis, do ya?
• Our understanding of sampling and sampling distributions now allows us to test hypotheses
• How we test a hypothesis depends on the information we have available
Choosing a Test• µ?
– σ?– s?
• Number of data sets:– 1– 2
• Number of Groups– 1– 2
1. Which variables are available?
2. How many data sets are you presented with?
3. Do your data sets come from 1 or 2 groups?
Testing Hypotheses about Means: The Rare Case of Knowing σ
• So far, to test the probability of finding a particular score, we’ve used the Standard Normal Distribution– IQ = 122– µ = 100– σ = 15
σ)( xxz −
=
(122 100)15
z −=
(22)15
z =
1.47z =
-1.96 < z < 1.96 Fail to reject H0
How the z-Test Works
• How does our test change when we test group means, not just individual scores?– We use the central limit theorem
How the z-Test Works
10015
)100122( −=z
1015
)22(=z
5.1)22(
=z 67.14=z
215
)100122( −=z
41.115
)22(=z
64.10)22(
=z 07.2=z
115
)100122( −=z
115
)22(=z
15)22(
=z 47.1=z
n = 100
n = 2
n = 1
How the z-Test Works
• Large samples reduce the amount of random variance (sampling error)
– More confidence that the sample mean = population mean
• Larger samples improve our ability to detect differences between samples and populations
• For n = 1=
n
xz σµ)( −
=σµ)( −
=xz
Testing Hypotheses: When σ Is Unknown
• Generally, the population standard deviation, σ, is unknown to us
• Occasionally, we will know the population mean, µ, when we don’t know σ
• In these situations, the standard normal distribution no longer meets our needs
Testing Hypotheses: When σ Is Unknown
• Knowing µ…– We can produce an estimate of σ from s– Using s changes the nature of the test we are
conducting, as s is not distributed in the same fashion as σ
• Sampling distribution of the sample standard deviation is NOT normally distributed
– Strong positive skew
Testing Hypotheses: When σ Is Unknown
Sampling distribution of s
Sampling distribution of σ
So How Does s Estimate σ?
• Given the differences in distribution shape, it is easy to conclude that s ≠ σ– s is an unbiased estimator of σ over repeated
samplings – However, a SINGLE value of s is likely to
underestimate σ• Because of this fact, small samples will systematically
underestimate σ as a function of s
– This leads to any given statistic calculated from this distribution to be < a comparable value of z
– We cannot use z any longer t
t and the t-Distribution
• Developed by Student while he was working for the Guinness Brewing Co.
1. The shape of the t-distribution is a direct function of the size of the sample we are examining
2. For small samples, the t-distribution is somewhat flatter than the standard normal distribution, with a lower peak and fatter tails
t and the t-Distribution
3. As sample size increases:• The t-distribution approaches a normal
distribution• Theoretically, we mean that the closer that our
sample comes to infinity, the more it looks like a normal distribution
• Practically, when n ~ 100 – 120
t and the t-Distribution
t and the t-Distribution
4. Identifying values of t associated with a given rejection region depends on:– α– the number of tails associated with the test– the degrees of freedom available in the analysis
– For this one-sample test, (df = n-1) because we used one degree of freedom calculating s using the sample mean and not the population mean.
One-Sample t-Test
xsxt )( µ−
=
ns
xtx
)( µ−=
ns
xtx
2
)( µ−=or or
z-Test vs. One-Sample t-Test
n
xz σµ)( −
=
ns
xtx
)( µ−=
Note the similarities between these tests: ONLY the source of “variance” and the distribution you test against have changed!
Using the One-Sample t-Test• You are one the admissions board for a
graduate school of Psychology. • You are attempting to determine if the GRE
scores for the students applying to your program is competitive with the national average. – µVerbal = 569
• SPSS output from your dataDescriptive Statistics
24 310.00 659.7917 86.43267
24
GRE
Valid N (listwise)
N Range Mean Std. Deviation
Using the One-Sample t-Test
• Research Hypothesis:– The GRE scores from your applicants differ
from the population norms• H1: µa ≠ µp or ES > 0
• Null Hypothesis – The GRE scores from your applicants do not
differ from the population norms • H0: µa = µp or ES = 0
• Evaluate the students’ GRE-V scores
Using the One-Sample t-Test
• Select:• Rejection region
• α = .05• “Tail” or directionality
• We don’t know exactly how the students will score: we just expect them to show scores differing from the population values
• Might predict higher scores…
Using the One-Sample t-Test
• Generate sampling distribution of the mean assuming H0 is true
• One-Sample t-test• Given our sampling distribution:
• Conduct the statistical test
Using the One-Sample t-Test
ns
xtx
)( µ−=
(659.79 569)86.43
24
t −=
(90.79)17.64
t =(90.79)86.434.90
t = 5.15t =
µVerbal = 569x-bar = 659.79s = 86.43n = 24
This numerical value is called tobt
tobt(23) = 5.15
Using the One-Sample t-Test
• SPSS OutputOne-Sample Test
5.146 23 .000 90.79167 54.2943 127.2890GREt df Sig. (2-tailed) Mean Difference Lower Upper
95% Confidence Intervalof the Difference
Test Value = 569
µ
tobt(23) = 5.15
Evaluating Statistical Significance of the t-Test
• First note:– α = .05– Tail or directionality: two-tailed– t-Value = 5.15– Degrees of freedom (df)
• For the One-Sample t-Test, df = n-1 (24-1 = 23)• Estimating s from x-bar (not σ from µ)
Evaluating Statistical Significance of the t-Test
• In the past you…– Identified a tabled value of tcrit– Compare tcrit to our tobt value– If tobt falls into the rejection region identified by
tcrit, then we reject H0– If tobt does not fall into the rejection region
identified by tcrit, then we fail to reject H0
• SPSS Simplifies matters by exactly calculating p for us
Using the One-Sample t-Test
• SPSS OutputOne-Sample Test
5.146 23 .000 90.79167 54.2943 127.2890GREt df Sig. (2-tailed) Mean Difference Lower Upper
95% Confidence Intervalof the Difference
Test Value = 569
µ
tobt(23) = 5.15, p < .05Exact probability ≈ .000003
Evaluating Statistical Significance of the t-Test
tcrit = 2.069tcrit = - 2.069
0
tobt = 5.15
Because tobt falls within the rejection region identified by tcrit we reject H0
Testing Hypotheses: Two Matched (Repeated) Samples
• Sometimes, we’re interested in how a single set of scores change over time– Psychotherapy tx influences depression– Patients respond to medication– Consumer attitudes before and after an advertisement– Changes in citizen attitudes following the State of the
Union address• When we look at two sets of scores collected
from a single sample at different time points, we need to use a matched samples test
Matched Samples
• Matched samples– Use the same participants at two or more
different time points to collect similar data• MUST BE THE SAME SAMPLE!
BDI - II BDI - II
Time 1 Wait 30 Days Time 2
Matched Samples Test
• With a matched samples test, you are testing the change in scores between the two administrations of the test– H0: µ1 = µ2
– H0: µ1 - µ2 = 0 or ES = 0• This is truly the null hypothesis for the matched
samples test
Matched Samples Test
• Essentially, the group means at each time point mean little to us– Change in scores is the key– Conduct this test by obtaining the average
difference score between the two time points
Matched Samples Test
ns
DtD
0−=
D-bar represents average difference scores between time points
is the standard deviation of the difference scores
-0 may seem redundant, but isn’t!
Ds
Calculating the Matched Samples t-Test
• You are a researcher examining the impact of a new therapy intervention on the incidence of self-injurious behavior (SIB)
• You collect a measure of the frequency of self-injurious acts when clients enter your treatment (time 1)
• You collect a measure of the frequency of self-injurious acts two weeks later (time 2)
Calculating the Matched Samples t-Test
• Research Hypothesis:– The new treatment will change SIB scores
• H1: µ1 ≠ µ2 or ES > 0
• Null Hypothesis – The SIB scores at time 2 will be the same as
the scores at time 1 (no change)• H0: µ1 = µ2
• H0: µ1 - µ2 = 0 or ES = 0
• Evaluate SIB at time 1 & time 2
Using the One-Sample t-Test
• Select:• Rejection region
• α = .05• “Tail” or directionality
• We don’t know exactly how the treatment will work, so we’d better use a two-tailed test
Using the One-Sample t-Test
• Generate sampling distribution of the mean assuming H0 is true
• Matched Samples t-test• Given our sampling distribution:
• Conduct the statistical test
Calculating the Matched Samples t-Test
54274413445D 2516449161619161625D2
261791191074108Time 2
71019161513111081413Time 1
∑D = 43
∑D2 = 193
(∑D)2 = 1849
Descriptive Statistics
11 7.00 19.00 12.3636 3.58532
11 2.00 17.00 8.4545 3.93354
11
time1
time2
Valid N (listwise)
N Minimum Maximum Mean Std. DeviationD = 3.91
Calculating the Matched Samples t-Test
22
2
( )
( 1)D
DD
nsn
−=
−
∑∑2
2
4319311
(11 1)Ds−
=−
2
184919311
(10)Ds−
= 2 193 168.09(10)Ds −
=
2 24.91(10)Ds =
2 2.49Ds = 2.49Ds = 1.58Ds =
Calculating the Matched Samples t-Test
ns
DtD
0−=
1158.1
091.3 −=t
32.358.191.3
=t
48.91.3
=t 15.8=t tobt = 8.15
Evaluating Statistical Significance of the t-Test
• First note:– α = .05– Tail or directionality: two-tailed– t-Value = 8.15– Degrees of freedom (df)
• For the Matched Samples t-Test:– df = number of PAIRS of scores -1– df = 11 - 1 = 10
– Again, we can calculate p exactly with SPSS
Calculating the Matched Samples t-Test
• SPSS OutputPaired Samples Correlations
11 .916 .000time1 & time2Pair 1N Correlation Sig.
Paired Samples Test
3.90909 1.57826 .47586 2.84880 4.96938 8.215 10 .000time1 - time2Pair 1Mean Std. Deviation Std. Error Mean Lower Upper
95% Confidence Intervalof the Difference
Paired Differences
t df Sig. (2-tailed)
tobt (10) = 8.15, p < .05p ≈ .0000009
Evaluating Statistical Significance of the t-Test
tcrit = 2.228tcrit = - 2.228
0
tobt = 8.15
Because tobt falls within the rejection region identified by tcrit we reject H0
Testing Hypotheses: Two Independent Samples
• Probably the most common use of the t-Test and the t-distribution
• Compare the mean scores of two groups on a single variable– IV: Groups– DV: Variable of interest
• Groups must be independent of one another– Scores in 1 group cannot influence scores in
the other group
Independent Samples t-Test
21
21
xxsXX
t−
−=
2
22
1
21
21
ns
ns
XXt+
−=or
This test is calculated by dividing the mean difference between two groups by the “dispersion”or “variation” observed between the two groups
Independent Samples t-Test:Degrees of Freedom
• 1 df lost for each σ estimated by s using x-bar
• Since there are two independent groups in this analysis, we must estimate σ twice
• df = (n1 + n2) - 2
Independent Samples t-Test: Example
• Let’s return to the example used for the matched samples test
• As a competent researcher, you realize that simply showing a change over time is not enough to prove the efficacy of your treatment– People spontaneously change over time
• Show that an untreated control group does not change over the same period of time that your treatment group does change
Independent Samples t-Test: Example
SIB
ScoresTx Group
Ctrl Group
Time 1 Time 2
SIB
Scores
Tx
Tx
SIB
Scores
SIB
Scores
SIB
Scores
Time 3
= ?
Independent Samples t-Test: Example
• At time 1, the control and treatment SIB groups have equal SIB scores
• Administer the treatment for 2 weeks to Txgroup– The Control group receives no intervention
during these two weeks• Compare SIB scores of Tx and Control
group after 2 weeks• Provide Control group w/ intervention if
desired
Independent Samples t-Test: Example
• Research Hypothesis:– Your treatment for SIB will reduce SIB
scores in the Tx group after 2 weeks• H1: µt < µc
• Null Hypothesis – Your treatment for SIB will have no effect
• H0: µt = µc
• Evaluate the efficacy of your treatment
Independent Samples t-Test: Example
261791191074108Tx
12161513168119101312ControlTime 2 Data
Ctrl Group
∑x 135
∑x2 1729
(∑x)2 18225
x-bar 12.27
s2 7.29
s 2.69
n 11
Tx Group
93
941
8649
8.45
15.47
3.93
11
Descriptive Statistics
11 8.00 16.00 12.2727 2.68667
11 2.00 17.00 8.4545 3.93354
11
ctrl
tx
Valid N (listwise)
N Minimum Maximum Mean Std. Deviation
Independent Samples t-Test: Example
• Select:• Rejection region
• α = .05• “Tail” or directionality
• We have evidence that the treatment probably works, so we make a one-tailed hypothesis here (scores for the Tx group will be lower than the Control group at time 2)
Independent Samples t-Test: Example
• Generate sampling distribution of the mean assuming H0 is true
• Independent Samples t-Test• Given our sampling distribution:
• Conduct the statistical test
Independent Samples t-Test: Example
1129.7
1147.15
27.1245.8
+
−=t
66.41.182.3+
−=t
07.282.3−
=t
2
22
1
21
21
ns
ns
XXt+
−=
44.182.3−
=t
65.2−=t tobt(20) = -2.65
Evaluating Statistical Significance of the t-Test
• First note:– α = .05– Tail or directionality: one-tailed– t-Value = -2.65– Degrees of freedom (df)
• For the Independent Samples t-Test– (n1 + n2) - 2 – (11+11)-2 – 22 - 2 = 20
Evaluating Statistical Significance of the t-Test
tobt(20) = -2.65, p < .05
• SPSS OutputIndependent Samples Test
.518 .480 2.658 20 .015 3.81818 1.43625
2.658 17.663 .016 3.81818 1.43625
Equal variances assumed
Equal variances not assumed
Self-Injurious BehaviorF Sig.
Levene's Test for Equalityof Variances
t df Sig. (2-tailed) Mean DifferenceStd. ErrorDifference
t-test for Equality of Means
p ≈ .015
Evaluating Statistical Significance of the t-Test
0
tobt = -2.65
Because tobt falls within the rejection region identified by tcrit we reject H0
tcrit = - 1.725
Independent Samples t-Test: One Complication
• There is a slight problem with the form of the equation we used…– ONLY can be applied
to groups with equal sample sizes
– A major limitation in real-world research
2
22
1
21
21
ns
ns
XXt+
−=
Pooled Variance Estimate
• This equation permits tests with different sample sizes
• Generates an estimate of the total variance between groups weighted by the size of each group– Therefore, larger samples have a greater
impact on the variance – Vice-versa for small samples
Pooled Variance Estimate
2)1()1(
21
222
2112
−+−+−
=nn
snsnsp
Using the Pooled Variance Estimate
2
22
1
21
21
ns
ns
XXt+
−= 2
2
1
221
ns
ns
XXtpp +
−=
21
2
21
11nn
s
XXt
p +
−=
Using the Pooled Variance Estimate: Example
261791191074108Tx
No Data121615131611ControlTime 2 Data
Ctrl Group
∑x 83
∑x2 1171
(∑x)2 6889
x-bar 13.83
s2 4.57
s 2.14
n 6
Tx Group
93
941
8649
8.45
15.47
3.93
11
Descriptive Statistics
6 11.00 16.00 13.8333 2.13698
11 2.00 17.00 8.4545 3.93354
6
ctrl
tx
Valid N (listwise)
N Minimum Maximum Mean Std. Deviation
Using the Pooled Variance Estimate: Example
261157.4)16(47.15)111(2
−+−+−
=ps2
)1()1(
21
222
2112
−+−+−
=nn
snsnsp
1585.227.1542 +
=ps15
57.4)5(47.15)10(2 +=ps
1555.1772 =ps 84.112 =ps
Using the Pooled Variance Estimate: Example
21
2
21
11nn
s
XXt
p +
−=
)61
111(84.11
83.1345.8
+
−=t
)0909.1667(.84.1138.5+
−=t
tobt(15) = -3.07
)2576(.84.1138.5−
=t
05.338.5−
=t 75.138.5−
=t 07.3−=t
Evaluating Statistical Significance of the t-Test
• First note:– α = .05– Tail or directionality: one-tailed– t-Value = -3.07– Degrees of freedom (df)
• For the Independent Samples t-Test– (n1 + n2) - 2 – (11+6)-2 – 17 - 2 = 15
Evaluating Statistical Significance of the t-Test
• SPSS OutputIndependent Samples Test
.714 .411 3.080 15 .008 5.37879 1.74614
3.653 14.979 .002 5.37879 1.47232
Equal variances assumed
Equal variances not assumed
Self-Injurious BehaviorF Sig.
Levene's Test for Equalityof Variances
t df Sig. (2-tailed) Mean DifferenceStd. ErrorDifference
t-test for Equality of Means
tobt(15) = -3.07, p < .05p ≈ .0076
Evaluating Statistical Significance of the t-Test
0
tobt = -3.07
Because tobt falls within the rejection region identified by tcrit we reject H0
tcrit = - 1.753
Effect Size of The Independent Samples t-Test
σµµ 21 −=d or
psXX
d 21 −=
We use the same effect size conventions we identified for the Matched Samples test
Effect Size of The Independent Samples t-Test
8.45 13.8311.84
d −=
5.3811.84
d −=
psXX
d 21 −=
An effect size approaching the convention for a medium effect
.45d = −
t-test Assumptions
• Although the t-test is generally a robust test, it can be affected by violations of underlying test assumptions– Normality – sampling distribution is normally
distributed– Sample size – samples for each group should
be of roughly equal size– Homogeneity of variance – σ1 = σ2
t-test Assumptions
• One sample t-test– Normality - √– Sample size - X– Homogeneity of variance – X
• Matched & Independent samples t-test(s)– Normality - √– Sample size - √– Homogeneity of variance – √
Impact of Violated Assumptions
• For equal sample sizes…– …violating homogeneity of variance…
• Minimal impact (α = .05 ± .02)– …with minor normality violations…
• Similar results as above– …with major normality violations…
• Severe skew (particularly in opposite directions) can lead to significant problems unless variances are fairly equal
Impact of Violated Assumptions
• Unequal sample sizes…– Much more difficult to interpret– Unequal sample sizes + heterogeneity of
variance = distortions in p• Possibly increased risk of Type I error• Risk of error increases as more assumptions are
violated
Coping with Violated Assumptions
• What can we do to prevent or cope with violated assumptions?
1. Maintain equal sample sizes2. Use trimmed samples…3. Use a distribution free (i.e. non-parametric)
test4. Apply a statistical correction to t
Coping with Violated Assumptions
• SPSS OutputIndependent Samples Test
.714 .411 3.080 15 .008 5.37879 1.74614
3.653 14.979 .002 5.37879 1.47232
Equal variances assumed
Equal variances not assumed
Self-Injurious BehaviorF Sig.
Levene's Test for Equalityof Variances
t df Sig. (2-tailed) Mean DifferenceStd. ErrorDifference
t-test for Equality of Means
If pF < .05, use the “Equal variances no assumed” row
Statistical Tests We Have Learned
1. z-Test• 1 group• 1 set of data• µ & σ known
2. One-Sample t-Test• 1 group• 1 set of data• µ known• Estimate σ with s using
x-bar
3. Matched Samples t-Test
• 1 group• 2 sets of data• µ & σ unknown• Estimate σD with sD
using D-bar
4. Independent Samples t-Test
• 2 groups• 2 sets of data• µ & σ unknown• Estimate σ twice with s
using x-bar
Choosing the Best Test
Choosing the Best Test
• Flow-chart available on the website:– http://www.personal.kent.edu/~marmey
• Also refer to the diagram on p. 11 of your Howell text
• Try the review problems on the website for an example of the types of questions I might ask on an exam!