testing differences between means continued · testing differences between means to test the...
TRANSCRIPT
Testing Differences between Means, continued
Statistics for Political ScienceLevin and FoxChapter Seven
Testing Differences between Means
To test the significance of a mean difference we need to find the standard deviation for any obtained mean difference.
However, we rarely know the standard deviation of the distribution of mean differences since we rarely have population data. Fortunately, it can be estimated based on two samples that we draw from the same population.
Step 2b: Translate our sample mean difference into units of standard deviation.
3
Z =( 1 – 2) - 0
Where = mean of the first sample
= mean of the second sample
0 = zero, the value of the mean of the sampling distribution of
differences between means (we assume that µ1 - µ2 = 0)
= standard error of the mean (standard deviation of the
distribution of the difference between means)
We can reduce this equation down to the following:
X X
21 XX1X
2X
21 XX
21
21
XX
XXz
Remember this formula required the standard deviation of the distribution of
mean differences.
Result: (assuming equals 2)
4
Z = ( 45 – 40)
Thus, a difference of 5 between the means of the two samples (women and
men) falls 2.5 standard deviations from a mean of zero.
21 XX
2
Z = + 2.5
Child Rearing: Comparing Males and Females
Standard Error of the Difference between Means
Here is how the standard error of the difference between means can be calculated.
21
21
21
2
22
2
11
21 2 NN
NN
NN
sNsNs
xx
The formula for combines the information from the two samples.21 XX
s
Where
The formula for combines the information from the two
samples.
A large difference between Xbar1 and Xbar2 can result if (1) one mean is
very small, (2) one mean is very large, or (3) one mean is moderately
small and the other is moderately large.
21 XXs
2
2
2
2
22
2
2
1
1
2
12
1
XN
Xs
XN
Xs
Variance: Weeks on Unemployment:
X
(weeks)
N=6
Deviation:
(raw score from
the mean)(raw score from the
mean, squared)
Variance:
9
8
6
4
2
1
9-5= 4
8-5=3
6-5=1
4-5=-1
2-5=-3
1-5=-4
42 = 16
32 = 9
12 = 1
-12 = 1
-32 = 9
-42 = 16
(weeks squared)
ΣX=30
χ= 30=5
6
(X X) (X X)2
(X X)2 52
Step 1:
Calculate
the Mean
Step 3: Calculate
Sum of square Dev
Step 2: Calculate
Deviation
Step 4: Calculate
the Mean of squared dev.
s2X X
2
N
52
68.67
Testing the Difference between Means
Let’s say that we have the following information about two samples, one of liberals and one of conservatives, on the progressive scale:
Liberals Conservatives
N1 = 25 N2 = 35
= 60 = 49
S1 = 12 S2 = 14
We can use this information to calculate the estimate of the standard
error of the difference between means:
1X 2X
21
21
21
2
22
2
11
21 2 NN
NN
NN
sNsNs
xx
)35)(25(
3525
23525
)14)(35()12)(25( 22
21 xxs
52.3
3717.12
)0686)(.3448.180(
875
60
58
860,6600,3
We start with
our formula:
The standard error of the difference between means is 3.52.
We can now use our result to translate the difference between sample
means to a t ratio.
We can now use our standard error results to change difference between sample mean into a t ratio:
21
21
XXs
XXt
t = 60 – 49
3.52
t = 11
3.52
t = 3.13
REMEMBER: We use t
instead of z because we do
not know the true population
standard deviation.
We aren’t finished yet!
Turn to Table C.
1) Because we are estimating for both σ1 and σ2 from s1 and s2, we use a wider t distribution, with degrees of freedom N1+ N2 – 2.
2) For each standard deviation that we estimate, we lose 1 degree of freedom from the total number of cases.
N = 60
Df ( 25 + 35 - 2) = 58
In Table C, use a critical value of 40 since 58 is not given.
We see that our t-value of 3.13 exceeds all the standard critical points except for the .001 level.
Therefore, based on what we established BEFORE our study, we reject the null hypothesis at the .10, .05, or .01 level.
df .20 .10 .05 .02 .01 .001
40 1.303 1.684 2.021 2.423 2.704 3.551
Comparing the Same Sample Measured Twice
Some research employs a panel design or before and after test (testing the same sample at two points in time).
In these types of studies, the same sample is tested twice. It is not two samples from the same population, it is a measuring the same group of people twice.
CRITICAL POINTS TO NOTE:
1. The same sample measured twice uses the t-test of difference between means.
2. Different samples from the same population selected at two points in time use the t-test of difference between means for independent groups.
Example Problem of Test of Difference Between Means for Same Sample Measured Twice
Null Hypothesis (µ1 = µ2): The degree of neighborliness does not differ before and after relocation.
Research Hypothesis (µ1 ≠ µ2): The degree of neighborliness differs before and after relocation.
Where µ1 is the mean score of neighborliness at time 1
Where µ2 is the mean score of neighborliness at time 2
Respondent Before
(X1)
After
(X2)
Difference
(D = X1 – X2)
Difference2
(D2)
Johnson 2 1 1 1
Robinson 1 2 -1 1
Brown 3 1 2 4
Thomas 3 1 2 4
Smith 1 2 -1 1
Holmes 4 1 3 9
∑ X1 = 14 ∑ X2 = 8 ∑ D2 = 20
221
2
)( XXN
DsD
The formula for obtaining
the standard deviation for
the distribution of before-
after difference scores
sD = standard deviation of the distribution of before-after difference scores
D = after-move raw score subtraction from before-move raw score
N = number of cases or respondents in sample
From this, we get the formula for the standard error of the difference between the means:
SD
SD
N 1
SD
1.53
6 1
X1
X1
nX 2
X 2
n
14
6=
=
=
=2.33 1.33
8
6
sD20
6(2.33 1.33)2
= 1.53
= .68
tX1 X2
sDt = 60 – 49
3.52
t = 3.13
Step 1: Find mean for each point in time
Step 2: Find the SD for the diff between
the times
Step 3: Find the SE for the diff
between the times
Step 4: Translate the mean diff into a t
score
Comparing the Same Sample Measured Twice
df = (n – 1)
= 6 – 1
= 5
Step 5: Calculate the degrees of freedom
Step 6: compare the obtained t ratio with t ratio in Table C
Obtained t = 1.47
Table t = 2.571
df = 5
α = .05
To order reject the null hypothesis at the .05 significance with five degrees of
freedom we must obtain a calculated t ratio of 2.571. Because our t ratio is
only 1.47 – we retain the null hypothesis.
df .20 .10 .05 .02 .01 .001
5 1.476 2.015 2.571 3.365 4.032 6.859
Two Sample Test of Proportions
21
21
PPs
PPz
21
21*)1(*21 NN
NNPPs PP
21
2211*NN
PNPNP
Where P1 and P2 are
respective sample proportions.
The standard error of the
difference in proportions is:
Where P* is the combined
sample proportion
Requirements when considering the appropriateness of the t-ratio as a test of significance. (For Testing the Difference between Means):
1. The t ratio is used to make comparisons between two means.
2. The assumption is that we are working with interval level data.
3. We used a random sampling process.
4. The sample characteristic is normally distributed.
5. The t ratio for independent samples assumes that the population variances are equal.
So how do you interpreting the results and state them for inclusion in your research?
“Since the observed value of t (state the test statistic) exceeds the critical value (state the critical value), the null hypothesis is rejected in favor of the directional alternative hypothesis. The probability that the observed difference (state the difference between means) would have occurred by chance, if in fact the null hypothesis is true, is less than .05.”