testing differences between means continued · testing differences between means to test the...

Testing Differences between Means, continued

Statistics for Political ScienceLevin and FoxChapter Seven

Testing Differences between Means

To test the significance of a mean difference we need to find the standard deviation for any obtained mean difference.

However, we rarely know the standard deviation of the distribution of mean differences since we rarely have population data. Fortunately, it can be estimated based on two samples that we draw from the same population.

Step 2b: Translate our sample mean difference into units of standard deviation.

3

Z =( 1 – 2) - 0

Where = mean of the first sample

= mean of the second sample

0 = zero, the value of the mean of the sampling distribution of

differences between means (we assume that µ1 - µ2 = 0)

= standard error of the mean (standard deviation of the

distribution of the difference between means)

We can reduce this equation down to the following:

X X

21 XX1X

2X

21 XX

21

21

XX

XXz

Remember this formula required the standard deviation of the distribution of

mean differences.

Result: (assuming equals 2)

4

Z = ( 45 – 40)

Thus, a difference of 5 between the means of the two samples (women and

men) falls 2.5 standard deviations from a mean of zero.

21 XX

2

Z = + 2.5

Child Rearing: Comparing Males and Females

Standard Error of the Difference between Means

Here is how the standard error of the difference between means can be calculated.

21

21

21

2

22

2

11

21 2 NN

NN

NN

sNsNs

xx

The formula for combines the information from the two samples.21 XX

s

Where

The formula for combines the information from the two

samples.

A large difference between Xbar1 and Xbar2 can result if (1) one mean is

very small, (2) one mean is very large, or (3) one mean is moderately

small and the other is moderately large.

21 XXs

2

2

2

2

22

2

2

1

1

2

12

1

XN

Xs

XN

Xs

Variance: Weeks on Unemployment:

X

(weeks)

N=6

Deviation:

(raw score from

the mean)(raw score from the

mean, squared)

Variance:

9

8

6

4

2

1

9-5= 4

8-5=3

6-5=1

4-5=-1

2-5=-3

1-5=-4

42 = 16

32 = 9

12 = 1

-12 = 1

-32 = 9

-42 = 16

(weeks squared)

ΣX=30

χ= 30=5

6

(X X) (X X)2

(X X)2 52

Step 1:

Calculate

the Mean

Step 3: Calculate

Sum of square Dev

Step 2: Calculate

Deviation

Step 4: Calculate

the Mean of squared dev.

s2X X

2

N

52

68.67

Testing the Difference between Means

Let’s say that we have the following information about two samples, one of liberals and one of conservatives, on the progressive scale:

Liberals Conservatives

N1 = 25 N2 = 35

= 60 = 49

S1 = 12 S2 = 14

We can use this information to calculate the estimate of the standard

error of the difference between means:

1X 2X

21

21

21

2

22

2

11

21 2 NN

NN

NN

sNsNs

xx

)35)(25(

3525

23525

)14)(35()12)(25( 22

21 xxs

52.3

3717.12

)0686)(.3448.180(

875

60

58

860,6600,3

We start with

our formula:

The standard error of the difference between means is 3.52.

We can now use our result to translate the difference between sample

means to a t ratio.

We can now use our standard error results to change difference between sample mean into a t ratio:

21

21

XXs

XXt

t = 60 – 49

3.52

t = 11

3.52

t = 3.13

REMEMBER: We use t

instead of z because we do

not know the true population

standard deviation.

We aren’t finished yet!

Turn to Table C.

1) Because we are estimating for both σ1 and σ2 from s1 and s2, we use a wider t distribution, with degrees of freedom N1+ N2 – 2.

2) For each standard deviation that we estimate, we lose 1 degree of freedom from the total number of cases.

N = 60

Df ( 25 + 35 - 2) = 58

In Table C, use a critical value of 40 since 58 is not given.

We see that our t-value of 3.13 exceeds all the standard critical points except for the .001 level.

Therefore, based on what we established BEFORE our study, we reject the null hypothesis at the .10, .05, or .01 level.

df .20 .10 .05 .02 .01 .001

40 1.303 1.684 2.021 2.423 2.704 3.551

Comparing the Same Sample Measured Twice

Some research employs a panel design or before and after test (testing the same sample at two points in time).

In these types of studies, the same sample is tested twice. It is not two samples from the same population, it is a measuring the same group of people twice.

CRITICAL POINTS TO NOTE:

1. The same sample measured twice uses the t-test of difference between means.

2. Different samples from the same population selected at two points in time use the t-test of difference between means for independent groups.

Example Problem of Test of Difference Between Means for Same Sample Measured Twice

Null Hypothesis (µ1 = µ2): The degree of neighborliness does not differ before and after relocation.

Research Hypothesis (µ1 ≠ µ2): The degree of neighborliness differs before and after relocation.

Where µ1 is the mean score of neighborliness at time 1

Where µ2 is the mean score of neighborliness at time 2

Respondent Before

(X1)

After

(X2)

Difference

(D = X1 – X2)

Difference2

(D2)

Johnson 2 1 1 1

Robinson 1 2 -1 1

Brown 3 1 2 4

Thomas 3 1 2 4

Smith 1 2 -1 1

Holmes 4 1 3 9

∑ X1 = 14 ∑ X2 = 8 ∑ D2 = 20

221

2

)( XXN

DsD

The formula for obtaining

the standard deviation for

the distribution of before-

after difference scores

sD = standard deviation of the distribution of before-after difference scores

D = after-move raw score subtraction from before-move raw score

N = number of cases or respondents in sample

From this, we get the formula for the standard error of the difference between the means:

SD

SD

N 1

SD

1.53

6 1

X1

X1

nX 2

X 2

n

14

6=

=

=

=2.33 1.33

8

6

sD20

6(2.33 1.33)2

= 1.53

= .68

tX1 X2

sDt = 60 – 49

3.52

t = 3.13

Step 1: Find mean for each point in time

Step 2: Find the SD for the diff between

the times

Step 3: Find the SE for the diff

between the times

Step 4: Translate the mean diff into a t

score

Comparing the Same Sample Measured Twice

df = (n – 1)

= 6 – 1

= 5

Step 5: Calculate the degrees of freedom

Step 6: compare the obtained t ratio with t ratio in Table C

Obtained t = 1.47

Table t = 2.571

df = 5

α = .05

To order reject the null hypothesis at the .05 significance with five degrees of

freedom we must obtain a calculated t ratio of 2.571. Because our t ratio is

only 1.47 – we retain the null hypothesis.

df .20 .10 .05 .02 .01 .001

5 1.476 2.015 2.571 3.365 4.032 6.859

Two Sample Test of Proportions

21

21

PPs

PPz

21

21*)1(*21 NN

NNPPs PP

21

2211*NN

PNPNP

Where P1 and P2 are

respective sample proportions.

The standard error of the

difference in proportions is:

Where P* is the combined

sample proportion

Requirements when considering the appropriateness of the t-ratio as a test of significance. (For Testing the Difference between Means):

1. The t ratio is used to make comparisons between two means.

2. The assumption is that we are working with interval level data.

3. We used a random sampling process.

4. The sample characteristic is normally distributed.

5. The t ratio for independent samples assumes that the population variances are equal.

So how do you interpreting the results and state them for inclusion in your research?

“Since the observed value of t (state the test statistic) exceeds the critical value (state the critical value), the null hypothesis is rejected in favor of the directional alternative hypothesis. The probability that the observed difference (state the difference between means) would have occurred by chance, if in fact the null hypothesis is true, is less than .05.”

testing differences between means continued · testing differences between means to test the...

Documents