quant 2 - lecture 7 (t-tests w error)personal.kent.edu/~marmey/quant2spring05/quant 2... · concept...

Student’s t-Distribution

The t-Distribution, t-Tests, Measures of Effect Size, & Managing Violations

of Assumptions

Sampling Distributions Redux

• Chapter 7 opens with a return to the concept of sampling distributions from chapter 4– Sampling distributions of the mean

Sampling Distribution of the Mean

• Because the SDotM is so important in statistics, you should understand it

• The SDotM is governed by the Central Limit Theorem

Given a population with a mean μ and a variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to μ, a variance equal to σ2/n, and a standard deviation equal to . The distribution will approach the normal distribution as n, the sample size, increases. (p. 178)

n/2σ


Translation:1. For any population with a given mean and

variance the sampling distribution of the mean will have:

• µx = μ• σx

2 = σ2/n• σx = σ/√n

2. As n increases, the sampling distribution of the mean (µx) approaches a normal curve


• Analysis:– Although µx and µ will tend to be similar to one

another…– The relationships between…

• σx2 and σ2

• σx and σ– …will differ as a function of the sample size

• We saw this in our sampling distribution of the mean example from chapter 4…

So, you wanna test a hypothesis, do ya?

• Our understanding of sampling and sampling distributions now allows us to test hypotheses

• How we test a hypothesis depends on the information we have available

Choosing a Test• µ?

– σ?– s?

• Number of data sets:– 1– 2

• Number of Groups– 1– 2

1. Which variables are available?

2. How many data sets are you presented with?

3. Do your data sets come from 1 or 2 groups?

Testing Hypotheses about Means: The Rare Case of Knowing σ

• So far, to test the probability of finding a particular score, we’ve used the Standard Normal Distribution– IQ = 122– µ = 100– σ = 15

σ)( xxz −

=

(122 100)15

z −=

(22)15

z =

1.47z =

-1.96 < z < 1.96 Fail to reject H0

How the z-Test Works

• How does our test change when we test group means, not just individual scores?– We use the central limit theorem


10015

)100122( −=z

1015

)22(=z

5.1)22(

=z 67.14=z

215

)100122( −=z

41.115

)22(=z

64.10)22(

=z 07.2=z

115

)100122( −=z

115

)22(=z

15)22(

=z 47.1=z

n = 100

n = 2

n = 1


• Large samples reduce the amount of random variance (sampling error)

– More confidence that the sample mean = population mean

• Larger samples improve our ability to detect differences between samples and populations

• For n = 1=

n

xz σµ)( −

=σµ)( −

=xz

Testing Hypotheses: When σ Is Unknown

• Generally, the population standard deviation, σ, is unknown to us

• Occasionally, we will know the population mean, µ, when we don’t know σ

• In these situations, the standard normal distribution no longer meets our needs


• Knowing µ…– We can produce an estimate of σ from s– Using s changes the nature of the test we are

conducting, as s is not distributed in the same fashion as σ

• Sampling distribution of the sample standard deviation is NOT normally distributed

– Strong positive skew


Sampling distribution of s

Sampling distribution of σ

So How Does s Estimate σ?

• Given the differences in distribution shape, it is easy to conclude that s ≠ σ– s is an unbiased estimator of σ over repeated

samplings – However, a SINGLE value of s is likely to

underestimate σ• Because of this fact, small samples will systematically

underestimate σ as a function of s

– This leads to any given statistic calculated from this distribution to be < a comparable value of z

– We cannot use z any longer t

t and the t-Distribution

• Developed by Student while he was working for the Guinness Brewing Co.

1. The shape of the t-distribution is a direct function of the size of the sample we are examining

2. For small samples, the t-distribution is somewhat flatter than the standard normal distribution, with a lower peak and fatter tails


3. As sample size increases:• The t-distribution approaches a normal

distribution• Theoretically, we mean that the closer that our

sample comes to infinity, the more it looks like a normal distribution

• Practically, when n ~ 100 – 120


4. Identifying values of t associated with a given rejection region depends on:– α– the number of tails associated with the test– the degrees of freedom available in the analysis

– For this one-sample test, (df = n-1) because we used one degree of freedom calculating s using the sample mean and not the population mean.

One-Sample t-Test

xsxt )( µ−

=

ns

xtx

)( µ−=

ns

xtx

2

)( µ−=or or

z-Test vs. One-Sample t-Test

n

xz σµ)( −

=

ns

xtx

)( µ−=

Note the similarities between these tests: ONLY the source of “variance” and the distribution you test against have changed!

Using the One-Sample t-Test• You are one the admissions board for a

graduate school of Psychology. • You are attempting to determine if the GRE

scores for the students applying to your program is competitive with the national average. – µVerbal = 569

• SPSS output from your dataDescriptive Statistics

24 310.00 659.7917 86.43267

24

GRE

Valid N (listwise)

N Range Mean Std. Deviation

Using the One-Sample t-Test

• Research Hypothesis:– The GRE scores from your applicants differ

from the population norms• H1: µa ≠ µp or ES > 0

• Null Hypothesis – The GRE scores from your applicants do not

differ from the population norms • H0: µa = µp or ES = 0

• Evaluate the students’ GRE-V scores


• Select:• Rejection region

• α = .05• “Tail” or directionality

• We don’t know exactly how the students will score: we just expect them to show scores differing from the population values

• Might predict higher scores…


• Generate sampling distribution of the mean assuming H0 is true

• One-Sample t-test• Given our sampling distribution:

• Conduct the statistical test


ns

xtx

)( µ−=

(659.79 569)86.43

24

t −=

(90.79)17.64

t =(90.79)86.434.90

t = 5.15t =

µVerbal = 569x-bar = 659.79s = 86.43n = 24

This numerical value is called tobt

tobt(23) = 5.15


• SPSS OutputOne-Sample Test

5.146 23 .000 90.79167 54.2943 127.2890GREt df Sig. (2-tailed) Mean Difference Lower Upper

95% Confidence Intervalof the Difference

Test Value = 569

µ

tobt(23) = 5.15

Evaluating Statistical Significance of the t-Test

• First note:– α = .05– Tail or directionality: two-tailed– t-Value = 5.15– Degrees of freedom (df)

• For the One-Sample t-Test, df = n-1 (24-1 = 23)• Estimating s from x-bar (not σ from µ)


• In the past you…– Identified a tabled value of tcrit– Compare tcrit to our tobt value– If tobt falls into the rejection region identified by

tcrit, then we reject H0– If tobt does not fall into the rejection region

identified by tcrit, then we fail to reject H0

• SPSS Simplifies matters by exactly calculating p for us


• SPSS OutputOne-Sample Test

5.146 23 .000 90.79167 54.2943 127.2890GREt df Sig. (2-tailed) Mean Difference Lower Upper


Test Value = 569

µ

tobt(23) = 5.15, p < .05Exact probability ≈ .000003


tcrit = 2.069tcrit = - 2.069

0

tobt = 5.15

Because tobt falls within the rejection region identified by tcrit we reject H0

Testing Hypotheses: Two Matched (Repeated) Samples

• Sometimes, we’re interested in how a single set of scores change over time– Psychotherapy tx influences depression– Patients respond to medication– Consumer attitudes before and after an advertisement– Changes in citizen attitudes following the State of the

Union address• When we look at two sets of scores collected

from a single sample at different time points, we need to use a matched samples test

Matched Samples

• Matched samples– Use the same participants at two or more

different time points to collect similar data• MUST BE THE SAME SAMPLE!

BDI - II BDI - II

Time 1 Wait 30 Days Time 2

Matched Samples Test

• With a matched samples test, you are testing the change in scores between the two administrations of the test– H0: µ1 = µ2

– H0: µ1 - µ2 = 0 or ES = 0• This is truly the null hypothesis for the matched

samples test


• Essentially, the group means at each time point mean little to us– Change in scores is the key– Conduct this test by obtaining the average

difference score between the two time points


ns

DtD

0−=

D-bar represents average difference scores between time points

is the standard deviation of the difference scores

-0 may seem redundant, but isn’t!

Ds

Calculating the Matched Samples t-Test

• You are a researcher examining the impact of a new therapy intervention on the incidence of self-injurious behavior (SIB)

• You collect a measure of the frequency of self-injurious acts when clients enter your treatment (time 1)

• You collect a measure of the frequency of self-injurious acts two weeks later (time 2)


• Research Hypothesis:– The new treatment will change SIB scores

• H1: µ1 ≠ µ2 or ES > 0

• Null Hypothesis – The SIB scores at time 2 will be the same as

the scores at time 1 (no change)• H0: µ1 = µ2

• H0: µ1 - µ2 = 0 or ES = 0

• Evaluate SIB at time 1 & time 2




• We don’t know exactly how the treatment will work, so we’d better use a two-tailed test



• Matched Samples t-test• Given our sampling distribution:



54274413445D 2516449161619161625D2

261791191074108Time 2

71019161513111081413Time 1

∑D = 43

∑D2 = 193

(∑D)2 = 1849

Descriptive Statistics

11 7.00 19.00 12.3636 3.58532

11 2.00 17.00 8.4545 3.93354

11

time1

time2

Valid N (listwise)

N Minimum Maximum Mean Std. DeviationD = 3.91


22

2

( )

( 1)D

DD

nsn

−=

−

∑∑2

2

4319311

(11 1)Ds−

=−

2

184919311

(10)Ds−

= 2 193 168.09(10)Ds −

=

2 24.91(10)Ds =

2 2.49Ds = 2.49Ds = 1.58Ds =


ns

DtD

0−=

1158.1

091.3 −=t

32.358.191.3

=t

48.91.3

=t 15.8=t tobt = 8.15


• First note:– α = .05– Tail or directionality: two-tailed– t-Value = 8.15– Degrees of freedom (df)

• For the Matched Samples t-Test:– df = number of PAIRS of scores -1– df = 11 - 1 = 10

– Again, we can calculate p exactly with SPSS


• SPSS OutputPaired Samples Correlations

11 .916 .000time1 & time2Pair 1N Correlation Sig.

Paired Samples Test

3.90909 1.57826 .47586 2.84880 4.96938 8.215 10 .000time1 - time2Pair 1Mean Std. Deviation Std. Error Mean Lower Upper


Paired Differences

t df Sig. (2-tailed)

tobt (10) = 8.15, p < .05p ≈ .0000009


tcrit = 2.228tcrit = - 2.228

0

tobt = 8.15


Testing Hypotheses: Two Independent Samples

• Probably the most common use of the t-Test and the t-distribution

• Compare the mean scores of two groups on a single variable– IV: Groups– DV: Variable of interest

• Groups must be independent of one another– Scores in 1 group cannot influence scores in

the other group

Independent Samples t-Test

21

21

xxsXX

t−

−=

2

22

1

21

21

ns

ns

XXt+

−=or

This test is calculated by dividing the mean difference between two groups by the “dispersion”or “variation” observed between the two groups

Independent Samples t-Test:Degrees of Freedom

• 1 df lost for each σ estimated by s using x-bar

• Since there are two independent groups in this analysis, we must estimate σ twice

• df = (n1 + n2) - 2

Independent Samples t-Test: Example

• Let’s return to the example used for the matched samples test

• As a competent researcher, you realize that simply showing a change over time is not enough to prove the efficacy of your treatment– People spontaneously change over time

• Show that an untreated control group does not change over the same period of time that your treatment group does change


SIB

ScoresTx Group

Ctrl Group

Time 1 Time 2

SIB

Scores

Tx

Tx

SIB

Scores

SIB

Scores

SIB

Scores

Time 3

= ?


• At time 1, the control and treatment SIB groups have equal SIB scores

• Administer the treatment for 2 weeks to Txgroup– The Control group receives no intervention

during these two weeks• Compare SIB scores of Tx and Control

group after 2 weeks• Provide Control group w/ intervention if

desired


• Research Hypothesis:– Your treatment for SIB will reduce SIB

scores in the Tx group after 2 weeks• H1: µt < µc

• Null Hypothesis – Your treatment for SIB will have no effect

• H0: µt = µc

• Evaluate the efficacy of your treatment


261791191074108Tx

12161513168119101312ControlTime 2 Data

Ctrl Group

∑x 135

∑x2 1729

(∑x)2 18225

x-bar 12.27

s2 7.29

s 2.69

n 11

Tx Group

93

941

8649

8.45

15.47

3.93

11


11 8.00 16.00 12.2727 2.68667

11 2.00 17.00 8.4545 3.93354

11

ctrl

tx

Valid N (listwise)

N Minimum Maximum Mean Std. Deviation




• We have evidence that the treatment probably works, so we make a one-tailed hypothesis here (scores for the Tx group will be lower than the Control group at time 2)



• Independent Samples t-Test• Given our sampling distribution:



1129.7

1147.15

27.1245.8

+

−=t

66.41.182.3+

−=t

07.282.3−

=t

2

22

1

21

21

ns

ns

XXt+

−=

44.182.3−

=t

65.2−=t tobt(20) = -2.65


• First note:– α = .05– Tail or directionality: one-tailed– t-Value = -2.65– Degrees of freedom (df)

• For the Independent Samples t-Test– (n1 + n2) - 2 – (11+11)-2 – 22 - 2 = 20


tobt(20) = -2.65, p < .05

• SPSS OutputIndependent Samples Test

.518 .480 2.658 20 .015 3.81818 1.43625

2.658 17.663 .016 3.81818 1.43625

Equal variances assumed

Equal variances not assumed

Self-Injurious BehaviorF Sig.

Levene's Test for Equalityof Variances

t df Sig. (2-tailed) Mean DifferenceStd. ErrorDifference

t-test for Equality of Means

p ≈ .015


0

tobt = -2.65


tcrit = - 1.725

Independent Samples t-Test: One Complication

• There is a slight problem with the form of the equation we used…– ONLY can be applied

to groups with equal sample sizes

– A major limitation in real-world research

2

22

1

21

21

ns

ns

XXt+

−=

Pooled Variance Estimate

• This equation permits tests with different sample sizes

• Generates an estimate of the total variance between groups weighted by the size of each group– Therefore, larger samples have a greater

impact on the variance – Vice-versa for small samples

Pooled Variance Estimate

2)1()1(

21

222

2112

−+−+−

=nn

snsnsp

Using the Pooled Variance Estimate

2

22

1

21

21

ns

ns

XXt+

−= 2

2

1

221

ns

ns

XXtpp +

−=

21

2

21

11nn

s

XXt

p +

−=

Using the Pooled Variance Estimate: Example

261791191074108Tx

No Data121615131611ControlTime 2 Data

Ctrl Group

∑x 83

∑x2 1171

(∑x)2 6889

x-bar 13.83

s2 4.57

s 2.14

n 6

Tx Group

93

941

8649

8.45

15.47

3.93

11


6 11.00 16.00 13.8333 2.13698

11 2.00 17.00 8.4545 3.93354

6

ctrl

tx

Valid N (listwise)

N Minimum Maximum Mean Std. Deviation


261157.4)16(47.15)111(2

−+−+−

=ps2

)1()1(

21

222

2112

−+−+−

=nn

snsnsp

1585.227.1542 +

=ps15

57.4)5(47.15)10(2 +=ps

1555.1772 =ps 84.112 =ps


21

2

21

11nn

s

XXt

p +

−=

)61

111(84.11

83.1345.8

+

−=t

)0909.1667(.84.1138.5+

−=t

tobt(15) = -3.07

)2576(.84.1138.5−

=t

05.338.5−

=t 75.138.5−

=t 07.3−=t


• First note:– α = .05– Tail or directionality: one-tailed– t-Value = -3.07– Degrees of freedom (df)

• For the Independent Samples t-Test– (n1 + n2) - 2 – (11+6)-2 – 17 - 2 = 15



.714 .411 3.080 15 .008 5.37879 1.74614

3.653 14.979 .002 5.37879 1.47232







tobt(15) = -3.07, p < .05p ≈ .0076


0

tobt = -3.07


tcrit = - 1.753

Effect Size of The Independent Samples t-Test

σµµ 21 −=d or

psXX

d 21 −=

We use the same effect size conventions we identified for the Matched Samples test

Effect Size of The Independent Samples t-Test

8.45 13.8311.84

d −=

5.3811.84

d −=

psXX

d 21 −=

An effect size approaching the convention for a medium effect

.45d = −

t-test Assumptions

• Although the t-test is generally a robust test, it can be affected by violations of underlying test assumptions– Normality – sampling distribution is normally

distributed– Sample size – samples for each group should

be of roughly equal size– Homogeneity of variance – σ1 = σ2

t-test Assumptions

• One sample t-test– Normality - √– Sample size - X– Homogeneity of variance – X

• Matched & Independent samples t-test(s)– Normality - √– Sample size - √– Homogeneity of variance – √

Impact of Violated Assumptions

• For equal sample sizes…– …violating homogeneity of variance…

• Minimal impact (α = .05 ± .02)– …with minor normality violations…

• Similar results as above– …with major normality violations…

• Severe skew (particularly in opposite directions) can lead to significant problems unless variances are fairly equal

Impact of Violated Assumptions

• Unequal sample sizes…– Much more difficult to interpret– Unequal sample sizes + heterogeneity of

variance = distortions in p• Possibly increased risk of Type I error• Risk of error increases as more assumptions are

violated

Coping with Violated Assumptions

• What can we do to prevent or cope with violated assumptions?

1. Maintain equal sample sizes2. Use trimmed samples…3. Use a distribution free (i.e. non-parametric)

test4. Apply a statistical correction to t

Coping with Violated Assumptions


.714 .411 3.080 15 .008 5.37879 1.74614

3.653 14.979 .002 5.37879 1.47232







If pF < .05, use the “Equal variances no assumed” row

Statistical Tests We Have Learned

1. z-Test• 1 group• 1 set of data• µ & σ known

2. One-Sample t-Test• 1 group• 1 set of data• µ known• Estimate σ with s using

x-bar

3. Matched Samples t-Test

• 1 group• 2 sets of data• µ & σ unknown• Estimate σD with sD

using D-bar

4. Independent Samples t-Test

• 2 groups• 2 sets of data• µ & σ unknown• Estimate σ twice with s

using x-bar

Choosing the Best Test

Choosing the Best Test

• Flow-chart available on the website:– http://www.personal.kent.edu/~marmey

• Also refer to the diagram on p. 11 of your Howell text

• Try the review problems on the website for an example of the types of questions I might ask on an exam!

quant 2 - lecture 7 (t-tests w error)personal.kent.edu/~marmey/quant2spring05/quant 2... · concept...

Documents