t-tests part 1 ps1006 lecture 2

32
T-tests Part 1 PS1006 Lecture 2 Sam Cromie

Upload: tanaya

Post on 04-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

T-tests Part 1 PS1006 Lecture 2. Sam Cromie. What has this got to do with statistics?. Overview. Review: hypothesis testing The need for the t-test The logic of the t-test Application to: Single sample designs Two group designs Within group designs (Related samples) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: T-tests Part 1 PS1006 Lecture 2

T-testsPart 1

PS1006 Lecture 2

Sam Cromie

Page 2: T-tests Part 1 PS1006 Lecture 2

What has this got to

do with statistics?

Page 3: T-tests Part 1 PS1006 Lecture 2

Overview

• Review: hypothesis testing • The need for the t-test• The logic of the t-test• Application to:

– Single sample designs– Two group designs

• Within group designs (Related samples)• Between group designs (Independent samples)

• Assumptions • Advantages and disadvantages

Page 4: T-tests Part 1 PS1006 Lecture 2

Generic Hypothesis testing steps

1. State hypotheses – Null and alternative

2. State alpha value

3. Calculate the statistic (z score, t score, etc.)

4. Look the statistic up on the appropriate probability table

5. Accept or reject null hypothesis

Page 5: T-tests Part 1 PS1006 Lecture 2

Generic form of a statistic

Data – HypothesisError

What you got – what you expected (null)The unreliability of your data

Z = Individual score – Population meanPopulation standard deviation

Page 6: T-tests Part 1 PS1006 Lecture 2

Hypothesis testing with an individual data point…

1. State null hypothesis

2. State α value

3. Convert the score to a z score

4. Look z score up on z score tables

z x

Page 7: T-tests Part 1 PS1006 Lecture 2

In this case we have…

Sample Population

Data of interest

Individual score, n=1

Mean known

Error Standard deviation

known

Page 8: T-tests Part 1 PS1006 Lecture 2

Hypothesis testing with one sample (n>1) …

• 100 participants saw video containing violence

• Then they free associated to 26 homonyms with aggressive & non-aggressive forms - e.g., pound, mug,

• Mean number of aggressive free associates = 7.10

• Suppose we know that without an aggressive video the mean ()=5.65 and the standard deviation () = 4.5

• Is 7.10 significantly larger than 5.65?

Page 9: T-tests Part 1 PS1006 Lecture 2

In this case we have…

Sample Population

Data of interest

Mean, n=100

Mean known

Error Standard deviation

known

Page 10: T-tests Part 1 PS1006 Lecture 2

Hypothesis testing with one sample (n>1) …

• Use the sample mean instead of x in the z score formula• Use standard error of sample instead of the population standard

deviation

x

z becomesX

xz

wherenX

n = the number of scores in the sample

Page 11: T-tests Part 1 PS1006 Lecture 2

Standard error:

• If we know then can be calculated using the formula

X

nX

ation differenti

of purposes for theerror standard a asknown isbut

on distributi sampling a ofdeviation standard theis X

Page 12: T-tests Part 1 PS1006 Lecture 2

Sampling distribution

• http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html

• Will always be narrower than the parent population

• The more samples that are taken the more normal the distribution

• As sample size increases standard error decreases

Page 13: T-tests Part 1 PS1006 Lecture 2

Back to video violence• H0: = 5.65

• H1: 5.65(two-tailed)

• Calculate p for sample mean of 7.10 assuming =5.65• Use z from normal distribution as sampling distribution

can be assumed to be normal• Calculate z

X

Xz

n

X

=

7.1 5.654.5100

=1.45

.453.22=

• If z > + 1.96, reject H0

• 3.22 > 1.96 the difference is significant

Page 14: T-tests Part 1 PS1006 Lecture 2

But mostly we do not know σ

• E.g. do penalty-takers show a preference for right or left?

• 16 penalty takers; 60 penalties each; null hypothesis = 50% or 30 each way

• Result mean of 39 penalties to the left; is this significantly different?

• µ = 30, but how do we calculate the standard error without the σ?

Page 15: T-tests Part 1 PS1006 Lecture 2

In this case we have…

Sample Population

Data of interest

Mean n=100

Hypothesis= 30

Error Variance ?

Page 16: T-tests Part 1 PS1006 Lecture 2

Using s to estimate σ • Can’t substitute s for in a z score because

s likely to be too small

• So we need:– a different type of score – a t-score– a different type of distribution – Student’s t

distribution

Page 17: T-tests Part 1 PS1006 Lecture 2

T distribution

• First published in 1908 by William Sealy Gosset,

• Worked at a Guinness Brewery in Dublin on best yielding varieties of barley

• Prohibited from publishing under his own name

• so the paper was written under the pseudonym Student.

Page 18: T-tests Part 1 PS1006 Lecture 2

T-test in a nut-shell…

Allows us to calculate precisely what small samples tell us

Uses three critical bits of information – mean, standard

deviation and sample size

Page 19: T-tests Part 1 PS1006 Lecture 2

t test for one mean• Calculated the same way as z except

is replaced by s. • For the video example we gave before,

s = 4.40

Xs

Xt

X

Xz

n

sX

n

X

=

7.1 5.65

4.40100

100

5.465.51.7

=

1.45

.443.30

1.45

.453.22=

Page 20: T-tests Part 1 PS1006 Lecture 2

Degrees of freedom

• t distribution is dependent on the sample size and this must be taken into account when calculating p

• Skewness of sampling distribution decreases as n increases

• t will differ from z less as sample size increases

• t based on df where df = n - 1

Page 21: T-tests Part 1 PS1006 Lecture 2

t table Two-Tailed Significance Level

df .10 .05 .02 .0110 1.812 2.228 2.764 3.16915 1.753 2.131 2.602 2.94720 1.725 2.086 2.528 2.84525 1.708 2.060 2.485 2.78730 1.697 2.042 2.457 2.750

100 1.660 1.984 2.364 2.626

Page 22: T-tests Part 1 PS1006 Lecture 2

• With n = 100, t.02599 = 1.98

• Because t = 3.30 > 1.98, reject H0

• Conclude that viewing violent video leads to more aggressive free associates than normal

Statistical inference made

Page 23: T-tests Part 1 PS1006 Lecture 2

Factors affecting t

• Difference between sample & population means– As value increases so t increases

• Magnitude of sample variance– As sample variance decreases t increases

• Sample size - as it increases – The value of t required to be significant decreases– The distribution becomes more like a normal

distribution

Page 24: T-tests Part 1 PS1006 Lecture 2

Application of t-test to Within group designs

Page 25: T-tests Part 1 PS1006 Lecture 2

t for repeated measures scores

• Same participants give data on two measures

• Someone high on one measure probably high on other

• Calculate difference between first and second score

• Base subsequent analysis on these difference scores. Before and after data are ignored

Page 26: T-tests Part 1 PS1006 Lecture 2

Example - Therapy for PTSD

Before After Diff.

21 24 21 26 32 27 21 25 18

15 15 17 20 17 20 8

19 10

6 9 4 6

15 7

13 6 8

Mean St. Dev.

23.84 4.20

15.67 4.24

8.22 3.60

• Therapy for victims of psychological trauma-Foa et al (1991)– 9 Individuals received Supportive Counselling – Measured post-traumatic stress disorder symptoms before and after

therapy

Page 27: T-tests Part 1 PS1006 Lecture 2

In this case we have…

Sample Population

Data of interest

Mean Difference

(n=9)

Hypothesis: = 0

Error S of the Difference

?

Page 28: T-tests Part 1 PS1006 Lecture 2

Results

• The Supportive Counselling group decreased number of symptoms - was difference significant?

• If no change, mean of differences should be zero

• So, test the obtained mean of difference scores against = 0.

• We don’t know , so use s and solve for t

Page 29: T-tests Part 1 PS1006 Lecture 2

Repeated measures t test

• and = mean and standard deviation of differences respectivelyD

df = n - 1 = 9 - 1 = 8

Ds

Dt

Xs

Xt

D sDn

n

sX

8.223.6

9

7.1 5.65

4.40100

8.22

1.26.85

1.45

.443.30

Ds

Page 30: T-tests Part 1 PS1006 Lecture 2

Inference made

• With 8 df, t.025 = +2.306

• We calculated t = 6.85

• Since 6.85 > 2.306, reject H0

• Conclude that the mean number of symptoms after therapy was less than mean number before therapy.

• Infer that supportive counselling seems to work

Page 31: T-tests Part 1 PS1006 Lecture 2

+ & - of Repeated measures design• Advantages

– Eliminate subject-to-subject variability– Control for extraneous variables– Need fewer subjects

• Disadvantages– Order effects– Carry-over effects– Subjects no longer naïve– Change may just be a function of time

Page 32: T-tests Part 1 PS1006 Lecture 2

t test is robust

• Test assumes that variances are the same– Even if the variances are not the same, the

test still works pretty well

• Test assumes data are drawn from a normally distributed population– Even if the population is not normally

distributed, the test still works pretty well