Download - Statistical Techniques to Compare Groups

7/22/2019 Statistical Techniques to Compare Groups

1/134

Statistical techniques to compare

groups


2/134

Techniques covered in this Part

One-sample t-test

Independent-samples t-test;

Paired-samples t-test;

One-way analysis of variance (between groups); two-way analysis of variance (between groups);

and

non-parametric techniques.


3/134

The different Statistical techniques to

compare groups in SPSS are:


4/134

Assumptions

Each of the tests in this section have a number of

assumptions underlying their use. There are some

general assumptions that apply to all of the

parametric techniques discussed here (e.g. t-tests,analysis of variance), and additional assumptions

associated with specific techniques.

The general assumptions are presented in thissection and the more specific assumptions are

presented in the following topics, as appropriate.


5/134

Level of measurement

Each of these approaches assumes that the

dependent variable is measured at the interval or

ratio level, that is, using a continuous scale rather

than discrete categories.

Wherever possible when designing your study, try

to make use of continuous, rather than categorical,

measures of your dependent variable. This givesyou a wider range of possible techniques to use

when analysing your data.


6/134

Random sampling

The techniques covered in this lecture assume that

the scores are obtained using a random sample

from the population. This is often not the case in

real-life research.


7/134

Independence of observations

The observations that make up your data must be

independent of one another. That is, each

observation or measurement must notbe

influenced by any other observation ormeasurement. Violation of this assumption is very

serious.

There are a number of research situations that

may violate this assumption of independence. Forexample:

Studying the performance of students working inpairs or small groups. The behaviour of each

member of the group influences all other group


8/134

Any situation where the observations or

measurements are collected in a group setting,or subjects are involved in some form of

interaction with one another, should be

considered suspect.

In designing your study you should try to ensure

that all observations are independent.

If you suspect some violation of this assumption,

Stevens (1996, p. 241) recommends that youset a more stringent alpha value (e.g. p


9/134

Normal distributionWhat are the characteristics of a normal distribution

curve?

It is assumed that the populations from which thesamples are taken are normally distributed.

In a lot of research (particularly in the social

sciences), scores on the dependent variable are notnicely normally distributed.

Fortunately, most of the techniques are reasonably

robust or tolerant of violations of this assumption.

With large enough sample sizes (e.g. 30+), theviolation of this assumption should not cause any

major problems.

The distribution of scores for each of your groupscan be checked usin histo ramsobtained as art


10/134

Homogeneity of variance

Techniques in this section make the assumption that

samples are obtained from populations of equal

variances.

To test this, SPSS performs the Levene test for

equality of variances as part of the t-test and analysis

of variances analyses.

If you obtain a significance value of less than .05, this

suggests that variances for the two groups are not

equal, and you have therefore violated the assumptionof homogeneity of variance.

Analysis of variance is reasonably robust to violations

of this assumption, provided the size of your groups is

reasonably similar (e.g. largest/smallest=1.5).


11/134

Type 1 error, Type 2 error and

power

The purpose of t-tests and analysis of variance is totest hypotheses. With these types of analyses there is

always the possibility of reaching the wrong

conclusion.

There are two different errors that we can make:

1. Type 1 error occurs when we think there is a

difference between our groups, but there really isnt.in other words, we may reject the null hypothesis

when it is, in fact, true .We can minimise this

possibility by selecting an appropriate alpha level.


12/134

Type 2 error. This occurs when we fail to reject anull hypothesis when it is, in fact, false (i.e. believingthat the groups do not differ, when in fact they do).

Unfortunately these two errors are inversely related.As we try to control for a Type 1 error, we actuallyincrease the likelihood that we will commit a Type 2error.

Ideally we would like the tests that we use tocorrectly identify whether in fact there is a differencebetween our groups. This is called the power of atest.

Tests vary in terms of their power (e.g. parametrictests such as t-tests, analysis of variance etc. aremore powerful than non-parametric tests).

Other factors that can influence the power of atest are:


13/134

1. Sample Size:

When the sample size is large (e.g. 100 or more

subjects), then power is not an issue. However, whenyou have a study where the group size is small (e.g.n=20), then you need to be aware of the possibility that

a non-significant result may be due to insufficient

power. When small group sizes are involved it may be

necessary to adjust the alpha level to compensate.

There are tables available that will tell you how large

your sample size needs to be to achieve sufficientpower, given the effect size you wish to detect.

The higher the power, the more confident you can be

that there is no real difference between the groups.
http://localhost/var/www/apps/conversion/tmp/4SPSS%202012/Appropriate%20Sample%20Size%20in%20Survey%20Research%20+++.pdfhttp://localhost/var/www/apps/conversion/tmp/4SPSS%202012/Appropriate%20Sample%20Size%20in%20Survey%20Research%20+++.pdfhttp://localhost/var/www/apps/conversion/tmp/4SPSS%202012/Appropriate%20Sample%20Size%20in%20Survey%20Research%20+++.pdf


14/134

Effect size

With large samples, even very small differences

between groups can become statisticallysignificant. This does not mean that the difference

has any practical or theoretical significance.

One way that you can assess the importance of

your finding is to calculate the effect size (alsoknown as strength of association).

Effect size is a set of statistics which indicates the

relative magnitude of the differences betweenmeans. In other words, it describes the amount ofthe total variance in the dependent variable that is

predictable from knowledge of the levels of the

independent variable


15/134

Effect size statistics, the most common of which

are:

eta squared,

Cohens d and Cohens f (see the formula)

Eta squared represents the proportion of variance

of the dependent variable that is explained by the

independent variable. Values for eta squared canrange from 0 to 1.

To interpret the strength of eta squared values the

following guidelines can be used:

.01=small effect;

.06=moderate effect; and

.14=large effect.

A number of criticisms have been levelled at eta


16/134

Missing data

It is important that you inspect your data file formissing data. Run Descriptives and find out what

percentage of values is missing for each of your

variables.

If you find a variable with a lot of unexpected

missing data you need to ask yourself why.

You should also consider whether your missing

values are happening randomly, or whether there issome systematic pattern (e.g. lots of women failing

to answer the question about their age).


17/134

The Options button in many of the SPSS statisticalprocedures offers you choices for how you want SPSSto deal with missing data.

TheExclude cases listwise option will include cases inthe analysis only if it has full data on all of the variableslisted in your variables box for that case. A case will betotally excluded from all the analyses if it is missing

even one piece of information. This can severely, andunnecessarily, limit your sample size.

The Exclude cases pairwise (sometimes shown asExclude cases analysis by analysis) option, however,

excludes the cases (persons) only if they are missingthe data required for the specific analysis. They will stillbe included in any of the analyses for which they havethe necessary information.

TheReplace with mean option, which is available insome SPSS statistical procedures, calculates the mean


18/134

I would strongly recommend that you use

pairwise exclusionof missing data,unless you have a pressing reason to do

otherwise.

The only situation where you might need touse listwise exclusion is when you want to

refer only to a subset of cases that

provided a full set of results.


19/134

Hypothesis testing

What is a hypothesis test

A hypothesis test uses sample data to test a

hypothesis about the population from which the

sample was taken.

When to use a hypothesis test

Use a hypothesis test to make inferences about

one or more populations when sample data areavailable.


20/134

Why use a hypothesis test

Hypothesis testing can help answer questions such

as:

Are students achievement in science meeting orexceeding his achievement in science ?

Is the performance of teacher Abetter than theperformance of teacher B?


21/134

Hypothesis Testing with One-

Sample t-test

21


22/134

1. One-sample t-test

What is a one-sample t-test

A one-sample t-test helps determine whether (thepopulation mean) is equal to a hypothesized

value (the test mean).

The test uses the standard deviation of the sample

to estimate (the population standard deviation).

If the difference between the sample mean and the

test mean is large relative to the variability of thesample mean, then is unlikely to be equal to thetest mean.


23/134

Cont.

When to use a one-sample t-test

Use a one-sample t-test when continuous data

are available from a single random sample.

The test assumes the population is normallydistributed. However, it is fairly robust to

violations of this assumption for sample sizes

equal to or greater than 30, provided the

observations are collected randomly and thedata are continuous, unimodal, and

reasonably symmetric


24/134

24

Z-test or t-test?

The shortcoming of the z test isthat it requires more information

than is usually available.

To do a z-test, we need to knowthe value of population standard

deviation to be able to computestandard error. But it is rarelyknown.


25/134

25

When the population variance is unknown, we

use one samplet-test

n

s

x

What if is unknown?

Cantcompute z test statistics (z score)

Z =

Population standarddeviation must be known

Can compute t statistict =

n

x

Sample standard deviation

must be known


26/134

26

Hypothesis testing with a one-sample

t-test

State the hypotheses Ho: = hypothesized value

H1: hypothesized value

Set the criteria for rejecting Ho

Alpha level

Critical t value


27/134

27

Determining the criteria for rejecting

the Ho

If the value of texceeds some threshold or

critical valued, t, then an effect is detected

(i.e. the null hypothesis of no difference is

rejected)


28/134

28

Table C.3 (p. 638 in text)


29/134

29

Degrees of freedom for One Sample

t-test

degrees of freedomis the number of values

in the final calculation of a statistic that are

free to vary.

Degrees of freedom (d.f.) is computed as the

one less than the sample size (the

denominator of the standard deviation):df= n- 1


30/134

30

Hypothesis testing with a one-

sample t-test

Compute the test statistic (t-statistic)

t =

Make statistical decision and drawconclusion

t t critical value, reject null hypothesis t < t critical value, fail to reject null hypothesis

n

s

x


31/134

31

One Sample t-test Example

You are conducting an experiment to see

if a given therapy works to reduce test

anxiety in a sample of college students.

A standard measure of test anxiety isknown to produce a = 20. In the

sample you draw of 81 the mean = 18

with s= 9.

Use an alpha level of .05


32/134

32

Write hypotheses

Ho: The average test anxiety in the sample ofcollege students will not be statisticallysignificantly different than 20.

Ho: = 20

H1= The average test anxiety in the sample ofcollege students will be statisticallysignificantly lower than 20.

H1: < 20


33/134


34/134

34

Compute test statistic (t statistic)

t = 1820 = -2 = -2

9 / 81 1


35/134

35

Compare to criteria and make

decision

t-statistic of -2 exceeds your critical value of -1.671.

Reject the null hypothesis and conclude that

average test anxiety in the sample of college

students is statistically significantly lower than

20, t = -2.0,p< .05.


36/134

Procedures

Select Analyze/Compare Means/One-Sample t-test

Notice, SPSS allows us to specify what Confidence

Interval to calculate. Leave it at 95%. Click Continue

and then Ok. The output follows.


37/134

Notice that descriptive statistics are automatically

calculated in the one-sample t-test. Does our t-value agree

with the one in the textbook? Look at the Conf idence

Interval . Notice that i t is not th e con fidence interval of

the mean, but the confid ence interval for the dif ference

between the samp le mean and the test value we


38/134

Hypothesis Testing with two-

Sample t-test

38


39/134

Independent-samples T-test

An independent-samples t-test isused when you want to compare the

mean score, on some cont inuous

var iable, for two di f ferent g roups ofsubjects.


40/134

Summary for independent-samples t-

test

Example of research question:Is there a significant difference in the mean self-esteem

scores for males and females?

What you need:

Two variables:

one categorical, independent variable (e.g.males/females); and

one continuous, dependent variable (e.g. self-esteemscores).

Assumptions:

The assumptions for this test are (continuous scale,

normality, independence of observation, homogeneity,

Cont


41/134

Cont.

What it does:An independent-samples t-test will tell you whether

there is a statistically significant difference in the

mean scores for the two groups (that is, whether

males and females differ significantly in terms oftheir self-esteem levels).

In statistical terms, you are testing the probability

that the two sets of scores (for males and

females) came from the same population.

Non-parametric alternative:

Mann-Whitney Test

f


42/134

Procedure for independent-samples

t-test

1. From the menu at the top of the screen click on:Analyze, then click on Compare means, then onIndependent Samples T-test.

2. Move the dependent (continuous) variable (e.g. totalself-esteem) into the area labelled Test variable.

3. Move the independent variable (categorical) variable(e.g. sex) into the section labelled Grouping variable.

4. Click on Define groups and type in the numbers usedin the data set to code each group. In the current datafile 1=males, 2=females; therefore, in the Group 1 box,

type 1; and in the Group 2 box, type 2.5. Click on Continue and then OK.

The output generated from this procedure is shown below.


43/134

I t t ti f t t f


44/134

Interpretation of output from

independent-samples t-test

Step 1: Checking the information about thegroups

In the Group Statistics box SPSS gives you the

mean and standard deviation for each of your

groups (in this case: male/female). It also gives youthe number of people in each group (N). Always

check these values first. Do they seem right? Are

the N values for males and females correct? Or are

there a lot of missing data? If so, find out why.Perhaps you have entered the wrong code for

males and females (0 and 1, rather than 1 and 2).

Check with your codebook.


45/134

Cont.

Step 2: Checking assumptions

Levenestest for equality of variances: This tests whether the variance (variation) of scores for

the two groups (males and females) is the same.

The outcome of this test determines which of the t-valuesthat SPSS provides is the correct one for you to use.

If your Sig. value is larger than .05 (e.g. .07, .10), youshould use the first line in the table, which refers to Equalvariances assumed.

If the significance level of Levenestest is p=.05 or less

(e.g. .01, .001), this means that the variances for the twogroups (males/females) are not the same.

Therefore your data violate the assumption of equalvariance. Dont panic-SPSS provides you with analternative t-value. You should use the information in the

second line of the t-test table, which refers to Equal

Cont


46/134

Cont.

Step 3: Assessing differences between the

groups If the value in the Sig. (2-tailed) column is equal

o r less than .05 (e.g . .03, .01, .001), then there is a

significant difference in the mean scores on your

dependent variable for each of the two groups. If the value is above .05 (e.g. .06, .10), there is no

significant difference between the two groups.

Having established that there is a significantdifference, the next step is to find out which set of

scores is higher.

C l l ti th ff t i f


47/134

Calculating the effect size for


Effect size statistics provide an indication of themagnitude of the differences between your

groups (not just whether the difference could

have occurred by chance).

eta squared is the most commonly used.

Eta squared can range from 0 to 1 and

represents the proportion of variance in the

dependent variable that is explained by the

independent (group) variable.

SPSS does not provide eta squared values for t-

tests.


48/134

the effect size of .006 is very small. Expressed asa percentage (multiply your eta square value by

100), only .6 per cent of the variance in self-

esteem is explained by sex.

P ti th lt f


49/134

Presenting the results for


The results of the analysis could be presented asfollows:

An independent-samples t-test was conducted tocompare the self-esteem scores for males and

females. There was no significant difference in

scores for males (M=34.02, D=4.91) and females

[M=33.17, SD=5.71; t(434)=1.62, p=.11]. The

magnitude of the differences in the means was

very small (eta squared=.006).


50/134

Paired-samples T-test

Paired-samples t-test (also referred to as repeatedmeasures) is used when you have only one group of

people (or companies, or machines etc.) and you

collect data from them on two different occasions, or

under two different conditions. Pre-test/post-test experimental designs are an example

of the type of situation where this technique is

appropriate.

It can also be used when you measure the sameperson in terms of his/her response to two different

questions.

In this case, both dimensions should be rated on the

same scale e. . from 1=not at all im ortant to 5=ver

Summary for paired samples t


51/134

Summary for paired-samples t-

test

Example of research question: Is there a significant change in participants

fear of statistics scores following

participation in an intervention designed to

increase students confidence in their abilityto successfully complete a statistics course?

Does the intervention have an impact on

participants fear of statistics scores?


52/134

Cont.

What you need:One set of subjects (or matched pairs). Each

person (or pair) must provide both sets of

scores.

Two variables:

one categorical independent variable (in this

case it is Time: with two different levels Time 1,

Time 2); andone continuous, dependent variable (e.g. Fear

of Statistics Test scores) measured on two

different occasions, or under different

conditions.

C t


53/134

Cont.

What it does:A paired-samples t-test will tell you whether there is a

statistically significant difference in the mean scores

for Time 1 and Time 2.

Assumptions: The basic assumptions for t-tests.

Additional assumption: The difference between the

two scores obtained for each subject should be

normally distributed. With sample sizes of 30+,violation of this assumption is unlikely to cause any

serious problems.

Non-parametric alternative:Wilcoxon Signed Rank

Test.

Procedure for paired samples t


54/134

Procedure for paired-samples t-

test

1. From the menu at the top of the screen click on:Analyze, then click on Compare Means, then onPaired Samples T-test.

2. Click on the two variables that you are interestedin comparing for each subject (e.g. fost1: fear ofstats time1, fost2: fear of stats time2).

3. With both of the variables highlighted, move theminto the box labelled Paired Variables by clickingon the arrow button. Click on OK.

The output generated from this procedure isshown below


55/134

Interpretation of output from paired


56/134

Interpretation of output from paired-

samples t-test

Step 1: Determining overall significanceIn the table labelled Paired Samples Test you need

to look in the final column, labelled Sig. (2-tailed)

this is your probability value. If this value is less

than .05 (e.g. .04, .01, .001), then you can concludethat there is a significant difference between your two

scores.


57/134

Cont.

Step 2: Comparing mean valuesHaving established that there is a significant

difference, the next step is to find out which set of

scores is higher (Time 1 or Time 2). To do this, look

in the first printout box, labelled Paired SamplesStatistics. This box gives you the Mean scores for

each of the two sets of scores.

Calculating the effect size for paired


58/134

Calculating the effect size for paired-

samples t-test

Given our eta squared value of .50, we can

conclude that there was a large effect, with a

substantial difference in the Fear of Statistics scores


59/134

Caution

Although we obtained a significant difference inthe scores before/after the intervention, we

cannot say that the intervention caused the drop

in Fear of Statistics Test scores. Research is

never that simple, unfortunately! There are manyother factors that may have also influenced the

decrease in fear scores.

Wherever possible, the researcher should try to

anticipate these confounding factors and eithercontrol for them or incorporate them into the

research design.

Presenting the results for paired


60/134

Presenting the results for paired-

samples t-test

A paired-samples t-test was conducted toevaluate the impact of the intervention on

students scores on the Fear of Statistics Test(FOST). There was a statistically significant

decrease in FOST scores from Time 1 (M=40.17,

SD=5.16) to Time 2 [M=37.5, SD=5.15,

t(29)=5.39, p


61/134

Hypothesis Testing

Analysis Of Variance

61


62/134

One-way analysis of variance

In many research situations, however, we areinterested in comparing the mean scores of more thantwo groups. In this situation we would use analysis ofvariance (ANOVA).

One-way analysis of variance involves oneindependent variable (referred to as a factor), whichhas a number of different levels. These levelscorrespond to the different groups or conditions.

For example, in comparing the effectiveness of three

different teaching styles on students Maths scores,you would have one factor (teaching style) with threelevels (e.g. whole class, small group activities, self-paced computer activities).

The dependent variable is a continuous variable (in

Cont.


63/134

Analysis of variance is so called because it compares

the variance (variability in scores) between the different

groups (believed to be due to the independent variable)with the variability within each of the groups (believed to

be due to chance).

An F ratio is calculated which represents the variance

between the groups, divided by the variance within the

groups.

A large F ratio indicates that there is more variability

between the groups (caused by the independent

variable) than there is within each group (referred to as

the error term).

A significant F test indicates that we can reject the null

hypothesis, which states that the population means are

Cont


64/134

Cont.

There are two different types of one-way ANOVA :

between-groups analysis of variance, which isused when you have different subjects or cases in

each of your groups (this is referred to as an

independent groups design); and

repeated-measures analysis of variance, which is

used when you are measuring the same subjects

under different conditions (or measured at

different points in time) (this is also referred to asa within-subjects design).


65/134

Planned comparisons andPost-hoccomparisons


66/134

Planned comparisons

Planned comparisons (also know as a priori) are used

when you wish to test specific hypotheses (usuallydrawn from theory or past research) concerning the

differences between a subset of your groups (e.g. do

Groups 1 and 3 differ significantly?).

Planned comparisons do not control for the increasedrisks of Type 1 errors.

If there are a large number of differences that you wish

to explore, it may be safer to use the alternative

approach (post-hoc comparisons), which is designed toprotect against Type 1 errors.

The other alternative is to apply what is known as a

Bonferroni adjustment to the alpha level that you will use

to ud e statistical si nificance. This involves settin a

P t h i


67/134

Post-hoccomparisons Post-hoc comparisons (also known as a posteriori) are

used when you want to conduct a whole set ofcomparisons, exploring the differences between each of

the groups or conditions in your study.

Post-hoc comparisons are designed to guard against the

possibility of an increased Type 1 error due to the largenumber of different comparisons being made.

With small samples this can be a problem, as it can be

very hard to find a significant result, even when the

apparent difference in scores between the groups isquite large.

There are a number of different post-hoc tests that you

can use, and these vary in terms of their nature and

strictness. The assumptions underlying the posthoc tests

P t H t t th t l i


68/134

Multiple Comparison Tests

AND Range TestsRange Tests Only

Multiple Comparison Tests

Only

Tukeys HSD (honestlysignificant difference) test

Tukeys b (AKA, TukeysWSD (Wholly Significant

Difference))

Bonferroni (don't use with 5

groups or greater)

Hochbergs GT2 S-N-K (Student-Newman-Keuls)

Sidak

Gabriel Duncan

Dunnett (compares a

control group to the other

groups without comparing

the other groups to eachother)

Scheffe (confidence

intervals that are fairly

wide)

R-E-G-W F (Ryan-Einot-

Gabriel-Welsch F test)

LSD (least significant

difference)

R-E-G-W Q (Ryan-Einot-

Gabriel-Welsch range test)

Post Hoc tests that assume equal variance


69/134

Post Hoc tests

Fisher's LSD (Least Significant Different)This test is the most liberal of all Post Hoc tests and its critical t forsignificance is not affected by the number of groups. This test isappropriate when you have 3 means to compare. It is notappropriate for additional means.

Bonferroni (AKA, Dunns Bonferroni)

This test does not require the overall ANOVA to be significant. It isappropriate when the number of comparisons (c = number ofcomparisons = k(k-1))/2) exceeds the number of degrees offreedom (df) between groups (df = k-1). This test is veryconservative and its power quickly declines as the c increases. Agood rule of thumb is that the number of comparisons (c) be nolarger than the degrees of freedom (df).

Newman-Keuls

If there is more than one true null hypothesis in a set of means, thistest will overestimate they familywise error rate. It is appropriate touse this test when the number of comparisons exceeds the

number of degrees of freedom (df) between groups (df = k-1) and

Cont


70/134

Cont. Tukey's HSD (Honestly Significant Difference)

This test is perhaps the most popular post hoc. It reducesType I error at the expense of Power. It is appropriate to usethis test when one desires all the possible comparisonsbetween a large set of means (6 or more means).

Tukey's b (AKA, TukeysWSD (Wholly Significant

Difference))This test strikes a balance between the Newman-Keuls andTukey's more conservative HSD regarding Type I error andPower. Tukey's b is appropriate to use when one is makingmore than k-1 comparisons, yet fewer than (k(k-1))/2comparisons, and needs more control of Type I error thanNewman-Kuels.

Scheffe

This test is the most conservative of all post hoc tests.Compared to Tukey's HSD, Scheffe has less Power whenmaking pairwise (simple) comparisons, but more Powerwhen making complex comparisons. It is appropriate to use

One-way between-groups ANOVA


71/134

One way between groups ANOVA

with post-hoc tests

One-way between-groups analysis of variance isused when you have one independent (grouping)

variable with three or more levels (groups) and one

dependent continuous variable.

The one-way part of the title indicates there is onlyone independent variable, and between-groupsmeans that you have different subjects or cases in

each of the groups.

Summary for one-way between-


72/134

Summary for one way between

groups ANOVA with post-hoc tests

What you need: Two variables: one categorical independent variable with three

or more distinct categories. This can also be a

continuous variable that has been recoded to

give three equal groups (e.g. age group: subjectsdivided into 3 age categories, 29 and younger,

between 30 and 44, 45 or above). For

instructions on how to do this see Chapter 8; and

one continuous dependent variable (e.g.

optimism).


73/134

Cont.

What it does:One-way ANOVA will tell you whether there are

significant differences in the mean scores on the

dependent variable across the three groups.

Post-hoc tests can then be used to find outwhere these differences lie.

Non-parametric alternative: Kruskal-Wallis Test


74/134

Population distribution of response variable ineach group is normal

Standard deviations of population distributions

for the groups are equal

Independent randomsamples

(In practice, in a lot of cases, these arent

strictly met, but we do ANOVA anyway)

Assumptions


75/134

Null hypothesis:

H0: 1= 2= 3

Alternative or research hypothesis:

Ha: 1 2or 1 3or 3 3

Hypotheses


76/134

Probability of making error in decision to reject

null hypothesis For this test choose = 0.05

Level of significance


77/134

Test statistic

gNWSS

gBSSF

1

estimateWithin

estimateBetween

11 gdf

gNdf 2


78/134

Between estimate of variance

Between estimate calculations

1

2

2

g

yyns

ii

Group N Mean Group Mean - Difference Times

Grand Mean Squared N

Walk 46 10.20 -3.900 15.210 699.660

Drive 228 14.43 0.330 0.109 24.829

Bus 17 20.35 6.250 39.063 664.063

Total 291 14.10 1388.552

Divided by g-1 (between estimate) 694.276

Grand

meanBS

S


79/134

Within estimate of variance

Within estimate calculations

gN

sns

ii

2

21

Group N Variance Variance

Times

n i - 1

Walk 46 46.608 2143.965

Drive 228 68.857 15699.351

Bus 17 170.877 2904.912Total 291 20748.228

Divided by N-g (within estimate) 72.042

WS

S


80/134

Calculating the Fstatistic

Fstatistic calculation

637.9042.72

276.694

estimateWithin

estimateBetweenF


81/134

df1(degrees of freedom in numerator) Number of samples/groups - 1

= 3 - 1 = 2

df2(degrees of freedom in denominator)

Total number of cases - number of groups =2916 - 3 = 288

Degrees of freedom


82/134

Separate tables for each probability df1(degrees of freedom in numerator) across

top

df2(degrees of freedom in denominator) down

side Values of Fin table

For degrees of freedom not given, use nextlower value

Table of Fdistribution


83/134

Find Fvalues for degrees of freedom (2, 313)= 0.05, F= 3.07 (2, 120)= 0.01, F= 4.79 (2, 120)= 0.001, F= 7.31 (2, 120)

F= 9.637 > F= 7.31 for = 0.001

p-value < 0.001

p-value


84/134

p-value < 0.001 is less than = 0.05 Reject null hypothesis that all means are

equal

Conclude that at least one of the means is

different from the others

Conclusion


85/134


86/134

Example

Treatment 1 Treatment 2 Treatment 3 Treatment 460 inches 50 48 47

67 52 49 6742 43 50 5467 67 55 6756 67 56 6862 59 61 6564 67 61 6559 64 60 5672 63 59 6071 65 64 65


87/134

Example


67 52 49 6742 43 50 5467 67 55 6756 67 56 6862 59 61 6564 67 61 6559 64 60 5672 63 59 6071 65 64 65

Step 1) calculate the sum of

squares between groups:

Mean for group 1 = 62.0




Grand mean= 59.85

SSB = [(62-59.85)2+ (59.7-59.85)2+ (56.3-59.85)2+ (61.4-59.85)2 ]xn per

group= 19.65x10= 196.5


88/134

Example


67 52 49 6742 43 50 5467 67 55 6756 67 56 6862 59 61 6564 67 61 6559 64 60 5672 63 59 6071 65 64 65

Step 2) calculate the sum of

squares within groups:

(60-62)2+(67-62)2+(42-62)

2+(67-62)2+(56-62)2+(62-

62)2+(64-62)2+(59-62)2+

(72-62)2+(71-62)2+(50-

59.7)2+(52-59.7)2+(43-

59.7)2

+67-59.7)2

+(67-59.7)

2+(69-59.7)2+.(sum of40 squared deviations) =

2060.6


89/134

Step 3) Fill in the ANOVA table

3 196.5 65.5 1.14 .344

36 2060.6 57.2

Source of variation d.f. Sum of squares Mean Sum of

Squares

F-statistic p-value

Between

Within

Total 39 2257.1

Procedure for one-way between-


90/134

y

groups ANOVA with post-hoc tests1. From the menu at the top of the screen click on:

Analyze, then click on Compare Means, then on One-wayANOVA.

2. Click on your dependent (continuous) variable (e.g. Totaloptimism). Move this into the box marked Dependent Listby clicking on the arrow button.

3. Click on your independent, categorical variable (e.g.agegp3). Move this into the box labelled Factor.

4. Click the Options button and click on Descriptive,Homogeneity of variance test, Brown-Forsythe, Welshand Means Plot.

5. For Missing values, make sure there is a dot in theoption marked Excludecases analysis by analysis. If not,click on this option once. Click on Continue.

6. Click on the button marked Post Hoc. Click on Tukey.

7. Click on Continue and then OK.

The output is shown below.


91/134


92/134


93/134

between-groups ANOVA with post-hoc


94/134

tests

DescriptivesThis table gives you information about each group (number ineach group, means, standard deviation, minimum andmaximum, etc.) Always check this table first. Are the Ns foreach group correct?

Test of homogeneity of varianceso The homogeneity of variance option gives you Levenestest for

homogeneity of variances, which tests whether the variance inscores is the same for each of the three groups.

o Check the significance value (Sig.) for Levenestest. If thisnumber is greater than .05 (e.g. .08, .12, .28), then you havenot violated the assumption of homogeneity of variance.

o If you have found that you violated this assumption you willneed to consult the table in the output headed Robust Tests ofEquality of Means. The two tests shown there (Welsh and

Brown-Forsythe) are preferable when the assumption of the

Cont.


95/134

ANOVA

This table gives both between-groups and within-groups

sums of squares, degrees of freedom etc. The main thing

you are interested in is the column marked Sig. If the Sig.

value is less than or equal to .05 (e.g. .03, .01, .001),

then there is a significant difference somewhere among

the mean scores on your dependent variable for the threegroups.

Multiple comparisons

You should look at this table only if you found asignificant difference in your overall ANOVA. That is, if

the Sig. value was equal to or less than .05. The

posthoc tests in this table will tell you exactly where

the differences among the groups

Cont.


96/134

Means plotsThis plot provides an easy way to compare the

mean scores for the different groups.

Warning: these plots can be misleading.

Depending on the scale used on the Y axis (inthis case representing Optimism scores), even

small differences can look dramatic.


97/134

Calculating effect sizeThe information you need to calculate eta squared,one of the most

common effect size statistics, is provided in theANOVA table (a calculator would be useful here). Theformula is:

Cohen classifies .01 as a small effect, .06 as a

medium effect and .14 as a large effect.


98/134

Warning

In this example we obtained a statisticallysignificant result, but the actual difference in themean scores of the groups was very small (21.36,

22.10, 22.96). This is evident in the small effectsize obtained (eta squared=.02). With a largeenough sample (in this case N=435), quite smalldifferences can become statistically significant,

even if the difference between the groups is oflittle practical importance. Always interpret yourresults carefully, taking into account all theinformation you have available. Dont rely too

heavily on statistical significancemany other


99/134

Presenting the results

A one-way between-groups analysis of variance wasconducted to explore the impact of age on levels of

optimism, as measured by the Life Orientation test (LOT).

Subjects were divided into three groups according to their

age (Group 1: 29 or less; Group 2: 30 to 44; Group 3: 45and above). There was a statistically significant difference

at thep


100/134

Two-way between-groupsANOVA

Two-way between-groups

ANOVA


101/134

ANOVA

Two-way means that there are two independentvariables, and between-groups indicates that

different people are in each of the groups. This

technique allows us to look at the individual and

joint effect of two independent variables on onedependent variable.

The advantage of using a two-way design is that

we can test the main effect for each independentvariable and also explore the possibility of an

interaction effect.

An interaction effect occurs when the effect of one

independent variable on the dependent variable

S f t ANOVA


102/134

Summary for two-way ANOVA

Example of research question:What is the impact of age and gender on

optimism? Does gender moderate the

relationship between age and optimism?

What you need: Three variables:

two categorical independent variables (e.g. Sex:

males/females; Age group: young, middle, old);

andone continuous dependent variable (e.g. total

optimism).

Assumptions: the assumptions underlying ANOVA.

-

C t


103/134

Cont.

What it does: Two-way ANOVA allows you to simultaneously

test for the effect of each of your independent

variables on the dependent variable and also

identifies any interaction effect.

For example, it allows you to test for:

sex differences in optimism;

differences in optimism for young, middle andold subjects; and

the interaction of these two variablesis therea difference in the effect of age on optimism for

males and females?

v uanalysis


104/134

y

P d f t ANOVA


105/134

Procedure for two-way ANOVA

1. From the menu at the top of the screen click on:Analyze, then click on General Linear Model, thenon Univariate.

2. Click on your dependent, continuous variable (e.g.total optimism) and move it into the box labelledDependent variable.

3. Click on your two independent, categoricalvariables (sex, agegp3: this is age grouped into threecategories) and move these into the box labelled Fixed

Factors.4. Click on the Options button.

Click on Descriptive Statistics, Estimates of effectsize and Homogeneity tests.

Click on Continue.

Cont.5 Click on the Post Hoc button


106/134

5. Click on the Post Hoc button.

From the Factors listed on the left-hand side choose the

independent variable(s) you are interested in (this variableshould have three or more levels or groups: e.g. agegp3).

Click on the arrow button to move it into the Post Hoc Tests forsection.

Choose the test you wish to use (in this case Tukey).

Click on Continue.

6. Click on the Plots button.

In the Horizontal box put the independent variable that has themost groups (e.g. agegp3).

In the box labelled Separate Lines put the other independentvariable (e.g. sex).

Click on Add. In the section labelled Plots you should now see your two

variables listed (e.g. agegp3*sex).

The output


107/134


108/134


109/134


110/134

Interpretation of output from two-way

ANOVA


111/134

ANOVA Descriptive statistics.

These provide the mean scores, standard deviations

and N for each subgroup. Check that these values are

correct.

LevenesTest of Equality of Error Variances. This test provides a test of one of the assumptions

underlying analysis of variance. The value you are most

interested in is the Sig. level. You want this to be greater

than .05, and therefore not significant. A significant result

(Sig. value less than .05) suggests that the variance of your

dependent variable across the groups is not equal.

If you find this to be the case in your study it is

recommended that you set a more stringent significance

level e. . .01 for evaluatin the results of our two-wa

Cont.


112/134

Tests Of Between-Subjects Effects.

This gives you a number of pieces of information, notnecessarily in the order in which you need to check

them.

Interaction effects

The first thing you need to do is to check for the

possibility of an interaction effect (e.g. that the

influence of age on optimism levels depends on

whether you are a male or a female). In the SPSS output the line we need to look at is

labeled AGEGP3*SEX. To find out whether the

interaction is significant, check the Sig. column for

that line. If the value is less than or equal to .05

Cont.


113/134

Main effects

In the left-hand column, find the variable you are interested in(e.g. AGEGP3) To determine whether there is a main effect

for each independent variable, check in the column marked

Sig. next to each variable.

Effect sizeThe effect size for the agegp3 variable is provided in the

column labelled Partial Eta Squared (.018).

Using Cohens (1988) criterion, this can be classified as

small (see introduction to Part Five). So, although this effectreaches statistical significance, the actual difference in the

mean values is very small. From the Descriptives table we

can see that the mean scores for the three age groups

(collapsed for sex) are 21.36, 22.10, 22.96. The difference

between the rou s a ears to be of little ractical

Cont


114/134

Cont.

Post-hocPost-hoc tests are relevant only if you have

more than two levels (groups) to your

independent variable. These tests

systematically compare each of your pairs ofgroups, and indicate whether there is a

significant difference in the means of each.

SPSS provides these post-hoc tests as part

of the ANOVA output. tests

Cont


115/134

Cont.

Multiple comparisonsThe results of the post-hoc tests are provided in

the table labelled Multiple Comparisons. We

have requested the Tukey Honestly Significant

Difference test, as this is one of the morecommonly used tests. Look down the column

labelled Sig. for any values less than .05.

Significant results are also indicated by a little

asterisk in the column labelled Mean Difference.

Cont


116/134

Cont.

PlotsYou will see at the end of your SPSS output a

plot of the optimism scores for males and

females, across the three age groups. This plot

is very useful for allowing you to visually inspectthe relationship among your variables.

Additional analyses if you obtain a

significant interaction effect


117/134

significant interaction effect

Conduct an analysis of simple effects. This meansthat you will look at the results for each of the

subgroups separately. This involves splitting the

sample into groups according to one of your

independent variables and running separate one-way ANOVAs to explore the effect of the other

variable.

Use the SPSS Split File option. This option allows

you to split your sample according to one

categorical variable and to repeat analyses

separately for each group.

Procedure for splitting the sample


118/134

Procedure for splitting the sample1. From the menu at the top of the screen click on: Data,

then click on Split File.2. Click on Organize output by groups.

3. Move the grouping variable (sex) into the box markedGroups based on.

4. This will split the sample by sex and repeat anyanalyses that follow for these two groups separately.

5. Click on OK.

After splitting the file you then perform a one-way ANOVA

Important: When you have finished these analysesfor the separate groups you must turn the Spl i tF ile op t ion off

Presenting the results from two-way

ANOVA


119/134

ANOVA

A two-way between-groups analysis of variance wasconducted to explore the impact of sex and age on levels of

optimism, as measured by the Life Orientation test (LOT).

Subjects were divided into three groups according to their

age (Group 1: 1829 years; Group 2: 3044 years; Group3: 45 years and above). There was a statistically significantmain effect for age [F(2, 429)=3.91, p=.02]; however, the

effect size was small (partial eta squared=.02). Post-hoc

comparisons using the Tukey HSD test indicated that the

mean score for the 1829 age group (M=21.36, SD=4.55)was significantly different from the 45 + group (M=22.96,SD=4.49). The 3044 age group (M=22.10, SD=4.15) did

not differ significantly from either of the other groups. The

main effect for sex [F(1, 429)=.30,p=.59] and the

= =


120/134

Analysis of covariance(ANCOVA)

Analysis of covariance (ANCOVA)


121/134

Analysis of covariance is an extension of analysis of

variance that allows you to explore differences between

groups while statistically controlling for an additional

(continuous) variable. This additional variable (called a

covariate) is a variable that you suspect may be

influencing scores on the dependent variable.

SPSS uses regression procedures to remove the

variation in the dependent variable that is due to the

covariate/s, and then performs the normal analysis of

variance techniques on the corrected or adjustedscores.

By removing the influence of these additional variables

ANCOVA can increase the power or sensitivity of the F-

test.

Uses of ANCOVA


122/134

ANCOVA can be used when you have a two-group pre-

test/post-test design (e.g. comparing the impact of twodifferent interventions, taking before and after

measures for each group). The scores on the pre-test

are treated as a covariate to control for pre-existing

differences between the groups. This makes ANCOVAvery useful in situations when you have quite small

sample sizes, and only small or medium effect sizes

(see discussion on effect sizes in the introduction to

Part. Five). Under these circumstances (which are very

common in social science research), Stevens (1996)

recommends the use of two or three carefully chosen

covariates to reduce the error variance and increase

your chances of detecting a significant difference

CONT.

ANCOVA is also handy when you have been unable to


123/134

ANCOVA is also handy when you have been unable to

randomly assign your subjects to the different groups, but

instead have had to use existing groups (e.g. classes ofstudents). As these groups may differ on a number of

different attributes (not just the one you are interested in),

ANCOVA can be used in an attempt to reduce some of

these differences. The use of well-chosen covariates canhelp reduce the confounding influence of group

differences. This is certainly not an ideal situation, as it is

not possible to control for all possible differences; however,

it does help reduce this systematic bias. The use ofANCOVA with intact or existing groups is somewhat of a

contentious one among writers in the field. It would be a

good idea to read more widely if you find yourself in this

situation Some of these issues are summarised in Stevens

1 N 30 O l t t t

Comparing means


124/134

1 group N 30 One-sample t-testN< 30 Normally distributed One-sample t-test

Not normal Sign test2 groups Independen

tN 30 t-testN< 30 Normally distributed t-test

Not normal MannWhitney Uor Wilcoxonsigned-rank test

Paired N 30 paired t-testN< 30 Normally distributed paired t-test

Not normal Wilcoxon signed-rank test3 or more

groupsIndependen

tNormally

distributed1 factor One way anova

2 factors two or other anovaNot normal KruskalWallis one-way

analysis of varianceby ranksDependent Normally

distributedRepeated measures anova

Not normal Friedman two-way analysis ofvarianceby ranks

THE END


125/134

Test Statistic for Testing a SinglePopulation Mean ()


126/134

Population Mean ()

n

s

Xt

XSE

Xt

oo

or)(

~ t-distribuion

with df = n1.

In general the basic form of a test statistic is given by:

)()()(

estimateSEvalueedhypothesizestimatet

post hoc


127/134

post hoc

Once you have determined that differences existamong the means, post hoc range tests and

pairwise multiple comparisons can determine

which means differ. Range tests identify

homogeneous subsets of means that are notdifferent from each other. Pairwise multiple

comparisons test the difference between each

pair of means, and yield a matrix where asterisks

indicate significantly different group means at an

alpha level of 0.05 (SPSS, Inc.).

Single-step tests

f


128/134

Tukey-Kramer: The Tukey-Kramer test is an extension of the Tukey test to

unbalanced designs. Unlike Tukey test for balanced designs, it is not exact.

The FWE of the Tukey-Kramer test may be less than ALPHA. It is lessconservative for only slightly unbalanced designs and more conservative

when differences among samples sizes are bigger.

Hochbergs GF2: The GF2 test is similar to Tukey, but the critical

values are based on the studentized maximum modulus distribution instead

of the studentized range. For balanced or unbalanced one-way anova, itsFWE does not exceed ALPHA. It is usually more conservative than the

Tukey-Kramer test for unbalanced designs and it is always more

conservative than the Tukey test for balanced designs.

Gabriel: Like the GF2 test, the Gabriel test is based on studentized

maximum modulus. It is equivalent to the GF2 test for balanced one-wayanova. For unbalanced one-way anova, it is less conservative than GF2, but

its FWE may exceed ALPHA in highly unbalanced designs.

Dunnett: The Dunnettstest is a test to use when the only pariwisecomparisons of interest are comparisons with a control. It is an exact test,

that is, its FWE is exactly equal to ALPHA, forbalanced as well as

Single-step tests


129/134

Below are short descriptions of these tests.

LSD: The LSD (Least Significant Difference) test is a two-step

test. First the ANOVA F test is performed. If it is significant at

level ALPHA, then all pairwise t-tests are carried out, each at

level ALPHA. If the F test is not significant, then the

procedure terminates. The LSD test does not control the

FWE. Bonferroni: The Bonferroni multiple comparison test is a

conservative test, that is, the FWE is not exactly equal

to ALPHA, but is less than ALPHA in most situations. It is

easy to apply and can be used for any set of comparisons.Even though the Bonferroni test controls the FEW rate, in

many situations it may be too conservative and not have

enough power to detect significant differences.

Single-step tests Sidak : Sidak adjusted p-values are also easy to compute.


130/134

Sidak: Sidak adjusted p values are also easy to compute.

The Sidak test gives slightly smaller adjusted p-values than

Bonferroni, but it guarantees the strict control of FWE onlywhen the comparisons are independent .

Scheffe: The Scheffe test is used in ANOVA analysis

(balanced, unbalanced, with covariates). It controls for the

FWE for all possible contrasts, not only pairwise comparisons

and is too conservative in cases when pairwise comparisons

are the only comparisons of interest.

Tukey: The Tukey test is based on the studentized range

distribution (standardized maximum difference between the

means). For oneway balanced anova, the FWE of the Tukeytest is exactly equal the assumed value of ALPHA. The Tukey

test is also exact for one-way balanced anova with correlated

errors when the type of correlation structure is compound

symmetry

Appendix


131/134

Appendix

Effect size statistics, the most common of which are:

eta squared,


132/134

q ,

Cohens d and Cohens f

ANOVA Table


133/134

ANOVA Table

Between

(k groups)

k-1 SSB(sum of squared

deviations of group

means from grand

mean)

SSB/k-1 Go to

Fk-1,nk-k

chart

Total

variation

nk-1 TSS

(sum of squared deviations of

observations from grand mean)

Source of

variation d.f.

Sum of

squares

Mean Sum

of Squares

F-statistic p-value

Within(n individuals per

group)

nk-kSSW(sum of squared

deviations of

observations from

their group mean)

s2=SSW/nk-k

knk

SSWk

SSB

1

TSS=SSB + SSW

Character ist ics o f a Normal Distr ibut ion1) Continuous Random Variable.

2) Bell-shaped curve.


134/134

2) Bell shaped curve.

3) The normal curve extends indefinitely in both directions,

approaching, but never touching, the horizontal axis as it does so.4) Unimodal

5) Mean = Median = Mode

6) Symmetrical with respect to the mean. That is, 50% of the area

(data) under the curve lies to the left of the mean and 50% of the

area (data) under the curve lies to the right of the mean.

7) (a) 68% of the area (data) under the curve is within one

standard deviation of the mean

(b) 95% of the area (data) under the curve is within two standard

deviations of the mean(c) 99.7% of the area (data) under the curve is within three

t d d d i ti f th

Download - Statistical Techniques to Compare Groups

Top Related