chi square: a nonparametric test psyc 230 june 3rd, 2004 shaun cook, abd university of arizona

42
Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Post on 20-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Chi Square: A Nonparametric Test

PSYC 230June 3rd, 2004

Shaun Cook, ABDUniversity of Arizona

Page 2: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Nonparametric (a.k.a. Distribution-Free)

• Nonparamteric refers to tests that:– Make no estimates about parameters

– Make few or no assumptions

– Can be run with ordinal or nominal data

– Usually less powerful that parametric tests

• They are significant tests

Page 3: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Chi Square Distribution• A distribution with one parameter, k

• Mathematically defined by:

• All values set except k– k is the only value that can vary– k is statistically equal to df– Distribution changes for different values of k

f(2) = 1

2k /2(k/2) 2[(k/2)-1]e -(2)/2

Page 4: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Chi Square Distribution

Howell, 1997

Page 5: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Chi Square Test

• Based on the chi square distribution• This is a nonparametric test• It can be used with nominal data

– Therefore, it can be used with data more complex, as well

> Data must be in nominal form

• Tests if frequency differences occur due to chance

Page 6: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Transforming Data

• Set of reaction time (RT) data, in ms{778, 921, 1148, 1675, 1721, 782, 1549, 846, 1313, 1947, 1498, 885, 1211}

• How can this be transformed into nominal data?

Page 7: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

The Nominal Scale

• Could be called labeling

• Numbers are assigned to define a category– Therefore, all cases in the same category

receive the same designation, the same number

• Categories are independent or mutually exclusive

• e.g., political party affiliation

Page 8: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Nominal Data

• These data tells whether a particular case possess a particular trait, and are categorized along these traits– We do not know how much of the trait

• All categories must share one trait

• All observations within any category are equal

Page 9: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Terminology

2 - chi square

• C - number of categories

• fo - frequency observed

• fe - frequency expected

Page 10: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Chi Square and the H0

• As do all significant tests, the chi square tests the H0

• The H0 with a chi square test says that the frequencies in your sample are equivalent to those that are expected– H0: fo = fe

> How do you obtain the value of fe ?

Page 11: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• Observed frequencies (fo ): frequencies you observe in your sample

• Expected frequencies (fe ): frequencies you would expect given H0

Observed and Expected Frequencies

Page 12: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Goodness of Fit (1 x C) Chi Square

• Applies when one group is assigned to C categories

• Good 2 to compare a sample to a population

• Testing how well our observed frequencies (fo) fit with the expected frequencies (fe), given H0

Page 13: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Ho & GOF Chi Square

• Ho can be stated in two ways:

No Preference

Idea that population is evenly divided among categories

No Difference

Idea that fe is same as those of a known population

Page 14: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

fe & GOF Chi Square• fe can be calculated in two ways,

corresponding to the Ho:

fe = CC

NN

By Chance A priori

This means that prior knowledge has informed your hypothesis and your expected frequency is based on this prior knowledge

Page 15: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Calculating Chi Square

2 = ((ffo - o - ffee ) )22

ffee

This formula generalizes to multiple category variables

Page 16: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Are the students equally divided?

Practical Problem

Page 17: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Categories

A B C

fo # # #

fe # # #

fo – fe # # #

(fo – fe)2 # # #

(fo – fe)2 / fe # # #

Calculating 1 x C 2

∑ of values in the bottom row = 2 value

Page 18: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Evaluating 2

• df = C - 1

• Once you have calculated a 2 value, you compare it to a table value (p. 699)

• Find the table value by looking up the df & level

• If calculated 2 is table value, reject Ho

• Ho: fo = fe

• Ha: fo fe

Page 19: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

2 Table

• Treat just like t table

• Note that, unlike t , as you increase df, the table or critical value also increases– Making it harder to find a significant result at these higher df

Page 20: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Can she reject the H0 that states the students are divided equally?

Class Problem

2 .05 (2) = 5.99

2 = 27.18; reject H0

Page 21: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Consumer psychologists tell us that red is an powerful color for merchandising. According to the numbers, products whose packaging contains red sell 2/3 more often than equivalent products whose packaging lacks red. Packaging companies know this & therefore charge more for red packaging. We test a new product in two packages: R+ & R-. We find that 49 people prefer the R+ & 38 prefer the R-. Does this mean that our sample is preferring red to the same degree?

Class Problem

2 .05 (1) = 3.84

2 = 4.49; reject H0

Page 22: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Independence (r x C)Chi Square

• Analysis of contingency

• Applies when more than one group is assigned to C categories

• Good 2 to compare a sample to a another sample

• Uses contingency tables

• Tests H0: the observed frequencies for one category are independent of the observed frequencies for any other; they occurred by chance

Page 23: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• Show the distribution of one variable at each level of another variable

• Also know as crosstabs

• Rows are defined by the groups

• Columns are defined by the categories

• Identifies marginal totals

Contingency Tables

Page 24: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• These are the totals of the frequencies in all cells of a row or column– For rows, they are placed to the right– For columns, they are placed at the bottom

Marginal Totals

row totals = column totals = N

Page 25: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• fe = (row total * column total) / N

– Follows from multiplicative law of probability

Expected Frequencies, df, & Independence 2

• df = (# rows - 1) (# of columns - 1)– Refers to the number of cell values that are free to vary once the marginal

totals are set– Check by crossing out 1 row & 1 column

Page 26: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

A 1993 survey of men in CA looked at martial and employment status. It found the following breakdown:

Class Problem

2 .05 (2) = 5.99

2 = 5.56; fail to reject H0

Married Not Married Never Married

Employed 679 103 114

Unemployed 63 10 20

Do men of different marital statuses have different distributions of employment status? Or, are these differences just chance variation?

Page 27: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

2 & Percentage 2 can be calculated with percentages

• The formula stays the same

• Treat the percentages just as you would frequencies

• Remember, a key factor in 2 is sample size• Percentage based 2 must account for N• They do so after the 2, based on the percentage, has been

calculated

2 =

22%%(N)(N)

100100

Page 28: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

You have classified a sample of 24 people into 5 categories based on ethnicity, using percents. You surveyed these people on their attitudes toward increasing taxes. To see if their attitudes were related to ethnicity, you have calculated 2 and obtained a value of 14.28. What is your conclusion?

Class Problem

2 .05 (4) = 9.49

2 = 3.43; fail to reject H0

Page 29: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• Inclusion of non-occurrences

• Normality - expected cell frequencies large enough

• Independence

Assumptions of 2

Page 30: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• Every possible value of a variable needs to be included – some slippage OK with very rare events

Inclusion of Non-Occurrences

Page 31: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Are Catholics more likely to vote pro-abortion than Non-Catholics?

Catholics Non-Catholics

Pro votes: 400 100

• Surprisingly, it looks like the answer is yes

Violation Example

• We have not considered the non-occurrences

Page 32: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Are Catholics more likely to vote pro-abortion than Non-Catholics?

Catholics Non-Catholics

Pro votes: 400 100

Con votes: 1200 100

Violation Example

• Catholics are much more likely to vote con

Page 33: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• Refers to having large enough frequencies for the normal approximation to the multinomial to be valid - make sure to check

• Different opinions on this:– Some say that all cells need fe > 5

– Some say that no more than 20% of cells can have fe < 5

• Biggest problem is lack of power• Fisher’s exact test is an alternative for 2x2 tables

Assumption of Normality

Page 34: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• Each subject falls into one and only one cell– Check: do totals of your cell counts = N

Assumption of Independence

• If you have repeated measurements,

you do not have independence

• Alternative if you don’t have independence– McNemar test

Page 35: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• In some cases, you can account for a lack of independence by using McNemar’s test

• Can only be computed with a 2 x 2 contingency table

• Within the table, we do not have the observed frequencies– We have change scores

• We compute 2 on these change scores

McNemar’s Test

Page 36: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

• The Ho in this case states that the distributions of original & changed scores are the same

Ho McNemar’s Test

Page 37: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

McNemar’s Test

22 = = a - da - d

a + a + dd

22

• The contingency table must be set-up as:

a a b b

cc d d PrPre e

PosPost t - + - +

+ +

- -

Page 38: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

You have classified a sample of 100 Texans into 2 categories: 77 pro death penalty & 23 con. You surveyed these Texans after having watched an execution. Of the original 77 pro opinions, 61 remain. Of the original 23 con, 18 remain. Did viewing an execution change Texans’ attitudes?

Class Problem

2 .05 (4) = 3.84

2 = 5.76; reject H0

Page 39: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

2 & Effect Size• This measure of effect size for 2 has different

conventions than those for parametric tests– .10 (small effect size)

– .25 (medium effect size)

– .40 (large effect size)

Effect size = 2

+ 2 • This measure of effect size is also call the

contingency coefficient

Page 40: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Can she reject the H0 that states the students are divided equally?

Class Problem

2 .05 (2) = 5.99

2 = 27.18; reject H0

ES = .27, medium effect

Page 41: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Questions/ Comments?

Thank YouThe end

Page 42: Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Homework

• Chapter 17

– 1, 2, 3, 4, 9, 13, 17, 19, 21