13 statistical fdn nominal ordinal data analysis (1)

22
265 Chapter 13 Statistical Foundations: Nominal and Ordinal Data Analysis I. Analyzing Nominal Data A. Introduction 1. Nominal Data a. Nominal data consists of names, labels, or categories only. Data can’t be arranged in an ordering scheme. Categories must be mutually exclusive. European African Boy/Girl Asian b. What is actually analyzed are raw frequencies, i.e., the number of boys, girls or the number of Europeans, Africans, and Asians. 2. There are 3 statistical procedures which are applied to analyze nominal data. a. Chi-Square Test for Goodness of Fit b. Chi-Square Test of Independence (Also called Association) c. Chi-Square Test of Homogeneity B. Chi-Square Introduction 1. Chi-square has a distribution but is classified as a nonparametric statistical test by tradition (Daniel, 1990, pp. 178-179). Nonparametric tests are distribution- free statistics which are used when a population's (or a sample's) distribution is substantially non-normal and parametric tests are not suitable. 2. Chi-square can be used to test whether a. Observed nominal data conforms (or is statistically different) to some theoretical or expected distribution (Daniel, 1990, p. 306; Morehouse and Stull, 1975, p. 311). b. Two variables within a sample are related (Daniel, 1990, p. 181). c. Two or more samples, drawn from different populations, are homogenous on some characteristic of interest (Daniel, 1990, pp. 192-93). 3. The chi-square distribution approaches the normal distribution as n increases (Daniel, 1990, p. 180). When the sample size is N > 30, the chi-square is approximately normal (Morehouse and Stull, 1975, p. 313). 4. The null (or statistical) hypothesis (H o ) states that there is either “no differences between observed or expected frequencies" (Goodness of Fit) or "the variables or samples are not related, i.e., are independent" (Test for Independence) or are homogeneous (Test of Homogeneity). a. As normally applied, the X 2 is not directed at any specific alternative hypothesis (Reynolds, 1984, p. 22).

Upload: shakander-badsha

Post on 04-Apr-2015

77 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

265

Chapter 13 Statistical Foundations: Nominal and Ordinal Data Analysis I. Analyzing Nominal Data A. Introduction 1. Nominal Data

a. Nominal data consists of names, labels, or categories only. Data can’t be arranged in an ordering scheme. Categories must be mutually exclusive.

European African

Boy/Girl

Asian b. What is actually analyzed are raw frequencies, i.e., the number of boys,

girls or the number of Europeans, Africans, and Asians. 2. There are 3 statistical procedures which are applied to analyze nominal data. a. Chi-Square Test for Goodness of Fit b. Chi-Square Test of Independence (Also called Association) c. Chi-Square Test of Homogeneity B. Chi-Square Introduction 1. Chi-square has a distribution but is classified as a nonparametric statistical test

by tradition (Daniel, 1990, pp. 178-179). Nonparametric tests are distribution-free statistics which are used when a population's (or a sample's) distribution is substantially non-normal and parametric tests are not suitable.

2. Chi-square can be used to test whether

a. Observed nominal data conforms (or is statistically different) to some theoretical or expected distribution (Daniel, 1990, p. 306; Morehouse and Stull, 1975, p. 311).

b. Two variables within a sample are related (Daniel, 1990, p. 181). c. Two or more samples, drawn from different populations, are homogenous

on some characteristic of interest (Daniel, 1990, pp. 192-93). 3. The chi-square distribution approaches the normal distribution as n increases

(Daniel, 1990, p. 180). When the sample size is N > 30, the chi-square is approximately normal (Morehouse and Stull, 1975, p. 313).

4. The null (or statistical) hypothesis (Ho) states that there is either “no

differences between observed or expected frequencies" (Goodness of Fit) or "the variables or samples are not related, i.e., are independent" (Test for Independence) or are homogeneous (Test of Homogeneity).

a. As normally applied, the X2 is not directed at any specific alternative hypothesis (Reynolds, 1984, p. 22).

Page 2: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

266

b. Degrees of freedom, alpha, and power must be specified or determined. The applicable degrees of freedom (df) depends on the test applied. Alpha is specified, a priori, usually at the .05 or .01 level.

5. While a statistically significant X2 establishes a relationship between two

variables, it reveals little else (Reynolds, 1984, p. 20). a. A statistically significant chi-square is not a measure of the strength of an

association (Daniel, 1990, p. 400; Reynolds, 1984, p. 30; Welkowitz, Ewen, & Cohen, 1991, p. 298; Udinsky, Osterlind, & Lynch, 1981, p. 214).

b. Hence, when a relationship has been established additional analysis must be conducted.

c. Subsequent analysis may be accomplished through partitioning X2, which is a detailed, time intensive task (Reynolds, 1984, pp. 23-30) or applying measures of association (Reynolds, 1984, pp. 35-44; Welkowitz, Ewen, & Cohen, 1991, pp. 298-301; Siegel, 1956, pp. 196-202).

6. Assumptions a. The distribution is not symmetric. b. Chi-square values can be zero or positive but never negative. c. Chi-square distribution is different for each degree of freedom. As the

number of degrees of freedom increase, the distribution approaches the SNC.

d. Degree of freedom (df) varies depending on the chi-square test being used. e. Measurements are independent of each other. Before-and-after frequency

counts of the same subjects cannot be analyzed using 2χ .

7. Chi-Square Issues a. Expected frequency cell size (1) The greater the numerical difference between observed and expected

frequencies within cells, the more likely is a statistically significant X2. Cells with estimated frequencies (<5) may yield an inflated X2.

(2) There is considerable debate concerning small cell expected frequencies (Daniel, 1990, p. 185). Various authors have proposed guidelines depending on the specific test.

(3) Welkowitz, Ewen, & Cohen (1991, p. 292) offer general advise: (a) for df = 1, all expected frequencies should be at least 5; (b) for df = 2, all expected frequencies should be at least 3; and (c) for df = 3, all but one expected frequency value should equal 5. (4) If cells are to be combined, then there should be a logical reason for

any combination effected, otherwise interpretation is adversely affected. Morehouse and Stull (1975, pp. 320-321) advocate the use of Yates' Correction for Continuity when expected cell sizes are less than 5. However, Daniel (1990, p. 187) reports that based on research there is a trend away from applying Yates' Correction. Spatz (2001, p. 293) advises against using the Yates correction.

Page 3: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

267

2. Percentages (1) There is some disagreement among authors as whether or not

percentages may be included in chi-square computations. (2) Reynolds (1984, p. 36) argues for 2 X 2 contingency tables that

“percentages permit one to detect patterns of departure from independence...and [that] percentages are particularly useful in 2 X 2 tables." However, Morehouse and Stull (1975, p. 320) disagree, "the direct numerical counts or raw data should always be used as a basis for the calculation of chi-square. Percentages and ratios are not independent, and consequently, their use will result in errors in conclusions."

(3) Daniel (1990), Siegel (1956), Welkowitz, Ewen, & Cohen (1991), and Udinsky, Osterlind, & Lynch (1981) are silent on the subject, but none of their examples contain percentages.

C. Chi-Square Goodness of Fit Test 1. Theory and Formula a. The Chi Square Goodness of Fit Test is a one variable procedure. There is

only one variable, often with several levels. Siegel (1956, pp. 42-47) refers to this type of test as “The 2χ One Sample Test.”

b. In a goodness of fit test, there are observed and expected frequencies (Table 13.1).

(1) Expected frequencies are derived from a theory, hypothesis, or commonly accepted standard.

(2) The null hypothesis is that the collected data “fit the model.” If the null is rejected, then the conclusion is that the data didn’t fit the model, i.e., the expected frequencies (Welkowitz, Ewen, & Cohen, 1991, pp. 289-292). The alternative hypothesis, H1, states the opposite.

(3) In other words, if one is drawing a random sample from a population, it is expected that there will be a reasonable "fit" between the sample frequencies and those found in the population across some characteristic of interest (Daniel, 1990, p. 307; Reynolds, 1984, p. 17; Siegel, 1956, p. 43; Welkowitz, Ewen, & Cohen, 1991, pp. 289-291).

c. Formula 13.1 Chi-Square Goodness of Fit (Spatz, 2001, p. 280)

( )[ ]EEO 22 −∑=χ

where: χ2 = chi-square value; O = observed frequencies; E = expected frequencies d. For Goodness of Fit test, there are no measures of association as it’s a one

variable test.

Page 4: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

268

(1) We are concerned only with whether or not the observed data “fit” the expected.

(2) One reason the X2 Goodness of Fit test is not used as a measure of association is that its magnitude depends on sample size (i.e., large sample equal large computed X2 value and the larger the computed X2, the greater the chances of rejecting the null hypothesis).

2. Index of Effect Size a. For the Goodness of Fit Test, we’re comparing how an observed

distribution “fits” an “expected” distribution; when ϖ = 0, there is “perfect fit. As ϖ increases the degree of departure from “a perfect fit” increases. Thus, we say when ϖ = .1, there is a small effect or small departure from “fit”.

b. Formula 13.2 Index of Effect Size Cohen (1988, pp. 216-218)

∑=

−=

m

i i

oii

PPP

1 0

21 )(ϖ

Where: P0i = proportion in a cell, posited by the null hypothesis; P1i =

proportion in a cell posited by the alternative hypothesis; and m = # of cells.

c. Formula 13.2 is quite complicated, tedious, and time consuming to compute by hand; use a statistical computer program, e.g., SPSS or SAS. For the Chi Square Test of Association or Heterogeneity, there are computationally convenient alternative effect size indices.

d. For Chi-square, Cohen (1988, pp. 224-225) recommends ϖ = .10 (small

effect), ϖ = .30 (medium effect) and ϖ = .50 (large effect). 4. Case 1: Oatmeal and Taste Preference a. Your company produces four types of microwave oatmeal products:

Apple, Peach, Cherry, and Prune. The vice-president for sales has received marketing data that suggests the purchasing decision is not based on fruit taste preference. You have been asked to verify these data. At the α = .05 level, test the claim that product purchase is unrelated to taste preference. You have randomly identified 88 customers and learned which oatmeal product they last purchased.

b. Apply Classical Hypothesis Testing Method (1) Are product purchases related to taste preference? (2) ρ1 = ρ2 = ρ3 = ρ4

(3) ρ1 ≠ ρ2 ≠ ρ3 ≠ ρ4

(4) Ho: ρ1 = ρ2 = ρ3 = ρ4

(5) H1: ρ1 ≠ ρ2 ≠ ρ3 ≠ ρ4

(6) α = 0.05

Page 5: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

269

(7) Chi-Square Goodness of Fit Test (8) Compute the test statistic and select the relevant critical value(s). (a) Data Table

Table 13.1 Taste Preference Case Data Taste 0 E O-E (O-E)2 (O-E)2/E Apple 36 22 14 196 8.909 Peach 21 22 -1 1 0.045 Cherry 12 22 -10 100 4.545 Prune 19 22 -3 9 0.409 2χ =13.908

(b) Compute Degrees of Freedom: df = 3 (k-1 or 4-1) (c) Substitute into Formula 13.1: See Data Table where 2χ = 13.908 (d) Critical Value: 7.815 at for ∂ = .05 & df = 3 (Triola 1998, p. 716) (9) Apply Decision Rule: Since 2χ = 13.908 ≥ 7.815, reject Ho: ρ1 = ρ2 =

ρ3 = ρ4, p < .05. (10)There appears to be a relationship between purchase decision and taste

preference. It seems that customers prefer to purchase the Apple and Cherry brands of oatmeal.

(11)Effect Size Estimate: In this case an effect size could be computed and

then interpreted using Cohen’s (1988, pp. 224-225) criteria. D. Chi-Square Test of Independence (Also called Association) 1. Theory and Formulae a. Welkowitz, Ewen, & Cohen (1991, pp. 293-297) and Daniel (1990, pp.

181-185) report that this test is applied when two variables from the same sample are to be tested.

(1) The Test of Independence can be computed in the same manner as the Goodness of Fit.

(2) Formula 13.3 Chi-Square Test of Independence (Spatz, 2001, p. 280)

( )[ ]EEO 22 −∑=χ

b. Measure of Association: The phi φ Coefficient (1) The phi coefficient is a symmetric index with a range between 0 and 1,

where zero equals statistical independence. (a) Maximum value is attained only under strict perfect association.

Page 6: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

270

(b) The phi coefficient is depressed by skewed marginal totals. (i.e., imbalanced marginal totals). The greater the skewness is the greater the depression.

(2) Before the phi coefficient is applied, there should be a statistically significant chi-square value where a 2 x 2 contingency table has been applied. Apply the phi coefficient only to a 2 x 2 Contingency Table.

(3) Formula 13.4 phi Coefficient (Spatz, 2001, p. 284)

N

2χφ =

where: φ = phi coefficient Χ2 = Χ2 test statistic N = total number of subjects c. Measure of Association: Cramer’s C or V Statistic (1) Cramer’s C or V statistic (Daniel, 1990, pp. 403-404; Spatz 2001, p.

291) is applied as a measure of association when “r x c” or “i x j” tables larger than the 2 x 2 Contingency Table are tested. The “r x c” or “i x j” terminology is used interchangeably.

(2) Characteristics (a) Value varies between “0” and “1.” (b) Where no association is present, C = 0. (c) When r = c, a perfect correlation is indicated by C = 1. (d) When r ≠ c, a perfect correlation may not be indicated even when C = 1. (3) Formula 13.5 Cramer’s C or V Statistic (Daniel, 1990, p. 403)

2

( 1)C

n tχ

=−

where: 2χ = computed 2χ test statistic n = sample size t = the number of r or c whichever is less (4) If 2χ is significant, then Cramer’s C is significant (Ewen, 1991, p. 178) d. Measure of Association: Contingency Coefficient (CC) (1) Siegel (1956, p. 196) describes the contingency coefficient, “a measure

of the extent of association or relation between two sets of attributes [variables]. It is applied as a measure of association when “r x c” or “i x j” tables larger than the 2 x 2 Contingency Table are tested.

Page 7: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

271

(2) CC will have the same value regardless of how the categories are arranged in the columns or rows.

(3) Formula 13.6 The Contingency Coefficient (Siegel, 1956, p. 196)

2

2CCNχχ

=+

where: CC = contingency coefficient Χ2 = Χ2 test statistic N = total number of subjects e. Effect Size Index (1) For a 2 x 2 Contingency Table, the phi coefficient does double duty as

a measure of association and effect size (Spatz, 2001, p. 284). (2) For “r x c” tables with more than one degree of freedom, where df = (r-1)(c-1), use Formula 13.5 (Spatz, 2001, 291).

2. Case 1: Training Product Sales Approach a. Example: A product manager within your training organization has

recommended that your company’s historically most successful product needs new sales approach to boost sales. You randomly selected 305 former trainees and through the mail, asked them to evaluate current product sales approach. One demographic variable you assessed was gender. This is a two variable analysis.

b. Apply the Hypothesis Testing Model (1) Is there a difference in packaging preference based on gender? (2) ρ1 ≠ ρ2

(3) ρ1 ≠ ρ2

(4) Ho: ρ1 = ρ2

(5) H1: ρ1 ≠ ρ2

(6) α = 0.05 (7) Chi-Square Test of Association (8) Compute the test statistic and select th (a) Data Table

Table 13.2 Training Sales Approac

Variable Men WomLike Sales Approach 43

(38.87) 35 (39.1

Disliked Sales Approach

109 (113.13)

118 (113.

Column Total 152 153

(b) Compute Degrees of Freedom: df (c) Substitute into Formula 13.3 Chi-S

To compute expected frequency, for the Test of Independence: Expected = (row total) (column total)

(grand total)

e relevant critical value(s).

h Case Data en Row Total

2) 78

87) 227

305

= 1 (r-1)(c-1)

quare Test of Association

Page 8: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

272

Table 13.3 Training Sales Approach 2χ Data

Taste 0 E O-E (O-E)2 (O-E)2/E M/Like 43 38.87 4.13 17.057 0.439 W/Like 35 39.12 -4.12 16.974 0.434 M/Dislike 109 113.13 -4.13 17.057 0.151 W/Dislike 118 113.87 4.13 17.057 0.150 2χ =1.174

(d) Critical Value: 3.841 at ∂ = .05 & df = 1 (Triola 1998, p. 716) (9) Apply Decision Rule: Since 2χ = 1.174 is < 3.841, retain Ho: ρ1 = ρ2

as p > .05. (10)There is insufficient evidence to conclude that there is a relationship

between liking product packaging and gender. It appears that both men and women disliked the training product’s sales approach.

(11)Effect Size Estimate: Doesn’t apply. 3. Case 2: Education Attainment and Training Preference a. The 2 x 2 or Contingency Table (1) The Contingency Table can be used when two categorical variables

with two levels each are being tested for an association. (2) Formula 13.7 2 x 2 Contingency Table (Spatz, 2001, p. 284)

2( )2( )( )( )(

N AD BC)A B C D A C B Dχ −

+ + + += b. Apply the Hypothesis Testing Model (1) Are education (high school or college graduate) and training

preference (active or traditional learning) related? (2) ρ1 ≠ ρ2

(3) ρ1 ≠ ρ2

(4) Ho: ρ1 = ρ2

(5) H1: ρ1 ≠ ρ2

(6) α = 0.05 (7) Chi-Square Test of Association (2 x 2 Contingency Table) (8) Compute the test statistic and select the relevant critical value(s). (a) Data Table

Page 9: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

273

Table 13.4 Training Preference Case Data

Preference High Sch College Row Total Active Learning 45 (A) 55 (B) 100 Traditional 27 (C) 53 (D) 80 Column Total 72 108 180

(b) Compute Degrees of Freedom: df = 1 (always) (c) Substitute into Formula 13.7 2 x 2 Contingency Table

))()()(()(2 2

DBCADCBABCADN

=+++−=χ

000,208,62000,8101802 •=χ

000,208,62000,800,1452 =χ

34.22 =χ (d) Critical Value: 3.841 at ∂ = .05 & df = 1 (Triola 1998, p. 716), (9) Apply Decision Rule: Since 2χ = 2.34 is < 3.841, retain Ho: ρ1 = ρ2, as

p > .05.

(10)There is insufficient evidence to conclude that there is a relationship between education attainment and training preference.

(11)Effect Size Estimate: Does not apply. 4. Case 3: Age and Environmental Preference (urban vs. suburban) a. Apply the Hypothesis Testing Model (1) Are age and environmental preference related? (2) ρ1 ≠ ρ2

(3) ρ1 ≠ ρ2

(4) Ho: ρ1 = ρ2

(5) H1: ρ1 ≠ ρ2

(6) α = 0.05 (7) Chi-Square Test of Association (2 x 2 Contingency Table)

Page 10: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

274

(8) Compute the test statistic and select the relevant critical value(s). (a) Data Table

Table 13.5 Age and Environment Case Data Preference ≤ 40 ≥ 41 Row Total Urban 15 28 43 Suburban 35 22 57 Column Total 50 50 100

(b) Compute Degrees of Freedom: df = 1 (always) (c) Substitute into Formula 13.7 2 x 2 Contingency Table

))()()(()(2 2

DBCADCBABCADN

=+++−=χ

)50)(50)(57)(43()35282215(1002 2•−•=χ

895.62 =χ

500,127,6000,250,422 =χ

(d) Critical Value: 3.841 at ∂ = .05 & df = 1 (Triola 1998, p. 716) (9) Apply Decision Rule: Since 2χ = 6.895 is > 3.841, reject Ho: ρ1 = ρ2,

p > .05. (10)There is sufficient evidence to conclude that there is a relationship

between age and environmental preference. People 40 and younger prefer the suburban environment.

(11)Apply Measure of Association: Since this is a 2 x 2 contingency table, we will use the phi-coefficient. (a) Substitute into Formula 13.4

26.06895.

100895.62

====Nχφ

(b) This appears to be a moderate positive association, as Φ = .26.

Page 11: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

275

(12)Effect Size Estimate: Using the guidelines, the preference appears to be “medium.” The Φ coefficient is also an index of effect size for the contingency table. Interpretation guidelines are (Spatz, 2001, p. 284).

(a) Small Effect, Φ = .10 (b) Medium Effect, Φ = .30 (c) Large Effect, Φ = .50 . 5. Case 4: Mastery Status and Teaching Style a. Apply the Hypothesis Testing Model (1) Is there a relationship between type of mastery (e.g., passing a test)

and teaching style? (2) ρ1 = ρ2 = ρ3

(3) ρ1 ≠ ρ2 ≠ ρ3

(4) Ho: ρ1 = ρ2 = ρ3

(5) H1: ρ1 ≠ ρ2 ≠ ρ3

(6) α = 0.05 (7) Chi-Square Test of Association (3 x 2 Contingency Table) (8) Compute the test statistic and select the relevant critical value(s). (a) Data Table

Table 13.6 Mastery and Teaching Style Case Data Mastery Status Style (A) Style (B) Style (C) Row Total Master (L) 36 (30) 12 (12) 12 (18) 60 Non-Master (R) 14 (20) 8 (8) 18 (12) 40 Column Total 50 20 30 100 (N)

(b) Compute Degrees of Freedom: df = 2 (r-1)(c-1) or (2-1)(3-1) = 2 (c) Substitute into Formula 13.3 Chi-Square Test of Association

Table 13.7 Mastery and Teaching Style Case Data Category O E O-E (O-E)2 (O-E)2/E L/A 36 30 6 36 1.2 L/B 12 12 0 0 0.0 L/C 12 18 -6 36 2.0 R/A 14 20 -6 36 1.8 R/B 8 8 0 0 0.0 R/C 18 12 6 36 3.0 2χ = 8.0

(d) Critical Value: 5.99 at ∂ = .05 & df = 2 (Triola 1998, p. 716) (9) Apply Decision Rule: Since 2χ = 8.0 is > 5.99 so reject, Ho: ρ1 = ρ2 =

ρ3 as p < .05.

Page 12: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

276

(10)There is sufficient evidence to conclude that type of teaching style is related to student/trainee mastery. Masters seemed to prefer Style A, while Non-masters preferred Style C. (Hint: Compare observed to expected)

(11)Apply Measure of Association: Since more than one degree of

freedom is involved, we will apply Cramer’s C or V Statistic. (a) Substitute into Formula 13.5.

2 8.0 8.0 .08 .283

( 1) 100(1 1) 100C

n tχ

= = = = =− −

The strength of the association is moderate. (b) Effect Size Estimate: Spatz (2001, p.291) offers Cramer’s C as an

effect size index. Applying Cohen’s (1988, pp. 224-225) criteria, the effect of teaching style on trainee mastery is medium.

b. See any introductory statistics book for more computational exercises. E. Chi-Square Test of Homogeneity 1. This version of the chi-square test is applied when data are drawn from two or

more independent (i.e., not the same) samples (Daniel 1990, pp. 192-194) and Siegel (1956 pp. 104-111, 175-179). The usual X2 assumptions hold.

2. The computation process is the same as for the Test for Independence. Two sample comparisons can be displayed and tested in a 2 X 2 Contingency Table as well.

3. Daniel (1990, pp. 195-196) asserts that three or more populations can be tested in an r X 2 Contingency table, where r = the number of populations from which the r samples have been drawn and c = the two mutually exclusive categories of interest.

4. Further, three or more r populations can be tested across three or more c categories of classification in a r X c table (Daniel, 1990, p. 196-198), where r and c have the same definitions as above.

5. The df are computed by (r-1)(c-1) [r = # rows & c = # columns] and e = (row total)(column total)/N.

II. Analyzing Ordinal Data A. Introduction 1. Ordinal data are ordered categories, but distances between categories can’t be

determined. The ordering has significance. We know an “A” is higher than a “C”, but we don’t know by exactly how many points. Strongly disagree is an opinion which is very different from strongly agree, but exactly how different is unknown.

Page 13: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

277

Income Grades Likert Scale High A Strongly Disagree Middle B Disagree Low C No Opinion D Agree F Strongly Agree

2. In many social science disciplines, it is common practice to “convert” the

ordinal Likert or Likert style scale data into interval data by assigning numbers, such as “1” for “Strongly Disagree” or “5” for “Strongly Agree.” Among researchers, statisticians, and evaluators, this practice is controversial. However, it is widely used.

3. For nonparametric statistics, the null hypothesis is that population

distributions being tested are equal in shape and variability, but the populations’ distributions are asymmetrical (i.e., not SNC).

4. Three nonparametric tests will be examined which test instances where the

dependent variable is measured in ranks. These are: a. For two dependent samples, the Wilcoxon Matched Pairs Test will be

presented. b. For two independent samples, the Mann-Whitney U Test is profiled. c. A nonparametric correlation test (Spearman Rank Order Correlation Test)

has been previously presented. 5. Many nonparametric procedures do not have corresponding effect size

indices. B. Two Dependent Samples: Wilcoxon Matched Pairs Signed Ranks “T” Test 1. The Wilcoxon Matched Pairs Test is the nonparametric equivalent to the

dependent samples t-test and is applied to ordinal data (Spatz, 2001, pp. 309-313). Recall that there are three dependent designs: natural pairs (e.g. twins), matched pairs, and repeated (before and after). Scores from the two groups must be logically paired.

2. The test statistic is “T.” The critical value is drawn from the critical values for

the Wilcoxon matched pairs signed rank T table. It is the differences that are ranked not the values of the differences. The rank of “1” always goes to the smallest difference.

3. Computational Procedure (See Table 13.8 below.) a. For each pair of scores, find D, the difference between each pair.

Subtraction order is not important. b. Rank each difference based on the absolute value of each. The rank of “1”

goes to the smallest difference; the rank of “2” goes to the next highest difference and so on.

Page 14: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

278

(1) When a “D” value equals zero, it is not assigned a rank and N is reduced by one. If two “D” values equal zero, then one is given a +1.5 ranking and the second is given a -1.5 ranking. If three “D” values equal zero, then one is dropped (reducing N by one) and the other two are assigned -1.5 and +1.5 ranking.

(2) When “D” values are tied, the mean of the ranks which would have been received is given to each of the tied ranks. See how pairs 5 and 6 are treated in Table 13.8.

c. To each value of D attach the sign of its difference, negative (-) or positive (+) which is usually left off as it is understood.

d. Sum the positive and negative ranks separately. T is the absolute value of the smaller of the two sums.

e. If the test statistic “T” is less than (<) the critical value “T”, the null hypothesis is rejected.

f. Computation and Null Hypothesis Decision-Making

Table 13.8 Wilcoxon Matched Pairs Test Pair Variable A Variable B D Rank Signed Rank 1 18 26 -8 7 -7 2 16 19 -3 4 -4 3 25 20 5 5 5 4 25 24 1 1 1 5 24 22 2 1.5 1.5 6 23 21 2 1.5 1.5 7 25 18 7 6 6

Σ (+ ranks) = 15 Σ (- ranks) = -11 T = 11 (smallest absolute value)

(1) Since the test statistic T = 11 is > the critical value T = 2, for a 2-tail test where ∂ = 0.5 (Triola 1998, p. 726), we retain the null hypothesis that there is no difference between the rankings.

(2) Remember that the dependent samples t-test could have been applied

to these Table 13.8 data but the decision was made to test the rankings of the scores and not the scores themselves. In cases where there is reason to believe that the populations are not normally distributed and do not have equal score variances, one would use this test.

(3) The rationale for the Wilcoxon Matched Pairs Test is that if the

populations are truly equal (i.e., there are no real differences) the absolute values of the positive and negative sums will be equal and any differences are due to sampling fluctuations.

Page 15: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

279

g. When the sample size is greater than 50, the T statistic is approximated using the z-score and the SNC.

(1) Formula 13.8a The Wilcoxon Matched Pairs Test N > 50 (Spatz, 2001, pp. 312-313)

( ) T

T

T c uz δ+ −=

(2) Formula 13.8b The Wilcoxon Matched Pairs Test N > 50 (Spatz, 2001,

pp. 312-313)

( 14T

N Nµ +=

)

(3) Formula 13.8c The Wilcoxon Matched Pairs Test N > 50 Formula

(Spatz, 2001, pp. 312-313)

( 1)(2 124T

N N Nδ + +=

)

Where: T = Smaller sum of the signed ranks c = 0.5 N = number of pairs (4) Decision Rules (a) For a 2-tail test, reject the null hypothesis (Ho) if the computed test

statistic “z” falls outside the interval -1.96 and +1.96 at ∂ = .05. (b) For a 2-tail test, reject the null hypothesis (Ho) if the computed test

statistic “z” falls outside the interval -2.58 and +2.58 at ∂ = .01. 4. Case: Eleven training assistants completed a teaching methods workshop.

Each was pre-tested before and post-tested after the workshop. A Higher score on the tests indicates better teaching skills. When examining dependent samples t-test assumptions, it was noted that the scores were not normally distributed. So the scores were converted to ranks as found in Table 13.9.

a. Applying the Adapted Hypothesis Testing Model (1) Did the workshop improve teaching skills? (2) Ho: Population distributions are not equal. (3) H1: Population distributions are equal. (4) α = 0.05 for a two-tail test (5) Wilcoxon Matched Pairs Signed Ranks Test

Page 16: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

280

(6) Compute the Test Statistic (a) Construct Data Table:

Table 13.9 Teaching Workshop Case Data Subject Pretest Posttest D Rank Signed Rank A 10 14 -4 5 -5 B 11 19 -8 10 -10 C 9 13 -4 5 -5 D 11 12 -1 1 -1 E 14 22 -7 9 -9 F 6 11 -5 7 -7 G 13 17 -4 5 -5 H 12 18 -6 8 -8 I 19 19 0 Deleted Deleted J 18 16 2 2 2 K 17 14 3 3 3

(b) Compute Number of Pairs N (#of pairs) - (# of deleted) or N = 10 (c) Compute the test statistic “T”

Σ (+ ranks) = 5 Σ (- ranks) = -52 T = 5 (smallest absolute value)

(d) Critical “T” Value: = 8 at ∂ = .05 in a 2-tail test with N = 10

(Triola 1998, p. 726) (7) Apply Decision Rule: Since the test statistic of T = 5 is less than the

critical T value of T = 8, the null hypothesis is rejected. (8) It appears that the workshop does not improve teaching skills. (9) Effect Size Estimation: Does not apply. C. Independent Samples: The Mann-Whitney U Test

1. Introduction a. The Mann-Whitney test produces the U statistic which is based upon the U

distribution (Spatz, 2001, pp. 303-308). Like all distributions for small samples (N1 ≤ 20 and N2 ≤ 20) the distribution shape depends on the sample size.

b. To detect errors, the two U value sums in the Mann-Whitney U Test equals the product of (N1)(N2).

c. In the Mann-Whitney U Test it makes no difference whether the highest or lowest score (should you have to rank scores) is given the rank of 1.

d. Most researchers using the Mann-Whitney test assume that the two distributions has the same form (shape) but most likely differ in central tendency. A significant “U” is typically attributed to difference in central tendency between the two groups; “U” actually compares distributions.

Page 17: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

281

e. Ties between ranks are handled in the same manner as in the Wilcoxon Matched Pairs Signed Ranks test. If the ties are in the same group, then the U value is not affected. If there are several ties across both groups, the correction suggested by Kirk (1999, p. 594) should be applied.

f. Hint: The smaller the U value, the more different the two groups are. 2. Computational Process for Small Sample a. Formula 13.9a The Mann-Whitney U Test Small Sample for R1 (Spatz,

2001, p. 304)

1 1

1 2( 1)( )( )

2N NU N N R1

+= + −∑

Where N1 = Number comprising group one N2 = Number comprising group two ΣR1 = sum of ranks for group one b. Formula 13.9b The Mann-Whitney U Test Small Sample) for R2 (Spatz,

2001, p. 304)

2 2

1 2( 1)( )( )2

N NU N N R2+

= + −∑ Where N1 = Number comprising group one N2 = Number comprising group two Σ R2 = Sum of ranks for group two 3. Case: Eleven freshmen students were collectively ranked by a team of three

professors as to their academic competence. Unknown to the professors, some freshman had completed an academic competence course where they were exposed to study habits, time management, academic culture, and managing for productive academic relationships.

(a) Applying the Adapted Hypothesis Testing Model (1) Does the freshman workshop improve academic competence? (2) Ho: Population distributions are not equal. (3) H1: Population distributions are equal. (4) α = 0.05 for a two-tail test (5) Mann-Whitney U Test (6) Compute the Test Statistic (a) Construct Data Table:

Page 18: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

282

Data Table 13.10 Academic

Competence Case Data Students Ranking Course A 4 Yes B 9 No C 7 No D 6 Yes E 5 No F 3 Yes G 2 No H 10 No I 11 No J 1 Yes K 8 Yes

(b) Determine N’s: N1 = 5 N2 = 6 (c) Compute the test statistics “U”

ΣR1 (yes) = 22 ΣR2 (no) = 44 U = 7 (smallest absolute value)

[1] Apply Formula 13.9a for the yes group (Spatz, 2001, p. 304)

1 1

1 2( 1)( )( )

2N NU N N R1

+= + −∑

5(5 1)(5)(6) 222

3030 222

45 2223

U

U

UU

+= + −

= + −

= −=

[2] Apply Formula 13.9a for the no group

(Spatz, 2001, p. 304)

2 2

1 2( 1)( )( )

2N NU N N R2

+= + −∑

Page 19: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

283

6(6 1)(5)(6) 442

4230 442

30 21 447

U

U

UU

+= + −

= + −

= + −=

(d) Critical “U” Value: = 3 at ∂ = .05 in a 2-tail test at the intersection

of column N1 = 5 and row N2 = 6 in the Mann-Whitney Critical Value Table (Spatz, 2001, p. 377)

(7) Apply Decision Rule: Since the test statistic of U = 7 is greater than

the critical U value of U = 3, the null hypothesis is retained. (8) It appears that the workshop does not improve academic competency

as the distributions of the two groups are statistically equal. (9) Effect Size Estimation: Does not apply. 4. When the sample size is greater than 21, in either N1 or N2, the U test statistic

is approximated using the z-score and the SNC. a. Formula 13.10 The Mann-Whitney U Test For N1 or N2, > 21 (Spatz,

2001, p. 306). In Formula 13.10 “c” is a .05 correction factor.

( )

1 2

1 2 1 2

( )( )2

( )( )( 1)12

U

U

U c u

U

U

z

N N

N N N N

δ

µ

δ

+ −=

=

+ +=

b. Decision Rules (Spatz, 2001, p. 306) (1) For a 2-tail test, reject the null hypothesis (Ho) if the computed test

statistic “z” ≥ |1.96| at ∂ = .05. (2) For a 2-tail test, reject the null hypothesis (Ho) if the computed test

statistic “z” ≥ |2.58| at ∂ = .01. (3) For a 1-tail test, reject the null hypothesis (Ho) if the computed test

statistic “z” ≥ 1.65 at ∂ = .05. (4) For a 1-tail test, reject the null hypothesis (Ho) if the computed test

statistic “z” ≥ 2.33 at ∂ = .01.

Page 20: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

284

Review Questions

1. Which statistical test would be applied to assess differences in frequencies based on

gender? a. Chi-square Goodness of Fit c. Mann-Whitney U Test Small Samples b. Mann-Whitney U Test Big Samples d. Wilcoxon Matched Pairs Test 2. Which one of the following statements concerning Chi-square is not accurate? a. Chi-square can be used to test whether observed nominal data conforms to some

theoretical or expected. b. When the sample size is N > 30, the chi-square is approximately normal. c. While a statistically significant X2 establishes a relationship between two

variables, it reveals little else. d. A statistically significant chi-square is a measure of the strength of an association. 3. Which one of the following statements concerning Chi-square is not accurate? a. The null (or statistical) hypothesis (Ho) states that there is either “no differences

between observed or expected frequencies" (Goodness of Fit) or "the variables or samples are not related, i.e., are independent" (Test for Independence) or are homogeneous (Test of Homogeneity).

b. As normally applied, the X2 is not directed at any specific alternative hypothesis. c. Degrees of freedom, alpha, and power must be specified or determined. The

applicable degrees of freedom (df) depends on the test applied. Alpha is specified, a priori, usually at the .05 or .01 level.

d. Each statement is in correct. 4. Chi-square assumptions include all but…. a. The distribution is symmetric. b. Chi-square values can be zero or positive but never negative. c. Chi-square distribution is different for each degree of freedom. As the number of

degrees of freedom increase, the distribution approaches the SNC. d. Degree of freedom (df) varies depending on the chi-square test being used. 5. Which index of effect size is correct? a. ω = .10 (small effect) c. ω = .50 (large effect) b. ω = .30 (medium effect) d. ω = .80 (very large effect) 6. Which one of the following statements about the phi (φ ) is not accurate? a. The phi coefficient is a symmetric index with a range between 0 and 1, where

zero equals statistical independence. b. Maximum value is attained only under strict perfect association. c. Before the phi coefficient is applied, there should be a statistically significant chi-

square value. d. Apply the phi coefficient only to any chi-square problem.

Page 21: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

285

7. Which one of the following tests is appropriate for the “before and after” design? a. Chi-square Goodness of Fit c. Mann-Whitney U Test Small Samples b. Mann-Whitney U Test Big Samples d. Wilcoxon Matched Pairs Test 8. When the sample size is > ______, the T statistic is approximated by the z-score and

the normal curve. a. 40 c. 60 b. 50 d. 70 9. When the sample size is > _______ for both N1 and N2 the U statistics is

approximated by the z-score and the normal curve. a. 20 c. 40 b. 30 d. 50 10. The statistical test where the null hypothesis is rejects when the test statistic is less

than the critical value is _______. a. Chi-square Goodness of Fit c. Mann-Whitney U Test Small Samples b. Mann-Whitney U Test Big Samples d. Wilcoxon Matched Pairs Test Answers: 1. a, 2. d, 3. d, 4. a, 5. d, 6. d, 7. d, 8. b, 9. a, 10. d.

References Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. Daniel, W. W. (1990). Applied Nonparametric Statistics (2nd ed.). Boston: PWS-Kent. Morehouse, C. A. & Stull, G. A. (1975). Statistical principles and procedures with applications for physical education. Philadelphia: Lea & Febiger. Reynolds. H. T. (1984). Analysis of nominal data (2nd ed). Beverly Hills, CA: Sage Publications.

Siegel, S. (1956). Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill. Spatz, C. (2001). Basic statistics: Tales of distributions (7th ed.). Belmont, CA: Wadsworth. Triola, M. F. (1998). Elementary statistics (7th ed.) Reading, MA: Addison-Wesley. Udinsky, B. F., Osterlind, S. J., & Lynch, S. W. (1981). Evaluation resource handbook. San Diego, CA: Edits Publishers.

Page 22: 13 Statistical Fdn Nominal Ordinal Data Analysis (1)

286

Welkowitz, J., Ewen, R. B., & Cohen, J. (1991). Introductory statistics for the behavioral sciences (4th ed.). New York: Harcourt Brace Jovanovich Publishers.