nonparametric analysis of ordinal categorical response data with factorial structure

Nonparametric analysis of ordinal categorical

response data with factorial structure

D. J. Best{

Commonwealth Scienti®c and Industrial Research Organisation, North Ryde, Australia

and J. C. W. Rayner

University of Wollongong, Australia

[Received May 1996. Final revision July 1997]

Summary. A simple nonparametric method of analysis for contingency tables with an ordinalresponse and factorial treatment structure is described. The method involves a partition ofPearson's X 2

P-statistic by using orthogonal polynomials so that location and dispersion effects areestimated for each level of the explanatory variable. Analyses of variance are then performed onthese effects to determine the important factors. The methods are applied to two examples, whereconsumers rate their liking for a product on an ordered categorical scale, one of which highlightsthe need to look at dispersion as well as location effects.

Keywords: Contingency tables; Dispersion (quadratic) effect; Goodness of ®t; Location (linear)effect; Orthogonal polynomials; Partition of Pearson's statistic

1. Introduction

In many sensory evaluation experiments, clinical trials and market research surveys, data arerecorded on an ordinal scale. An example is the data set quoted in Agresti (1990) and shownin Table 1. The data, gathered from defence force sta�, are concerned with preferences forblack olives and are presented as counts on a categorized liking scale. There is thus an ordinalresponse variable, liking, which is categorized into six ordered classes.We have a factorial structure with two factors, urbanization and region; urbanization has

two levels and region has three levels. Suppose that we ignore the factorial structure for themoment, so that we have a 6� 6 table where the rows are levels of an explanatory variableand the columns are ordinal responses. The usual homogeneity �2-statistic, X 2

P, for this tableis 50.05 on 25 degrees of freedom ��2 p-value, 0.002), and so the response distributions for thesix explanatory variables are not homogeneous. However, X 2

P does not take into account thatthe responses are ordered. If we assign the integer scores 1, 2, 3, 4, 5 and 6 to the columns,then a nonparametric test of the homogeneity of the row mean scores is given by Yates's Q-statistic (Yates (1948), p. 179). For these data Q � 33:61. As this statistic has an approximate�2-distribution on 5 degrees of freedom, there is clearly a di�erence between row mean scores(�2 p-value, 0.000). A formula and details of calculation for Q are given in Section 2.

{Address for correspondence: Mathematical and Information Sciences, Commonwealth Scienti®c and IndustrialResearch Organisation, PO Box 52, North Ryde, NSW 2113, Australia.E-mail: [email protected]

& 1998 Royal Statistical Society 0035±9254/98/47439

Appl. Statist. (1998)47, Part 3, pp. 439^446

Yates's Q-statistic is largely unknown but is a special case of the one-way analysis-of-variance type of statistic given in Agresti (1990), equation 8.19. If mid-rank scores had beenused instead of the integers 1±6, then the well-known Kruskal±Wallis test is obtained.The 6� 6 contingency table data of Table 1 can also be analysed by using several para-

metric procedures. Perhaps the most popular of these is ®tting log-linear models; see, forexample, Agresti (1990), chapters 5±9. Of course in using these models an assumption is madethat the model is correct. Also, as the estimated parameters may have di�erent standarderrors and di�erent covariances, a simple interpretation of the analysis is not always possible.Another problem with using log-linear models is that, for sparse tables, ®nding Monte Carlop-values may be quite time consuming because of numerical convergence problems.Given this background, we develop an analysis that uses the factorial structure in Table 1.

Chuang-Stein and Francom (1988) showed how to do this by using log-linear models. How-ever, we avoid problems such as those noted in the previous paragraph by developingalternative nonparametric techniques using both the location and the dispersion componentsof X 2

P. The nonparametric procedure that we propose applies not only to the factorial treat-ment structures discussed here but also to any orthogonal contrasts among the treatments,such as linear and quadratic contrasts when the treatments are levels of a continuousvariable.The importance of the dispersion components is illustrated by using the data shown in

Table 2. These data are similar to those from a Japan±Australia cross-cultural study carriedout by the Commonwealth Scienti®c and Industrial Research Organisation's Division ofFood Science and Technology. Japanese and Australian consumers rated a Japanese choco-late on a seven-point categorical scale with anchors `dislike extremely' and `like extremely'.Judging by eye, the opinions of the Australian consumers are more spread across the scale

440 D. J. Best and J. C. W. Rayner

Table 1. Observed counts and the linear and quadratic effects in the various regions for the olive data

Urbanization Region i Response{ Effect

ÿÿ ÿ � � �� Linear (v1i) Quadratic (v2i)

Urban Mid-west 1 20 15 12 17 16 28 1:1 1:6North-east 2 18 17 18 18 6 25 ÿ0:2 0:6South-west 3 12 9 23 21 19 30 2:9 ÿ0:7

Rural Mid-west 4 30 22 21 17 8 12 ÿ3:9 0:3North-east 5 23 18 20 18 10 15 ÿ2:1 ÿ0:2South-west 6 11 9 26 19 17 24 2:1 ÿ1:5

Sum 114 90 120 110 76 134

{Response is measured on a categorical scale: ÿÿ, dislike extremely; ÿ, dislike moderately; �, neither like nor dislike;�, like slightly; ��, like moderately; ��, like extremely.

Table 2. Observed counts in different countries and cities for the Japanesechocolate data

Country City Counts on the following scale of liking sweetness:

1 2 3 4 5 6 7

Australia Sydney 2 1 6 1 8 9 6Melbourne 1 6 2 2 10 5 5

Japan Tokyo 0 1 3 4 15 7 1Osaka 1 1 2 3 16 6 2

than the opinions of the Japanese consumers. The dispersion component of X 2P quanti®es and

hence enables testing of these types of di�erence.

2. Yates's statistic

2.1. Multinomial componentsRayner and Best (1989), chapter 5, discussed the components V1, . . ., Vcÿ1 of the usualPearson X 2

P goodness-of-®t statistic for a multinomial distribution in which n data values arecategorized into c classes with known class probabilities p1, p2, . . ., pc speci®ed by the nullhypothesis. These components are de®ned in the next paragraph. If the categories areordered, then the ®rst two components are based on orthogonal polynomials and identifylinear and quadratic trends in the multinomial counts that correspond to location and dis-persion e�ects. The linear and the quadratic components are asymptotically independent, andboth have asymptotically the standard normal distribution.We adopt the usual convention of denoting random variables by upper case letters and

particular values of random variables by the corresponding lower case letters. Suppose that thenumbers of observations in the c classes are N1, N2, . . ., Nc, where n � N1 �N2 � . . . �Nc,and that scores x1, x2, . . ., xc are assigned to these classes. The linear and quadratic com-ponents are de®ned in terms of orthogonal polynomials g1�xj� and g2�xj�, j � 1, . . ., c:

g1�xj� � �xj ÿ ��=p�2

and

g2�xj� � a

��xj ÿ ��2 ÿ

��3�xj ÿ ��

�2

�ÿ �2

�,

in which

� �Pcj�1

xjpj,

�r �Pcj�1�xj ÿ ��rpj

and

a � ��4 ÿ �23=�2 ÿ �22�ÿ0:5.Further orthogonal polynomials g3�xj�, . . ., gcÿ1�xj� can be derived by using the recurrencerelations in Emerson (1968) or by using the determinant formula of Lancaster (1969), p. 49.The components of X 2

P can then be de®ned as

Vu �Pcj�1

Nj gu�xj�=pn, u � 1, . . ., cÿ 1.

Rayner and Best (1989), chapter 5, showed that

X 2P �

Pcj�1�Nj ÿ npj�2=npj

� V 21 � V 2

2 � . . . � V 2cÿ1.

These components are the basis for score tests which are weakly optimal directional tests

Analysis of Ordinal Categorical Response Data 441

(asymptotically equivalent to likelihood ratio tests) and hence supplement the omnibus natureof Pearson's test. In general an omnibus test has some power for detecting many di�erentalternatives to the null hypothesis but, unlike a directional test, does not necessarily havegood power for detecting speci®c alternatives. So the ®rst two components, which suggestlocation and dispersion e�ects, may be signi®cant when the omnibus Pearson test is not.

2.2. Two-way tablesNow consider a two-way table with counts Nij, i � 1, . . ., r, j � 1, . . ., c, known row totalsni. � Ni1 � . . . �Nic, i � 1, . . ., r, column totals N.j � N1j � . . . �Nrj, j � 1, . . ., c, andtotal countN.1 � . . . �N.c � n1. � . . . � nr.� n.. . Suppose that the columns are the orderedcategories and that it is of interest to compare rows. Lancaster (1969), p. 214, called such atable a `comparative trial contingency table' but the name `product multinomial model' isnow more commonly used. The usual Pearson statistic is

X 2P �

Pri�1

Pcj�1�Nij ÿ Eij�2=Eij,

where Eij � ni.N.j=n.. , i � 1, . . ., r, j � 1, . . ., c, is the cell expectation under the null hy-pothesis of row homogeneity: that each row has a multinomial distribution with the same cellprobabilities. Again, Pearson's test is an omnibus test, whereas the tests based on the com-ponents given subsequently are directional.We obtain a decomposition of X 2

P for the two-way table by calculating values, v1i of V1, v2iof V2 etc. for the i th row of the table, i � 1, . . ., r. To do this, for each row take pj and Nj tobe N.j=n.. and Nij for j � 1, . . ., c respectively, and take n to be ni. . A measure of theconsistency of the linear or location e�ects across the rows is

Q � V 211 � . . . � V 2

1r,

which is just the statistic of Yates (1948) when xj � j for all c classes. Under the null hypoth-esis, the V1i have asymptotically the standard normal distribution with a linear constraint,V11

pn1. � . . . � V1r

pnr. � 0, so that Q is approximately distributed as �2

rÿ1. Large values ofa particular V1i suggest a linear trend or mean shift for the row i multinomial distribution,compared with the overall multinomial distribution with parameters n.. and probabilitiesfN.j=n.. g. Large values of Q indicate an overall linear trend or mean shift compared with thissame overall multinomial distribution.Again regarding the data in Table 1 as a 6� 6 contingency table, we ®nd fn.j=n.. g � f0:174,

0.138, 0.184, 0.184, 0.116, 0:205g. Thus with xj � j we have � � 3:544, �2 � 3:031, �3 � 0:002and �4 � 16:084, giving a � 0:381. Hence

g1� j � � � jÿ 3:544�=1:741and

g2� j � � 0:381

�� jÿ 3:544�2 ÿ 0:002� jÿ 3:544�

3:031ÿ 9:185

�.

It follows that

fv1i g � f1:096, ÿ0:158, 2:924, ÿ3:920, ÿ2:062, 2:052gand


fv2i g � f1:573, 0:604, ÿ0:694, ÿ0:293, ÿ0:238, ÿ1:523g.Hence Q � 1:0962 � �ÿ0:158�2 � . . . � 2:0522 � 33:61 as reported above. A futher analysisof the data in Table 1 is given in Section 4.Nair (1986), section 5, de®ned a statistic similar toQ but used mid-rank scores. This statistic

is just the Kruskal±Wallis statistic adjusted for ties. Best (1990) gave a taste test applicationof Yates's Q-statistic and extended the analysis to consider dispersion or quadratic e�ects inaddition to location or linear e�ects. This dispersion statistic is de®ned by

D � V 221 � . . . � V 2

2r.

Nair (1986), section 5, gave a rank test for dispersion e�ects. Best (1994) compared Nair'sstatistics with other nonparametric tests when r � 2. In the following we use xj � j as Yates(1948) did, but sometimes other de®nitions of xj, such as mid-ranks, may be appropriate,depending on what is known about the data.

3. Ordinal categorical response data

Suppose that we have ordinal responses observed within factorial-structured groups. Treatthe data as a two-way contingency table with counts Nij arranged in r rows (the groups) and ccolumns (the ordinal response categories). Calculate v11, . . ., v1r as de®ned in the previoussection.Under these circumstances, for the i th row we can calculate a single v1i from the ni obser-

vations. The V1i are approximately independent normal random variables with unit variance.If there is a factorial structure for the row categories, then Q can be partitioned as in ananalysis of variance to see which factors, if any, are important. However, as each V1i hasapproximately unit variance, the importance of each sum of squares in the partition of Q canbe assessed via a �2-test rather than by the F-test used in the analysis of variance.Another di�erence from the analysis of variance is that Q is a sum of squares about 0

rather than about the average V1i -value. If all the row totals are the same then this averagewill be 0, but in general this will not be the case. Thus if an analysis-of-variance computerroutine is used to calculate the partition of Q then �v11 � v12 � . . . � v1r�2=r should be addedto the analysis-of-variance total sum of squares, and to the sum of squares associated witheach factor.Probabilities to assess signi®cance can be obtained by using the usual �2-distributions, or,

when counts are small, by using Monte Carlo methods. If we carried out a permutation teston the data, the column totals N.j would remain constant. Thus Monte Carlo p-values may beobtained by using the algorithm of Pate®eld (1981) which generates random contingencytables with ®xed margins. In contrast, obtaining Monte Carlo p-values for log-linear modelsis not always routine, as some of the random tables may be sparse and lead to numericalconvergence problems.

4. Olives data example

For each of the six rows of Table 1, values of V1i and V2i were calculated as above by usinginteger scores. For these data X 2

P � 50:05 on 25 degrees of freedom ��2 p-value, 0.002) andX 2

P ÿQ � 16:44 on 20 degrees of freedom ��2 p-value, 0.689). In addition, D � 5:781 on 5degrees of freedom ��2 p-value, 0.328) and X 2

P ÿQÿD � 10:66 on 15 degrees of freedom ��2p-value, 0.776). Although the linear e�ects are the important e�ects here, decomposing the


test statistic and its residual can still be illuminating. For example, a low v2i -value oftenindicates a concentration of counts in the middle categories: see v26. A high v2i -value oftenindicates counts concentrated at one or both ends of the ordered categories. If the counts areconcentrated at both ends or there are two peaks, there is market segmentation: the market islargely polarized into those who favour the product and those who do not.The statistic Q was calculated and partitioned by using a computer routine for a two-way

analysis of variance without replication, where the two factors are urbanization and region.This led to the analysis summarized in Table 3. Although quadratic e�ects are not important,Fig. 1 shows a plot of �v1i, v2i) values. There are both urbanization and region e�ects asjudged by the linear e�ects v1i. Olives are liked more in urban than in rural areas (compare 1,2 and 3 in Fig. 1 with 4, 5 and 6) and more in the south-west (SW) than in the other regions(compare 1 and 4 and 2 and 5 with 3 and 6).Fig. 1 suggests that there is not much di�erence in response between the north-east (NE)

and mid-west (MW) regions. If U denotes urban and R denotes rural then this comparisoncan be examined by partitioning the sum of squares for region into two contrasts:

p3 � fMW�U� �MW�R� �NE�U� �NE�R�g=2ÿ fSW�U� � SW�R�g �

and


Table 3. Partition of Yates's Q-statistic for the olives data

Source �2 Degrees offreedom

p-value

Urbanization 10.12 1 0.001Region 18.83 2 0.001Interaction 4.66 2 0.097

Yates's statistic Q 33.61 5 <0.001

Fig. 1. Linear (v1i ) versus quadratic (v2i ) effects for the olives data

p2fMW�U� �MW�R� ÿNE�U� ÿNE�R�g,

each on 1 degree of freedom. For these data the associated sums of squares are 18.09 and 0.73respectively. Clearly there is little di�erence in mean response between the MW and NEregions but the SW region di�ers signi®cantly from the average of these ��2 p-value, 0.000).We de®ned a row factor Swith six levels, a column factorCwith six levels and a covariate or

predictor LIN, taking values equal to those of C, and we used the GLIM software (Francis etal., 1993) to ®t to the data in Table 1 the log-linear model 1� S� C� S.LIN, where Poissonerrors were assumed and 1 denotes an overall e�ect. We previously found X 2

P ÿQ � 16:44 on20 degrees of freedom, which compares with the �2-value for the log-linear model of 17.67,also on 20 degrees of freedom. The following six parameter estimates (with standard errors inparentheses) related to S.LIN were obtained:

ÿ0:055 �0:079�, ÿ0:126 �0:080�, 0:045 �0:079�,

ÿ0:342 �0:082�, ÿ0:236 �0:081�, 0:000 � � .This last parameter is set to 0 by GLIM and there is no standard error. These are the log-linear model analogues of the v1is of Table 1. However, it is more di�cult to compare them asthey are correlated and have di�erent standard errorsÐalthough the di�erences are slightfor this data set. Note that all Vui are asymptotically standard normal random variablesunder the null hypothesis, so all the standard errors are approximately 1. An informal `byeye' inspection indicates trends in the GLIM parameter estimates that are similar to thoseexhibited by the v1is. A log-linear analysis similar to that given in Table 3 can be given if wede®ne two factors, urbanization and region, instead of S.

5. Cross-cultural study example

Table 4 con®rms the by eye impression that there are di�erences between countries indispersion or quadratic e�ects, but not in linear or location e�ects. There is no signi®cantdi�erence between Australian and Japanese consumers in their average liking of the sweet-ness of Japanese chocolate. However, there is signi®cantly more spread in the opinions of theAustralian consumers than in the opinions of the Japanese consumers. This conclusion wouldnot have been reached if only location tests had been employed. The spread of opinions of theAustralian consumers indicates market segmentation which has commerical implications.The partitions of Q and D were calculated by using a one-way analysis-of-variance routinewith the v1i - and v2i -values as data. The p-values in Table 4 are based on �

2-approximations.As all the row totals are equal in this example, no changes are needed in the analysis-of-variance output.

Ð


Table 4. Partition of the linear (Q) and dispersion (D) statistics for the Japanesechocolate data

Source Degrees offreedom

Linear p-value Quadratic p-value

Countries 1 0.2122 0.65 11.1042 <0.01Cities within countries 2 0.7524 0.69 0.4544 0.80

Total 3 Q � 0:9646 0.81 D � 11:5586 <0:05

Acknowledgements

We are much indebted to two referees and the Joint Editor who suggested substantialrevisions that have much improved the presentation. We also wish to thank Kathy Haskardfor assistance with the calculations.

References

Agresti, A. (1990) Categorical Data Analysis. New York: Wiley.Best, D. J. (1990) A new technique for statistical analysis of consumer sensory evaluation and market research data.

CSIRO Food Res. Q., 50, 85±88.Ð(1994) Nonparametric comparison of two histograms. Biometrics, 50, 538±541.Chuang-Stein, C. and Francom, S. F. (1988) Some simple log-linear models for the analysis of ordinal response data.

J. Appl. Statist., 15, 285±294.Emerson, P. L. (1968) Numerical construction of orthogonal polynomials from a general recurrence formula.

Biometrics, 24, 645±701.Francis, B., Green, M. and Payne, C. (eds) (1993) The GLIM System, Release 4. Oxford: Clarendon.Lancaster, H. O. (1969) The Chi-squared Distribution. New York: Wiley.Nair, V. (1986) Testing in industrial experiments with ordered categorical data. Technometrics, 28, 283±311.Pate®eld, W. M. (1981) Algorithm AS 159: An e�cient method of generating random R� C tables with given rowand column totals. Appl. Statist., 30, 91±97.

Rayner, J. C. W. and Best, D. J. (1989) Smooth Tests of Goodness of Fit. New York: Oxford University Press.Yates, F. (1948) The analysis of contingency tables with grouping based on quantitative characters. Biometrika, 35,176±181.


nonparametric analysis of ordinal categorical response data with factorial structure

Documents