[wiley series in probability and statistics] correspondence analysis || simple correspondence...

4

Simple correspondence analysis

4.1 Introduction

The graphical depiction of the association between categorical variables can be undertakenin many different ways. Chapter 1 provides a description of some common methods forvisualising univariate and bivariate categorical variables. Our focus is on the correspondenceanalysis of a contingency table, whether it be a cross-classification of only two variables, orof many. The most simple case, where correspondence analysis is applied, is for the graphicaldepiction of association between the variables of a two-way contingency table. Such a variantof correspondence analysis is referred to as simple correspondence analysis. The use of theword simple does not imply that the method is necessarily easy. Instead, it refers to the mostsimple type of contingency table the technique analyses -- that is a two-way contingencytable.

What is now well understood as correspondence analysis has European origins. It haslong been recognised that the quantification of association between two (or more) categoricalvariables has its roots in England. Some of the pioneering work of R.A. Fisher, Karl Pearson,FrankYates andGeorgeU. Yule formed the foundation of the numerical aspects of correspon-dence analysis. However, it was not until the 1960s when Jean-Paul Benzecri and his teamat the University of Paris, France, proposed a technique with a heavy graphical componentto this work which we now recognise as correspondence analysis. It has since spanned muchof the European statistical, and allied, disciplines and has developed due to the contributionof researchers in Great Britain, Japan and, to a lesser extent, the United States. In fact, muchof the work concerning the correspondence analysis of multiple categorical variables stemsfrom research undertaken in the Netherlands. Chapters 10 and 11 explore some of the keyfeatures and contributions of this aspect of correspondence analysis. However, despite theinternational growth of correspondence analysis, its impact on the Australasian statisticalcommunity has been minimal. With a few exceptions, including its sporadic application byresearchers in Australia and New Zealand, the mathematical evolution of correspondenceanalysis in this part of the world still remains largely unknown.

Correspondence Analysis: Theory, Practice and New Strategies, First Edition. Eric J. Beh and Rosaria Lombardo.© 2014 John Wiley & Sons, Ltd. Published 2014 by John Wiley & Sons, Ltd.Companion website: http://www.wiley.com/go/correspondence_analysis

http://www.wiley.com/go/correspondence_analysis

SIMPLE CORRESPONDENCE ANALYSIS 121

The internationalisation of correspondence analysis has been relatively slow, especiallywhen compared with the development of statistical techniques such as generalised linearmodels, Bayesian analysis, time series analysis and experimental design (just to name a few).The work concerning these areas of statistical study are, unlike correspondence analysis,largely model based which probably helps to explain the absence of correspondence analysisin some parts of the statistical world. Where it is seen, it is largely viewed as a descriptivestatistical technique, and devoid of any inferential work that dominates much of statisticalmodeling.

So, in this chapter we will describe some of the key concepts of simple correspondenceanalysis and do so by keeping in mind long established ideas of association for contingencytables. In particular, our attention will focus on nominal categorical variables that are ‘sym-metrically’ structured. Asymmetrically structured categorical variables that are nominallyand ordinally structured will be considered in the next few chapters.

We shall begin our discussion by saying that the term correspondence analysis is adirect translation of l’analyse de correspondances, literally meaning the analysis of corre-spondences or relationships/associations that exist in the data. It is therefore often viewed asa graphical, and simple, means of analysis.

It is important to note that there are many introductory discussions on some of thekey aspects of correspondence analysis. Some include (but are not limited to) the statisticaldescriptions of Beh (2004a), Kroonenberg andGreenacre (2006), Greenacre (2010a) andAbdiandWilliams (2010). Some more introductions, and of varying technical levels, can be foundin a number of different disciplines. For example, Hoffman and Franke (1986) and Whitlarkand Smith (2001, 2004) provide excellent descriptions in the market research literature.Introduction of correspondence analysis can also be found in other disciplines, includingpsychology (Doey and Kurta, 2011), various areas of medicine (Greenacre, 1992; Moussaand Ouda, 1988; Yelland, 2010), nursing (Watts, 1997), the biological sciences (Moser, 1985)and legal studies (Harcourt, 2002). Historical accounts of the development of correspondenceanalysis can also be found. For example, Benzecri (1977), Van Meter et al. (1994) andHolmes (2008) provide a very good account of this history from a French perspective.Further historical descriptions may be found in Armatte (2008), De Falguerolles (2008) andLebart (2008). Beh and Lombardo (2012) briefly describe the growth of correspondenceanalysis from an international perspective. Increasingly, we now find that many books thatdiscuss issues concerning categorical data analysis dedicate a section, or chapters, to simplecorrespondence analysis -- although many (not all) of them are rather elementary. See, forexample, Wickens (1989, Section 11.5), Everitt (1992, Section 3.5), Husson et al, (2011,Chapter 2) and Agresti (2002, Section 9.6.4). Wickens (1998) also provides an introductionto some of the key issues concerning categorical data analysis.

4.2 Notation

Consider an 𝐼 × 𝐽 two-way contingency table,𝐍, where the (𝑖, 𝑗)th cell entry is denoted by 𝑛𝑖𝑗

for 𝑖 = 1, 2,… , 𝐼 and 𝑗 = 1, 2,… , 𝐽 . Let the grand total of 𝐍 be 𝑛 and the correspondencematrix, or matrix of relative frequencies, be 𝐏 so that the (𝑖, 𝑗)th cell entry is 𝑝𝑖𝑗 = 𝑛𝑖𝑗∕𝑛 and∑𝐼

𝑖=1∑𝐽

𝑗=1 𝑝𝑖𝑗 = 1. Define the 𝑖th row marginal proportion by 𝑝𝑖 ⋅ =∑𝐽

𝑗=1 𝑝𝑖𝑗 and define the

𝑗th column marginal proportion as 𝑝⋅𝑗 =∑𝐼

𝑖=1 𝑝𝑖𝑗 . In the correspondence analysis literature,

122 CORRESPONDENCE ANALYSIS

the row marginal proportions are commonly referred to as row masses and the columnmarginal proportions are called column masses.

Much of our discussion in this chapter will be to explore the characteristics of correspon-dence analysis using matrix notation. Therefore, we shall denote the vector of row marginal

proportions by 𝐫 =(

𝑝1 ⋅,… , 𝑝𝐼 ⋅)𝑇

. Similarly, the vector of column marginal proportions

will be denoted as 𝐜 =(

𝑝⋅1,… , 𝑝⋅𝐽)𝑇

. The diagonal matrix of the row marginal proportionsis denoted by 𝐃𝐼 = diag (𝐫) while the diagonal matrix of column marginal proportions isdenoted by 𝐃𝐽 = diag (𝐜).

Jean Paul Benzecri

4.3 Measuring departures from complete independence

The aim of correspondence analysis, like many multivariate data analytic techniques, isto determine scores which describe how similar or different responses from two or morevariables are. For a two-way contingency table, the strength of association between the rowsscores and column scores should also be considered. This is traditionally carried out bydetermining those cell proportions which deviate from what is expected under the hypothesisof independence,

𝐏 = 𝐫 𝐜𝑇 . (4.1)

However, there are a variety of ways inwhich departures from independence can bemeasured.Here we discuss two such ways.


4.3.1 The ‘duplication constant’

One of the simplest ways to determine the strength of association between two categoricalvariables is to consider the duplication constant, 𝛼, such that

𝐏 = 𝛼 𝐫 𝐜𝑇 .

Thus, complete independence between the row and column variables occurs when 𝛼 = 1.Such an approach has been commonly used in the past in the marketing research literature

and is a topic of consideration by Headen et al. (1979), Danaher (1991) and Gokhale andJohnson (1978). In fact, Danaher (1991) considered the duplication constant in the contextof popular association models for two- and three-way contingency tables. The estimation of𝛼 may be achieved using least-squares estimation and yields

�� =∑𝐼

𝑖=1∑𝐽

𝑗=1 𝑝𝑖𝑗𝑝𝑖 ⋅𝑝⋅𝑗(∑𝐼

𝑖=1 𝑝2𝑖 ⋅

)(∑𝐽

𝑗=1 𝑝2⋅𝑗

) .

Despite its simplicity, considering the duplication constant assumes that the departurefrom independence -- measured by the quantity 𝛼 -- is homogeneous across each and everycell of the contingency table.

Since it is generally considered inappropriate to assume a homogeneous independencestructure across the cells, this constant is very rarely considered in categorical data analysis.Alternatively, one may consider an equally simple measure that reflects a heterogeneousstructure and leads to Pearson’s chi-squared statistic.

4.3.2 Pearson ratios

Rather than determining the duplication constant for a contingency table, we may considercalculating the Pearson ratio of each cell. In matrix form, we may therefore consider

𝐏 = 𝐃𝐼𝚫𝐃𝐽 , (4.2)

where the (𝑖, 𝑗)th cell value of 𝚫 is the Pearson ratio, of that cell. If the row and column vari-ables are completely independent, then all cells in 𝚫 are 1. However, complete independencewill seldom be observed. Therefore, one can determine which of the Pearson ratios in 𝚫 arenot equal to 1. These elements can easily be observed by calculating the matrix of Pearsonratios:

𝚫 = 𝐃−1𝐼𝐏𝐃−1

𝐽. (4.3)

See, for example, Goodman (1996) and Beh (2004a).Table 4.1 shows that, for Selikoff’s (1981) asbestos data summarised in Table 1.2, there

are a number of cells that have a Pearson ratio that is vastly different from unity; consider themore extreme values in the (5, 1)th, (5, 3)th and (5, 4)th cells. Pearson ratios that are greaterthan 1 occur when we have a low (or no) grade of asbestosis and less years of exposure toasbestos or when we record a high grade of asbestosis andmany years of exposure to asbestos.They also arise for workers diagnosed with a high grade of asbestosis and a large period of


Table 4.1 The Pearson ratio of Table 1.2.

Exposure (years) None Grade 1 Grade 2 Grade 3

0--9 1.740 0.318 0.000 0.00010--19 1.087 1.272 0.211 0.00020--29 0.530 1.387 1.957 1.16130--39 0.250 1.605 2.239 2.07340+ 0.112 0.883 3.737 5.170

occupational exposure to the fibre. On the other hand, the remaining cell values are all lessthan 1. This suggests that there may be evidence of a strong linear-by-linear association (orpositive correlation) between the two variables; more formal tests are generally not availablefor the Pearson ratio, although the analyst may consider Monte Carlo 𝑝-values. Due to thepresence of three zero cell frequencies in the original contingency table, there are three zeroPearson ratios. Section 4.10 provides a comprehensive description of the R code required tocalculate the Pearson ratio for each cell and many other numerical summaries described inthis chapter.

4.4 Decomposing the Pearson ratio

The complete association structure of the original contingency table, 𝐍, is preserved withinthe matrix of Pearson ratios, 𝚫. However, this matrix is of dimension 𝐼 × 𝐽 which, in manysituations, makes visualisation very difficult. Therefore, in order to maximise the associationof the contingency table using as few dimensions as possible, we shall consider the generalisedsingular value decomposition:

𝚫 = ��𝜆��𝑇 . (4.4)

For Equation 4.4, �� is a 𝐼 × 𝑀 column matrix consisting of the 𝐼 left singular vectors of

𝚫, where the 𝑚th vector is denoted by 𝐚𝑚 =(

𝑎1𝑚, 𝑎2𝑚,… , 𝑎𝐼𝑚

)𝑇. Similarly, �� is a 𝐽 × 𝑀

column matrix consisting of the 𝐽 right singular vectors of 𝚫, where the 𝑚th vector is

𝐛𝑚 =(

𝑏1𝑚, 𝑏2𝑚,… , 𝑏𝐽 𝑚

)𝑇. In both cases, 𝑀 = min (𝐼, 𝐽 ) and the first (trivial) singular

vector of both matrices has all values equal to 1. The matrices �� and �� have the property

��𝑇 𝐃𝐼 �� = 𝐈𝑀 , ��𝑇 𝐃𝐽 �� = 𝐈𝑀 , (4.5)

respectively, where 𝐈𝑀 is an identity matrix of dimension 𝑀 × 𝑀 . The matrix ��𝜆 =diag

(𝜆0, 𝜆1,… , 𝜆𝑀

)is a 𝑀 × 𝑀 diagonal matrix consisting of the singular values of

𝚫 arranged in descending order such that

1 = 𝜆0 > 𝜆1 > 𝜆2 > ⋯ > 𝜆𝑀 > 0 .

See Van de Velden and Neudecker (2000) and Benasseni (2002) for proofs of this result.


We may redefine the generalised singular value decomposition of the Pearson ratios byremoving the trivial values of 𝜆0, 𝑎𝑖0 = 1, 𝑏𝑗0 = 1 from the analysis. In doing so, let 𝐀 and

𝐁 be the matrices �� and ��, respectively, with the trivial singular vector from each removed.Thus, �� =

[𝟏𝐼 𝐀

]and �� =

[𝟏𝐽 𝐁

]. Also, for the 𝑀 × 𝑀 matrix, ��𝜆, remove the first row

and column yielding the matrix 𝚲𝜆 so that

��𝜆 =

(1 𝟏𝑇

𝑀∗

𝟏𝑀∗ 𝚲𝜆

)

.

Therefore, denoting 𝑀∗ = min (𝐼, 𝐽 ) − 1, 𝐀 is a 𝐼 × 𝑀∗ matrix, 𝐁 is a 𝐽 × 𝑀∗ matrixand 𝚲𝜆 is a 𝑀∗ × 𝑀∗ matrix. By omitting the trivial values, the generalised singular valuedecomposition of the Pearson ratios becomes the generalised singular value decompositionof

𝚫 − 𝐔 = 𝐀𝚲𝜆𝐁𝑇 ,

where 𝐔 is a unity matrix of size 𝐼 × 𝐽 whose elements are all 1.Using the current orthogonality constraints that are imposed on 𝐀 and 𝐁 prevents us from

directly using R; this is because the singular value decomposition function in R, svd, doesnot impose the generalised singular value decomposition constraints given by Equation 4.5.However, by considering an appropriate rescaling of 𝐀 and 𝐁, R may be considered. Thescaling we consider here is to let �� = 𝐃1∕2

𝐼𝐀 and �� = 𝐃1∕2

𝐽𝐁. Therefore, 𝚫 − 𝐔 = 𝐀𝚲𝜆𝐁𝑇

may be alternatively rewritten so that

�� = 𝐃−1𝐼

(𝐏 − 𝐫 𝐜 𝑇

)𝐃−1

𝐽

=(𝐃−1∕2

𝐼��)𝚲𝜆

(𝐃−1∕2

𝐽��)𝑇

= 𝐃−1∕2𝐼

(��𝚲𝜆��𝑇

)𝐃−1∕2

𝐽.

By pre-multiplying both sides of this expression by 𝐃1∕2𝐼

and post-multiplying by 𝐃1∕2𝐽

, weobtain the matrix of standardised, or Pearson, residuals, 𝐒:

𝐒 = 𝐃−1∕2𝐼

(𝐏 − 𝐫 𝐜𝑇

)𝐃−1∕2

𝐽

= ��𝚲𝜆��𝑇 .

By considering the rescaling of 𝐀 to obtain �� and the rescaling of 𝐁 to obtain ��, theorthogonality constraints under generalised singular value decomposition of 𝚫,

𝐀𝑇 𝐃𝐼𝐀 = 𝐈𝑀∗ , 𝐁𝑇 𝐃𝐽𝐁 = 𝐈𝑀∗ ,

become the orthogonality constraints for the singular value decomposition of 𝐒:

��𝑇 �� = 𝐈𝑀∗ , ��𝑇 �� = 𝐈𝑀∗ ,

where 𝑀∗ = min (𝐼, 𝐽 ) − 1.


Beh (2004a) considered the generalised singular value decomposition of the Pearsonratios as the starting point for his discussion of correspondence analysis. Gower and Hand(1996) and Gower et al. (2011) also described their derivation of biplots for correspondenceanalysis from this perspective. They referred to the elements of 𝚫 as contingency ratiosas did Greenacre (2009) in his discussion of a variant of simple correspondence calledlog-ratio correspondence analysis; see also Section 9.6.9. The singular value decompositionof the standardised residuals is considered by many, including Greenacre (1984).

4.5 Coordinate systems

4.5.1 Standard coordinates

When considering the generalised singular value decomposition of the matrix of Pear-son ratios, 𝚫, one may visualise the associations between the row categories and thecolumn categories by considering the set of singular vectors,

{𝑎𝑖𝑚

}and

{𝑏𝑗𝑚

}for

𝑖 = 1, 2,… , 𝐼 and 𝑗 = 1, 2,… , 𝐽 . These are referred to as standard coordinates andmay be plotted as coordinates for the 𝑖th row and 𝑗th column onto the 𝑚th dimensionof correspondence plot -- see, for example, Greenacre (1984, p. 93). The standard co-ordinates of the rows and columns of Selikoff’s data -- Table 1.2 -- are summarised inTables 4.2 and 4.3, respectively. Chapter 3 calculates these coordinates using R; see Atildeand Btilde. Figure 4.1 shows the two-dimensional plot of the standard coordinates.

*

*

**

*

−1 10 2

−1

21

0

Principal Axis 1 (84.22%)

Prin

cipa

l Axi

s 2

(15

.35%

)

+

+

+

+

0−9

10−19

20−2930−39

40+

None

Grade 1

Grade 2

Grade 3

Figure 4.1 Two-dimensional plot of the standard coordinates of Selikoff’s asbestos data inTable 1.2.


Table 4.2 The row standard coordinates of Table 1.2.

Exposure (years) Axis 1 Axis 2 Axis 3 Axis 4

0−9 −1.023 0.949 −0.390 −0.81710−19 −0.368 −0.917 0.866 −0.10020−29 0.668 −0.582 −2.730 1.47630−39 1.093 −0.739 −0.584 −1.91740+ 1.901 1.174 1.075 0.119

Table 4.3 The column standard coordinates of Table 1.2.

Exposure (years) Axis 1 Axis 2 Axis 3 Axis 4

None −0.847 0.472 −0.056 1.000Grade 1 0.416 −1.340 0.291 1.000Grade 2 1.799 0.879 −1.963 1.000Grade 3 2.161 2.167 3.460 1.000

One may be tempted to interpret the plot in the following manner -- workers with lessthan 10 years of exposure are likely to not be diagnosed with asbestosis, while workers withat least 40 years of exposure will be diagnosed with the two most severe grades. However,one disadvantage of considering standard coordinates is that they give equal weighting toeach of the 𝑀∗ dimensions. That is, for the matrix of row standard coordinates, 𝐀,

Var (𝐀) = 𝐀𝑇 𝐃𝐼𝐀 = 𝐈𝑀∗

so that the weight associated with each axis of plots of the type given by Figure 4.1 is 1.Therefore, these axes have associated with them unit principal inertia values.

4.5.2 Principal coordinates

Rather than defining the row and column coordinates to be the column vectors of 𝐀 and 𝐁,respectively, consider instead defining the position of the row and column categories so thatthe strength of the association that exists between the variables is reflected. One such scalingof 𝐀 and 𝐁 is to consider

𝐅 = 𝐀𝚲𝜆, (4.6)

𝐆 = 𝐁𝚲𝜆 (4.7)

and defines the row and column principal coordinates, respectively. For Table 1.2, the rowand column principal coordinates are summarised in Tables 4.4 and 4.5, respectively, and aregraphically depicted using the two-dimensional correspondence plot of Figure 4.2.

By considering the row and column principal coordinates of Equations 4.6 and 4.7, theorthogonality constraints become

𝐅𝑇 𝐃𝐼𝐅 = 𝚲2𝜆

, 𝐆𝑇 𝐃𝐽𝐆 = 𝚲2𝜆

. (4.8)


Table 4.4 The row principal coordinates of Selikoff’sasbestos data summarised in Table 1.2.

Exposure (years) Axis 1 Axis 2 Axis 3

0−9 −0.715 0.283 −0.02010−19 −0.258 −0.274 0.04320−29 0.468 −0.174 −0.13730−39 0.764 −0.221 −0.02940+ 1.330 0.512 0.054

Table 4.5 The column principal coordinates ofSelikoff’s asbestos data summarised in Table 1.2.


None −0.592 0.141 −0.003Grade 1 0.291 −0.400 0.015Grade 2 1.258 0.262 −0.098Grade 3 1.512 0.647 0.173

−1.0 0.0 0.5 1.0 1.5 2.0 2.5

−1.

00.

00.

51.

01.

52.

02.

5


Prin

cipa

l Axi

s 2

(15

.35%

)

None

Grade 1

Grade 2

Grade 3

+

+

+

+

0−9

10−1920−2930−39

40+

*

** *

*

Figure 4.2 Two-dimensional plot of the principal coordinates of Selikoff’s asbestos datain Table 1.2.


Instead of each axis of Figure 4.2 having unit principal inertia values, the 𝑚th principalaxis has an inertia value equal to 𝜆2

𝑚.

Note that, from Equations 4.5 and 4.8,

𝑋2

𝑛= trace

(𝐅𝑇 𝐃𝐼𝐅

). (4.9)

Similarly,

𝑋2

𝑛= trace

(𝐆𝑇 𝐃𝐽𝐆

). (4.10)

Equations 4.9 and 4.10 show that principal coordinates close to the origin contribute verylittle (relatively speaking) to the association between the variables since they only make arelatively small contribution to the total inertia. On the other hand, principal coordinates farfrom the origin do make such a contribution. Of course their position in the correspondenceplot is also influenced by the row and column masses.

By considering Equations 4.8--4.9, the total inertia of the contingency table is

𝑋2

𝑛= trace

(𝚲2

𝜆

)

=𝑀∗∑

𝑚=1𝜆2

𝑚,

which confirms the total inertia expression in Section 3.7.1 and is the sum of squares of theprincipal inertias along each of the𝑀∗ dimensions of the optimal correspondence plot. Thus,the𝑚th principal inertia which is associatedwith the𝑚th principal axis, for𝑚 = 1, 2,… , 𝑀∗,of a correspondence plot may be expressed as the weighted sum of squares of the 𝐼 rowprincipal coordinates by

𝜆2𝑚=

𝐼∑

𝑖=1𝑝𝑖 ⋅𝑓

2𝑖𝑚

.

Similarly, the 𝑚th principal inertia may also be expressed as the weighted sum of squares ofthe 𝐽 column principal coordinates along the 𝑚th principal axis:

𝜆2𝑚=

𝐽∑

𝑗=1𝑝⋅𝑗𝑔2

𝑗𝑚.

Therefore, we can quantify the contribution that the principal coordinates for each row andcolumn category make to the general configuration of the correspondence plot, in particular,to each of the axes used to construct the plot.

The row principal coordinates may be alternatively expressed by multiplying Equa-tion 4.3 by 𝐃𝐽𝐁 and using the column orthogonality property of Equation 4.5 are

𝐅 = 𝐃−1𝐼𝐏𝐁 . (4.11)


Similarly, the column principal coordinates can be alternatively expressed by

𝐆 = 𝐃−1𝐽𝐏𝑇 𝐀 . (4.12)

Consider the (𝑖, 𝑚)th element of (4.11):

𝑓𝑖𝑚 =𝐽∑

𝑗=1

𝑝𝑖𝑗

𝑝𝑖 ⋅𝑏𝑗𝑚

=(

𝑝𝑖1𝑝𝑖 ⋅

)𝑏1𝑚 +

(𝑝𝑖2𝑝𝑖 ⋅

)𝑏2𝑚 + ⋯ +

(𝑝𝑖𝐽

𝑝𝑖 ⋅

)𝑏𝐽 𝑚 . (4.13)

This is just the weighted sum of the 𝑖th row principal and is akin to the reciprocal averagingtechnique described in detail by Hill (1974) and outlined in Chapter 3 -- see Equation 3.4where, for 𝑚 = 1, 𝑓𝑖1 ≡ 𝜌1𝑎𝑖1. Similarly, the (𝑖, 𝑚)th element of (4.12)

𝑔𝑗𝑚 =𝐼∑

𝑖=1

𝑝𝑖𝑗

𝑝⋅𝑗𝑎𝑖𝑚

=(

𝑝1𝑗

𝑝⋅𝑗

)𝑎1𝑚 +

(𝑝2𝑗

𝑝⋅𝑗

)𝑎2𝑚 + ⋯ +

(𝑝𝐼𝑗

𝑝⋅𝑗

)𝑎𝐼𝑚

is the weighted sum of the 𝑗th column principal coordinate and is analogous to Equation 3.5.These equations may therefore be considered formulae to transition between the standardcoordinates of one variable and the principal coordinates of the other variable. We shalldiscuss transition formulae for the row and column principal coordinates in Section 4.7.

An equivalent formulation of the row and column principal coordinates may be obtainedif one considers the singular value decomposition of 𝐒 rather than the generalised singularvalue decomposition of𝚫. In this case, the coordinates may be alternatively, but equivalently,expressed as

𝐅 = 𝐀𝚲𝜆 = 𝐃−1∕2𝐼

��𝚲𝜆 ,

𝐆 = 𝐁𝚲𝜆 = 𝐃−1∕2𝐽

��𝚲𝜆 .

It is worth noting that, in their discussion of simple correspondence analysis, Gower andDigby (1981, p. 93) consider the following set of principal coordinates:

𝐅 = 𝐀𝚲𝛽

𝜆, (4.14)

𝐆 = 𝐁𝚲𝛽

𝜆, (4.15)

where 𝛽 depends on the coordinate system being considered -- therefore, thesemay be referredto as Gower--Digby coordinates. For example, when 𝛽 = 0, the resulting coordinates areidentical to the standard coordinates described above. When 𝛽 = 1, the coordinates are theprincipal coordinates of the row and column categories. As we shall see in the next section,when 𝛽 = 1∕2, the resulting plots are akin to a symmetric biplot.

The use of principal coordinates to graphically depict the association between the rowand column variables has been traditionally used as the default coordinate system. However,the interpretation of the proximity between a row principal coordinate and a column principal


coordinate has long been questioned. This is because the Euclidean inner product of 𝐅 and𝐆is

𝐅𝐆𝑇 = 𝐀𝚲2𝜆𝐁𝑇 ≠ 𝚫 .

Ludovic Lebart

Lebart et al. (1984, p. 46) also described that it was ‘extremely dangerous’ to interpret theproximity of principal coordinates from different variables. More recently, Bock (2011a)described this issue from a marketing research perspective. Interestingly, Collins (2011,p. 583) remarks of Bock’s discussion:

. . . not only do we not need CA but that its opacity blinds us to the results thatemerge from a fuller and better directed analysis.

Such a harsh and critical appraisal no doubtwill create consternation by somewhose fluency incorrespondence analysis is well established. It certainly did with Bock (2011b, pp. 587--588)who replied

I suspect that Martin’s real concern is not with correspondence analysis. RatherI think his concern is with the plots.

However, in response to the article on correspondence analysis byWhitlark and Smith (2001),Collins (2002) further criticised heavily correspondence analysis saying

Correspondence analysis can make you blind . . . The technique is complex andobscure; all it offers is superficial gloss. A CA ‘map’ can be a crutch for those


who feel unable to stand in the face of data . . . The main problem with CA . . isthat it usually blinds us to simple and important findings, concentrating insteadon minor deviations from the main patterns.

The intention then is to deal with the inability of the principal coordinate to provide ameaningful interpretation of the distance between a row and column point in these plots. Oneapproach is to consider a slightly different rescaling of the standard coordinates. One wayin which this can be done is to consider one of a growing variety of graphical summariesthat belong to the family of biplots. The next section discusses the biplot and its role incorrespondence analysis.

4.5.3 Biplot coordinates

To provide a meaningful interpretation of the distance between a row principal coordinateand a column principal coordinate in a low-dimensional space, we can consider a rescalingof Equations 4.6 and 4.7 such that

�� = 𝐀𝚲𝛾

𝜆, �� = 𝐁𝚲1−𝛾

𝜆

for 0 ≤ 𝛾 ≤ 1. Such an approach, yielding a biplot display, was proposed in a principalcomponent analysis setting by Gabriel (1971). For this scaling of the standard coordinates,we shall refer to the (𝑖, 𝑚)th element of ��--𝑓𝑖𝑚 -- as the 𝑖th row biplot coordinate along the𝑚th axis of the display. Similarly, the (𝑗, 𝑚)th element of �� -- ��𝑗𝑚 -- is the 𝑗th column biplotcoordinate along this axis. Therefore, with such scaling implemented, the matrix of Pearsonratios may be decomposed such that

��𝑇 = 𝐀𝚲𝜆𝐁𝑇 = 𝚫 .

The left-hand side is just the Euclidean inner product of the row and columnbiplot coordinates.Note that by considering �� and �� points in the correspondence plot that are close to zero areconsistent with Pearson ratios of 1.

Suppose we consider the construction of a biplot for Table 1.2, where 𝛾 = 1∕2. The rowand column biplot coordinates for this value are summarised in Tables 4.6 and 4.7, respec-tively. The inertia values for each of the three axes of the optimal biplot are 0.699, 0.299and 0.050, each representing 66.7%, 28.5% and 4.8% of the total association between thecategorical variables. The two-dimensional biplot, using the first two columns of Tables 4.6and 4.7, is given by Figure 4.3. The configuration of points looks very similar to the configu-ration obtained using the standard coordinates plot (Figure 4.1) and the principal coordinates(Figure 4.2). However, the advantage with using this biplot is that the distance between arow and a column point now makes some sense, unlike the correspondence plot obtainedusing principal coordinate coordinates. For this reason, the use of biplots in a correspondenceanalysis context has increased greatly over the past 10 years.

This literature commenced with Gabriel’s (1971) proposition of the biplot. He consideredthe construction of a biplot with 𝛾 = 1, 𝛾 = 1∕2 and 𝛾 = 0 (p. 458). When considering suchspecifications of 𝛾 from a correspondence analysis perspective, Lombardo et al. (1996)referred to such plots as ‘row isometric’, ‘symmetric’ and ‘column isometric’ factorisations,respectively -- see also Section 5.4.3 which describes these factorisations for non-symmetrical


Table 4.6 The row (symmetric) biplot coordinates ofSelikoff’s asbestos data in Table 1.2.


0−9 −1.855 0.518 −0.08710−19 −0.308 −0.501 0.19420−29 0.559 −0.318 −0.61130−39 0.914 −0.404 −0.13140+ 1.590 0.936 0.241

Table 4.7 The column (symmetric) biplot coordinates ofSelikoff’s asbestos data in Table 1.2.


None −0.708 0.258 −0.012Grade 1 0.348 −0.732 0.065Grade 2 1.505 0.480 −0.440Grade 3 1.808 1.184 0.775

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−1.

0−

0.5

0.0

0.5

1.0

1.5

2.0


Prin

cipa

l Axi

s 2

(15

.35%

)

0−9

10−19

20−2930−39

40+

+

+

++

+

None

Grade 1

Grade 2

Grade 3

*

*

*

*

Figure 4.3 Symmetric biplot coordinates for Selikoff’s asbestos data in Table 1.2.


correspondence analysis. Gifi’s (1990, p. 268) Equations 8.25 and 8.26 point to a rowisometric plot.

Due to the different ways in which 𝚲𝜆 is scaled, the plot obtained from performing asimple correspondence analysis of Nmay not be considered a special type of biplot display --a point made by Greenacre (1993, p. 252) and Greenacre and Hastie (1987). Greenacre (1993)refers to the displays generated by considering 𝛾 = 0 and 𝛾 = 1 as an ‘asymmetric map’ sincethe rows and columns have not been weighted equivalently. However, the link betweenthe principal coordinates of simple correspondence analysis and the biplot coordinates canbe established by considering the following two cases. The ‘row isometric’ plot, obtainedby setting 𝛾 = 0, can be obtained by jointly visualising the row principal coordinates andcolumn standard coordinates. Similarly, the ‘column isometric’ plot, obtained by setting𝛾 = 1, can be obtained by jointly plotting the row standard coordinates with the columnprincipal coordinates.

Using the same formulations as simple correspondence analysis, the variances of the rowand column biplot coordinates are

Var(��)= trace

(��𝑇 𝐃𝐼 ��

)= trace

(𝚲2𝛾

𝜆

),

Var(��)= trace

(��𝑇 𝐃𝐽 ��

)= trace

(𝚲2(1−𝛾)

𝜆

),

respectively. Therefore, if a ‘symmetric map’ is chosen such that 𝛾 = 1∕2, then the inertiaassociated with the 𝑚th axis of the biplot is

𝐼∑


2𝑖𝑚

=𝐽∑

𝑗=1𝑝⋅𝑗 ��2

𝑗𝑚= 𝜆𝑚 .

The relationship between �� and 𝐅 is easily obtained. By noting that �� = 𝐀𝚲𝜆𝚲𝛾−1𝜆

, theprincipal and biplot row coordinates are related by

�� = 𝐅𝚲𝛾−1𝜆

, 𝐅 = ��𝚲1−𝛾

𝜆.

Similarly, the relationship between the principal and biplot column coordinates is

�� = 𝐆𝚲−𝛾

𝜆, 𝐆 = ��𝚲𝛾

𝜆.

Based on these results, the total inertia, 𝑋2∕𝑛, of the contingency table may be expressed interms of the row biplot coordinates such that

𝑋2

𝑛= trace

(𝚲1−𝛾

𝜆��𝑇 𝐃𝐼 ��𝚲

1−𝛾

𝜆

).

Similarly, the relationship between the total inertia and the column biplot coordinates is

𝑋2

𝑛= trace

(𝚲𝛾

𝜆��𝑇 𝐃𝐽 ��𝚲𝛾

𝜆

).


Therefore, when 𝛾 = 1 producing a row isometric biplot, the total inertia is

𝑋2

𝑛= trace

(𝐅𝑇 𝐃𝐼𝐅

)

= trace(��𝑇 𝐃𝐽 ��

)

and therefore preserves the total inertia as derived from the row principal coordinates and thecolumn standard coordinates.

For the column isometric biplot, where 𝛾 = 0, the total inertia may be expressed in termsof the row standard and column principal coordinates by

𝑋2

𝑛= trace

(��𝑇 𝐃𝐼 ��

)(4.16)

= trace(𝐆𝑇 𝐃𝐽𝐆

).

The net effect of these results is that the variation in the contingency table, as measured bythe total inertia 𝑋2∕𝑛, is calculated by the weighted sum of squares of the row and columncoordinates, whether they are principal coordinates or biplot coordinates. As is the case fora correspondence plot constructed using row and column principal coordinates, for biplots,a point situated close to the origin indicates that the particular category is only a relativelysmall contributor to the association structure. A point in a biplot that is situated far from theorigin reflects that its category is deemed an important contributor to the association.

There is an ever-increasing number of contributions in the literature that provide variousadaptations and advances of the biplot. Many of them are for graphically depicting theassociation using simple correspondence analysis. For example, suppose we define, for𝛽 ≠ 0, the following set of row and column coordinates:

�� =𝐀𝚲𝛾

𝜆,

𝛽�� = 𝛽𝐁𝚲1−𝛾

𝜆.

These coordinates are referred to as the beta scaling of the original principal coordinates.Gower et al. (2011, Section 2.3.1) consider the characteristics of this scaling in detail. Notethat, due to the notation used in their book, they referred to this method of constructing biplotcoordinates as lambda scaling. The interested reader is also directed to Lipkovich and Smith(2002), Greenacre (1993, 2010b, 2012), Carlier andKroonenberg (1996), Gower et al. (2010),Gower (1992, 2004), Graffelman and Aluja-Banet (2003) and Vicente-Villardon et al. (2006)for further variations, and characteristics, of the traditional biplot in simple correspondenceanalysis. Of special note is Greenacre’s (1993, Section 4) description of the interpretationof the biplot for correspondence analysis. Discussions of the biplot for contingency tableanalysis can be found in Blasius et al. (2009), Bradu and Gabriel (1978), Dossou-Gbeteand Grorud (2002) and Gabriel (1995a, 2002). More general discussions of the biplot canbe found in Daigle and Rivest (1992), Gabriel (2006), Gower and Harding (1988), Gower(1990, 1993) and Kroonenberg (1997). Aitchison and Greenacre (2002), Gabriel (1981,1995b), Gabriel and Odoroff (1990), Osmond (1985) and Smith and Cornell (1993) providegeneral discussions of biplots and their link to other aspects of graphical statistics.

Gabriel (1971) pointed out that his biplots were not unique to the data being analysed.Therefore, biplots in the correspondence analysis context are also not unique, a featuredescribed by Greenacre (1993) and Van de Velden and Kiers (2005). Therefore, different


biplot configurations can be obtained from the same contingency table. This aspect wasexploited by Van de Velden and Kiers (2005) who addressed the issue by rotating the pointsin such a way that they lie close to an axis, thereby providing additional interpretability ofthe axis. Consider, for example, the rotation of the matrix of row biplot coordinates

��∗ = ��𝐓,

where the diagonal 𝑀 × 𝑀 matrix 𝐓 defines the rotation structure of the rows in the biplotsuch that 𝐓𝑇 𝐓 = 𝐈𝑀 . Kiers (1991) and Adachi (2004) also considered the issue of rotationbut for multiple correspondence analysis. One may also consider the work of Lorenzo-Sevaet al. (2009) on this topic.

4.6 Distances

Now that we have described various ways in which to define coordinates for graphicallydepicting the association between categorical variables, we now turn our attention to theinterpretation of their distances from each other and from the origin. There is however oneword of warning; while it may be appealing to provide meaningful interpretation between arow point and a column point, it is generally considered that such an interpretation does notexist (except for when biplot coordinates are used). However, as we shall describe there ismuch debate about the interpretability of these inter-variable distances. It is well understoodthat one can interpret intra-variable distances, or distances between two row (or column)points.

We shall commence our discussion of distances of points in a correspondence plot byexamining their distance from the origin.

4.6.1 Distance from the origin

The squared distance of the 𝑖th row profile from the origin is

𝑑2𝐼(𝑖, 0) =

𝐽∑

𝑗=1

1𝑝⋅𝑗

(𝑝𝑖𝑗

𝑝𝑖⋅− 𝑝⋅𝑗

)2=

𝑀∗∑

𝑚=1

(𝐽∑

𝑗=1𝑝⋅𝑗𝑏2

𝑗𝑚

)

𝑎2𝑖𝑚

𝜆2𝑚

,

which simplifies to

𝑑2𝐼(𝑖, 0) =

𝑀∗∑

𝑚=1𝑓 2

𝑖𝑚, (4.17)

and is the Euclidean distance of the 𝑖th row principal coordinate from the origin. Therefore,Equation 4.9 becomes

𝑋2

𝑛=

𝐼∑

𝑖=1𝑝𝑖⋅𝑑

2𝐼(𝑖, 0) . (4.18)

Hence, the larger the distance of the 𝑖th row profile in the 𝑀∗-dimensional correspondenceplot from the origin, the larger the weighted discrepancy between the profile of the 𝑖th row


category to the average profile of the column categories. It follows that points far from theorigin indicate a clear deviation from what we would expect under complete independence,while a point near the origin indicates that the frequencies in row 𝑖 of the contingency table fitsthe independence hypothesis well. In fact, Lebart et al. (1984) showed by using confidencecircles that the analyst is able to test graphically whether the position of a particular row orcolumn category contributes to the hypothesis of independence for the contingency table.Beh (2001) demonstrated that these circles can be usefully applied for the correspondenceanalysis of ordinal two-way contingency tables. Generally, if the origin lies outside of theconfidence circle, then that category can be said to contribute to the dependency between therow and column categories of the contingency table. If the origin lies within the circle, thenthat category does not make such a contribution. See Chapter 8 for more details on this issue.

Similar statements can be made for the Euclidean distance of the column principalcoordinates to the origin.

4.6.2 Intra-variable distances and the 𝑳𝒑metric

One of the advantages of using a correspondence plot is that the analyst is able to graphicallyestablish those profiles that are similarly distributed and those profiles that are distributeddifferently using a low-dimensional subspace. Identifying profiles that are similarly, or dif-ferently, distributed in this space is commonly achieved by considering the squared Euclideandistance of two principal coordinates. For example, the squared Euclidean distance betweenthe 𝑖th and 𝑖′th row profiles 𝑖 in an optimal correspondence plot is given by

𝑑2𝐼

(𝑖, 𝑖′

)=

𝐽∑

𝑗=1

1𝑝⋅𝑗

(𝑝𝑖𝑗

𝑝𝑖⋅−

𝑝𝑖′𝑗

𝑝𝑖′⋅

)2(4.19)

and we term this the intra-variable distance between the two row categories since they arefrom the same variable. The presence of 1∕𝑝⋅𝑗 ensures that the property of distributionalequivalence is maintained (Benzecri, 1973; Lebart et al., 1984, p. 35). Such a chi-squareddistance measure is also referred to as weighted metric of the 𝐿2 type. Equation 4.19 can bewritten in terms of the Euclidean distance between the principal coordinates of the 𝑖th rowand the 𝑖′th row categories by

𝑑2𝐼

(𝑖, 𝑖′

)=

𝑀∗∑

𝑚=1

(𝑓𝑖𝑚 − 𝑓𝑖′𝑚

)2(4.20)

or equivalently

𝑑2𝐼

(𝑖, 𝑖′

)= 𝑑2

𝐼(𝑖, 0) + 𝑑2

𝐼

(𝑖′, 0

)− 2

𝑀∗∑

𝑚=1𝑓𝑖𝑚𝑓𝑖′𝑚 .


Similar Euclidean distance measures can be derived between the 𝑗th and 𝑗′th columncategories:

𝑑2𝐽

(𝑗, 𝑗′

)=

𝐼∑

𝑖=1

1𝑝𝑖⋅

(𝑝𝑖𝑗

𝑝⋅𝑗−

𝑝𝑖𝑗′

𝑝⋅𝑗′

)2

=𝑀∗∑

𝑚=1

(𝑔𝑗𝑚 − 𝑔𝑗′𝑚

)2.

These results lead to the conclusion that when two row profiles, or two column profiles, aresimilar, then they will be positioned closely to one another in the correspondence plot. If twoprofiles are different, then they will be positioned at a distance from one another. Therefore,we can see that these distance measures preserve the property of distributional equivalence.

The discussion of Fichet (2009) considers other distances of 𝐿𝑝 type that preserve theproperty of distributional equivalence. If we consider those metrics for the row profiles,Fichet (2009) discusses, for example, the following distances:

∙ The 𝐿1-type distance discussed by Benzecri (1982):

𝑑𝐼

(𝑖, 𝑖′

)=

𝐽∑

𝑗=1

||||

𝑝𝑖𝑗

𝑝𝑖 ⋅−

𝑝𝑖′𝑗

𝑝𝑖′ ⋅

||||.

∙ Hellinger’s distance described by Escofier (1978) and Domenges and Volle (1979):

𝑑2𝐼

(𝑖, 𝑖′

)=

𝐽∑

𝑗=1

(√𝑝𝑖𝑗

𝑝𝑖 ⋅−

√𝑝𝑖′𝑗

𝑝𝑖′ ⋅

)2

.

Rao (1995), Cuadras andCuadras (2006) andBeh andLombardo (2014) also consideredthe use of these distances for correspondence analysis. More will be said on the role ofHellinger’s distance in correspondence analysis in Chapter 9.

Yamakawa et al. (1998, 1999a, 1999b) considered the development and application of atechnique that involves, for the analysis of two-way tables, minimising the distance betweentwo row principal coordinates:

𝑑2𝐼

(𝑖, 𝑖′

)=

𝑀∗∑

𝑚=1

(𝑓𝑖𝑚 − 𝑓𝑖′𝑚

)𝑆

for any positive value of 𝑆, where 𝑆 is determined using the interior point method forthe neural solution algorithm. This is used as an alternative to Equation 4.20 and shiftscorrespondence analysis solution from the 𝐿2-norm planar space to the more general 𝐿𝑆 -norm space.

4.6.3 Inter-variable distances

As the row and column coordinates can be simultaneously represented on the same correspon-dence plot, it seems reasonable to assume that one is able to interpret the distance between


a row and a column profile. Such distances can be referred to as ‘inter-point’ distances or‘inter-variable’ distances.

In the correspondence analysis literature, the topic of inter-point distances has been rarebecause it is accepted that such distances do not have a clear, or valid, interpretation. However,Carroll et al. (1986) proposed (and further clarified in 1987) the following row and columncoordinates

𝐅 = 𝐃−1∕2𝐼

��(𝚲𝜆 + 𝐈𝑀∗

)1∕2, (4.21)

𝐆 = 𝐃−1∕2𝐽

��(𝚲𝜆 + 𝐈𝑀∗

)1∕2, (4.22)

where ��𝑇 �� = 𝐈𝑀∗ and ��𝑇 �� = 𝐈𝑀∗ . Such coordinates, referred to here as CGS coordinates,are derived from first considering a particular approach to performing multiple correspon-dence analysis that involves transforming the original contingency table into its indicatormatrix form (this transformation will be discussed in Chapter 10). When treating the rowand column coordinates symmetrically, the graphical display of these coordinates is, strictlyspeaking, a special type of correspondence plot constructed using principal coordinates. Incontrast with the simple correspondence analysis of a contingency table, the row space forthe CGS coordinates exists in the space for the individuals in the study, while the columnspace encompasses all of the categories for the two variables. So, it does seem reasonableto consider the distance between two categories from different variables using CGS co-ordinates. However, this space is not one that is analogous to a space derived using theclassical approach to simple correspondence analysis. It belongs to the space constructedfor a special type of multiple correspondence analysis. Interestingly, while the focus of Car-roll’s et al. (1986) argument is on a procedure for preserving the distance between a rowand a column point, no measure of the inter-point distance (individuals and categories) wasstated.

However, Greenacre (1989) argued, and reiterated his concerns in 1990, that the claimsmade by these authors are flawed for two reasons:

1. The measurable distance (whether it offers any clear interpretation or not) between arow and a column point in a low-dimensional space depends on the association struc-ture between the variables. Since the CGS coordinates are motivated from a multiplecorrespondence analysis perspective, Greenacre (1990) stated that this association is‘difficult to separate out’.

2. Again, since the CGS coordinates are motivated from a multiple correspondenceanalysis perspective, the principal inertias will be quite small for many practicalsituations (something we shall discuss in more detail in Chapter 10). Therefore, thelow-dimensional space (which is a biplot, not a correspondence plot) will be quitepoor at visualising the underlying association structure between the row and columncategories.

Carroll et al. (1989, p. 367) responded to Greenacre’s remarks:

Perhaps even more disturbing are Greenacre’s current questions about interpret-ing the category distances in MCA. If current practice is ‘wrong’ here as well,we see relatively little practical value in interpreting MCA plots.


Interestingly, the issue of interpoint distances have since gained no at traction in thecorrespondence analysis literature. Although Hoffman et al. (1995) discussed the debate ofthe CGS coordinates and Greenacre’s refutation of them. They state that the difference inopinion between Carroll et al. (1986) and Greenacre can be described as follows:

Greenacre was trained in the ‘French’ school, which appears to correspond nicelywith the fact that he takes simple correspondence analysis to be amore fundamen-tal and satisfactory technique thanMCA. It also means that he tends to emphasisethe so called ‘chi-squared distance’ interpretation of within-set distances. CGShave their starting point in multidimensional scaling and unfolding theory, whichnaturally leads them to emphasise between-set distance relations.

Despite this apparent reconciliation of the differing philosophies, and to the best of ourknowledge, inter-point distances have not been a topic of discussion or consideration.

4.7 Transition formulae

The transition formulae (Hill, 1973), or barycentric formulae (Benzecri, 1992, p. 111), areequations for obtaining principal coordinates for one variable from the coordinates of theother variable. They may also be interpreted as a means of transitioning from one subspaceto another. Here, we shall derive these formulae for principal coordinates and for biplotcoordinates.

Suppose we consider first the derivation of transition formulae for the principal coordi-nates. By considering Equations 4.11 and 4.7, Equation 4.6 can be alternatively expressedby

𝐅 = 𝐃−1𝐼𝐏𝐆𝚲−1

𝜆. (4.23)

This formula enables us to derive the row principal coordinates, given the column principalcoordinates. Similarly, we can obtain the column principal coordinates when the row principalcoordinates are given by considering the transition formula:

𝐆 = 𝐃−1𝐽𝐏𝑇 𝐅𝚲−1

𝜆. (4.24)

Equations 4.23 and 4.24 are referred to as transition formulae. To further aid in the interpre-tation of these formulae, suppose we consider the (𝑖, 𝑚)th element of 𝐅 from Equation 4.23.It can be expressed as

𝑓𝑖𝑚 =(


)(𝑔1𝑚

𝜆𝑚

)+(


)(𝑔2𝑚

𝜆𝑚

)+ ⋯ +

(𝑝𝑖𝐽

𝑝𝑖 ⋅

)(𝑔𝐽 𝑚

𝜆𝑚

). (4.25)

We can see here that the row principal coordinates can be expressed as the weighted averageof the row profiles defined in Chapter 3 and thus lead to an identical solution obtainedusing reciprocal averaging, the weights here being 𝑔𝑗𝑚∕𝜆𝑚 or, equivalently, 𝑏𝑗𝑚. Thus, if𝑝𝑖𝑗 is relatively large, 𝑔𝑗𝑚 will be heavily weighted and so will influence the magnitude of𝑓𝑖𝑚. Similar comments can be made by considering the transition formulae describing thederivation of𝐆 from 𝐅. While this may be the case, the general consensus among researchersof correspondence analysis is that no direct measurement of the distance between a row and


a column profile is possible using the scaling approach of Equations 4.6 and 4.7. See, again,our discussion of this point in Section 4.6.3.

In the case where the row and column biplot coordinates are considered, the transitionformulae are

𝑓𝑖𝑚 = 1𝜆2𝛾𝑚

𝐽∑

𝑗=1

𝑝𝑖𝑗

𝑝𝑖⋅��𝑗𝑚 , (4.26)

��𝑗𝑚 = 1𝜆2(1−𝛾)𝑚

𝐼∑

𝑖=1

𝑝𝑖𝑗

𝑝⋅𝑗𝑓𝑖𝑚 . (4.27)

Likewe saw the transition formulae of the principal coordinates, the biplot transition formulaeof Equations 4.26 and 4.27 are the weighted mean of the row and column profiles.

4.8 Moments of the principal coordinates

The traditional view of moments is that they numerically describe particular characteristicsof the probability distribution for a random variable. While the probability distribution ofprincipal coordinates has largely not been of interest in the correspondence analysis literature,we can still treat them as a random variable and thus calculate, and interpret, their moments.In fact, identifying the location (or centre) and spread (or variation) of a set of principal coor-dinates in correspondence analysis is fundamental to understanding the general properties ofthe configuration of points and has traditionally been viewed from a geometric perspectiverather than from a statistical one. This is, in part, due to the multi-dimensional nature of thetechnique. Here we shall introduce these well-understood quantities from a more statisticalframework, and demonstrate how they can be expanded upon to provide a meaningful in-terpretation of the skewness coefficient and kurtosis coefficient of a configuration of points.We shall briefly describe the first four moments by adopting the notation of Larsen and Marx(1986, Section 3.11).

Our discussion is applicable to both the row and column principal coordinates, as well asbiplot coordinates. Here we shall focus our attention on the moments of the row coordinate𝑓𝑖𝑚 which is motivated by the discussion Beh and Simonetti (2010, 2011) who consideredthem from a non-symmetrical correspondence analysis perspective. However, they were firstintroduced for use in simple correspondence analysis by Beh (2009). Therefore, we start bydefining the 𝑟th central moment of the row coordinates about its mean (𝜇) as

𝜇′𝑟= 𝐸

(𝑓 𝑟

𝑖𝑚

)

=𝐼∑

𝑖=1

𝑀∗∑

𝑚=1𝑝𝑖 ⋅

(𝑓𝑖𝑚 − 𝜇

)𝑟.


4.8.1 The mean of 𝒇𝒊𝒎

Define 𝜇 =∑𝐼

𝑖=1∑𝑀∗

𝑚=1 𝑝𝑖 ⋅𝑓𝑖𝑚 to be the weighted mean of the row principal coordinates.Therefore, the first moment, or mean, of these coordinates can be shown to be zero since

𝜇′1 = E

(𝑓𝑖𝑚

)

=𝐼∑

𝑖=1

𝑀∗∑

𝑚=1𝑝𝑖 ⋅

(𝑓𝑖𝑚 − 𝜇

)

= 0 .

Therefore, the expected value of a row principal coordinate is zero and this property impliesthat the configuration of points from the simple correspondence analysis of a contingencytable is centred at the origin of the correspondence plot. Similarly, it can also be shown thatthe column principal coordinates are centred around the origin so that

𝐽∑

𝑗=1𝑝⋅𝑗𝑔𝑗𝑚 = 0

for 𝑚 = 1, 2,… , 𝑀∗.

4.8.2 The variance of 𝒇𝒊𝒎

The second moment, or variance, of the row coordinates in the optimal correspondenceplot is

𝜇′2 = Var

(𝑓𝑖𝑚

)

= 𝐸(

𝑓 2𝑖𝑚

)−[𝐸(

𝑓𝑖𝑚

)]2

=𝐼∑

𝑖=1

𝑀∗∑

𝑚=1𝑝𝑖 ⋅𝑓

2𝑖𝑚

.

One may see that the right-hand side of this expression is just the total inertia of the con-tingency table, 𝑋2∕𝑛; see Equation 4.16. Therefore, the second moment associated with the𝑚th axis, Var𝑚

(𝑓𝑖𝑚

)=∑𝐼

𝑖=1 𝑝𝑖 ⋅𝑓2𝑖𝑚

= 𝜆2𝑚, is the principal inertia of the 𝑚th axis. It may

therefore be used to quantify the proportion of the total inertia that the 𝑚th principal axismakes in graphically depicting the association -- so that Var

(𝑓𝑖𝑚

)=∑𝑀∗

𝑚=1 Var𝑚(

𝑓𝑖𝑚

). Sim-

ilarly, the contribution that the 𝑖th row category makes to the total inertia may be calculatedVar𝑖

(𝑓𝑖𝑚

)=∑𝑀∗

𝑚=1 𝑝𝑖 ⋅𝑓2𝑖𝑚

so that Var(

𝑓𝑖𝑚

)=∑𝐼

𝑖=1Var𝑖(

𝑓𝑖𝑚

). The cumulative nature of

these variances implies that the row principal coordinates may be considered as independentof each other. In fact, due to the singular value decomposition of the Pearson residuals, thetotal inertia (variance) calculated from the row principal coordinates will be exactly the sameas the total inertia calculated from the column principal coordinates.


4.8.3 The skewness of 𝒇𝒊𝒎

The coefficient of skewness for the row coordinates, 𝛾1 = Skew(

𝑓𝑖𝑚

), in the optimal corre-

spondence plot can be calculated by considering the third central moment:

𝛾1 =𝜇′3

𝜎3

=∑𝐼

𝑖=1∑𝑀∗


(√∑𝐼

𝑖=1∑𝑀∗


)3

=(

𝑛

𝑋2

)3∕2 𝐼∑

𝑖=1

𝑀∗∑


3𝑖𝑚

.

Except for the discussion of Beh and Simonetti (2011), consideration of this quantity hasnot been made in the correspondence analysis literature and may be used to quantify howsymmetrically orientated the configuration of points is within the optimal correspondenceplot. For example, a skewness coefficient of 𝛾1 = 0 will indicate that the points are evenlyspread around the origin of the optimal correspondence plot.

The skewness of the configuration of points for each of the 𝑀∗ axes may also bedetermined and is generally of more practical use than calculating the skewness coefficientof the optimal plot. For example,

Skew𝑚

(𝑓𝑖𝑚

)=(

𝑛

𝑋2

)3∕2 𝐼∑


3𝑖𝑚

is the skewness coefficient of the configuration of points along the 𝑚th axis so thatthe skewness of the configuration in the optimal correspondence plot is Skew

(𝑓𝑖𝑚

)=

∑𝑀∗

𝑚=1 Skew𝑚

(𝑓𝑖𝑚

). If, for the 𝑚th axis, the skewness coefficient Skew𝑚

(𝑓𝑖𝑚

)is negative,

the axis is dominated by points whose 𝑚th coordinate is positive. Note that if the magnitudeof the skewness of two axes is equivalent but of different sign, then the skewness coefficientfor a two-dimensional correspondence plot can be zero. However, if the skewness coefficientof these two dimensions is positive, the configuration will be dominated by points lying inthe bottom-left quadrant of the correspondence plot. Similarly, if there is negative skewnesscoefficient for each of the first two principal axes, then the two-dimensional plot will alsohave a negative skewness coefficient and the configuration of points will be dominated in thetop-right quadrant.

4.8.4 The kurtosis of 𝒇𝒊𝒎

The coefficient of kurtosis,measured using the fourth centralmoment, for the row coordinates,𝛾2 = Kurt

(𝑓𝑖𝑚

)= 𝜇′

4∕𝜎4, in the optimal correspondence plot can be calculated as

𝛾2 =∑𝐼

𝑖=1∑𝑀∗


(∑𝐼

𝑖=1∑𝑀∗


)2


or more simply as

𝛾2 =(

𝑛

𝑋2

)3∕2 𝐼∑

𝑖=1

𝑀∗∑


4𝑖𝑚

.

While there has beenmuch discussion on the interpretation of the kurtosis coefficient (see,for example Bickel and Lehmann (1975)), we shall consider its interpretation as meaning theextent to which the configuration of points is ‘peaked’ or ‘flattened’ relative to the origin --a detailed discussion on the various measures, issues and interpretations of the kurtosiscoefficient can be found by considering MacGillivray (1986), Ruppert (1987) and Balandaand MacGillivray (1988).

In the context of correspondence analysis, we shall deem a kurtosis coefficient greater than3 to mean that the row coordinates are clustered towards the origin of the display, indicatingthat there may be a number of row categories that do not contribute to the association structureof the variables. Such a configuration of points is indicative of a leptokurtic display. Similarly,a kurtosis coefficient less than 3 will suggest that the configuration is not clustered near theorigin, thus indicating that many of the categories contribute to the association. Thus, such adisplay may be referred to as a platykurtic correspondence plot. Identification of which pointsproduce such a display may be made by considering the variance of that point. More formalprocedures, including the 95% (say) confidence circles of Lebart et al. (1984), the ellipticalregions of Beh (2010) or Ringrose’s (1992, 1996, 2012) bootstrap confidence regions mayalso be considered for a low-dimensional plot. See Chapter 8 for a detailed discussion onthese regions.

The kurtosis coefficient for each of the 𝑀∗ axes of the optimal correspondence plot mayalso be calculated. For the 𝑚th axis, it has a kurtosis coefficient of

Kurt𝑚(

𝑓𝑖𝑚

)=(

𝑛

𝑋2

)3∕2 𝐼∑


4𝑖𝑚

so that Kurt1(

𝑓𝑖1)

> Kurt2(

𝑓𝑖2)

> ⋯ > Kurt𝑀∗(

𝑓𝑖𝑀∗)

> 0. Similarly, the kurtosis coef-ficient for the 𝑖th row principal coordinate is

Kurt𝑖(

𝑓𝑖𝑚

)=(

𝑛

𝑋2

)3∕2 𝑀∗∑


4𝑖𝑚

.

4.8.5 Moments of the asbestos data

Consider again the principal coordinates of the rows and columns of Table 1.2. They aresummarised in Tables 4.4 and 4.5, respectively. The columns labelled ‘Axis 1’, ‘Axis 2’, ‘2-D Plot’ and ‘Optimal Plot’ refer to themoments associated with the first principal axis, secondprincipal axis, the two-dimensional correspondence plot and the optimal correspondence plot,respectively. We can see that the first row, the moment reflecting spread, of Tables 4.8 and4.9 are just the principal inertias of the first two axes of a correspondence plot as well as thetotal inertia of the contingency table.


Table 4.8 The moments of the row principal coordinates ofTable 1.2.

Moment Axis 1 Axis 2 2 plot Optimal plot

Spread 0.489 0.089 0.578 0.581Skewness 0.497 0.028 0.525 0.525Kurtosis 1.434 0.035 1.469 1.469

Table 4.9 The moments of the column principal coordinates ofTable 1.2.

Moment Axis 1 Axis 2 2 plot Optimal plot

Spread 0.489 0.089 0.578 0.581Skewness 0.634 −0.012 0.622 0.622Kurtosis 1.726 0.050 1.776 1.777

Consider now the skewness of the row and column principal coordinates. Since the skew-ness of the row coordinates along the first principal axis is not zero (0.497), the configurationof row points along this axis is not evenly spread around the origin. In fact, the positivecoefficient tells us that the coordinates are predominately positive than negative; Figure 4.2indeed shows this.

4.9 How many dimensions to use?

Once the coordinates have been derived, an important issue is to determine how manydimensions are appropriate for visualising the association in the contingency table. Thereare a variety of objective and subjective approaches that one may use. However, the choiceof approach comes down to whether the association can be viewed using two or, at most,three dimensions. Therefore, the most commonly adopted approach is to just accept the two,or three, dimensions that are constructed. One may argue that since the classical approachto correspondence analysis ensures that the percentage contribution of the axes to the totalinertia is of decreasing order as higher dimensions are considered, the analyst is guaranteedto have a correspondence plot that maximises the association structure between the variables.

However, it sometimes happens that the quality of a two-, or three-, dimensional plot ispoor. One may consider a plot that accounts for at least 70% of the association to be adequate,but the analyst may need to consider alternative graphical strategies if the correspondenceplot is of poorer quality. Even with this in mind, there are many practical applications ofcorrespondence analysis that go so far as to completely ignore the quality of their correspon-dence plot, and just produce a graphical display consisting of two dimensions, irrespectiveof whether it accounts for 50% or 99% of the association in their contingency table, (weshall not highlight examples of where this has occurred). Where the analyst feels that theircorrespondence plot is of poor quality, there are techniques available to help visualise theassociation that would typically require multiple dimensions. In the correspondence analysisliterature, some popular choices include the production of a dendrogram and Andrews curves.


If the analyst wishes to formally test how many dimensions are required to provide asufficient graphical display of the association in their contingency table, there are a numberof strategies that can be considered:

∙ The permutation test proposed by Takane and Jung (2009) which effectively calculatesMonte Carlo 𝑝-values for each of the singular values. Ciampi et al. (2005) also considersa similar type of simulation study.

∙ Lorenzo-Seva (2011) proposed a parallel analysis approach based on the methodologygiven in Horn (1965).

∙ Blasius (1994, pp. 28--29) suggested choosing those dimensions whose principal inertiaexceeds the average percentage of total inertia:

average = 100min (𝐼, 𝐽 ) − 1

.

However, such an approach, which is analogous to Kaiser’s rule (Kaiser, 1960) some-times adopted in principal component analysis, will tend to overestimate the numberof dimensions required to adequately visualise the association.

∙ Onemay reconstitute the cell values of the contingency table using an appropriatemodelthat incorporates the singular values (or principal inertia values) of the contingencytable and perform a goodness-of-fit test. For a test of the quality of a two-dimensionalcorrespondence plot, an appropriate model is

��𝑖𝑗(2) = 𝑝𝑖 ⋅𝑝⋅𝑗(1 + 𝜆1𝑎𝑖1𝑏𝑗1 + 𝜆2𝑎𝑖2𝑏𝑗2

)

and can be assessed using the chi-squared statistic

𝑋2(𝑀) = 𝑛

𝐼∑

𝑖=1

𝐽∑

𝑗=1

(𝑝𝑖𝑗 − ��𝑖𝑗(2)

)2

��𝑖𝑗(2).

The statistical significance of this statistic may be determined by comparing it with thechi-squared distribution with (𝐼 − 𝑀 − 1) (𝐽 − 𝑀 − 1) degrees of freedom.

∙ Lebart et al. (1984) argue that, when performing correspondence analysis, 𝑛𝜆2𝑚has a

distribution that is approximately that of the 𝑚th eigenvalue of a Wishart matrix withparameters 𝐼 − 1 and 𝐽 − 1. This is referred to as the Fisher--Hsu law (Fisher, 1939;Hsu, 1939). Lebart (1976) provides additional discussion on this particular approach.

Despite the increasing number of approaches that one may adopt for formally determininghow many dimensions are required for adequately visualising the association between thevariables of a contingency table, one may adopt the philosophy, the simple things in life areoften the best. Others are also in agreement with this view. As pointed out byVanRijckevorseland De Leeuw (1988, p. 89), Jolliffe (1986, Chapter 6), who considered the same issue butfor principal component analysis, says

the rules which have more sound statistical foundations seem, at present, to offerlittle advantage over the simpler.


Benzecri (1992, p. 398) believes that the decision should bemade based on the researcherspersonal judgement rather than by any mathematical procedure. However, one must be awarethat leaving the decision based purely on a subjective rationale may result in important,or valuable, information at higher dimensions being neglected because of the failure toinvestigate these higher dimensions. As we have stated above, this commonly arises in manypractical applications of correspondence analysis.

We do highly recommend, especially for those whose experience with multivariate dataanalysis is moderately limited, to construct a scree plot of the principal inertias. Originallyproposed by Cattell (1966), the scree plot is a type of bar chart of the singular values, or theprincipal inertia values, and allows the analyst to quickly view those dimensions that, relativeto the other dimensions, make a strong or weak contribution to the total inertia. Some of thestatistical programs and R functions that perform correspondence analysis include the screeplot as part of their output (see Chapter 12 for an example of this feature). Variations of thescree plot include the lev plot of Craddock and Flood (1969) which, from a correspondenceanalysis perspective, considers the logarithm of the principal inertia’s; see, for example, VanRijckevorsel and De Leeuw (1988, p. 89) who discuss the use of the scree plot and thelev plot.

4.10 R code

The computation of correspondence analysis has a relatively long history -- Chapter 12provides an extensive discussion on this issue. More recent contributions have focused on thedevelopment of code for use with R. A comparison of some of these approaches, with othercomputing tools, will also be discussed in Chapter 12. Here we shall provide some simple Rcode that performs a correspondence analysis, including the two-dimensional correspondenceplot. By considering the rgl library inR, wemay also plot three-dimensional correspondenceplots.

One thing to keep in mind is that because of the svd function in R, the coding ofcorrespondence analysis is performed by applying a singular value decomposition on thestandardised residuals, as described in Section 4.4, rather than via Pearson ratios or hiscontingencies.

What follows is a very simple R function, simpleca.exe, that performs a correspon-dence analysis on a specified two-way contingency table, 𝐍. The input parameters are asfollows:

∙ The contingency table, N.

∙ scaleplot which increases (or decreases) the viewing area of the plot. By defaultscaleplot = 1.2 so that the scale of the axes is 20% greater than the bounds ofthe coordinates.

∙ The dimensions used to construct the two-dimensional correspondence plot, dim1 anddim2. By default the first two axes are used so that dim1 = 1 and dim2 = 2.

When using the code to perform a correspondence analysis on a two-way contingency table,it produces the following as its output:

∙ The contingency table analysed, N.


∙ The row principal coordinates, f.

∙ The column principal coordinates, g.

∙ Pearson’s chi-squared statistic, Chi.Squared, and its 𝑝-value, P.Value,

∙ The total inertia, 𝑋2∕𝑛, Total.Inertia.

∙ A two-dimensional correspondence plot.

∙ For each of the principal axes, the principal inertia, its percentage contribution to thetotal inertia and the cumulative principal inertia summarised in Inertia.

The following is the R function, simpleca.exe:

simpleca.exe <- function(N, scaleplot = 1.2, dim1 = 1, dim2 = 2){

I <- nrow(N) # Number of rows of tableJ <- ncol(N) # Number of columns of table

Inames <- dimnames(N)[1] # Row category namesJnames <- dimnames(N)[2] # Column category names

n <- sum(N) # Total number of classifications in the tablep <- N * (1/n) # Matrix of joint relative proportions

Imass <- as.matrix(apply(p, 1, sum))Jmass <- as.matrix(apply(p, 2, sum))

ItJ <- Imass %*% t(Jmass)y <- p - ItJdI <- diag(c(Imass), nrow = I, ncol = I)dJ <- diag(c(Jmass), nrow = J, ncol = J)Ih <- Imassˆ-0.5Jh <- Jmassˆ-0.5dIh <- diag(c(Ih), nrow = I, ncol = I)dJh <- diag(c(Jh), nrow = J, ncol = J)

x <- dIh %*% y %*% dJh # Matrix of Pearson residuals

sva <- svd(x) # SVD of the Pearson residuals

dmu <- diag(sva$d)

########################################################################### ## Principal Coordinates ## ###########################################################################

f <- dIh %*% sva$u %*% dmu # Row Principal Coordinates for Classical CAg <- dJh %*% sva$v %*% dmu # Column Principal Coordinates for Classical CA

dimnames(f) <- list(paste(Inames[[1]]), paste(1:min(I,J) - 1 ))dimnames(g) <- list(paste(Jnames[[1]]), paste(1:min(I,J) - 1 ))


########################################################################### ## Calculating the total inertia, Pearson chi-squared statistic, its ## p-value and the percentage contribution of the axes to the inertia ## ###########################################################################

Principal.Inertia <- diag(t(f[, 1:min(I - 1, J - 1)]) %*% dI %*% f[,1:min(I - 1, J - 1)])

Total.Inertia <- sum(Principal.Inertia)Chi.Squared <- n * Total.Inertia # Pearson’s chi-squared statisticPercentage.Inertia <- (Principal.Inertia/Total.Inertia) * 100Cumm.Inertia <- cumsum(Percentage.Inertia)Inertia <- cbind(Principal.Inertia, Percentage.Inertia, Cumm.Inertia)dimnames(Inertia)[1] <- list(paste("Axis", 1:min(I - 1, J - 1), sep = " "))p.value <- 1 - pchisq(Chi.Squared, (I - 1) * (J - 1))

########################################################################### ## Here we construct the two-dimensional correspondence plot ## ###########################################################################

par(pty = "s")plot(0, 0, pch = " ", xlim = scaleplot*range(f[, dim1], f[, dim2],

g[, dim1], g[, dim2]), ylim = scaleplot*range(f[, dim1], f[, dim2],g[, dim1], g[, dim2]), xlab = paste("Principal Axis", dim1, "(",round(Percentage.Inertia[dim1], digits = 2), "%)"), ylab =paste("Principal Axis", dim2, "(", round(Percentage.Inertia[dim2],digits = 2), "%)"))

points(f[, 1], f[, 2], pch = "+", col = "blue")text(f[, 1], f[, 2], labels = Inames[[1]], pos = 4, col = "blue")points(g[, 1], g[, 2], pch = "*", col = "red")text(g[, 1], g[, 2], labels = Jnames[[1]], pos = 4, col = "red")abline(h = 0, v = 0)

list(N = N, f = round(f, digits = 3), g = round(g, digits = 3),Chi.Squared = round(Chi.Squared, digits = 3), Total.Inertia =round(Total.Inertia, digits = 3), P.Value = round(p.value,digits = 3), Inertia = round(Inertia, digits = 3))

}

For example, the application of this code to Table 1.2, which is specified as the R objectselikoff.dat, gives the following numerical output:

> simpleca.exe(selikoff.dat)$N

None Grade 1 Grade 2 Grade 30-9 310 36 0 010-19 212 158 9 020-29 21 35 17 430-39 25 102 49 1840+ 7 35 51 28


$f0 1 2 3

0-9 -0.715 0.283 -0.020 010-19 -0.258 -0.274 0.043 020-29 0.468 -0.174 -0.137 030-39 0.764 -0.221 -0.029 040+ 1.330 0.512 0.054 0

$g0 1 2 3

None -0.592 0.141 -0.003 0Grade 1 0.291 -0.400 0.015 0Grade 2 1.258 0.262 -0.098 0Grade 3 1.512 0.647 0.173 0

$Chi.Squared[1] 648.812

$Total.Inertia[1] 0.581

$P.Value[1] 0

$InertiaPrincipal.Inertia Percentage.Inertia Cumm.Inertia

Axis 1 0.489 84.215 84.215Axis 2 0.089 15.352 99.567Axis 3 0.003 0.433 100.000

From this output, we can see that the chi-squared statistic of the contingency table is 648.812.With a 𝑝-value that is less than 0.0001, this indicates that there is a statistically significantassociation between the two categorical variables. This same conclusion can also be obtainedusing the chisq.test function discussed in Chapter 2. The correspondence analysisyields a total inertia of 0.581, of which the first principal axis accounts for over 84% of theassociation. The inclusion of the second axis accounts for a further 15.35% which meansthat a two-dimensional correspondence plot visually describes 99.567% of the associationthat exists in Table 1.2. Figure 4.2 gives this two-dimensional display of the row and columnprincipal axes, and the axes reflect these percentages. We can see that long-term exposureto asbestos leads to a more severe case of asbestosis. Specifically, Figure 4.2 indicates thata New York worker who is exposed to asbestos for no more than 9 years is not likely tobe diagnosed with asbestosis. However, the more severe strains are most likely to arise inworkers who have been exposed for at least 20 years. This helps to confirm Selikoff’s originalfindings and aids in the validation of his ‘20 year’ rule discussed in Chapter 1.

If one were to consider a third dimension, then it adds a further 0.433% to the visualdisplay of association between the two categorical variables. Hence, a three-dimensionalcorrespondence plot -- given here by Figure 4.4 -- graphically depicts all of the association


that exists in Table 1.2. This should be of no surprise since the optimal correspondence plotconsists of 𝑀∗ = min (𝐼, 𝐽 ) − 1 = min (5, 4) − 1 = 3 dimensions. The three-dimensionalplot of Figure 4.4 is constructed in R using the plot3d and related functions contained inthe rgl library. The code that produces such a graphical display is as follows:

> library(rgl)> plot3d(0, 0, 0, type = "n", box = FALSE, xlim = range(f[, 1], f[, 2],+ f[, 3], g[, 1], g[, 2], g[,3]), ylim = range(f[, 1], f[, 2], f[, 3],+ g[, 1], g[, 2], g[, 3]), zlim = range(f[, 1], f[, 2], f[, 3], g[, 1],+ g[, 2], g[, 3]), xlab = paste("Principal Axis 1", "(",+ round(Percentage.Inertia[1], digits = 2), "%", ")"), ylab =+ paste("Principal Axis 2", "(", round(Percentage.Inertia[2], digits+ = 2), "%", ")"), zlab = paste("Principal Axis 3", "(",+ round(Percentage.Inertia[3], digits = 2), "%", ")"))> points3d(f[, 1], f[, 2], f[, 3], size = 10, col = "blue")> text3d(f[, 1], f[, 2], f[, 3], text = dimnames(f)[[1]], adj = 1.5, col =+ "blue")> points3d(g[, 1], g[, 2], g[, 3], size = 10, col = "red")> text3d(g[, 1], g[, 2], g[, 3], text = dimnames(g)[[1]], adj = 1.5, col =+ "red")> i <- c(1, 2, 1, 3, 1, 4)

–0.5–1.0

Principal Axis 3 (0.43% )

Principal Axis 2 (15.35 %)

Principal Axis 1 (84.22 %)

0-9

10-1940+

20-3020-39

Grade 3

Grade 1

None

Grade 2

0.0 0.5 1.0 1.5

1.5

1.0

0.5

0.0

–0.5

–1.0

1.5

1.0

0.5

0.0

–0.5

–1.0

Figure 4.4 Three-dimensional correspondence plot of Selikoff’s asbestos data in Table 1.2using in R.


> x1 <- c(0, max(abs(f), abs(g)), 0, 0)> y1 <- c(0, 0, max(abs(f),abs(g)), 0)> z1 <- c(0, 0, 0, max(abs(f), abs(g)))> x2 <- c(0, - max(abs(f), abs(g)), 0, 0)> y2 <- c(0, 0, - max(abs(f), abs(g)), 0)> z2 <- c(0, 0, 0, - max(abs(f), abs(g)))> segments3d(x1[i], y1[i], z1[i], alpha=0.3)> segments3d(x2[i], y2[i], z2[i], alpha=0.3)

One may also calculate the CGS coordinates of Equations 4.21 and 4.22, which are depictedin Figure 4.5, by considering the following R code:

> F.CGS <- (dIh %*% sva$u %*%(dmu + diag(rep(1, dim(dmu)[1])))ˆ(1/2))[,1:3]> G.CGS <- (dJh %*% sva$v %*%(dmu + diag(rep(1, dim(dmu)[1])))ˆ(1/2))[,1:3]> dimnames(F.CGS) <- list(paste(Inames[[1]]), NULL)> dimnames(G.CGS) <- list(paste(Jnames[[1]]), NULL)> F.CGS

[,1] [,2] [,3] [,4]0-9 -1.3334646 1.0808967 -0.3995030 -0.632308710-19 -0.4802648 -1.0447537 0.8877720 0.163459020-29 0.8713723 -0.6628536 -2.7973664 1.912989130-39 1.4247469 -0.8415919 -0.5988753 -1.853819840+ 2.4785332 1.9527268 1.1019955 0.4070098> G.CGS

[,1] [,2] [,3] [,4]None -1.1038527 0.5375482 -0.05692846 1Grade 1 0.5417734 -1.5266708 0.29774492 1Grade 2 2.3456310 1.0014364 -2.01208157 1Grade 3 2.8175347 2.4698058 3.54563004 1

From these coordinates, the principal inertia values associated with each of the axes are

> diag(t(F.CGS) %*% dI %*% F.CGS)[1] 1.699405 1.298617 1.050127 1.000000

so that the total inertia is 0.9497, an increase from 0.5809 using simple correspondenceanalysis. Similarly,

> diag(t(G.CGS) %*% dJ %*% G.CGS)[1] 1.699405 1.298617 1.050127 1.000000

Therefore, the percentage contribution each axis makes to the total inertia is

> 100 * diag(t(F.CGS) %*% dI %*% F.CGS)/sum(t(F.CGS) %*% dI %*% F.CGS)[1] 33.66392 25.72461 20.80223 19.80924

That is, the first principal axis reflects 33.66% of the total inertia while for the second axisit is 25.72%. In total, the two-dimensional correspondence plot obtained using the CGScoordinates is about 59.39% and so provides a much poorer graphical display than the two-dimensional correspondence plot using the principal coordinates.


−2 −1 0 321

−2

−1

32

10

Principal Axis 1 ( 33.66 %)

Prin

cipa

l Axi

s 2

( 25

.72

%)

+

+

++

+

0−9

10−19

20−2930−39

40+

*

*

*

*

None

Grade 1

Grade 2

Grade 3

Figure 4.5 Plot of the CGS coordinates for Selikoff’s asbestos data in Table 1.2.

Except for the scaling along the axes, we can see that the general configuration ofpoints using the CGS coordinates is the same as the configuration of points using standardcoordinates, and symmetric biplot coordinates. Since the total inertia has nearly doubled fromconsidering the traditional approach to correspondence analysis, there is a stretching of thepoints along both axes.

The second, third and fourth moments of the row and column principal coordinates forTable 1.2 can be calculated by considering the following R code:

> Varfdum <- matrix(0, nrow = I, ncol = min(I,J) - 1)> Skewfdum <- matrix(0, nrow = I, ncol = min(I,J) - 1)> Kurtfdum <- matrix(0, nrow = I, ncol = min(I,J) - 1)

> Vargdum <- matrix(0, nrow = J, ncol = min(I,J) - 1)> Skewgdum <- matrix(0, nrow = J, ncol = min(I,J) - 1)> Kurtgdum <- matrix(0, nrow = J, ncol = min(I,J) - 1)

> for (m in 1:(min(I,J) - 1)){for (i in 1:I){

Varfdum[i,m] <- Imass[i] * f[i,m]ˆ2Skewfdum[i,m] <- Imass[i] * f[i,m]ˆ3Kurtfdum[i,m] <- Imass[i] * f[i,m]ˆ4

}for (j in 1:J){

Vargdum[j,m] <- Jmass[j] * g[j,m]ˆ2


Skewgdum[j,m] <- Jmass[j] * g[j,m]ˆ3Kurtgdum[j,m] <- Jmass[j] * g[j,m]ˆ4

}}

> Varf <- apply(Varfdum, 2, sum)> Skewf <- (1/total.inertia)ˆ(3/2) * apply(Skewfdum, 2, sum)> Kurtf <- (1/total.inertia)ˆ(2) * apply(Kurtfdum, 2, sum)

> Varg <- apply(Vargdum, 2, sum)> Skewg <- (1/total.inertia)ˆ(3/2) * apply(Skewgdum, 2, sum)> Kurtg <- (1/total.inertia)ˆ2 * apply(Kurtgdum, 2, sum)

> momentsf <- c(Varf[1], Skewf[1], Kurtf[1], Varf[2], Skewf[2], Kurtf[2],Varf[1] + Varf[2], Skewf[1] + Skewf[2], Kurtf[1] + Kurtf[2], sum(Varf),sum(Skewf), sum(Kurtf))> momentsf <- matrix(momentsf, nrow=3)

> momentsg <- c(Varg[1], Skewg[1], Kurtg[1], Varg[2], Skewg[2], Kurtg[2],Varg[1] + Varg[2], Skewg[1] + Skewg[2], Kurtg[1] + Kurtg[2], sum(Varg),sum(Skewg), sum(Kurtg))> momentsg <- matrix(momentsg, nrow=3)

> dimnames(momentsf) <- list(paste(c("Spread", "Skewness", "Kurtosis")),paste(c("Axis 1", "Axis 2", "2-D Plot", "Optimal Plot")))> dimnames(momentsg) <- list(paste(c("Spread", "Skewness", "Kurtosis")),paste(c("Axis 1", "Axis 2", "2-D Plot", "Optimal Plot")))

> momentsfAxis 1 Axis 2 2-D Plot Optimal Plot

Spread 0.4892 0.0892 0.5783 0.5809Skewness 0.4972 0.0279 0.5251 0.5248Kurtosis 1.4344 0.0350 1.4694 1.4694

> momentsgAxis 1 Axis 2 2-D Plot Optimal Plot

Spread 0.4892 0.0892 0.5783 0.5809Skewness 0.6338 -0.0121 0.6216 0.6219Kurtosis 1.7260 0.0503 1.7763 1.7765

The output momentsf and momentsg are summarised in Tables 4.8 and 4.9, respectively.

4.11 Other theoretical issues

There are a great many other issues concerned with the simple correspondence analysisof a contingency table. Various sensitivity aspects of the solution have been examined.Benasseni (1993) studied the sensitivity of various modifications of a contingency tableon the output from performing a correspondence analysis. The focus was the impact fromaggregating row and column categories, permuting observations from one cell to anothercell and adding/deleting observations from the table. Bar-Hen and Mortier (2004) consideredmeasuring the sensitivity of outlying row and column categories on the solution.


The link between the graphical summary of association using correspondence analysisand various types of statistical models has long been discussed. Log-linear and associationmodel links have been established by Van der Heijden et al. (1989), Choulakian (1988),Cavedon et al. (1982), De Falguerolles et al. (1995), Van der Heijden (1992) and D’Ambraand Kiers (1991). Links with regression analysis have been established by De Tibeiro andD’Ambra (2010) and time series analysis by Deville and Saporta (1983) where the timevariable is treated not as an ordered variable but as nominal. Higgs (1991) also considers thecorrespondence analysis of a table where one variable consists of four time periods -- althoughthese categories are treated as strictly independent. Perhaps one of the better attempts atanalysing time-dependent categories for correspondence analysis is that of Silic et al. (2012)who described their CatViz (Temporally Sliced Correspondence Analysis Visualization)program. One can also consider Baccini et al. (1993) for other considerations.

Somewhat interrelated approaches for performing a correspondence analysis on a squarecontingency table was considered by Bove (1992) and Greenacre (2000) -- the foundations ofwhich lie in themethodology considered byGower (1977) andConstantine andGower (1978).De Leeuw and Van der Heijden (1988) also point out that analysing square contingency tablesusing correspondence analysis has been given considerable attention in the French literature.They point to Burtchy (1984) and Foucart (1985) who reviewed various strategies. Thus, DeLeeuw and Van der Heijden (1988) studied data from Foucart (1985) by iteratively imputingmissing cell frequencies from a square contingency tablewhere the row and column categoriesare the same, but the two variables relate to the place of home and work of a Parisian. Withthe huge explosion of Bayesian approaches to many aspects of statistics, correspondenceanalysis has not been immune to its influence. De Tibeiro and Murdoch (2010) consideredtreating missing cells by considering a Bayesian approach. A more practical consideration ofcorrespondence analysis and a Bayesian analysis was made in Braga et al. (2005).

One aspect of correspondence analysis that we have so far not considered, but has receiveda great deal of attention, is that of supplementary points and supplementary variables. Whilethe association between the variables of the contingency table and the interrelationships thatexist between their categories are of primary interest to the analyst, sometimes there is a needto incorporate the information of a category, or variable, that is considered to be of secondaryinterest. Greenacre (1984) and Lebart et al. (1984) describe this issue at some length andprovide some interesting examples. The feature of secondary points and variables is thatthey may be displayed in the correspondence plot using the transition formulae to determinetheir position. Doing so, one may compare the relative distribution of its profile (in the caseof a secondary category) or profiles (for secondary variables) with those of the primaryinformation. However, they play no part in defining the total inertia or the contribution ofthe axes to the total inertia. As a result, their presence does not impact on the configurationof points between the primary information originally considered and so have a zero mass,or on the analysis. One may consider a number of interesting contributions in the 1998book Visualization of Categorical Data, edited by Jorg Blasius and Michael Greenacre. Forexample, Le Roux and Rouanet (1998) and Lebart (1998) discuss this issue. One may alsoconsider Tarnai and Wuggenig (1998) for a detailed description and application to the finearts world of Europe of this issue. See also Thiessen and Blasius (1998) and Graffelman andAluja-Banet (2003) for more on this issue.

Another aspect of correspondence analysis that has gained some attention over the yearsis the identification of clusters of points in a plot. Greenacre (1988) provides a means ofclustering specific to correspondence analysis, while Goodman (1981) and Gilula (1986)


speak more specifically about the impact of combining categories of a contingency table.Lebart and Mirkin (1993) discuss that this issue has a relatively long history, especiallyamongst the French. In addition to some of the contributions just mentioned, they citeJambu (1978), Cazes (1986) and Escoufier (1988) as early contributors to this aspect ofcorrespondence analysis. Ciampi et al. (2005, 2012) also considered this issue. The latterapproach uses chi-squared distances to ‘inspire’ two new algorithms that determine thenumber of clusters to consider in a subspace of reduced dimension.

Assessing the influence of row and column categories, or even from adding an individualobservation, on a correspondence analysis has received considerable attention since the early1990s. Pack and Jolliffe (1992) considered the impact of adding, or deleting, one unit toa cell value and assessing the impact this has on the eigenvalues, and hence the generalconfiguration of the correspondence plot. They comment that such work commenced earlier,but it seems not within the English-speaking statistical community. They point to Escofierand Le Roux (1976) and Tanaka and Tarumi (1985) as early contributors to the influenceof a particular row and/or column on the analysis. Where a correspondence plot identifiesdistinctive clusters, or clear groupings, of row categories (say), Krzanowski (1993) providesa strategy for identifying those column categories that contribute the most towards thesegroupings. On a similar issue, Nakayama (2001) developed a procedure for testing theredundancy of categories to a correspondence analysis while Nowak and Bar-Hen (2005)considered the identification of categories deemed to be influential, based on the statisticalsignificance of each 𝜆2

𝑚, on each principal axis of a correspondence plot.

Other theoretical issueswere considered byHubert andArabie (1992), Takane et al. (1991)and Ter Braak (1985). Yehia (1993) considers the correspondence analysis of a two-waycontingency table where the row and column categories come from an underlying Diricheletcontinuous distribution. Ter Braak (1987, p. 103) suggested that when a contingency tableconsists of a lot of very small cell frequencies (none of which are zero) and a few very largecell frequencies, then correspondence analysis be performed on the table after replacing thecell frequency 𝑛𝑖𝑗 with ln

(𝑛𝑖𝑗 + 1

).

4.12 Some applications of correspondence analysis

Perhaps the biggest impact correspondence analysis has made has been as a tool for theunderstanding of a multitude of issues in a huge variety of disciplines. There are thousandsof applications that have been made. Bibliographies, such as those by Birks et al. (1996) andBeh (2004b), provide a wealth of articles specifically focused on disciplines such as ecologyand biology as well as across a broad spectrum of other disciplines. Therefore, it would notseem sensible to present here an equally detailed list -- although it may be obtained from anynumber of online bibliometric databases, including those of Google Scholar, Scopus or theWeb of Knowledge. Here we provide only a glimpse of some of the contributions that havebeen made in the past 10--15 years that highlight the applicability of correspondence analysiswith an aim of demonstrating the continued diversity of its use.

As we described in Chapter 1, one area where correspondence analysis has been used ex-tensively is in archaeology. One only needs to consider the extensive list of references givenin Baxter’s online bibliographies to get a glimpse of its extensive use. More recently, Baxterand Cool (2010) provided a description of R code that is now available and demonstrated forits use; see Chapter 12 for more details on the use of R to perform correspondence analysis.


They also point out the contributions of Shennan (1997, pp. 308--341) and Baxter (1994,pp. 100--139, 2003, pp. 136--146) as good expositions written for the archaeological reader-ship. In fact, Baxter and Cool (2010) point to the paper of Bolviken et al. (1982) as the reasonfor the popularity of correspondence analysis in archaeology.

In other areas of science, the application of correspondence analysis has been widespread.The study of Lang (1978) looks at the distribution of worm communities in Lake Geneva,Switzerland and is one of the earliest known English articles on correspondence analysis.Karadjov and Simeonov (1990) use the analysis to study 24 chemical elements from 10 rocksamples obtained from the Rhodopa mountains in Bulgaria. Hammer and Harper (2006, Sec-tion 6.12) describe correspondence analysis in the book on palaeontology data analysis. Morerecently, Freudenthal et al. (2009) used the technique to highlight various technical char-acteristics associated with analysing palaeontological data. In one of the earliest ecologicalapplications of correspondence analysis, Greenacre and Vrba (1984) studied the vegetationcharacteristics for the grazing of 17 species of antelope in 16 sub-Saharan wildlife areasspanning Eastern and Southern Africa. More recently, Cakir et al. (2006) used the techniqueto study vegetation changes based on Landsat images of Raleigh, North Carolina.

In chemistry, Pacheco (1998) studied the groundwater chemistry using correspondenceanalysis. More recently, Parsons et al. (2010a, 2010b) and Beh and Holdsworth (2012) madecontributions. The latter used correspondence analysis to study the physicochemical proper-ties of wine using the data of Cortez et al. (2009). Beh and Farver (2012) studied the samedata but examined the association structure using ordinal log-linear models. Further studiesof wine using correspondence analysis have also recently been made by Esti et al. (2009),Marques et al. (2012) and Saenz-Navajas et al. (2013). Within the related area of chemomet-rics, Mellinger (1987a, 1987b) provides some early discussions on correspondence analysis.Onemay also consider the varying topical contributions ofGreenacre (1987), Avila andMyers(1991), Avila et al. (1991), Rhodes and Myers (1991) and Devaux et al. (1992). Within thearea of bibliometrics, Anuradha and Urs (2007) used correspondence analysis to study theinternational research collaborative partnerships of Indian publications that have been made,including those across disciplines. Recently, there has been increasing attention given to theuse of the classical approach to correspondence analysis for analysing various aspects of foodperception across Europe. For example, one may consider Guerrero et al. (2010). Their datawere subsequently analysed using the classical and non-symmetrical versions of correspon-dence analysis by Beh et al. (2011). Their data were also used as a motivating example forthe proposal of moments for principal coordinates by Beh and Simonetti (2011); we havealready discussed the derivation and interpretation of these moments in Section 4.8. Guerreroet al. (2012) followed on from their 2010 analysis of the data. Most recently, Schiffersteinet al. (2013) used correspondence analysis to study the emotional characteristics that exist atvarious stages of the production food products.

There are now many microarray databases that act as repositories for storing microarraygene expression data. Data of this kind are often extremely large in the context of categoricaldata. However, Fellenberg et al. (2001) and Busold et al. (2005) demonstrated how cor-respondence analysis could be successfully used to analyse the often complex associationpatterns that exist in these data. Bonnet (1998) gave an overview of the application and adiscussion of some of the issues concerned with the implementation of multivariate dataanalytic techniques, with a focus on correspondence analysis, in microscopy. In oncology, Liet al. (2010) used the analysis to study the association between male dietary habits in someof Chinese cities and their gastric carcinoma mortality.


In an early study of correspondence analysis with an educational flavour, Murtagh (1982)used it to assess the accuracy of marks of examination papers and final grades. The data wereobtained from 34 students in their final year of a Bachelor of Science (Honours) degree inComputer Science at the University College, Dublin in 1982. Beishuizen et al. (2001) used itto identify what students think about good teachers and showed that they felt personality andability to teach as the best indicators of good teaching practice. One educational applicationfrom Australia is that of Askell-Williams and Lawson (2004). Another Australian study byShanka et al. (2006) investigated the primary reasons why international students chooseto study at the Curtin University in Perth, Western Australia. Part of their study revealedthat international students from the South-East Asian region are attracted to the Australianuniversity for a variety of reasons (dominated largely by the proximity to their home country).Mazzarol and Soutar (2008) investigated a similar issue but with a national focus. Morerecently, Polackova and Jindrova (2010) demonstrated the effectiveness of new learningstrategies for students studying statistics at the Faculty of Economics and Management of theCzech University of Life Sciences.

Tourism studies have also benefited from the use of correspondence analysis. For example,the adventure industry in New Zealand is a major draw-card for tourists to the country. Withthat in mind, tourism injury problems are a real concern and Bentley et al. (2001) investigatedthe issue using correspondence analysis. Fontana (2008) assessed the daily tourism flow atthe end of 2007 in the Piemonte region of Italy from the regions web-based service, TUAP(Turismo Arrivi e Presenze), which was originally set up in 2000. Further research intotourism issues using correspondence analysis can be found in Calantone et al. (1989), Gursoyand Chen (2000) and Chen (2001).

Some further diverse demonstrations of the application of correspondence analysis includeneural networks (Lebart, 1997), open pit mining of lignite in Turkey (Elevli et al., 2008) andepidemiology (Sourial et al., 2010). Further considerations from a health perspective includeGreenacre’s (2002) study of data from the Spanish National Health service, and Corbellini’set al. (2008) comparison of simple correspondence analysis with non-symmetrical corre-spondence analysis and Beh’s (1997) ordered correspondence analysis technique to detectthe most important indicators of elderly frailty in Parma, Italy. The interested reader is alsodirected to consider Colin et al. (1992) who used correspondence analysis to study the atti-tude of a sample of 1148 nurses in Montreal, Canada from a questionnaire consisting of 177questions. Further studies of questionnaire data can be found by considering, for example, Liand Yamanishi (2001). Most notably, Greenacre and Pardo’s (2006) subset correspondenceanalysis is specifically designed to cater to the analysis of questionnaire data. An early con-tribution to the analysis of survey data is that of Daudin et al. (1985) who studied data froma 1969 survey of agricultural holdings by considering variables concerned with workforcefrom 21 French regions. See also Lebart (1985) and Israels et al. (1985).

In the marketing research literature, one may consider the contributions of Fox (1998),Bendixen (1996), Berthon et al. (1997), Hsieh (2004), Bendixen andYurova (2012), Hoffmanand Franke (1986) and Collins (2002).

4.13 Analysis of a mother’s attachment to her child

The area of parent--child attachment theory in psychology has gained considerable attentionfor many decades. One study undertaken by Van IJzendoorn (1995) considered various


issues on the attachment and analysed an extensive array of studies on the topic. As VanIJzendoorn (1995, p. 388) describes, the ‘revolutionary shift’ in the study of attachmenttheory was the development of the Adult Attachment Interview, or AAI, by George et al.(1985); see also Hesse (1999). The AAI is an interview that asks questions about the adultparticipants recollection of their relationship with their parents (and others close to them) aschildren and also asks about their current relationship with their parents. The classificationof the data from the AAI is based not just on the answers to the questions that are askedduring the interview (listed in Hesse, 1999) but also on how the participant responded to thequestions. This classification, which reflects an adults attachment to their parents, is describedas follows:

∙ Autonomous: As a child, the adults recollection of their childhood was coherent andconsistent and their responses were clear, relevant and succinct.

∙ Dismissing: As a child, the adults recollection of their childhood was positive butinvolved responses that were unsupported during the interview, or contradicted infor-mation they provided during the interview.

∙ Preoccupied: The adults response to the questions demonstrated an attachment to theirparents as being confused, angry or they had a passive preoccupation with them.

∙ Disorganised: The adults responses revealed traumatic experiences involving loss, orabuse as a child.

The AAI was then used to assess these attachment of a child of these adults to their parent.The assessment was made using the Ainsworth Strange Situation (Ainsworth et al., 1978;Ainsworth and Wittig, 1969). This is a structured laboratory playroom where the infants andtheir parent are involved in two brief separations and reunions. The infants classification tothe separation/reunion with their parent is categorised as follows:

∙ Secure: If the infants are eager to explore the laboratory playroom in the presence oftheir parents, but show, signs of missing their presence during separation.

∙ Avoidant: Explores the playroomat once but shows little or no response to the separationof their parent.

∙ Resistant: If the infant becomes apprehensive on entering the laboratory playroom andis uninterested in exploring the lab. When separated from their parent, the child showssigns of anxiety and great distress and combines contact seeking with contact resistanceon the reunion.

∙ Disorganised: The infant displays disorganised or disorientated behaviour in theirparents presence.

As a result of this study, Table 4.10 summarises the cross-classification of 548 mothers bytheir attachment with their parents and their child’s attachment to them.

The Pearson’s chi-squared statistic for Table 4.10 is 362.73, and has a 𝑝-value that isless than 0.0001. Therefore, there is a statistically significant association between a parentsattachment classification and their child’s attachment to them. A simple correspondenceanalysis can be performed on the contingency table to visualise the association. Figure 4.6 is


Table 4.10 Cross-classification of the attachment classification of a mother andher infant.

Infant Mother’s attachment classification

response Dismissing Autonomous Preoccupied Unresolved Total

Avoidant 62 29 14 11 116Secure 24 210 14 39 287Resistant 3 9 10 6 28Disorganised 19 26 10 62 117

Total 108 274 48 118 548

the two-dimensional correspondence plot of Table 4.10. It shows that an avoidant infant anddismissing mother are associated. Similarly, a secure infant is associated with an autonomousparent. A mother with unresolved parental issues is closely associated with an infant who isdisorganised.

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0


Prin

cipa

l Axi

s 2

( 35

.84

%)

+

+

+

+

Avoidant

Secure

Resistant

Disorganised

*

*

*

Dismissing

Autonomous

Preoccupied

Unresolved*

Figure 4.6 Two-dimensional plot using the principal coordinates of the mother/child at-tachment data in Table 4.10.


The numerical output obtained using the simpleca.exe R function is as follows:

> simpleca.exe(mother.dat, scaleplot = 1.4)$N

DISMISSING AUTONOMOUS PREOCCUPIED UNRESOLVEDavoidant 62 29 14 11secure 24 210 14 39resistant 3 9 10 6disorganised 19 26 10 62

$f0 1 23

Avoidant -0.725 -0.503 0.061 0Secure 0.457 -0.099 0.023 0Resistant -0.246 0.169 -0.919 0Disorganised -0.343 0.701 0.104 0

$g0 1 23

Dismissing -0.766 -0.449 0.152 0Autonomous 0.467 -0.141 0.016 0Preoccupied -0.403 0.014 -0.674 0Unresolved -0.219 0.732 0.098 0

$Chi.Squared[1] 252.398

$Total.Inertia[1] 0.461

$P.Value[1] 0

$InertiaPrincipal.Inertia Percentage.Inertia Cumm.Inertia

Axis 1 0.249 54.056 54.056Axis 2 0.165 35.835 89.892Axis 3 0.047 10.108 100.000

This output shows that Figure 4.6 graphically depicts 89.89% of the association that existsbetween the variables. Adding a third dimension will lead to the optimal correspondence plotfor Table 4.10 since 𝑀∗ = min (4, 4) − 1 = 3.

Despite the debate concerning the appropriateness of the CGS coordinates, the config-uration of the resulting correspondence plot -- of Figure 4.7 -- is virtually identical to theconfiguration obtained using the principal coordinates. The CGS coordinates, using R, are asfollows and gives the plot of Figure 4.7.

> F.CGS[,1] [,2] [,3]

avoidant -1.7798564 -1.4672975 0.3114452secure 1.1213687 -0.2893198 0.1156271resistant -0.6040476 0.4923274 -4.6982055


disorganised -0.8415055 2.0466335 0.5319414

> G.CGS[,1] [,2] [,3]

DISMISSING -1.8786617 -1.31170490 0.77657863AUTONOMOUS 1.1451008 -0.41023009 0.08220732PREOCCUPIED -0.9884655 0.04086865 -3.44534197UNRESOLVED -0.5374222 2.13648712 0.49983998

Despite the comparable configurations of Figures 4.6 and 4.7, using CGS coordinates hasled to a reduction in the quality of the display -- using the principal coordinates the two-dimensional correspondence plot of Figure 4.6 graphically depicts 89.89% of the associationbetween the variables, while Figure 4.7 accounts for 77.39% of the association. However, inFigure 4.7, the intra-point distance can be interpreted.

Furthermore, in Figure 4.8 we depict the association using the row isometric biplot, wherethe rows are graphically represented using their principal coordinates and the columns arerepresented using their standard coordinates. We can also see that the row points are high-lighted with an arrow from the origin to each row coordinate, while the columns are depictedusing just points. The length of the arrows (distance from the origin) has a clear interpretationand indicates the importance of that category for helping to describe the association betweenthe variables. The shortest projections made of the columns on those row arrows highlight thestrength of the category association. The biplot is asymmetric and so from the row isometricbiplot we can see how the mother (column) categories (are associated with) depend on the

−4 −2 0 2 4

−4

−2

02

4


Prin

cipa

l Axi

s 2

( 35

%)

+

+

+

+

Avoidant

Secure

Resistant

Disorganised

*

*

*

Dismissing

Autonomous

Preoccupied

Unresolved*

Figure 4.7 Two-dimensional correspondence plot of the mother/child attachment data inTable 4.10 using the CGS coordinates.


−2 −1 10 2

−2

−1

21

0


Prin

cipa

l Axi

s 2

(35

.84%

)

Dismissing

Autonomous

Preoccupied

Unresolved

+

++

+

Avoidant

Secure

Resistant

Disorganised

*

*

*

*

Figure 4.8 Two-dimensional row isometric biplot of the mother/child attachment data inTable 4.10.

infant attachment (row) categories and not vice versa. For example, the infant categories‘Disorganised’ and ‘Avoidant’ affect the mothers that are ‘Unresolved’ and ‘Dismissing’,respectively.

In contrast, Figure 4.9 is a column isometric biplot so that the columns are depicted usingtheir principal coordinates and the rows are depicted using their standard coordinates. Thistime, there are arrows from the origin to each of the column points while the row categories arerepresented using only points. In this case the infant attachment (rows) categories depend onthe mother behaviour (columns) categories and the length of the arrows indicate the bestmother categories suitable to describe this relationship. Here ‘Unresolved’ and ‘Dismissing’are the mother’s categories that the greatest influences of their child’s response. In particular,a mother’s ‘Unresolved’ attachment to their child is strongly associated with a child’sresponse that is ‘Disorganised’, while a mother that is classified as having a ‘Dismissing’attachment classification is associated with a child that has an ‘Avoidant’ response.

Finally, looking at the length of projection of a category on an arrow, it is evidentthat the projection of the mother category ‘Unresolved’ on the arrow of the child category‘Disorganised’ in Figure 4.8 is shorter than the projection of the infant category ‘Disorganised’on the arrow of the mother category ‘Unresolved’ in Figure 4.9. Therefore studying thesymmetric association we know that mother’s behaviour affect child’s behaviour and viceversa, but looking at the two asymmetric biplot representations, we deduce that the mothercategory ‘Unresolved’ depends on the child category ‘Disorganised’ (Figure 4.8) whilethe child category ‘Avoidant’ depends on the mother category ‘Dismissing’ (rather than‘Dismissing’ on ‘Avoidant’); see Figure 4.9.

Figure 4.10 represents the third kind of biplot: a symmetric biplot. One may note that forthis plot, the scaling on the axes is different comparedwith the scaling for the isometric biplots.


−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

−1.

5−

1.0

−0.

50.

00.

51.

01.

5


Prin

cipa

l Axi

s 2

(35

.84%

)

Avoidant

Secure

Resistant

Disorganised

+

+

+

+

Dismissing

AutonomousPreoccupied

Unresolved

*

**

*

Figure 4.9 Two-dimensional column isometric biplot of the mother/child attachment datain Table 4.10.

−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0

−2.

0−

1.5

−1.

0−

0.5

0.0

0.5

1.0


Prin

cipa

l Axi

s 2

(35

.84%

)

Avoidant

Secure

Resistant

Disorganised

+

+

+

+

Dismissing

Autonomous

Preoccupied

Unresolved

*

*

*

*

Figure 4.10 Two-dimensional symmetric biplot of the mother/child attachment data inTable 4.10.


Unfortunately, the intra-distances of row and column points cannot be properly assessed asrows and columns are no more in principal coordinates. In contrast, the inter-point distancescan be evaluated confirming the previous findings.

References

Abdi, H. andWilliams, L.J. (2010) Correspondence analysis, in Encyclopedia of Research Design (eds.N.J. Salkind, D.M. Dougherty and B. Frey), Sage, pp. 267--278.

Adachi, K. (2004) Oblique promax rotation applied to the solutions in multiple correspondence analysis.Behaviormetrika, 31, 1--12.

Agresti, A. (2002) Categorical Data Analysis, 2nd edn, John Wiley & Sons, Inc., New York.

Ainsworth,M.D., Blehar,M.C.,Waters, E., andWall, S. (1978)Patterns of Attachment: APsychologicalStudy of the Strange Situation, Erlbaum.

Ainsworth, M.D.S. and Wittig, B.A. (1969) Attachment and exploratory behavior of one yearolds in a strange sitation, in Determinants of Infant Behavior (ed. B.M. Foss), Methuen,pp. 113--136.

Aitchison, J. and Greenacre, M.J. (2002) Biplots in compositional data. Applied Statistics, 51,375--392.

Anuradha, K.T. and Urs, S.R. (2007) Bibliometric indicators of Indian research collaboration patterns:a correspondence analysis. Scientometrics, 17, 179--189.

Armatte, M. (2008) Histoire et prehistoire de l’analyse des donnees par J. P. Benzecri: un casde genealogie retrospective. Electronic Journal for History of Probability and Statistics, 4 (2),24 page.

Askell-Williams, H. and Lawson, M.J. (2004) A correspondence analysis of child-care students’ andmedical students’ knowledge about teaching and learning. International Education Journal, 5,176--205.

Avila, F. and Myers, D.E. (1991) Correspondence analysis applied to environmental data sets:a study of Chautauqua Lake sediments. Chemometrics and Intelligent Laboratory Systems, 11,229--249.

Avila, F., Myers, D.E., and Palmer, C. (1991) Correspondence analysis and adsorbate selection forchemical sensor arrays. Journal of Chemometrics, 5, 455--465.

Baccini, A., Caussinus, H., and De Falguerolles, A. (1993) Analysing dependence in large contingencytables: dimensionality and patterns in scatter-plots, in Multivariate Analysis: Future Directions 2(eds. C.M. Cuadras and C.R. Rao), Elsevier, pp. 245--263.

Balanda, K.P. and MacGillivray, H.L. (1988) Kurtosis: a critical review. The American Statistician, 42,111--119.

Bar-Hen, A. and Mortier, F. (2004) Influence and sensitivity measures in correspondence analysis.Statistics, 38, 207--215.

Baxter, M.J. (1994) Exploratory Multivariate Analysis in Archaeology, Edinburgh UniversityPress.

Baxter, M.J. (2003) Statistics in Archaeology, Arnold.

Baxter, M.J. and Cool, H.E.M. (2010) Correspondence analysis in R for archaeologists: an educationalaccount. Archeologia e Calcolatori, 21, 211--228.

Beh, E.J. (1997) Simple correspondence analysis of ordinal cross-classifications using orthogonalpolynomials. Biometrical Journal, 39, 589--613.


Beh, E.J. (2001) Confidence circles for correspondence analysis using orthogonal polynomials. Journalof Applied Mathematics and Decision Sciences, 5, 35--45.

Beh, E.J. (2004a) Simple correspondence analysis: a bibliographic review. International StatisticalReview, 72, 257--284.

Beh, E.J. (2004b) A Bibliography of the Theory and Application of Correspondence Analysis, Vol. III-- By Year. Available from the author upon request.

Beh, E.J. (2009) A few moments for simple correspondence analysis, in Proceedings of Third AnnualASEARC Conference, December 8--9, 2009, 4 page.

Beh, E.J. (2010) Elliptical confidence regions for simple correspondence analysis. Journal of StatisticalPlanning and Inference, 140, 2582--2588.

Beh, E.J. and Farver, T.B. (2012) A numerical evaluation of the classification of Portuguese red wine.Current Analytical Chemistry, 8, 218--22.

Beh, E.J. and Holdsworth, C.I. (2012) A visual evaluation of a classification method for inves-tigating the physicochemical properties of Portuguese wine. Current Analytical Chemistry, 8,205--217.

Beh, E.J. and Lombardo, R. (2012) A genealogy of correspondence analysis. The Australian and NewZealand Journal of Statistics, 54, 137--168.

Beh, E.J. and Lombardo, R. (2014) Correspondence analysis and the Freeman--Tukey statistic in press.

Beh, E.J., Lombardo, R., and Simonetti, B. (2011) A European perception of food using two methodsof correspondence analysis. Food Quality and Preference, 22, 226--231.

Beh, E.J. and Simonetti, B. (2010) A few moments for non-symmetrical correspondence analysis,in Proceedings of the 11th European Symposium on Statistical Methods for the Food Industry(AGROSTAT 2010), pp. 277--286.

Beh, E.J. and Simonetti, B. (2011) Investigating the European perception of food using moments ob-tained from non-symmetrical correspondence analysis. Journal of Statistical Planning and Inference,141, 2953--2960.

Beishuizen, J.J., Hof, E., Van Putten, C.M., Bouwmeester, S., and Asscher, J.J. (2001) Students’and teachers’ cognitions about good teachers. British Journal of Educational Psychology, 71,185--201.

Benasseni, J. (1993) Perturbational aspects in correspondence analysis. Computational Statistics andData Analysis, 15, 393--410.

Benasseni, J. (2002) A complementary proof of an eigenvalue property in correspondence analysis.Linear Algebra and Its Applications, 354, 49--51.

Bendixen, M. (1996) A practical guide to the use of correspondence analysis in marketing research.Marketing Research On-Line, 1, 16--38.

Bendixen, M. and Yurova, Y. (2012) How respondents use verbal and numeric rating scales. Interna-tional Journal of Market Research, 54, 261--282.

Bentley, T., Page, S., Meyer D., Chalmers, D., and Laird, I. (2001) How safe is adventure tourism inNew Zealand? An exploratory analysis. Applied Ergonomics, 32, 327--338.

Benzecri, J-P. (1973) L’Analyse des Donnees (two volumes). Dunad, Paris.

Benzecri, J-P. (1977) Histoire et prehistoire de l’analyse des donnees. Partie V: l’analyse des corre-spondances. Cahiers de l’Analyse des Donnees, 2, 9--40.

Benzecri, J-P. (1982) Histoire et Prehistoire de L’Analyse des Donnees, Dunod.

Benzecri, J-P. (1992) Correspondence Analysis Handbook, Marcel Dekker.

Berthon, P., Pitt, L., Berthon, J-P., Crowther, C., Bruwer, L., Lyall, P., and Money, A. (1997) Mappingthe marketspace: evaluating industry Web sites using correspondence analysis. Journal of StrategicMarketing, 5, 233--242.


Bickel, P.J. and Lehmann, E.L. (1975) Descriptive statistics for non-parametric models. I. Introduction.The Annals of Statistics, 3, 1038--1044.

Birks, H., Peglar, S., and Austin, H. (1996) An annotated bibliography of canonical correspondenceanalysis and related constrained ordination methods 1986--1993. Abstracta Botanica, 20, 17--36.

Blasius, J. (1994) Correspondence analysis in social science research, in Correspondence Analysis inthe Social Sciences (eds. M. Greenacre and J. Blasius), Academic Press, pp. 23--52.

Blasius, J., Eilers, P.H.C., and Gower, J. (2009) Better biplots. Computational Statistics and DataAnalysis, 53, 3145--1358.

Bock, T. (2011a) Improving the display of correspondence analysis using moon plots. InternationalJournal of Market Research, 53, 307--326.

Bock, T. (2011b)We really do need correspondence analysis. International Journal ofMarket Research,53, 587--591.

Bolviken, E., Helskog, E., Helskog, K., Holm-Olsen, I.M., Solheim, L., and Bertelsen, R.(1982) Correspondence analysis: an alternative to principal components. World Archaeology, 14,41--60.

Bonnet, N. (1998) Multivariate statistical methods for the analysis of microscope image series: appli-cations in material science. Journal of Microscopy, 190, 2--18.

Bove, G. (1992) Asymmetric multidimensional scaling and correspondence analysis for square tables.Statistica Applicata, 4, 587--598.

Bradu, D. and Gabriel, K.R. (1978) The biplot as a diagnostic tool for models of two-way tables.Technometrics, 20, 47--68.

Braga, J., Heuze, Y., Chabadel, O., Sonan, N.K., and Gueramy, A. (2005) Non-adult dental ageassessment: correspondence analysis and linear regression versus Bayesian predictions. InternationalJournal of Legal Medicine, 119, 260--274.

Burtchy, B. (1984) Analyse factorielle des matrices d’echanges, in Data Analysis and Informatics III(eds. E. Diday, L. Jambu, J. Lebart, J. Pages and R. Tomassone), North-Holland, pp. 447--464.

Busold, C.H, Winter, S., Hauser, N., Bauer, A., Dippon, J., Hoheisel, J.D., and Fellenberg, K. (2005)Integration of GO annotations in correspondence analysis: facilitating the interpretation ofmicroarraydata. Bioinformatics, 21, 2424--2429.

Cakir, H.I, Khorram, S., and Nelson, S.A.C. (2006) Correspondence analysis for detecting land coverchange. Remote Sensing of Environment, 102, 306--317.

Calantone, R.J, Di Benedetto, C.A., Hakam, A., and Bojanic, D.C. (1989) Multiple multinationaltourism positioning using correspondence analysis. Journal of Travel Research, 28, 25--32.

Carlier, A. and Kroonenberg, P.M. (1996) Decompositions and biplots in three-way correspondenceanalysis. Psychometrika, 61, 355--373.

Carroll, J.D, Green, P.E., and Schaffer, C.M. (1986) Interpoint distance comparisons in correspondenceanalysis. Journal of Marketing Research, 23, 271--280.

Carroll, J.D, Green, P.E., and Schaffer, C.M. (1987) Comparing interpoint distances in correspondenceanalysis: a clarification. Journal of Marketing Research, 24, 445--450.

Carroll, J.D, Green, P.E., and Schaffer, C.M. (1989) Reply to Greenacre’s commentary on theCarroll--Green--Schaffer scaling of two-way correspondence analysis solutions. Journal of Mar-keting Research, 26, 366--368.

Cattell, R.B. (1966) The scree test for the number of factors. Journal of Multivariate BehavioralResearch, 1, 245--276.

Cavedon, G., D’Arcengelo, E., and De Antoni, F. (1982) Analysis of relationships among categori-cal variables: a comparison between correspondence analysis and log-linear models. Metron, 40,117--143.


Cazes, P. (1986) Correspondance entre deux ensembles et partition de ces deux ensembles. Cahiers del’Analyse des Donnees, 11, 335--340.

Chen, J.S. (2001) A case study of Korean outbound travelers’ destination images by using correspon-dence analysis. Tourism Management, 22, 345--350.

Choulakian, V. (1988) Exploratory analysis of contingency tables by loglinear formulation and gener-alizations of correspondence analysis. Psychometrika, 53, 235--250.

Ciampi, A., Dyachenko, A., Gonzalez-Marcos, A., and Lechevallier, Y. (2012) Two-way classificationof a data table with non-negative entries: the role of the 𝜒2 distance and correspondence analysis.Communications in Statistics (Simulation and Computation), 41, 1006--1022.

Ciampi, A., Marcos, A.G., and Limas, M.C. (2005) Correspondence analysis and two-way clustering.SORT, 29, 24--42.

Colin, B., MacGibbon, B., and De Tibeiro, J. (1992) Correspondence analysis of a survey on nursingattitudes to the workplace and turnover. The New Zealand Statistician, 27, 55--67.

Collins, M. (2002) Analyzing brand image data.Marketing Research, 14, 32--36.

Collins, M. (2011) Do we really need correspondence analysis? International Journal of MarketResearch, 53, 583--586.

Constantine, A.G. and Gower, J.C. (1978) Graphical representation of asymmetry. Applied Statistics,27, 297--304.

Corbellini, A., Riani, M., and Donatini, A. (2008) Multivariate data analysis techniques to detect earlywarnings of elderly frailty. Statistica Applicata, 20, 159--178.

Cortez, P., Cerdeira, A., Almeida, F., Matos, T., and Reis, J. (2009) Modeling wine preference by datamining from physicochemical properties. Decision Support Systems, 47, 547--553.

Craddock, J.M. and Flood, C.R. (1969) Eigenvectors for representing the 50 mb geopotential sur-face over the Northern Hemisphere. Quarterly Journal of the Royal Meteorological Society, 95,576--593.

Cuadras, C.M. and Cuadras, D. (2006) A parametric approach to correspondence analysis. LinearAlgebra and its Applications, 417, 64--74.

D’Ambra, L. and Kiers, H.A.L. (1991) Analysis of log-trilinear models for a three-way contingencytable using PARAFAC/CANDECOMP. In: Societ�� Italiana di Statistica, Atti delle giornate di studiodi Pescara, 11-12 Ottobre 1990, pp.101-114. Chieti: Marino Solfanelli Editore.

Daigle, G. and Rivest, L-P. (1992) A robust biplot. The Canadian Journal of Statistics, 20,241--255.

Danaher, P.J. (1991) A canonical expansion model for multivariate media exposure distributions:a generalization of the ‘Duplication of Viewing Law’. Journal of Marketing Research, 28,361--367.

Daudin, J.J, Tomassone, R., and Trecourt, P. (1985) Large-scale survey analysis (with discussions),in Recent Developments in the Analysis of Large-Scale Data Sets (ed. A.Z. Israels), Eurostat,pp. 189--202.

De Falguerolles, A. (2008) L’analyse des donnees; before and around. Electronic Journal for Historyof Probability and Statistics, 4(2), 32 page.

De Falguerolles, A., Jmel, S., andWhittaker, J. (1995) Correspondence analysis and association modelsby a conditional independence graph. Psychometrika, 60, 161--180.

De Leeuw, J. and Van der Heijden, P.G.M. (1988) Correspondence analysis of incomplete contingencytables. Psychometrika, 53, 223--233.

De Tibeiro, J.J and D’Ambra, L. (2010) An integrated approach to regression analysis using correspon-dence analysis and cluster analysis. Statistica & Applicazioni, 8, 27--56.


De Tibeiro, J.J.S. andMurdoch, D.J. (2010) Correspondence analysis with incomplete paired data usingBayesian imputation. Bayesian Analysis, 5, 519--532.

Devaux, M.F, Qannari, E.M., and Gallant, D.J. (1992) Multiple-correspondence analysis optical mi-croscopy for determination of starch granules. Journal of Chemometrics, 6, 163--175.

Deville, J-C. and Saporta G. (1983) Correspondence analysis, with an extension towards nominal timeseries. Journal of Econometrics, 22, 169--189.

Doey, L. and Kurta, J. (2011) Correspondence analysis applied to psychological research. Tutorials inQuantitative Methods for Psychology, 7, 5--14.

Domenges, D. and Volle, M. (1979) Analyse factorielle spherique: une exploration. Ann Insee, 35,3--84.

Dossou-Gbete, S. and Grorud A. (2002) Biplots for matched two-way tables. Annales de la Faculte desSciences de Toulouse, 11, 469--483.

Elevli, S., Uzgaren, N., and Elevli, B. (2008) Correspondence analysis of repair data: a case study forelectric shovels. Journal of Applied Statistics, 35, 901--908.

Escofier, B. (1978) Analyse factorielle et distances repondant au principe d’equivalence distribution-nelle. Revue de Statistique Appliquee, 26, 29--37.

Escofier, B. and Le Roux, B. (1976) Influence d’un element sur les facteurs en analyse des correspon-dances. Cahiers de l’Analyse des Donnees, 1, 297--318.

Escoufier, Y. (1988) Beyond correspondence analysis, in Classification and Related Methods of DataAnalysis (ed. H.H. Bock), North-Holland, Amsterdam, pp. 505--514.

Esti, M., Moneta, E., Peparaio, M., and Sinesio, F. (2009) Exploring sensory properties by correspon-dence analysis. Ingredienti Alimentari, 8 (43), 26--28.

Everitt, B.S. (1992) The Analysis of Contingency Tables, 2nd edn, Chapman & Hall.

Fellenberg, K., Hauser, N.C., Brors, B., Neutzner, A., Hoheisel, J.D., and Vingron, M. (2001) Corre-spondence analysis applied to microarray data. Proceedings of the National Academy of Sciences ofthe United States of America, 98, 10781--10786.

Fichet, B. (2009)Metrics of𝐿𝑝-type and distributional equivalence principle.Advances inData Analysisand Classification, 3, 305--314.

Fisher, R.A. (1939) The sampling distribution of some statistics obtained from non linear equations.Annals of Eugenics, 9, 238--249.

Fontana, R. (2008) The use of correspondence analysis to study daily tourism flows. Statistica Applicata,20, 93--101.

Foucart, T. (1985) Tableaux symmetriques et tableaux d’echanges. Revue de Statistique Appliquee, 33,37--54.

Fox, R.J. (1998) Perceptual mapping using the basic structure matrix decomposition. Journal of theAcademy of Marketing Science, 16, 47--59.

Freudenthal, M., Martin-Suarez, E., Gallardo, J.A, Daroca, A.G-A., and Minwer-Barakat, R. (2009)The application of correspondence analysis in palaeontology. Comptes Rendus Palevol, 8,1--8.

Gabriel, K.R. (1971) The biplot graphic display of matrices with application to principal componentanalysis. Biometrika, 58, 453--467.

Gabriel, K.R. (1981) Biplot display of multivariate matrices for inspection of data and diagnosis,in Interpreting Multivariate Data (ed. V. Barnett), John Wiley & Sons, Ltd, Chichester, UK,pp. 147--173.

Gabriel, K.R. (1995a) Biplot display of multivariate categorical data, with comments on multiple corre-spondence analysis, in Recent Advances in Descriptive Multivariate Analysis (ed. W.J. Krzanowski),Oxford University Press, pp. 469--485.


Gabriel, K.R. (1995b) MANOVA biplots for two-way contingency tables, in Recent Advances inDescriptive Multivariate Analysis (ed. W.J. Krzanowski), Oxford University Press, pp. 469--485.

Gabriel, K.R. (2002) Goodness of fit of biplots and correspondence analysis. Biometrika, 89,423--436.

Gabriel, K.R. (2006) Biplots. Encyclopedia of Statistical Sciences, 1, 563--570.

Gabriel, K.R. and Odoroff, C.L. (1990) Biplots in biomedical research. Statistics in Medicine, 9,469--485.

George, C., Kaplan, N., and Main, M. (1985) Adult Attachment Interview. Unpublished manuscript,University of California, Berkeley, CA.

Gifi, A. (1990) Nonlinear Multivariate Analysis, John Wiley & Sons, Ltd, Chichester, UK.

Gilula, Z. (1986) Grouping and association in contingency tables: an exploratory canonical correlationapproach. Journal of the American Statistical Association, 81, 773--779.

Gokhale, D.V. and Johnson, N.S. (1978) A class of alternatives to independence in contingency tables.Journal of the American Statistical Association, 73, 800--804.

Goodman, L.A. (1981) Criteria for determining whether certain categories in a cross-classification tableshould be combined with special reference to occupational categories in an occupational mobilitytable. American Journal of Sociology, 87, 612--650.

Goodman, L.A. (1996) A single general method for the analysis of cross-classified data: reconciliationand synthesis of some methods of Pearson, Yule, and Fisher, and also some methods of corre-spondence analysis and association analysis. Journal of the American Statistical Association, 91,408--428.

Gower, J.C. (1977) The analysis of asymmetry and orthogonality, in Recent Developmentsin Statistics (eds. J.R. Barra, F. Brodeau, G. Romer and B. Van Cutsem), North-Holland,pp. 109--123.

Gower, J.C. (1990) Three-dimensional biplots. Biometrika, 77, 773--785.

Gower, J.C. (1992) Generalized biplots. Biometrika, 79, 475--493.

Gower, J.C. (1993) Recent advances in biplot methodology, inMultivariate Analysis: Future Directions2 (eds. C.M. Cuadras and C.R. Rao), North-Holland, pp. 295--325.

Gower, J.C. (2004) The geometry of biplot scaling. Biometrika, 91, 705--714.

Gower, J.C. and Digby, P.G.N. (1981) Expressing complex relationships in two dimensions, inInterpreting Multivariate Data (ed. V. Barnett), John Wiley & Sons, Ltd, Chichester, UK,pp. 83--118.

Gower, J.C., Groenen, P.J.F., and Van de Velden, M. (2010) Area biplots. Journal of Computationaland Graphical Statistics, 19, 46--61.

Gower, J.C. and Hand, D.J. (1996) Biplots, Chapman & Hall.

Gower, J.C. and Harding, S.A. (1988) Nonlinear biplots. Biometrika, 75, 445--455.

Gower, J.C, Lubbe, S., and le Roux, N. (2011) Understanding Biplots, John Wiley & Sons, Ltd,Chichester, UK.

Graffelman, J. and Aluja-Banet, T. (2003) Optimal representation of supplementary variables in bi-plots from principal component analysis and correspondence analysis. Biometrical Journal, 45,491--509.

Greenacre, M.J. (1984) Theory and Applications of Correspondence Analysis, Academic Press.

Greenacre, M.J. (1987) Correspondence analysis on a personal computer.Chemometrics and IntelligentLaboratory Systems, 2, 233--234.

Greenacre, M.J. (1988) Clustering the rows and columns of a contingency table. Journal of Classifica-tion, 5, 39--51.


Greenacre, M.J. (1989) The Carroll--Green--Schaffer scaling in correspondence analysis: a theoreticaland empirical appraisal. Journal of Marketing Research, 26, 358--365.

Greenacre, M.J. (1990) Some limitations of multiple correspondence analysis. Computational StatisticsQuarterly, 3, 249--256.

Greenacre, M. (1992) Correspondence analysis in medical research. Statistical Methods in MedicalResearch, 1, 97--117.

Greenacre, M.J. (1993) Biplots in correspondence analysis. Journal of Applied Statistics, 20,251--269.

Greenacre, M. (2000) Correspondence analysis of square asymmetric matrices. Applied Statistics, 49,297--310.

Greenacre, M. (2002) Correspondence analysis of the Spanish National Health Survey. Gaceta Sani-taria, 16, 160--170.

Greenacre, M. (2009) Power transformations in correspondence analysis. Computational Statistics andData Analysis, 53, 3107--3116.

Greenacre, M. (2010a) Correspondence analysis. WIREs Computational Statistics, 2, 613--619.

Greenacre, M. (2010b) Biplots in Practice, Fundacion BBVA.

Greenacre, M. (2012) Contribution biplots. Journal of Computational and Graphical Statistics, 22,107--122.

Greenacre, M.J. and Hastie, T. (1987) The geometric interpretation of correspondence analysis. Journalof the American Statistical Association, 82, 437--447.

Greenacre, M. and Pardo, R. (2006) Subset correspondence analysis: visualizing relationships among aselected set of response categories from a questionnaire survey. Sociological Methods & Research,35, 193--218.

Greenacre, M.J. and Vrba, E.S. (1984) Graphical display and interpretation of antelope census data inAfrican wildlife areas, using correspondence analysis. Ecology, 65, 984--997.

Guerrero, L., Claret, A., Verbeke, W., Enderli, G., Zakowska-Biemans, S., Vanhonacker, F., Issanchou,S., Sajdakowska, M., Granli, B.S, Scalvedi, L., Contel, M., and Hersleth, M. (2010) Perception oftraditional food products in six European regions using free word association. Food Quality andPreference, 21, 225--233.

Guerrero, L., Claret, A., Verbeke, W., Vanhonacker, F., Enderli, G., Sulmont-Rosse, C., Hersleth, M.,and Guardia, M.D. (2012) Cross-cultural conceptualization of the words Traditional and Innovationin a food context by means of sorting task and hedonic evaluation. Food Quality and Preference, 25,69--78.

Gursoy, D. and Chen, J.S. (2000) Competetive analysis of cross-cultural information search behaviour.Tourism Management, 21, 583--590.

Hammer, Ø. and Harper, D.A.T. (2006) Paleontological Data Analysis, Blackwell.

Harcourt, B.E. (2002) Measured interpretation: introducing the method of correspondence analysis tolegal studies. University of Illinois Law Review, 4, 979--1018.

Headen, R.S, Klompmaker, J.E., and Rust, R.T. (1979) The duplication of viewing law and televisionmedia schedule evaluation. Journal of Marketing Research, 16, 333--340.

Hesse, E. (1999) The adult attachment interview, in Handbook of Attachment: Theory, Research andClinical Applications (eds. J. Cassidy and P.R. Shaver), Guildford Press, pp. 552--598.

Higgs, N.T. (1991) Practical and innovative uses of correspondence analysis. The Statistician, 40,183--194.

Hill, M.O. (1973) Reciprocal averaging: an eigenvector method of ordination. Journal of Ecology, 61,237--249.


Hill, M.O. (1974) Correspondence analysis: a neglected multivariate technique. Applied Statistics, 23,340--354.

Hoffman, D., De Leeuw, J., and Arjunji, R. (1995) Multiple correspondence analysis, in AdvancedMethods of Marketing Research (ed. R.P. Bagozzi), Blackwell, pp. 260--294.

Hoffman, D.L. and Franke, G.R. (1986) Correspondence analysis: Graphical representation of categor-ical data in marketing research. Journal of Marketing Research, 23, 213--227.

Holmes, S. (2008) Multivariate data analysis: the French way, in Probability and Statistics: Essaysin Honor of David A. Freedman (eds. D. Nolan and T. Speed), Institute of Mathematical Statistics,pp. 219--233.

Horn, J.L. (1965) A rationale and test for the number of factors in factor analysis. Psychometrika, 30,179--185.

Hsieh, M-H. (2004) An investigation of country-of-origin effect using correspondence analysis: across-national context. International Journal of Marketing Research, 46, 267--295.

Hsu, P.L. (1939) On the distribution of the roots of certain determinantal equations. Annals of Eugenics,9, 250--258.

Hubert, L. and Arabie, P. (1992) Correspondence analysis and optimal structural representations.Psychometrika, 56, 119--140.

Husson, F.Le.S. and Pages J. (2011) Exploratory Multivariate Analysis by Example Using R, CRCPress.

Israels, A.Z., Bethlehem, J.G., Van Driel, J., Jansen,M.E., Pannekoek, J., De Ree, S.J.M., and Sikkel, D.(1985) Multivariate analysis methods for discrete variables, in Recent Developments in the Analysisof Large-Scale Data Sets (ed. A.Z. Israels), Eurostat, pp. 241--302.

Jambu, M. (1978) Classification Automatique pour l’Analyse de Donnees, I -- Methodes et Algorithms.Dunod.

Jolliffe, I.T. (1986) Principal Component Analysis, Springer.

Kaiser, H.F. (1960) The application of electronic computers to factor analysis. Educational and Psy-chological Measurement, 20, 141--151.

Karadjov, M. and Simeonov, V. (1990) Interpretation of geochemical data by the use of multivariatedata analysis.Mikrochimica Acta, 3, 191--199.

Kiers, H.A.L. (1991) Simple structure in component analysis techniques for mixtures of qualitative andquantitative variables. Psychometrika, 56, 197--212.

Kroonenberg, P.M. (1997) Introduction to biplots for G×E tables. Research Report #51, Centre forStatistics, The University of Queensland, Brisbane, Australia.

Kroonenberg, P.M. and Greenacre, M.J. (2006) Correspondence analysis, in Encyclopedia of StatisticalSciences, 2nd edn (eds. S. Kotz, C.B. Read, N. Balakrishnan and B. Vidakovic), John Wiley & Sons,Inc., Hoboken, NJ, pp. 1394--1403.

Krzanowski, W.J. (1993) Attribute selection in correspondence analysis of incidence matrices. AppliedStatistics, 42, 529--541.

Lang, C. (1978) Factorial correspondence analysis of oligochaeta communities according to eutrophi-cation level. Hydrobiologia, 57, 241--247.

Larsen, R.J. and Marx, M.L. (1986) An Introduction to Mathematical Statistics and its Applications,2nd edn, Prentice-Hall.

Lebart, L. (1976) The significancy of eigenvalues issued from correspondence analysis of contingencytables, in COMPSTAT. 1976 (eds. J. Gordesch and P. Naeve), Physica-Verlag, pp. 38--45.

Lebart, L. (1985) Exploratory analysis of survey data: The role of correspondence analysis (withdiscussions), in Recent Developments in the Analysis of Large-Scale Data Sets (ed. A.Z. Israels),Eurostat, pp. 169--188.


Lebart, L. (1997) Correspondence analysis, discrimination and neural networks, in Data Science,Classification and Related Methods (eds. C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bockand Y. Baba), Springer, Berlin, pp. 423--430.

Lebart, L. (1998) Visualizations of textual data, in Visualizations of Categorical Data (eds. J. Blasiusand M. Greenacre), Academic Press, pp. 133--147.

Lebart, L. (2008) Exploratory multivariate data analysis from its origins to 1980: nine contributions.Electronic Journal for History of Probability and Statistics, 4 (2), 17 pp.

Lebart, L. andMirkin, B.G. (1993) Correspondence analysis and classification, inMultivariate Analysis,Future Directions (eds. C. Cuadras and C.R. Rao), North-Holland, pp. 341--357.

Lebart, L., Morineau, A., andWarwick, K.M. (1984)Multivariate Descriptive Statistical Analysis, JohnWiley & Sons, Ltd, Chichester, UK.

Le Roux, B. and Rouanet, H. (1998) Interpreting axes in multiple correspondence analysis: methods ofthe contributions of points and deviations, in Visualizations of Categorical Data (eds. J. Blasius andM. Greenacre), Academic Press, pp. 197--220.

Li, B.-H., Sun, Z.-Q., and Dong, S.-F. (2010) Correspondence analysis and its application in oncology.Communications in Statistics (Theory and Methods), 39, 1229--1236.

Li, H. and Yamanishi, K. (2001) Mining from open answers in questionnaire data. Proceedings of theSeventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds.D. Lee, M. Schkolnick, F. Provost and R. Srikant), ACM, pp. 443--449.

Lipkovich, I.A. and Smith, E.P. (2002) Biplot and singular value decomposition macros for Excel ©.Journal of Statistical Software, 7 (5), 15 pages.

Lombardo, R. Carlier, A., and D’Ambra, L. (1996) Nonsymmetric correspondence analysis for three-way contingency tables. Methodologica, 4, 59--80.

Lorenza-Seva, U. (2011) Horn’s parallel analysis for selecting the number of dimensions in correspon-dence analysis. European Journal of Research Methods for the Behavioral and Social Sciences, 7,96--102.

Lorenzo-Seva, U., Van de Velden, M., and Kiers, H.A.L. (2009) Oblique rotation in correspondenceanalysis: a step forward in the search for the simplest interpretation. British Journal of Mathematicaland Statistical Psychology, 62, 583--600.

MacGillivray, H.L. (1986) Skewness and asymmetry: measures and orderings. The Annals of Statistics,14, 994--1011.

Marques, A.P, San Romao, M.V., and Tenreiro, R. (2012) RNA fingerprinting analysis of Oenococcusoeni strains under wine conditions. Food Microbiology, 31, 238--245.

Mazzarol, T.W. and Soutar, G.N. (2008) Australian educational institutions’ international mar-kets: a correspondence analysis. International Journal of Educational Management, 22,229--238.

Mellinger, M. (1987a) Correspondence analysis: the method and its application. Chemometrics andIntelligent Laboratory Systems, 2, 61--77.

Mellinger, M. (1987b) Interpretation of lithogeochemistry using correspondence analysis. Chemomet-rics and Intelligent Laboratory Systems, 2, 93--108.

Moser, E.B. (1985) Exploring contingency tables with correspondence analysis. Computational andApplied Bioscience, 5, 183--189.

Moussa, M.A.A. and Ouda, B.A. (1988) Correspondence analysis of contingency tables. ComputerMethods and Programs in Biomedicine, 27, 111--119.

Murtagh, F. (1982) Verifying examination results: a general approach. SIGCSE Bulletin, 14 (4),2--11.


Nakayama, T. (2001) Tests for redundancy of some variables in correspondence analysis. HiroshimaMathematical Journal, 31, 1--34.

Nowak, E. andBar-Hen,A. (2005) Influence function and correspondence analysis. Journal of StatisticalPlanning and Inference, 134, 26--35.

Osmond, C. (1985) Biplot models applied to cancer mortality rates. Applied Statistics, 34, 63--70.

Pacheco, F.A.L. (1998) Application of correspondence analysis in the assessment of groundwaterchemistry. Mathematical Geology, 30, 129--161.

Pack, P. and Jolliffe, I.T. (1992) Influence in correspondence analysis. Applied Statistics, 41,365--380.

Parsons, B.G., Watmough, S.A, Dillon, P.J., and Somers, K.M. (2010a) A bioassessment of lakes inthe Athabasca Oil Sands Region, Alberta, using benthic macroinvertebrates. Journal of Limnology,69, 105--117.

Parsons, B.G, Watmough, S.A., Dillon, P.J., and Somers, K.M. (2010b) Relationships between lakewater chemistry and benthic macroinvertebrates in the Athabasca Oil Sands Region, Alberta. Journalof Limnology, 69, 118--125.

Polackova, J. and Jindrova, A. (2010) Innovative approach to education and teaching of statistics.Journal of Efficiency and Responsibility in Education and Science, 3, 14--27.

Rao, C.R. (1995) A review of canonical coordinates and an alternative to correspondence analysis usingHellinger distance. Questiio, 19, 23--63.

Rhodes, H.R. and Myers, D.E. (1991) Correspondence analysis used in the evaluation of lakewaterchemistry in the Adirondacks. Journal of Chemometrics, 5, 273--290.

Ringrose, T.J. (1992) Bootstrapping and correspondence analysis in archaeology. Journal of Archaeo-logical Science, 19, 615--629.

Ringrose, T.J. (1996) Alternative confidence regions for canonical variate analysis. Biometrika, 83,575--587.

Ringrose, T.J. (2012) Bootstrap confidence regions for correspondence analysis. Journal of StatisticalComputation and Simulation, 83, 1397--1413.

Ruppert, D. (1987) What is kurtosis? An influence function approach. The American Statistician, 41,1--5.

Saenz-Navajas, M., Campo, E., Sutan, A., Ballester, J., and Valentin, D. (2013) Perception of wine qual-ity according to extrinsic cues: the case of Burgundy wine consumers. Food Quality and Preference,27, 44--53.

Schifferstein, H.N.J., Fenko, A., Desmet, P.M.A., Labbe, D., andMartin, N. (2013) Influence of packagedesign on the dynamics of multisensory and emotional experience. Food Quality and Preference,27, 18--25.

Selikoff, I.J. (1981) Household risks with inorganic fibers. Bulletin of the New York Academy ofMedicine, 57, 947--961.

Shanka, T., Quintal, V., and Taylor, R. (2006) Factors influencing international students’ choice of aneducation destination -- a correspondence analysis. Journal of Marketing for Higher Education, 15(2), 31--46.

Shennan, S. (1997) Quantifying Archaeology, 2nd edn, Edinburgh University Press.

Silic, A., Morin, A., Chauchat, J.-H., and Basic, B.D. (2012) Visualization of temporal text collectionsbased on correspondence analysis. Expert Systems with Applications, 39, 12143--12157.

Smith, W.F. and Cornell, J.A. (1993) Biplot displays for looking at multiple response data in mixtureexperiments. Technometrics, 35, 337--350.

Sourial, N.,Wolfson, C., Zhu, B., Quail, J., Fletcher, J., Karunananthan, S., Bandeen-Roche, K., Beland,F., and Bergman, H. (2010) Correspondence analysis is a useful tool to uncover the relationships


among categorical variables. Journal of Clinical Epidemiology, 63, 638--646 [Erratum: Journal ofClinical Epidemiology, 63, 809].

Takane, Y. and Jung, S. (2009) Regularized nonsymetric correspondence analysis. ComputationalStatistics and Data Analysis, 53, 3159--3170.

Takane, Y., Yanai, H., and Mayekawa, S. (1991) Relationships among several methods of linearlyconstrained correspondence analysis. Psychometrika, 56, 667--684.

Tanaka, Y. and Tarumi, T. (1985) Computational aspect of sensitivity analysis in multivariate methods.Technical Report 12, Okayama Statistician Group, Okayama University.

Tarnai, C. and Wuggenig, U. (1998) Normative integration of the avant-garde? Traditionalism in theart worlds of Vienna, Hamburg, and Paris, in Visualization of Categorical Data (eds. J. Blasius andM. Greenacre), Academic Press, pp. 171--184.

Ter Braak, C.J.F. (1985) Correspondence analysis of incidence and abundance data: properties in termsof a unimodal response model. Biometrics, 41, 859--873.

Ter Braak, C.J.F. (1987) Ordination, inData Analysis in Community and Landscape Ecology (eds. R.H.Jongman, C.J.F. Ter Braak and O.F.R. Van Tongeren), Pudoc, pp. 91--173.

Thiessen, V. and Blasius, J. (1998) Using multiple correspondence analysis to distinguish betweensubstantive and nonsubstantive responses, in Visualizations of Categorical Data (eds. J. Blasius andM. Greenacre), Academic Press, pp. 239--252.

Van der Heijden, P.G.M. (1992) Three approaches to study the departure from quasi-independence.Statistica Applicata, 4, 465--480.

Van der Heijden, P.G.M, De Falguerolles, A., and De Leeuw, J. (1989) A combined approach tocontingency table analysis using correspondence analysis and log-linear analysis. Applied Statistics,38, 249--292.

Van de Velden, M. and Kiers, H.A.L. (2005) Rotation in correspondence analysis. Journal of Classifi-cation, 22, 251--271.

Van de Velden, H. and Neudecker, H. (2000) On an eigenvalue property relevant in correspondenceanalysis and related methods. Linear Algebra and Its Applications, 321, 347--364.

Van IJzendoorn, M.H. (1995) Adult attachment representations, parental responsiveness and infantattachment. A meta-analysis on the predictive validity of the Adult Attachment Interview. Psycho-logical Bulletin, 117, 387--403.

Van Meter, K., Schiltz, M-A., Cibois, P., and Mounier, L. (1994) Correspondence analysis: a historyand French sociological perspective, in Correspondence Analysis in the Social Sciences (eds. M.Greenacre and J. Blasius), Elsevier, pp. 128--137.

VanRijckevorsel, J.L.A. andDeLeeuw, J. (1988)Component andCorrespondenceAnalysis:DimensionReduction by Functional Approximation, John Wiley & Sons, Ltd, Chichester, UK.

Vicente-Villardon, J.L, Galindo Villardon, M.P., and Blazquez-Zaballos, A. (2006) Logistic biplots,in Multiple Correspondence Analysis and Related Methods (eds. M. Greenacre and J. Blasius),Chapman & Hall/CRC Press, pp. 503--521.

Watts, D.D. (1997) Correspondence analysis: a graphical technique for examining categorical data.Nursing Research, 46, 235--239.

Whitlark, D.B. and Smith, S.M. (2001) Using correspondence analysis to map relationships.MarketingResearch, 13 (3), 22--27.

Whitlark, D.B and Smith, S.M. (2004) Pick and choose: a correspondence analysis of ‘pick data’.Marketing Research, 16 (4), 9--14.

Wickens, T.D. (1989)MultiwayContingency Tables Analysis for the Social Sciences, LawrenceErlbaumAssociates Inc.

Wickens, T.D. (1998) Categorical data analysis. Annual Review of Psychology, 49, 537--557.


Yamakawa, A., Ichihashi, H., and Miyoshi, T. (1998) Multiple correspondence analysis based onLs-Norm and its application to an analysis of senior simulation, in Proceedings of the SecondJapan--Australia Joint Workshop on Intelligent and Evolutionary Systems, pp. 99--106.

Yamakawa, A., Ichihashi, H., and Miyoshi, T. (1999a) Multiple correspondence analysis of mimeticexperiences of advanced aged persons, in Proceedings of International Conference on ProductionResearch, vol. 2, pp. 1065--1068.

Yamakawa, A., Kanaumi, Y., Ichihashi, H., and Miyoshi, T. (1999b) Simultaneous application of clus-tering and correspondence analysis, in Proceedings of International Conference on Neural Networks.No. 625.

Yehia, A.Y. (1993) A continuous analog of correspondence analysis with an application to the Dirichletdistribution. The Egyptian Statistical Journal, 37, 190--197.

Yelland, P.M. (2010) An introduction to correspondence analysis. The Mathematica Journal, 12,1--23.

[wiley series in probability and statistics] correspondence analysis || simple correspondence...

Documents