[Wiley Series in Probability and Statistics] Applied Multiway Data Analysis || Improving Interpretation through Rotations
Post on 08-Dec-2016
Embed Size (px)
IMPROVING INTERPRETATION THROUGH ROTATIONS
The topic of this chapter is the rotation of component matrices and core arrays to enhance the interpretability of the results of multiway analysis and in particular, three-mode analyses. This enhancement is realized by increasing the number of large coefficients and decreasing the number of small coefficients in the sets of parameters, the idea being that such rotated solutions are simpler to interpret. Thus, the quest for an increase in the interpretability of the solutions of component models is a search for simplicity of the components, the core array, or both. This quest is particularly relevant for the Tucker models because they have parameter matrices, which can be transformed without loss of fit of the model to the data. In contrast, the other major multiway model, the Parafac model, is defined in such a way that rotating its solution will lead to a loss of fit. Therefore, the chapter will primarily deal with Tuckermodels.
Rotations versus setting parameters to zero. A separate matter is whether one is willing to sacrifice a certain amount of fit in favor of more clear-cut solutions by
App/ied Mulriwuy Data Analysis. By Pieter M. Kroonenberg Copyright @ 2007 John Wiley & Sons, Inc.
238 IMPROVING INTERPRETATION THROUGH ROTATIONS
effectively setting small coefficients to zero. Setting specific parameters of a model to zero introduces restrictions on the solution, the model itself changes, and such changes are essentially different from rotations that operate within the chosen model. It is possible to fit models with restrictions on the component configurations, but applications of such restrictions do not feature very prominently in this book.
When restrictions are introduced, a step toward confirmatory models is made, because with restrictions the models become testable and generally require special algorithms. There is a class of multiway models, that is, multimode covariance structure models, that are clearly confirmatory in nature; see Bloxom (1984), Bentler and Lee (1978b, 1979), Bentler et al. (1988), Oort (1999), and Kroonenberg and Oort (2003a). However, the confirmatory approach in multiway data analysis is a topic that falls outside the scope of this book.
Transformations and rotations. In order to avoid confusion later on, a brief re- mark about the word rotation is in order. Most mathematicians associate the word with orthonormal transformations. However, especially in the social and behavioral sciences, expressions like oblique rotations crop up, referring to nonsingular trans- formations (mostly with unit-length columns) that transform orthogonal axes into nonorthogonal or oblique axes. Note that orthogonal rotations of other than ortho- normal matrices also lead to nonorthogonal axes. Here, we will use the terms rotation and transformation interchangeably as generic terms and specify which kind of trans- formation is intended. However, we will generally refer to rotating components rather than transforming components, because the former expression seems less ambiguous.
Rotation for simplification. Simplifying components has a different rationale than simplification of core arrays. Few large coefficients and many small ones in com- ponent matrices make it easier to equate a component with a latent variable with a narrowly defined meaning. A small number of large elements in a core array indicates that certain combinations of components contribute substantially to the structural im- age, while combinations of components associated with very small or zero core entries do not. In other words, if we succeed in rotating the core array to such a form that there are only a few large elements, we effectively reduce the size of the model itself, because many of the multiplicative combinations do not seriously contribute to the model.
Influence of simplifications. Simpler structures in one or more of the sets of para- meters allow for easier interpretation of that set. However, simplicity in one part of the model will generally create complexity in another part, so that the interpretation of those parts is impaired. In order to obtain coherent descriptions of the results, rotation in one part of a three-mode model should be accompanied by a counterro- tation or inverse transformation of an appropriate other part of the parameters in the model. Mostly, this may mean that if a set of easily interpretable components is real- ized through rotations, the counterrotations of the core array equalize the sizes of the core entries in the same way that rotations in two-mode PCA redistribute the variance
among the components. Equalizing the elements of the core array means that there are fewer large and small core elements and more medium-sized ones, which means that more component combinations have to be interpreted. On the other hand, a simple core array with only a few large elements generally yields complex and correlated components, which tend to be difficult to interpret. Correlated components destroy the variance partitioning in the core array, which may or may not be convenient. On the other hand, correlated variables are the bread and butter of the social and behav- ioral sciences. However, transformations that create an extremely simple core array may also introduce such strong correlations between components as to make it nearly impossible to designate them as separate components.
Model-based and content-based arguments for rotations. Two types of arguments may be distinguished when deciding which parts of a model should be rotated in which way. Model-based arguments prescribe which type of rotations are appropriate with respect to the model, and content-based arguments determine which modes have to be rotated with which type of rotation. Furthermore, much research has gone into the investigation of the number of zero entries a full core array can have given its size by making use of the transformational freedom. It turns out that mathematical theory shows that some core arrays can be extremely simple in this sense. In other words, some simple cores are not the result of a particular data set, but are the result of the mathematical structure of the problem. In such cases the simplicity itself cannot be used as an argument in support of a substantive claim (see Section 10.5).
Rotations and conceptualization of three-mode model. Within the multivariate point of view rotations are strongly linked to the (factor-analytic) idea of latent vari- ables. However, not in all applications is a latent variable a useful concept. More- over, within the multidimensional point of view one may be satisfied with finding the unique low-dimensional subspaces and may look at patterns among the entities (subjects, variables, etc.) in these low-dimensional spaces without attempting to find simple axes in the space. Instead, one could look for directions (possibly more than the number of dimensions of the space) in which the variables, subjects, and so on show interesting variation. The interpretation will then primarily take place in terms of the original quantities (levels of the modes), and rotations may only be useful to find a convenient way of looking around in the low-dimensional space, especially if it has more than two dimensions.
Rotations are two-way not multiway procedures. Even though we speak about rotations of core arrays and seem to imply that we handle the core array in its entirety, in fact, nearly all rotations discussed in multiway modeling pertain to rotations of matrices, that is, two-way matrices rather than three-way arrays. The latter are handled for each mode separately by first matricizing in one direction, then applying some (standard) two-way rotational procedure, and reshaping the rotated matrix back into an array again. This may be done for each mode in turn. Exceptions to this approach
240 IMPROVING INTERPRETATION THROUGH ROTATIONS
are techniques that explicitly attempt to maximize and minimize specific elements of a core array via specific optimization algorithms (i.e.. not via explicit rotations).
Rotations do not change models. In all of what follows, it should be kept in mind that rotations do not change the information contained in the model but only rearrange it. Thus, all rotations are equally valid given a particular model. The rotations do not alter the information itself but only present it in a different way. Nevertheless, certain forms of the models may lend themselves more easily to theoretical development in a subject area than others, and therefore some forms may be preferred above others on grounds external to the data.
There are some model-based aspects as well. For instance, the statistical properties of the coefficients in the original solution may not be the same as the coefficients of the rotated solution. For instance, correlations between components may be introduced where they were not present before.
10.2 CHAPTER PREVIEW
In this chapter we will briefly review procedures for rotation common in standard PCA, in particular, the varimax and the Harris-Kaiser independent cluster rotation of component matrices. Then, we will discuss rotations for the core of both the Tucker2 and Tucker3 models. In particular, we will consider procedures to achieve simplicity in the core itself as well as procedures to get as close as possible to Parafac solutions of the same data. Finally, techniques that combine simplicity rotation for the core array and the components will be examined.
The material dealing with rotations is primarily described in a series of papers by Kiers (1992a, 1997a, 1998b, 1998d) in which rotations of components, rotations of full core arrays, and joint rotations of components and core arrays in the Tucker3 context are discussed. Earlier discussions of rotating Tucker2 extended core arrays toward diagonality can be found in Kroonenberg (1983c, Chapter 5 ) and Brouwer and Kroonenberg (1991), who also mention a few earlier references. Andersson and Henrion (1999) discuss a rotation procedure to diagonalize full core arrays.
The discussion of rotating core arrays is closely tied in with the problem of the maximum number of elements that can be zeroed in a core array without loss of fit (a maximally simple core array). This problem is discussed in Kierss papers mentioned above and in Murakami, Ten Berge, and Kiers (1 998b). There are also important links with the rank of a three-way array but these aspects will not be discussed here (see Section 8.4, p. 177, for a brief discussion and references).
10.2.1 Example: Coping at school
Most rotational procedures will be illustrated by means of a subset of the Coping data set discussed in Section 14.4 (p. 354). In brief, we have 14, specially selected,
ROTATING COMPONENTS 241
primary school children who for 6 potentially stressful situations mainly related to school matters have indicated to what extent they would feel emotions (like being angry, experiencing fear) and to which extent they would use a set of coping strategies to handle the stressful situation (Roder, 2000). The data will be treated as three-way rating scale data in a scaling arrangement (see Section 3.6.2. p. 35). Thus, we have placed the 14 children in the third, the 8 scales (4 emotions, 4 strategies) in the second and the 6 situations in the first mode. The data are centered across situations for each variable-child combination. If we normalize, it will be per variable slice (for more details on preprocessing choices see Chapter 6). We will emphasize the numerical aspects but introduce some substantive interpretation as well.
10.3 ROTATING COMPONENTS
In standard principal component analysis. it is customary to rotate the solution of the variables to some kind of "simple structure" or to "simplicity", most often using Kaiser's (1958) varimax procedure in the case of orthogonal rotations and oblimin (e.g.. Harman. 1976) or promax (Hendrickson & White, 1964) in the case of oblique rotations. These and other rotational procedures have occasionally been applied in three-mode principal component analysis. In this book we will not review the tech- nical aspects of these procedures as good overviews can be found in the standard literature (e.g.. Browne, 2001), but only touch upon those aspects of rotating compo- nents that are different or specifically relevant for three-mode analysis.
Several authors have advocated particular rotations of component matrices from specific types of data. For instance. Lohmoller (1978a) recommends rotation of time-mode component matrices to a matrix of orthogonal polynomials as a target, a proposal also put forward by Van de Geer (1974). Subject components are often not rotated because the first component (especially in research using rating scales) can be used to describe the consensus of the subjects about the relationships between the other two modes. Sometimes subject-mode components are rotated so that the axes pass through centroids of clusters of individuals. On the strength of content- based arguments, Tucker (1972, pp. 10-12) advocated that the "first priority for these transformations should be given to establishing meaningful dimensions for the object space". which concurs more with a multidimensional point of view.
One of the confusing aspects in multiway analJsis when discussing rotations to schieke simplicity is that the coefficients of the component matrices do not automati- cally have the characteristics of scores and variable-component correlations they have in the covariance form of two-mode component analysis (see Section 9.3.1. p. 215). The reason for this is that in the covariance form of two-mode component analysis. variables are automatically standardized and the subject scores are commonly pre-
Rotating normalized versus principal coordinates
242 IMPROVING INTERPRETATION THROUGH ROTATIONS
sented such that they are standardized as well, while the variable coefficients are, at least before rotation, variablesomponent correlations. In three-mode analysis, there is nothing automatic about the preprocessing or the scaling of the components them- selves (see Chapter 6). The basic form of the Tucker models is that all component matrices are orthonormal (i.e., in normalized coordinates), and the size of the data is contained in the core array. Whatever scaling or preprocessing is appropriate has to be explicitly decided by the researcher (see Chapter 9), and when it comes to rotations the type of rotation and the mode to which it has to be applied has to be a conscious decision based both on content and on model arguments.
For the Tucker3 model, three rotation matrices can be defined (two for the Tucker2 model) and after the components of a mode have been rotated the inverse transforma- tion has to be applied to the core array. Thus, if we applied the orthonormal matrix S to A, t...