rebus-pls: a response-based procedure for detecting unit

20
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY Appl. Stochastic Models Bus. Ind. 2008; 24:439–458 Published online 28 August 2008 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/asmb.728 REBUS-PLS: A response-based procedure for detecting unit segments in PLS path modelling V. Esposito Vinzi 1, , , L. Trinchera 2 , S. Squillacciotti 3 and M. Tenenhaus 4 1 ESSEC Business School of Paris, Avenue Bernard Hirsch - B.P. 50105, 95021 Cergy-Pontoise, France 2 DMS, University of Naples ‘Federico II’, Via Cintia 27, C.Monte Sant’Angelo, 80126 Naples, Italy 3 Electricit´ e de France, R&D, 1 Avenue du G´ en´ eral de Gaulle, 92141 Clamart, France 4 HEC School of Management (GREGHEC), 1 Rue de la Lib´ eration, 78351 Jouy-en-Josas, France SUMMARY Structural equation models (SEMs) make it possible to estimate the causal relationships, defined according to a theoretical model, linking two or more latent complex concepts, each measured through a number of observable indicators, usually called manifest variables. Traditionally, the component-based estimation of SEMs by means of partial least squares (PLS path modelling, PLS-PM) assumes homogeneity over the observed set of units: all units are supposed to be well represented by a unique model estimated on the overall data set. In many cases, however, it is reasonable to expect classes made of units showing heterogeneous behaviours to exist. Two different kinds of heterogeneity could be affecting the data: observed and unobserved heterogeneity. The first refers to the case of a priori existing classes, whereas in unobserved heterogeneity no information is available either on the number of classes or on their composition. If a group structure for the statistical units is given, the aim of the analysis is to search for any differences in the behaviours of the a priori given classes. In PLS-PM this would mean studying the effect of directly observed moderating variables, i.e. estimating as many (local) models as there are classes. Unobserved heterogeneity, instead, implies identifying classes of units (a priori unknown) having similar behaviours. Such heterogeneity is captured by an unobserved (latent) discrete moderating variable defining both the number of classes and the class membership. A new method for unobserved heterogeneity detection in PLS-PM is proposed in this paper: response- based procedure for detecting unit segments in PLS-PM (REBUS-PLS). REBUS-PLS, according to PLS-PM features, does not require distributional hypotheses and may lead to local models that are different in terms of both structural and measurement models. An application of REBUS-PLS on real data will be shown. Copyright 2008 John Wiley & Sons, Ltd. Received 17 May 2008; Revised 28 May 2008; Accepted 25 June 2008 Correspondence to: V. Esposito Vinzi, ESSEC Business School of Paris, Avenue Bernard Hirsch - B.P. 50105, 95021 Cergy-Pontoise, France. E-mail: [email protected] Contract/grant sponsor: CERESSEC Contract/grant sponsor: MURST grant Copyright 2008 John Wiley & Sons, Ltd.

Upload: phungkhanh

Post on 01-Feb-2017

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: REBUS-PLS: A response-based procedure for detecting unit

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRYAppl. Stochastic Models Bus. Ind. 2008; 24:439–458Published online 28 August 2008 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/asmb.728

REBUS-PLS: A response-based procedure for detecting unitsegments in PLS path modelling

V. Esposito Vinzi1,∗,†, L. Trinchera2, S. Squillacciotti3 and M. Tenenhaus4

1ESSEC Business School of Paris, Avenue Bernard Hirsch - B.P. 50105, 95021 Cergy-Pontoise, France2DMS, University of Naples ‘Federico II’, Via Cintia 27, C.Monte Sant’Angelo, 80126 Naples, Italy

3Electricite de France, R&D, 1 Avenue du General de Gaulle, 92141 Clamart, France4HEC School of Management (GREGHEC), 1 Rue de la Liberation, 78351 Jouy-en-Josas, France

SUMMARY

Structural equation models (SEMs) make it possible to estimate the causal relationships, defined accordingto a theoretical model, linking two or more latent complex concepts, each measured through a numberof observable indicators, usually called manifest variables. Traditionally, the component-based estimationof SEMs by means of partial least squares (PLS path modelling, PLS-PM) assumes homogeneity overthe observed set of units: all units are supposed to be well represented by a unique model estimated onthe overall data set. In many cases, however, it is reasonable to expect classes made of units showingheterogeneous behaviours to exist.

Two different kinds of heterogeneity could be affecting the data: observed and unobserved heterogeneity.The first refers to the case of a priori existing classes, whereas in unobserved heterogeneity no informationis available either on the number of classes or on their composition.

If a group structure for the statistical units is given, the aim of the analysis is to search for anydifferences in the behaviours of the a priori given classes. In PLS-PM this would mean studying the effectof directly observed moderating variables, i.e. estimating as many (local) models as there are classes.

Unobserved heterogeneity, instead, implies identifying classes of units (a priori unknown) havingsimilar behaviours. Such heterogeneity is captured by an unobserved (latent) discrete moderating variabledefining both the number of classes and the class membership.

A new method for unobserved heterogeneity detection in PLS-PM is proposed in this paper: response-based procedure for detecting unit segments in PLS-PM (REBUS-PLS). REBUS-PLS, according toPLS-PM features, does not require distributional hypotheses and may lead to local models that are differentin terms of both structural and measurement models. An application of REBUS-PLS on real data will beshown. Copyright q 2008 John Wiley & Sons, Ltd.

Received 17 May 2008; Revised 28 May 2008; Accepted 25 June 2008

∗Correspondence to: V. Esposito Vinzi, ESSEC Business School of Paris, Avenue Bernard Hirsch - B.P. 50105,95021 Cergy-Pontoise, France.

†E-mail: [email protected]

Contract/grant sponsor: CERESSECContract/grant sponsor: MURST grant

Copyright q 2008 John Wiley & Sons, Ltd.

Page 2: REBUS-PLS: A response-based procedure for detecting unit

440 V. ESPOSITO VINZI ET AL.

KEY WORDS: classification; latent classes; moderating effects; structural equation modelling; FIMIX-PLS; typological PLS-PM

1. INTRODUCTION

Structural equation models (SEMs) include a number of statistical methodologies that make itpossible to estimate the causal relationships, defined according to a theoretical model, linking twoor more latent complex concepts, each measured through a number of observable indicators knownas manifest variables.

Two different approaches are available for the estimation of a SEM: a covariance structure-basedapproach, maximum likelihood estimation of structural equation models (SEM-ML), also known aslinear structural relations (LISREL), [1, 2], and a component-based (or variance-based) approach,PLS approach to structural equation modelling, also known as partial least squares path modelling(PLS-PM) [3, 4]. The two approaches are more complementary than competing, and the choice ofone rather than the other should depend on the purpose of the analysis, the nature of the modeland the research context [5]. However, since PLS-PM is known to be a ‘soft modelling’ techniquerequiring no distributional assumptions on latent or manifest variables, it is often more adapted toempirical data. Indeed, many real cases show violations of the hypotheses required by SEM-ML,and may thus lead to incoherent or illogical results [6].

Traditionally, SEMs assume homogeneity over the observed set of units. In other words, all unitsare supposed to be well represented by a unique global model. This assumption may, however,often turn out to be false. In many cases it is reasonable to expect that different classes of unitsshowing heterogeneous behaviours exist, and that treating all units as belonging to a single classmay lead to biased results both in terms of model parameters and of validation indexes [7, 8]. Thisproblem is particularly true in marketing research, where SEMs are often used to model customers’behaviours and choices. In this particular domain, treating all units as belonging to the same modelmay hide differences in customer behaviour [9, 10], whereas marketing managers are interestedin finding ways to exploit heterogeneity in customers as new opportunities in business strategies.Indeed, when customers do show different behaviours, models accounting for this heterogeneityallow targeted and more efficient strategies to be defined.

The traditional approach to clustering in structural equation modelling consists of estimatingseparate models for customer segments that have been obtained by external clustering techniques.This can be achieved either by assigning customers to a priori classes on the basis of externalvariables, such as demographic or consumption variables, or through cluster analysis. Concerningthis last point, in SEMs classes can be obtained by performing a cluster analysis either on themanifest or on the estimated latent variable scores. None of these a priori approaches, however, canbe considered really satisfactory for several different reasons. First, heterogeneity in the modelscan very rarely be captured by well-known observable variables [11]. Clustering techniques onmanifest variables or on latent variable scores in no way take into account the model and itsnetwork of relationships. On the other hand, local models obtained by cluster analysis on the latentvariable scores will lead to differences in the group averages of latent variables but not necessarilyto different models. The same method performed on the manifest variables is unlikely to leadto different and well-separated models, both in terms of model parameters and of average latentvariable scores. Moreover, clustering procedures may show some theoretical problems: traditionalcluster analysis assumes independence among variables, while SEMs are based on the assumption

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 3: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 441

that variables (latent or manifest) are correlated; preliminary data reduction techniques may alsolead to statistical problems [7]. Apart from the methodological considerations, a priori clusteringis not conceptually acceptable in this framework since no causal structure among the variablesis postulated: when information concerning the causal relationships among variables is available(as it is in the theoretical causal network defining the structural model), classes should be lookedfor while taking into account this relevant piece of information. In other words, a response-basedclustering method should be used, where the obtained classes are homogeneous with respect to thepostulated model. This approach to clustering is opposed to traditional a priori clustering, whereclasses are defined according to external criteria and, therefore, not related to the existing model.

In this paper we focus on techniques for detecting unit segments uniquely in a PLS-PMframework. A new response-based approach for capturing unit heterogeneity in PLS structuralequation modelling is proposed: response-based procedure for detecting unit segments in PLS-PM(REBUS-PLS). This approach shows a number of interesting features as compared with existingresponse-based clustering techniques in a PLS-PM framework, such as finite-mixture PLS (FIMIX-PLS) [11] and PLS typological path modelling (PLS-TPM) [12]. REBUS-PLS is distribution-freecoherently with PLS basic principles and accounts for heterogeneity in both the structural and themeasurement models.

In the following, we will use the term local whenever the model is estimated over a single subset(class) of units, while the global model is meant to be estimated over the whole set of observedunits.

The first part of this paper deals with existing response-based clustering techniques in PLS-PM.The main methodological aspects of FIMIX-PLS and of PLS-TPM will be presented. A discussionon the main features and the principal issues of both techniques will lead us to present an originalmethodology that is able to handle heterogeneity in PLS-PM: REBUS-PLS. After describing themethodological foundations, the new proposal is tested on real data from a study published by TheEconomist in 1983 [13].

2. PLS-PM IN THE PRESENCE OF A GROUP STRUCTURE: MULTI-GROUP ANALYSISAND LATENT CLASS DETECTION

Two different kinds of heterogeneity could affect the data: observed and unobserved heterogeneity.The first refers to the case of a priori existing classes, while in unobserved heterogeneity noinformation is available either on the number of classes or on their composition.

If the classes’ composition is known, the aim of the analysis is to search for the differences inthe behaviours of the units belonging to the a priori classes by estimating as many local models asthere are classes. In PLS-PM, this would mean studying the effect of directly observed moderatingvariables [14–16]. Traditionally, observed heterogeneity in PLS-PM is dealt with by means ofmulti-group analysis. After the estimation of a global model, units are assigned to classes accordingto prior knowledge or external variables, and then local models are estimated. These models arethen compared, essentially in terms of structural coefficients and goodness-of-fit (GOF) indexes,in order to identify differences among classes. However, not being based on the model itself,the procedure might not necessarily lead to an improvement in the predictive power of the localmodels. Moreover, the obtained groups might not be homogeneous with respect to the structuraland the measurement models.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 4: REBUS-PLS: A response-based procedure for detecting unit

442 V. ESPOSITO VINZI ET AL.

Unobserved heterogeneity, on the other hand, implies identifying classes of units having similarbehaviours. Such heterogeneity is captured by an unobserved (latent) discrete moderating variabledefining both the number of classes and the class membership. In order to achieve this goal, ahomogeneity criterion must be defined. To the authors’ knowledge, three methods exist for handlingunobserved heterogeneity by response-based techniques in a PLS-PM framework: FIMIX-PLS[11, 17], PLS-TPM [12, 18] and PATHMOX [19]. The latter, being more a method based on binarysegmentation leading to classes showing different models than one of classification, will not bediscussed within the present paper.

As described by Hahn et al. [11], FIMIX-PLS is an extension of finite mixture models fromSEM-ML [7, 20] to a PLS-PM framework. This technique combines a finite mixture procedure withan expectation–maximization (EM) algorithm [10, 20], which specifically concerns the PLS-PMpredictions, obtained by means of classical OLS regressions. It is based on the assumption that ifseparate classes exist among the units, the unobserved heterogeneity will be concentrated in thestructural model, i.e. in the relationships between latent variables. Similar to finite mixture modelsin SEM-ML, the population is supposed to be a mixture of two or more sub-populations (hereaftercalled classes), each characterized by a different distribution, and mixed in different proportions.The aim is to identify the class membership probability for each unit.

The first step of FIMIX-PLS consists of estimating the path model through the PLS-PM algo-rithm. Then, the estimated latent variable scores are used to detect the classes by an EM-basedprocedure.

In order to ensure model identification, a normality assumption is required, at least at theendogenous latent variable level. Nevertheless, PLS-PM is a distribution-free technique that doesnot require distributional assumptions on manifest variables. Indeed, the estimated latent variablescores are unlikely to follow a normal distribution except for the case of latent variables corre-sponding to super-blocks in PLS hierarchical models. As shown in [14], these particular latentvariables may approach a normal distribution even if both the manifest variables and the otherlatent variable scores are far from normality.

Moreover, in FIMIX-PLS, heterogeneity is supposed to be only concentrated in the structuralmodel. Therefore, in order to identify the classes and calculate the latent variable scores, themeasurement model is kept constant over the iterations. In other words, in all local models, the outerweights are constant and equal to those obtained for the global model. Thus, if units differ withrespect to the measurement model, FIMIX-PLS is not able to capture this source of heterogeneityas long as the structural coefficients are similar across the different classes. Ringle et al. proposedto solve this problem by performing an ex post analysis in the last step of the procedure [17]. Theex post analysis consists of looking for one or more external variables eventually able to lead to thesame classes as those identified by FIMIX-PLS. Then, a multi-group PLS-PM [16] is performedover the a priori segmented data. This analysis will lead to class-specific latent variable scores aswell as to different measurement and structural models. Very rarely, however, external variableslead unambiguously to the same classes as those yielded by FIMIX-PLS.

Another issue concerning FIMIX-PLS is that the number of classes is not known a priori noris it included as a parameter in the estimation process. In order to detect the optimal partition,FIMIX-PLS has to be repeated, each time using a different choice for the number of classes G, i.e.G=1,G=2,G=3 . . . . Since mixture models are not asymptotically distributed as a chi-squareand the likelihood ratio test (LRT) does not have an asymptotically full rank quadratic form, theLRT statistic is not valid. For this reason, different global measures of fit need to be used whenchoosing the appropriate number of classes, such as the Akaike’s information criteria, the consistent

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 5: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 443

criterion and the Bayesian information criterion, or the entropy measure. The latter indicates thedegree of separation among the classes. Bounded between 0 and 1, its value will increase with theimprovement of class separation. Values higher than 0.5 indicate an unambiguous segmentation.Experience has however shown that the choice of the appropriate number of classes is not atall straightforward in empirical applications. When trying FIMIX-PLS with increasing numberof classes, the method may lead to unacceptable results, such as R2 higher than 1 or negativevariances. These problems appear especially when additional class sizes are too small.

To conclude, since FIMIX-PLS is based on the EM algorithm, all the drawbacks of the EMalgorithm (such as the convergence in local optimum) are likewise present in FIMIX-PLS.

In order to overcome some of the FIMIX-PLS drawbacks, PLS-TPM [12, 18] has beenrecently presented. This is a prediction oriented, distribution-free, response-based clusteringtechnique. The assignment of units to the classes is based on a measure of unit-model ‘distance’.In other words, the procedure iteratively assigns the units to the classes corresponding to theclosest local model, according to a measure that stems from the DModY distance used in PLSregression and in PLS typological regression [21, 22].

In the original formulation [12], PLS-TPM starts by estimating the global model over the wholeset of observed units. After having randomly assigned the units to a previously chosen number Gof classes, the starting local models are estimated, and the distances between each unit and eachlocal model are computed. Units are then re-assigned to the closest local model on the basis ofa distance measure expressly developed. If this leads to modifications in the composition of theclasses, new local models are estimated, and new distances computed, and so on until the classes’compositions become stable. The final local models are then compared and classes are eventuallycharacterized by means of available external variables.

The main problem with this procedure is the choice of the number of classes. As in FIMIX-PLS, the appropriate number of classes is generally a priori unknown, and PLS-TPM has to berepeated with different values of G in order to choose the optimal partition. Unlike FIMIX-PLS,however, no indicator of class separation is available in PLS-TPM since it does not lead to a ‘fuzzy’classification but rather to a ‘hard’ one. In a more recent version of the algorithm [18], searchingfor a solution to this problem, the number of classes and the initial unit assignment to classes areobtained by means of a hierarchical cluster analysis over the global model redundancy residualsfor the endogenous latent variable defined as a target by the user. In PLS-PM, redundancy residualsare defined as the residuals computed when predicting the manifest variables of an endogenouslatent variable from its adjacent latent variables. Once the number of classes is identified, unitsare iteratively assigned to the class corresponding to the closest local model, according to thedistance measure. Stability of results in terms of classes’ compositions is considered as a stoppingcriterion.

PLS-TPM leads to a clustering of the units based on the specified path model. Units are assignedto the closest local model, where the ‘distance’ of the unit from the model takes into accountboth the structural and the measurement model. The latter model plays a role only for the targetendogenous latent variable. As the ‘distance’ is computed from the target latent variable space,local models lead to a higher predictability in terms of R2 value associated with the target latentvariable only. Finally, models concerning the classes are re-estimated at each step, leading to anevolving re-estimation of all elements in the models (path coefficients, latent variable scores, outerweights, etc.).

The main advantage of this methodology is that it is distribution-free, coherently with PLS prin-ciples. It is however based on the assumption that unit heterogeneity is embedded in the structural

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 6: REBUS-PLS: A response-based procedure for detecting unit

444 V. ESPOSITO VINZI ET AL.

model and in the outer model only of the ‘target’ variable. Although well adapted to particulardomains (namely, the measurement of customer satisfaction or loyalty), some problems may arisein empirical applications whenever the model does not comprise a well-identified target variable,or whenever heterogeneity can be supposed to affect also the part of the measurement modelrelated to exogenous blocks. In the particular case where classes show identical structural modelsbut different exogenous measurement models, PLS-TPM would not be able to detect the classes.

So far, PLS-TPM has been implemented on models with reflective blocks as they are mostfrequent in customer satisfaction surveys. However, the ‘distance’ is not adapted to formative targetblocks. Finally, there is no formal proof of convergence for the PLS-TPM algorithm, although ithas always been attained in practice.

3. REBUS-PLS

In order to overcome some of the main limits mentioned above, a new method for capturingunobserved heterogeneity in PLS-PM is proposed in this paper: REBUS-PLS.

The aim of REBUS-PLS is to detect sources of heterogeneity in both the structural and the outermodel for all exogenous and endogenous latent variables. This is attained through the definitionof a new ‘distance’. As in PLS-TPM, this new ‘distance’ has, so far, only been implemented onmodels with reflective blocks. Although this is not to be considered a strict limitation for manyapplications, it must be pointed out that REBUS-PLS requires all blocks to be reflective. Finally,this technique, coherently with PLS-PM features, will require no distributional hypotheses.

In REBUS-PLS the ‘distance’ of a unit from a model is defined by taking into account themodel performance in terms of the residuals related to the structural and the measurement modelfor all available latent variables.

The measure of this ‘distance’ is defined as a sum of squared residuals and therefore referredto as a ‘closeness’ measure (CM).

This CM is defined according to the structure of the GOF index [23], the only available measureof global fit for a PLS path model. The GOF is obtained as the geometric mean of the averagecommunality and the average R2 value so as to assess the overall performance on both the structuraland the measurement model. The GOF is defined as follows:

GOF=√√√√

∑Jj=1

∑p jq=1 Cor

2(xq j , � j )∑Jj=1 p j

×∑J∗

j∗=1 R2(n j∗,{n j ’s explaining n j∗})

J ∗ (1)

where J is the number of latent variables in the model, i.e. the number of blocks; J ∗ is the number ofendogenous latent variables in the model and j∗ indicates a generic endogenous block; Cor(xq j , � j )

is the correlation between the qth manifest variable of the j th block and the corresponding latentvariable scores; R2(n j∗,{n j ’s explaining n j∗}) is the R2 value of the regression linking the j∗thendogenous latent variable to its explanatory latent variables.

The left term of the product in (1) can be considered as an index measuring the predictiveperformance of the measurement models: the communality index. It is obtained as the mean ofthe squared correlations linking each manifest variable (xq j ) to the corresponding latent variable

(� j ) over all blocks. The term on the right side of the product, the average R2, is instead an indexmeasuring the predictive performance of the structural model.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 7: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 445

Following the GOF structure, the chosen CM is based on the residuals of the communalitymodel (i.e. regressions of the manifest variables over their respective latent variables) and of thestructural model (regressions of the endogenous latent variables over their respective explanatorylatent variables). The CM is defined as follows:

CMig=√√√√√√√

∑Jj=1

∑p jq=1[e2iq jg/Com(n jg,xq j )]∑N

i=1∑J

j=1∑p j

q=1[e2iq jg/Com(n jg,xq j )](ng−mg−1)

×∑J∗

j∗=1[ f 2i j∗g/R2(n j∗ , {n j ’s explaining n j∗ })]∑Ni=1

∑J∗j∗=1[ f 2i j∗g/R2(n j∗ , n j ’s explaining n j∗ )]

(ng−mg−1)

(2)

where Com(� jg,xq j ) is the communality index for the qth variable of the j th block in the gth latentclass; eiq jg is the measurement model residual for the i th unit in the gth latent class, correspondingto the qth manifest variable in the j th block, i.e. the communality residual; fi j∗g is the structuralmodel residual for the the i th unit in the gth latent class, corresponding to the j∗th endogenousblock; ng is the number of units belonging to the gth latent class; mg is the number of dimensions.Since all blocks are supposed to be reflective, there will always be one dimension (latent variable)per block.

The measurement residual, or communality residual, of the i th unit belonging to the gth latentclass, is obtained as shown in Equation (3), where xiq j is the observed value of the qth manifestvariable in the j th block for the i th unit and xiq jg is the corresponding estimated value of xiq j(obtained by regression of xq j on the j th latent variable computed for the gth latent class, i.e. � jg):

eiq jg = xiq j − xiq jg (3)

Hence, for the i th unit, xiq jg =�q jg �i jg , with �q jg representing the loading associated with the qth

variable of the j th block in the gth latent class, and �i jg is the score of the j th latent variable forthe i th unit yielded by using the external weight estimated for the gth latent class:

�i jg =p j∑q=1

wq jgxiq j (4)

where wq jg is the external (or outer) weight linking the qth manifest variable of the j th blockto the corresponding latent variable in the gth latent class. The external weight wq jg is obtainedby performing a PLS-PM on the units belonging to the gth latent class. In other words, thecommunality residuals are the residuals of the simple regressions of each manifest variable on thecorresponding latent variable.

The structural residuals are the residuals of the multiple regressions of the endogenous latentvariables on their explanatory latent variables:

fi j∗g = �i j∗g− yi j∗g (5)

where yi j∗g =∑J’s on j∗j=1 � j j∗g �i jg and � j j∗g is the path coefficient linking the j th explanatory

latent variable to the j∗th endogenous latent variable in the gth latent class.If twomodels show identical path coefficients, but differ with respect to one or more outer weights

in the exogenous blocks, REBUS-PLS is now able to identify this other source of heterogeneity,which might be of major importance in practical applications. Moreover, since the CM is defined

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 8: REBUS-PLS: A response-based procedure for detecting unit

446 V. ESPOSITO VINZI ET AL.

Figure 1. The REBUS-PLS algorithm.

according to the structure of the GOF index, REBUS-PLS should naturally lead to local modelswith better overall predictive performance.

Figure 1 displays the steps of the REBUS-PLS iterative algorithm.As in PLS-TPM, the stability of the classes’ composition from one step to the other is considered

as an evidence of convergence although, so far, no formal proof of convergence is available.The identification of heterogeneous classes uniquely based on the model might not provide

information that is relevant for the decision-making process, in the sense that the classes might notbe referred to real interpretable segments. If external variables (e.g. socio–demographic descriptors)are available, once the final local models are obtained, they should be used to describe and possiblycharacterize the identified classes.

4. EMPIRICAL APPLICATION

In 1993, The Economist published a study with the aim of ranking countries according to the qualityof life assessed by western journalists. In the article Where to live: Nirvana by Numbers [13],which appeared in The Economist, the unknown authors presented a semi-serious study to identifythe best country where a western journalist could live out of 22 possible countries. Four differentdimensions of life quality were taken into account by the authors to rank the 22 countries through31 items: the economic dimension, the social dimension, the cultural dimension and the politicaldimension. The ranking procedure used was not described in detail in the article. However, five

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 9: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 447

Figure 2. The PLS-PM structural model.

different indexes were used to provide the countries’ ranking: one index for each of the specificaspects and a global index. In what follows, according to other analyses previously performed onthe same data [24, 25], only 26 out of the 31 original items and 3 synthetic indexes (summarizingthe economic, social and cultural tables) are used to perform the PLS-PM analysis by referringto the path model shown in Figure 2. In the specified model, the economic dimension influencesall other dimensions, whereas the political dimension is influenced by all others. Moreover, aglobal index is also used in the analysis. A previous study [25] also showed that 3 out of the 22countries investigated by The Economist could be considered as outliers according to results of amultiple factorial analysis: Russia, Brazil and India. Taking into account the classification purposeof REBUS-PLS, we have decided to omit these three countries from the analysis.

To summarize, the data set is made up of 29 manifest variables observed over 19 countries andgrouped into 5 blocks (corresponding to latent variables) as shown in Table I.

REBUS-PLS analyses are performed through a SAS-IML [26] macro developed by Trinchera[27]. The new approach leads to the identification of two latent classes: the first one comprises ninecountries (U.S.A., Australia, Great Britain, Bahamas, South Korea, Mexico, Japan, Canada andHong Kong) while the second one comprises 10 countries (New Zealand, Israel, Spain, Hungary,Switzerland, China, Germany, France, Sweden and Italy). For both classes, as well as for theglobal model, the strong correlations between latent variables (Tables II(a–c)) lead to unstableOLS regression estimates for path coefficients that are mostly non-significant and with incoherentsigns (Table III). Therefore, no assessment of the model quality is provided. In order to overcomethis problem, after estimating the latent variable scores for the two classes, we estimated the pathcoefficients by running PLS regressions through the XLSTAT 2008 software [28]. The results forthe structural local models are shown in Table IV and in Figure 3. For each PLS regression runin the global model as well as in the local model concerning group 1, only one PLS componentis retained. For the local model concerning group 2, instead, two PLS components are retainedfor both structural equations. In the global model the economic index, the social index and thecultural index have similar impacts on the political index. However, the local models show that:

• In group 2 the social index has a stronger impact on the political index as compared with theother exogenous latent variables;

• In group 2 the cultural index has a non-significant impact on the political index;

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 10: REBUS-PLS: A response-based procedure for detecting unit

448 V. ESPOSITO VINZI ET AL.

Table I. Composition of the blocks.

Economic indicators (LV1)GDP Gross domestic product per head in 1991 $GROW PACH annual average growth % in 1983–1992INFL Inflation annual average % in 1983–1992UNEM Unemployment % of labour force in 1992TAX Total tax as percentage of GDP in 1991TEL Telephone lines per 100 people in 1990CARS Cars per 1000 people in 1990CO2 Pollution CO2 emissions tons per head in 1989

Social indicators (LV2)SCHO Sec. school enrolment rate % 12–17 yrs in 1990LIFE Life expectancy at birth years in 1991INFA Infant mortality per 1000 live births in 1991DOC Doctors per 100 000 people in 1990DENS Population density people/1000 hectares in 1991POP Population of largest city in 1991 millionsCRIM Murders per 100 000 men in 1990DIVO Divorces as % of marriages in 1990

Cultural indicators (LV3)TV TV sets per 1000 people in 1990CINE Cinemas per million people in 1991NEWS Daily newspapers per 1000 people in 1988–1990MAC Mac Donald’s rest. per million people in 1993ALCO Alcohol consumption litres per head in 1991TEMP Average temperature in Celsius

Political indicators (LV4)PUBL Public sector as % of employment in 1991WOMP Women MPs as % of all MPs in 1991PFI Freedom house political freedom index in 1992CLI Freedom house civil liberties index in 1992

Global indicators (LV5)I ECO Global index for economic aspectsI SOC Global index for social aspectsI CUL Global index for cultural aspects

• In group 2 the social index shows a higher impact on the global index than in group 1 or inthe global model.

We can conclude that countries belonging to group 2 are more driven by social aspects than countriesbelonging to group 1. In fact, for the former countries, the social aspects are more important thanthe economic and the cultural ones in forming both the global index and the political one.

The results for the measurement models are shown in Table V and in Figure 4 where differencesarise between the global model and the two local ones. With respect to the economic latent variable,it is possible to detect the following major differences:

• In group 1, the manifest variable GROW (PACH annual average growth) is clearly lesscorrelated with the latent variable than in the local model for group 2;

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 11: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 449

Table II. Correlations between latent variables for: (a) global model; (b) group 1; and (c) group 2.

Social Economic Cultural Political Global

(a)Social 1Economic 0.878 1Cultural 0.821 0.918 1Political 0.811 0.863 0.785 1Global 0.839 0.826 0.755 0.765 1

(b)Social 1Economic 0.922 1Cultural 0.779 0.942 1Political 0.708 0.880 0.857 1Global 0.806 0.898 0.798 0.833 1

(c)Social 1Economic 0.941 1Cultural 0.807 0.852 1Political 0.979 0.919 0.830 1Global 0.878 0.835 0.632 0.815 1

Table III. Path coefficients estimated by OLS regressions for the global and thelocal models yielded by REBUS-PLS.

Endogenous LV Explanatory LV Global Group 1 Group 2

Pol. index R2 0.758 0.868 0.966Eco index 0.717∗ 2.445∗ 0.146Soc. index 0.237 −1.069 −0.995Cul. index −0.068 −0.614 −0.152

Glob. index R2 0.741 0.900 0.826Eco. index 0.369 −3.203 0.190Soc. index 0.477 1.036 1.747Cul. index −0.057 1.316 −0.163Pol. index 0.104 0.125 0.936

Soc. index R2 0.771 0.851 0.886Eco. index 0.878∗∗∗ 0.922∗∗∗ 0.941∗∗∗

Cul. index R2 0.842 0.888 0.726Eco. index 0.918∗∗∗ 0.942∗∗∗ 0.852∗∗∗

∗P-value less than 0.1; ∗∗P-value less than 0.05; ∗∗∗P-value less than 0.01.

• The manifest variable TAX (total tax as percentage of GDP) is more strongly correlatedwith the latent variable in the local model for group 1 than in the global model while it isnegatively correlated with the latent variable in group 2;

• In group 2, the manifest variable UNEM (unemployment % of labour force) is weaklycorrelated with the economic index.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 12: REBUS-PLS: A response-based procedure for detecting unit

450 V. ESPOSITO VINZI ET AL.

Table IV. Path coefficients estimated by PLS regressions for the global and thelocal models yielded by REBUS-PLS.

Endogenous LV Explanatory LV Global Group 1 Group 2

Pol. index No. PLS comp. 1 1 2R2 0.736 0.731 0.950

Eco index 0.315 0.320 0.261Soc. index 0.296 0.258 0.714Cul. index 0.286 0.312 0.014

Glob. index No. PLS comp. 1 1 2R2 0.719 0.782 0.778

Eco. index 0.234 0.253 0.371Soc. index 0.237 0.227 0.549Cul. index 0.214 0.223 0.342Pol. index 0.216 0.235 0.248

Soc. index R2 0.771 0.851 0.886Eco. index 0.878 0.922 0.941

Cul. index R2 0.842 0.888 0.726Eco. index 0.918 0.942 0.852

GOF — 0.567 0.637 0.602

Figure 3. Path coefficients compared among REBUS local models and the global model.

Differences arise also for the social index as follows:

• The manifest variables DOC (doctors per 100 000 people), CRIM (murders per 100 000 men) andDIVO (divorces as % of marriages) are more strongly correlated with the social latent variablein group 1 than in group 2;

• Group 2 is, instead, characterized by a positive correlation between the manifest variable DENS(population density) and the related latent variable.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 13: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 451

Table V. Standardized loadings for the global and thelocal models yielded by REBUS-PLS.

LV MVs Global Group 1 Group 2

Eco. indexGDP 0.835 0.846 0.919GROW −0.587 −0.334 −0.730INFL −0.409 −0.604 −0.229UNEM 0.356 0.485 0.182TAX 0.307 0.739 −0.402TEL 0.816 0.861 0.840CARS 0.953 0.935 0.949CO2 0.700 0.898 0.789

Soc. indexSCHO 0.873 0.845 0.919LIFE 0.850 0.825 0.866INFA −0.964 −0.934 −0.993DOC 0.587 0.827 0.501DENS −0.196 −0.694 0.228POP −0.433 −0.377 −0.407CRIM −0.614 −0.750 0.122DIVO 0.653 0.808 0.592

Cul. indexTV 0.904 0.934 0.924

CINE 0.645 0.688 0.654NEWS 0.328 −0.123 0.792MAC 0.502 0.905 0.495

ALCOL 0.698 0.804 0.637TEMP −0.648 −0.466 −0.841

Pol. indexPUBL 0.470 0.908 0.075WOMP 0.241 0.437 0.016PFI −0.914 −0.833 −0.990CLI −0.924 −0.962 −0.991

Glob. indexI ECO −0.065 −0.526 0.616I SOC 0.806 0.662 0.841I CUL 0.666 0.824 0.609

Groups 1 and 2 also differ with respect to the correlations between the cultural index and itsown manifest variables, namely NEWS (daily newspapers per 1000 people), MAC (number ofMacDonald’s per million people) and TEMP (average temperature in Celsius).

Finally, with respect to the political index, group 1 is characterized by a higher correlationbetween PUBL (public sector as % of employment) and the latent variable while, for group 2,both PUBL and WOMP (women MPs as % of all MPs in 1991) are practically uncorrelated withthe political index.

As expected, both local models show higher values for GOF and R2 than the global model.Hence, in this application REBUS-PLS has led not only to identifying latent classes showing local

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 14: REBUS-PLS: A response-based procedure for detecting unit

452 V. ESPOSITO VINZI ET AL.

Figure 4. Measurement model compared among REBUS local models and the global model.

models different on both the measurement and the structural models but also to local modelsshowing a better performance (higher values for GOF indexes).

In order to understand whether a group structure yielded independently from the path modelmight still provide well performing PLS-PM results, the REBUS results are compared with localmodels computed on groups identified by traditional clustering techniques. Namely REBUS resultsare compared with the local models estimated on groups obtained by performing a cluster analysisfirst on all the manifest variables, and then on the latent variable scores of the global model. Boththe ascendant hierarchical cluster analyses are performed using the Ward criterion in XLSTAT 2008[28]. The dendrogramme from the cluster analysis performed on the manifest variables is shownin Figure 5. It is possible to notice that even a non-response-based technique leads to the detectionof two groups of countries: a first group containing 13 countries (U.S.A., Australia, Great Britain,Canada, New Zealand, Israel, Spain, Hungary, Switzerland, Germany, France, Sweden and Italy)and a second group containing 6 countries (Japan, Hong Kong, Bahamas, South Korea, Mexicoand China).

OLS regression estimates for path coefficients (Table VI) are unstable because of strong corre-lations between latent variables. PLS estimates of the path coefficients are then obtained throughthe XLSTAT 2008 software [28] by performing a PLS regression among the latent variables(Table VII). For each PLS regression run in the global model only one component is retained.For group 1, two PLS components are retained when the political index is the endogenous latentvariable, while for group 2, two components are retained when the global index is the endogenouslatent variable. Results for the local models estimated for the two classes yielded by cluster analysison the manifest variables are shown in Tables VII and VIII as well as in Figures 6 and 7.

As expected, the composition of the two groups is different from the results yielded by REBUS-PLS because of the different optimization criterion. The local model performance for the first groupis clearly worse (in terms of GOF values) than the global model (see Table VII). Nevertheless, all

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 15: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 453

Figure 5. Hierarchical cluster analysis performed on manifest variables.

Table VI. Path coefficients estimated by OLS regressions for the global and thelocal models yielded by cluster analysis on manifest variables.

Endogenous LV Explanatory LV Global Group 1 Group 2

Pol. index R2 0.758 0.598 0.995Eco index 0.717∗ 0.930∗∗ 3.856∗∗∗Soc. index 0.237 −0.290 −0.884∗∗Cul. index −0.068 0.098 −2.209∗∗

Glob. index R2 0.741 0.409 0.887Eco. index 0.369 0.709 −2.854Soc. index 0.477 0.195 1.288Cul. index −0.057 0.441 2.055Pol. index 0.104 0.500 0.598

Soc. index R2 0.771 0.708 0.921Eco. index 0.878∗∗∗ 0.841∗∗∗ 0.959∗∗∗

Cul. index R2 0.842 0.578 0.960Eco. index 0.918∗∗∗ 0.760∗∗∗ 0.980∗∗∗

∗P-value less than 0.1; ∗∗P-value less than 0.05; ∗∗∗P-value less than 0.01.

the fit indexes of the second group, except for the R2 value associated with the political Latentvariable, are higher than the ones obtained by REBUS-PLS.

The dendrogramme corresponding to the cluster analysis performed on the latent variablesscores obtained for the global model is shown in Figure 8. Once again, two groups of countriesare detected. The groups’ composition is almost the same: only Japan has switched from group 2to group 1.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 16: REBUS-PLS: A response-based procedure for detecting unit

454 V. ESPOSITO VINZI ET AL.

Table VII. Path coefficients estimated by PLS regressions for the global and thelocal models yielded by cluster analysis on manifest variables.

Endogenous LV Explanatory LV Global Group 1 Group 2

Pol. index No. PLS comp. 1 2 1R2 0.736 0.592 0.609

Eco index 0.315 0.907 0.290Soc. index 0.296 0.144 0.249Cul. index 0.286 0.030 0.252

Glob. index No. PLS comp. 1 1 2R2 0.719 0.267 0.882

Eco. index 0.234 0.168 0.269Soc. index 0.237 0.159 0.446Cul. index 0.214 0.181 0.387Pol. index 0.216 0.061 0.206

Soc. index R2 0.771 0.709 0.921Eco. index 0.878 0.842 0.959

Cul. index R2 0.842 0.578 0.960Eco. index 0.918 0.760 0.980

GOF — 0.587 0.442 0.614

However, these two classes are homogeneous with respect to the latent variable scores, but notwith respect to the relationships estimated in the models.

The dendrogramme in Figure 8 shows that this second cluster analysis has led to classessimilar to those provided when using manifest variables. For sake of brevity we will not givea detailed description of the results yielded by the path models on these classes: classes withsimilar composition will not yield different results. Indeed, similar results yielded by both clusteranalyses confirm that in PLS-PM the structural model plays a marginal role in estimating the latentvariable scores. Consequently, the levels of manifest variables are very close to the levels of thecorresponding latent variables.

5. CONCLUSIONS AND FUTURE PERSPECTIVES

REBUS-PLS has performed well in detecting latent classes within the PLS-PM framework. Fromthe application standpoint, the obtained local models show differences in the models (both thestructural and the measurement one) that allow a coherent interpretation of the obtained classes.The cluster analyses on both the manifest and the latent variables provide two classes with a similarcomposition, but only the local models obtained through REBUS-PLS show a better performancefor both identified classes, in terms of R2 and GOF.

The performance of REBUS-PLS still needs to be more generally assessed by means of athorough simulation study. This study will have to aim at understanding the sensitivity of themethod to differently structured data sets. REBUS-PLS performance must also be compared withFIMIX-PLS and PLS-TPM. These methods actually share a similar objective though with differentoptimizing criteria. The results of the simulation study will represent the content of a future paper.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 17: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 455

Table VIII. Standardized loadings for the global and the local models yieldedby clustering analysis on manifest variables.

LV MVs Global Group 1 Group 2

Eco. indexGDP 0.835 0.920 0.969GROW −0.587 0.301 −0.316INFL −0.409 −0.577 −0.419UNEM 0.356 −0.571 0.115TAX 0.307 −0.628 −0.443TEL 0.816 0.725 0.943CARS 0.953 0.940 0.665CO2 0.700 0.685 0.861

Soc. indexSCHO 0.873 0.587 0.924LIFE 0.850 0.498 0.915INFA −0.964 −0.572 −0.957DOC 0.587 −0.551 0.698DENS −0.196 −0.026 −0.467POP −0.433 0.635 −0.175CRIM −0.614 0.414 −0.507DIVO 0.653 0.747 0.911

Cul. indexTV 0.904 0.915 0.894

CINE 0.645 −0.092 −0.014NEWS 0.328 0.193 0.847MAC 0.502 0.954 0.800

ALCOL 0.698 −0.307 0.573TEMP −0.648 −0.312 0.280

Pol. indexPUBL 0.470 0.540 0.796WOMP 0.241 −0.212 0.940PFI −0.914 0.881 0.912CLI −0.924 0.766 0.964

Glob. indexI ECO −0.065 0.208 0.955I SOC 0.806 −0.828 0.880I CUL 0.666 0.911 0.342

From the methodological standpoint, the differences among models estimated over distinctsamples are, in PLS-PM, generally assessed by means of partial comparisons, i.e. referred to eachcoefficient [16] by means of bootstrap-based or permutation tests. A relevant research perspectiveconcerns the development of statistical descriptive indexes and inferential tests making it possible tovalidate that local models yielded by response-based class detection techniques are to all purposeswell (or significantly) separated in statistical terms.

Moreover, real applications may often demand to search for classes that are somehow homoge-neous in terms of both model’s parameters and external features (defined by concomitant variable)capable to specify the traits of the statistical units belonging to the different latent classes. For

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 18: REBUS-PLS: A response-based procedure for detecting unit

456 V. ESPOSITO VINZI ET AL.

Figure 6. Path coefficients compared among local models yielded via cluster analysison manifest variables and the global model.

Figure 7. Measurement model compared among local models obtained via cluster analysison manifest variables and the global model.

instance, in consumer science, it could be important to detect groups of consumers showing notonly significantly different models, but also common features in terms of socio-demographic vari-ables or purchase attitudes for interpretation purposes. At present, REBUS-PLS well identifies

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 19: REBUS-PLS: A response-based procedure for detecting unit

DETECTING UNIT SEGMENTS IN PLS-PM 457

Figure 8. Hierarchical cluster analysis performed on latent variable scores estimated for the global model.

homogeneous latent classes. However, these classes are not necessarily well characterized in termsof external variables as they do not play an active role in the algorithm.

Finally, when estimating model parameters in association with an unsupervised clusteringapproach, there is a risk of overfitting. The authors are aware that the improved performance of localmodels could, in all the presented examples, also be explained by this factor and are investigatingways for assessing the impact and the size of this problem.

ACKNOWLEDGEMENTS

The participation of V. Esposito Vinzi to this research was supported by CERESSEC, Research Center ofthe ESSEC Business School in Paris. The participation of L. Trinchera to this research was supported bythe MURST grant ‘Multivariate statistical models for the ex ante and the ex post analysis of regulatoryimpact’, coordinated by C. Lauro (2006).

REFERENCES

1. Joreskog KG. A general method for analysis of covariance structure. Biometrika 1970; 57:239–251.2. Joreskog KG. A general method for estimating a linear structural equation system. In Structural Equation Models

in the Social Sciences, Goldberger AS, Duncan OD (eds). Academic Press: New York, 1973; 85.3. Wold H. Estimation of principal components and related models by iterative least squares. In Multivariate

Analysis, Krishnaiah PR (ed.). Academic Press: New York, 1966; 391.4. Tenenhaus M, Esposito Vinzi V, Chatelin Y-M, Lauro C. PLS path modeling. Computational Statistics and Data

Analysis 2005; 48:159–205.5. Lohmoller J-B. Latent Variable Path Modelling with Partial Least Squares. Physica-Verlag: Heidelberg, 1989.6. Chen F, Bollen KA, Paxton P, Curran P, Kirby J. Improper solutions in structural equation models: causes,

consequences, and strategies. Sociological Methods and Research 2001; 29:468–508.7. Jedidi K, Jagpal SH, De Sarbo WS. STEMM: a general finite mixture structural equation model. Journal of

Classification 1997; 14:23–50.

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb

Page 20: REBUS-PLS: A response-based procedure for detecting unit

458 V. ESPOSITO VINZI ET AL.

8. Jedidi K, Jagpal SH, De Sarbo WS. Finite-mixture structural equation models for response-based segmentationand unobserved heterogeneity. Marketing Science (1986–1998) 1997; 16:39–60.

9. Ozcan K. Modelling consumer heterogeneity in marketing science. Working Paper, Department of Marketing,University of Michigan Business School, 1998.

10. Wedel M, Kamakura WA. Market Segmentation: Conceptual and Methodological Foundations (2nd edn).International Series in Quantitative Marketing. Kluwer Academic Publishers: Dordrecht, U.S.A., 2000.

11. Hahn C, Johnson M, Herrmann A, Huber F. Capturing customer heterogeneity using a finite mixture PLSapproach. Schmalenbach Business Review 2002; 54:243–269.

12. Squillacciotti S. Prediction oriented classification in PLS path modelling. In Handbook of Partial Least Squares—Concepts, Methods and Applications, Esposito Vinzi V, Chin W, Henseler J, Wang H (eds). Springer: Heidelberg,2008.

13. The Economist. Where to live: nirvana by numbers. The Economist 25 December 1993; 73–76.14. Tenenhaus M, Mauger E, Guinot C. Use of ULS-SEM and PLS-SEM to measure interaction effect in a regression

model relating two blocks of binary variables. In Handbook of Partial Least Squares—Concepts, Methods andApplications, Esposito Vinzi V, Chin W, Henseler J, Wang H (eds). Springer: Heidelberg, 2008.

15. Henseler J, Fassott G. Testing moderating effects in PLS path models: an illustration of available procedures. InHandbook of Partial Least Squares—Concepts, Methods and Applications, Esposito Vinzi V, Chin W, Henseler J,Wang H (eds). Springer: Heidelberg, 2008.

16. Chin W, Dibbern J. A permutation based procedure for multigroup PLS analysis: results of tests of differenceson simulated data and a cross cultural analysis of the sourcing of information system services between Germanyand the U.S.A. In Handbook of Partial Least Squares—Concepts, Methods and Applications, Esposito Vinzi V,Chin W, Henseler J, Wang H (eds). Springer: Heidelberg, 2008.

17. Ringle CM, Wende S, Will A. Finite mixture partial least squares analysis: methodology and numerical examples.In Handbook of Partial Least Squares—Concepts, Methods and Applications, Esposito Vinzi V, Chin W, Henseler J,Wang H (eds). Springer: Heidelberg, 2008.

18. Trinchera L, Squillacciotti S, Esposito Vinzi V. PLS typological path modeling: a model-based approach toclassification. In Proceedings of KNEMO 2006, Esposito Vinzi V, Lauro C, Braverma A, Kiers HAL, SchmiekMG (eds), 2006; 87. ISBN 88-89744-00-6.

19. Sanchez G, Aluja T. Pathmox: a PLS-PM segmentation algorithm. In Proceedings of KNEMO 2006, EspositoVinzi V, Lauro C, Braverman A, Kiers HAL, Schimek MG (eds), 2006; 69. ISBN 88-89744-00-6.

20. McLachlan GJ, Basford KE. Mixture Models: Inference and Application to Clustering. Marcel Dekker: New Yorkand Basel, 2004. ISBN: 0-8247-7691-7.

21. Tenenhaus M. La Regression PLS: theorie et pratique. Technip: Paris, 1998.22. Esposito Vinzi V, Lauro C. PLS regression and classification. Proceedings of the PLS’03 International Symposium.

DECISIA: France, 2003; 45–56.23. Tenenhaus M, Amato S, Esposito Vinzi V. A global goodness-of-fit index for PLS structural equation modelling.

Proceedings of the XLII SIS Scientific Meeting, Contributed Papers. CLEUP: Padova, 2004; 739–742.24. Camiz S, Langrand C. Nirvana by numbers: a study on the study. Conference Proceedings of the International

Conference on Intelligent Technologies, Assumption University, Bangkok, Thailand, 2000; 440–449.25. Camiz S, Chatelin YM, Trinchera L. Nirvana by numbers: comparison between MFA and PLS approaches

on an example. In Proceedings of PLS-05 International Symposium, Aluja T, Casanovas J, Esposito Vinzi V,Morineau A, Tenenhaus M (eds). SPAD Test&Go: Paris, France, 2005.

26. SAS-IML R© 9.1. SAS Institute Inc.: Cary, North Carolina 2, U.S.A.27. Trinchera L. Unobserved heterogeneity in structural equation models: a new approach in latent class detection in

PLS path modeling. Ph.D. Thesis, Department of Mathematics and Statistics, University of Naples, Italy, 2007.28. XLSTAT 2008. XLSTAT Addinsoft: Paris, France (www.xlstat.com).

Copyright q 2008 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2008; 24:439–458DOI: 10.1002/asmb