model-based reliability and validity of measurement models ... · emotional intelligence. these...

Model-based Reliability and Validity of Measurement Models

Using Structural Equation Modeling

Leila Karimi

Ph.D

2015

To

“The hands of my mum who showed me how the impossible

can become possible with hard work,

and the big heart of my dad who always gave me the courage

to follow my dreams”

ABSTRACT

Structural Equation Modeling (SEM) has a long and interesting history and it

continues to evolve, providing exciting research opportunities. This study considers the

roots of SEM and model-based reliability and the developments in these areas in the

context of measurement models. Looking to the future, the research provides important new

applications of model-based reliability in bifactor models using Covariance-based SEM

(CB-SEM) and reflective-formative measurement models, using Partial Least Squares SEM

(PLS-SEM). In addition the application of Bentler’s covariate-dependent reliability for

reliability assessments and demonstrating common method bias are demonstrated for the

first time.

The thesis considers three important research studies involving work ability, work

organisational assessment and a survey of social desirability, wellness, drinking habits and

emotional intelligence. These studies are used to demonstrate the above new developments

in model-based reliability. The contribution of the research and directions for future

research are discussed separately for each study and in general.

1

ACKNOWLEDGEMENTS

I would like to express my sincere appreciation to my supervisor and mentor,

Associate Professor Denny Meyer, for her continual support, encouragement, patience and

kindness throughout the life of this PhD. I would also like to extend my gratitude to

Professor Peter Bentler, University of California, Los Angeles (UCLA) who introduced me

to a new era of research.

Thank you also to my associate supervisors Professor Philip Taylor from Monash

University and Associate Professor Christine Critchley from Swinburne University for their

support through different stages of this project. Thanks to the Business, Work & Ageing

Centre for Research (BWA) at Swinburne University for sharing the database on WAS and

Dr Jodi Oakman from La Trobe University for generously sharing the paramedics data, Dr

James Gaskin from Brigham Young University, Professor Joerg Henseler from the

University of Twente and Professor Christian M. Ringle from the Hamburg University of

Technology (TUHH) for introducing me to the world of Partial Least Squares (PLS).

Special thanks to my lovely family and friends, particularly my brothers ‘Puya’ and

‘Hamid’ for their unconditional love and support which kept me strong during the difficult

times. I am so lucky to have them in my life. The last but not the least, thanks to my caring,

supportive partner ‘Arron’ for putting up with ‘crazy me’ in the last few stressful months of

wrapping up this project.

2

DECLARATION

This is to declare that the examinable outcome:

contains no material which has been accepted for the award to the candidate

of any other degree or diploma, except where due reference is made in the

text of the examinable outcome;

to the best of the candidate’s knowledge contains no material previously

published or written by another person except where due reference is made

in the text of the examinable outcome.

Signature

Date

3

TABLE OF CONTENTS

1 INTRODUCTION TO THE THESIS ................................................................................ 12

1.1 Introduction ............................................................................................ 12

1.2 Study Structure ....................................................................................... 14

1.3 Summary ................................................................................................. 18

2 THE HISTORY OF THE EVOLUTION OF SEM IN PSYCHOLOGY .................................... 20

2.1 First Trend: Exploratory Factor Analysis................................................... 22

2.2 Second Trend: Confirmatory Factor Analysis (CFA) .................................. 24

2.3 Third Trend: Factor Analysis of SEM (FASEM) .......................................... 25

2.4 Current Developments in SEM ................................................................ 31

2.5 Conclusion .............................................................................................. 33

3 THE EVOLUTION OF MODEL-BASED RELIABILITY ESTIMATES ................................... 35

3.1 Introduction ............................................................................................ 35

3.2 Classical Test Theory and Coefficient Alpha ............................................. 36

3.3 Major Problems with Using a Coefficient Alpha Reliability Analysis ......... 42

3.4 Unidimensional Model-based Reliability ................................................. 44

3.5 Recent Developments ............................................................................. 46

3.6 Summary ................................................................................................. 56

4 THE VALIDITY OF BIFACTOR VERSUS HIGHER-ORDER MEASUREMENT MODELS ..... 59

4.1 Bifactor Model of WOAQ......................................................................... 61

4.2 Summary ................................................................................................. 64

5 THE VALIDITY OF FORMATIVE MEASUREMENT MODELS VERSUS REFLECTIVE MODELS........................................................................................................................... 65

5.1 Differences between Formative and Reflective Models ........................... 66

5.2 Applications of Formative Models ........................................................... 72

5.3 Developing a Framework for Distinguishing Reflective- Formative Models .. ................................................................................................................ 74

4

5.4 Measurement Model Misspecification in Organisational Psychology Literature ............................................................................................................. 79

5.5 Summary and Conclusion ........................................................................ 84

6 STUDY 1: MODEL-BASED RELIABILITY, VALIDITY AND CROSS VALIDITY OF BIFACATOR MODEL FOR WOAQ ......................................................................................................... 86

6.1 Rational and Objectives ........................................................................... 88

6.2 Method ................................................................................................... 98

6.3 Summary ............................................................................................... 106

7 STUDY 1: RESULTS .................................................................................................. 108

7.1 Results: Study of Nurses-Validation of Bifactor Model of WOAQ ........... 108

7.2 Results: Study of Paramedics-Cross Validation of Bifactor Model WOAQ .... .............................................................................................................. 120

8 STUDY 1: DISCUSSION ............................................................................................ 125

8.1 Discussion: Study of Nurses-Validation of Bifactor Model of WOAQ ...... 125

8.2 Discussion: Study of Paramedics-Cross Validation of Bifactor Model of WOAQ ............................................................................................................ 129

8.3 Strengths and Limitations ...................................................................... 130

8.4 Summary and Conclusion ...................................................................... 132

9 STUDY 2: APPLICATIONS OF COVARIATE-DEPENDENT RELIABILITY........................ 134

9.1 Rational and Objectives ......................................................................... 134

9.2 Method ................................................................................................. 143

10 STUDY 2: RESULTS .............................................................................................. 147

10.1 Results of Application for Reliability Assessments – The study of WOAQ . .......................................................................................................... 147

10.2 Model Fit Evaluation .......................................................................... 150

10.3 Application in Demonstrating CMB using Social Desirability ............... 156

11 STUDY 2: DISCUSSION ......................................................................................... 169

11.1 Discussion: Application in Reliability Assessment of WOAQ ............... 170

5

11.2 Discussion: Application in Demonstrating CMB .................................. 172

11.3 Strengths ........................................................................................... 174

11.4 Limitations and Directions for Future Research ................................. 175

12 STUDY 3: MODEL-BASED RELIABILITY AND VALIDITY OF REFLECTIVE-FORMATIVE MODEL OF WAS USING PLS-SEM ................................................................................... 177

12.1 Rationale and Objectives ................................................................... 177

12.2 Method ............................................................................................. 201

13 STUDY 3: RESULTS .............................................................................................. 213

13.1 Results of Model Fit Evaluation .......................................................... 213

13.2 Comparison of the Misspecified Models with the Correctly Specified WAS Model ..................................................................................................... 221

14 STUDY 3: DISCUSSION ......................................................................................... 225

14.1 Implications for Work Ability Assessments......................................... 229

14.2 Limitations and Directions for Future Research ................................. 230

15 SUMMARY .......................................................................................................... 233

15.1 Study 1: Model-Based Reliability, Validity and Cross Validity of the Bifactor Model for WOAQ ............................................................................... 233

15.2 Study 2: Applications of Covariate-dependent Reliability ................... 235

15.3 Study 3: Model-based Reliability and Validity of Reflective-formative Model of WAS ................................................................................................. 236

15.4 Thesis contributions to SEM .............................................................. 238

15.5 Summary ........................................................................................... 242

16 APPENDICES ........................................................................................................ 243

16.1 PUBLISHED ARTICLES ......................................................................... 244

16.2 Validity and model-based reliability of the Work Organisation Assessment Questionnaire (WOAQ) among nurses ......................................... 244

16.3 Structural Equation Modeling in Psychology: The History, Development and Current Challenges ................................................................................... 257

6

16.4 Cross-validation of the Work Organization Assessment Questionnaire across gender: A study of Australian Health Organization ............................... 268

16.5 EVALUATING A HIGHER-ORDER MISSPECIFIED REFLECTIVE MODEL OF WAS USING CB-SEM. ....................................................................................... 283

16.6 DEFINITIONS OF IMPORTANT TERMS ................................................. 296

16.7 THE WOAQ AND ITS SUBFACTORS ITEMS. .......................................... 300

16.8 The R-WAS questionnaire .................................................................. 302

16.9 Appendix F. List of items used in construction of WAS ....................... 323

16.10 Ethics clearance ................................................................................. 326

16.11 A List of Articles Included in the Review ............................................. 333

17 REFERENCES ........................................................................................................ 355

7

LIST OF TABLES

Table 7.1 Descriptive Statistics of the Demographic Variables ............................ 109

Table 7.2 Subscales and WOAQ Items ................................................................ 110

Table 7.3 Item Characteristics of WOAQ ............................................................. 111

Table 7.4 Completely Standardized Maximum Likelihood (ML) Solutions of Higher order Model and the Bifactor Model .................................................................. 117

Table 7.5 Summary of Model Fit Statistics of the CFA Models of WOAQ ............. 118

Table 7.6 The Reliability Coefficients of WOAQ among Nursing Sample (n=312) . 119

Table 7.7 Characteristics of Paramedic Participants ............................................ 121

Table 7.8 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Gender .................................................................................................... 121

Table 7.9 Invariance Testing Across Gender for the Bifactor Model of WOAQ. ... 123

Table 10.1 The Descriptive Characteristics of the Main Study Constructs and Parameters (n=1255) .......................................................................................... 148

Table 10.2 Nursing and Paramedic Demographic Characteristics ........................ 149

Table 10.3 Mean Age Differences between Nursing and Paramedic ................... 150

Table 10.4 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Organisations ............................................................................... 151

Table 10.5 Summary of Model Fit Statistics of the Bifactor Models of WOAQ (n=1257) ............................................................................................................. 151

Table 10.6 WOAQ Reliability Statistics for Nursing (as reported in Chapter 7) and Paramedic Organisations .................................................................................... 152

Table 10.7 WOAQ Reliability Statistics across Organisations (n=1257) ............... 153

Table 10.8 Invariance Testing Across Organisations for the Bifactor Model of WOAQ ................................................................................................................ 154

Table 10.9 Summary of the Demographic Characteristics of the Participants (n=341) ............................................................................................................... 156

Table 10.10 Comparison of the Covariate-dependent and Covariate-free Reliability Coefficients of the Scales after Including CMB .................................................... 158

8

Table 10.11 Summary of Fit Indices of Comparison Models ................................ 166

Table 10.12 Standardised Factor Loadings for Different Models Compared to Baseline .............................................................................................................. 167

Table 12.1 Items of the Work Ability Index ......................................................... 184

Table 13.1 Quality Criteria of the Reflective-formative WAS First-order Constructs using PLS-SEM .................................................................................................... 215

Table 13.2 Intercorrelation Analysis and the Square Roots of AVE of First-order Constructs of Reflective-formative PLS-SEM Model † ......................................... 217

Table 13.3 The Standardised Mean Coefficients of the Second-order formative Constructs of Reflective-formative PLS-SEM Model (n=5000 bootstrap) ............. 218

Table 13.4 Results for Third-order formative Constructs of Reflective-formative WAS (n=5000 bootstrap samples)....................................................................... 219

Table 13.5 Comparing the Standardized Path Coefficients of Misspecified and Correctly Specified WAS Models ......................................................................... 222

Table 13.6 Comparing the Model-based Reliability Coefficients of a Misspecified Reflective-reflective WAS (CB-SEM) with the Correctly Specified Reflective-formative Model of WAS .................................................................................................... 224

Table 16.1 The Standardised Path Parameter Estimates for the Misspecified Reflective WAS Model Using the CB-SEM Procedure .......................................... 285

Table 16.2 The Parameter Estimates for First-order Reflective Model Using the CB-SEM Procedure ................................................................................................... 286

Table 16.3 Intercorrelation analysis and the square roots of AVE for subfactors. 288

Table 16.4 Structural Model Results for Second-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap). .................................................. 290

Table 16.5 Structural Model Results for Higher-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap samples). .................................... 291

Table 16.6 Structural Model Results for Second-order Reflective Constructs of Full Formative PLS-SEM Model (n=5000 bootstrap). ................................................. 294

Table 16.7 Structural Model Results for Higher-order Reflective Constructs (n=5000 bootstrap samples). ............................................................................................ 295

9

LIST OF FIGURES

Figure 1.1. The study structures in this thesis. ...................................................... 17

Figure 2.1. A pseudo path diagram and timeline for some of the developments in SEM modeling and SEM model structures. ........................................................... 21

Figure 2.2. One of Wright's first path diagrams for genetic modeling. .................. 26

Figure 3.1. A pseudo path diagram and timeline for some of the developments in historical review of the conceptualisation and estimation of model-based reliability ............................................................................................................................. 38

Figure 3.2. Demonstrating Omega reliability coefficients for WOAQ..................... 49

Figure 3.3. A unidimensional construct with four indicators ................................. 53

Figure 3.4. A covariate-dependent construct with four indicators and two covariates ............................................................................................................. 54

Figure 4.1. Higher-order vs. Bifactor model model of WOAQ ................................ 63

Figure 5.1. First-order reflective model ................................................................ 68

Figure 5.2. First-order formative model ................................................................ 69

Figure 5.3. Higher-order reflective-reflective measurement model ...................... 70

Figure 5.4. Higher-order formative-formative measurement model ..................... 71

Figure 5.5. The developed framework for assessing the formative vs. reflective measurement models........................................................................................... 78

Figure 6.1. A demonstration of Bifactor model of WOAQ with phantom variables for calculating omega coefficients on the righthand side .................................... 105

Figure 7.1. The proposed bifactor model of WOAQ vs. higher order ................... 114

Figure 9.1. Covariate-dependent reliability assessment with the bifactor model of WOAQ across the nursing and paramedic organisations. .................................... 136

Figure 10.1. The effects of latent factor bias (social desirability) on the reliability of the constructs. ................................................................................................... 157

Figure 10.2. The proposed model for evaluating CMB/CMV. .............................. 159

10

Figure 10.3 . Model 1: Baseline model when all the study constructs are correlated without controlling for CMV and CMB ................................................................ 161

Figure 10.4. Model 2. Constrained equal loadings from CMV to the study indicators. .......................................................................................................... 163

Figure 10.5. Model 3. Free loadings from CMV to the study indicators ............... 165

Figure 12.1. Multidimensional work ability model. ............................................. 188

Figure 12.2. WAI scores: Australia and Finland. .................................................. 190

Figure 12.3. The correctly specified reflective-formative model of WAS. ............ 195

Figure 12.4. The misspecified reflective-reflective model of WAS ....................... 196

.Figure 12.5. The misspecified formative-formative model of WAS. .................... 197

Figure 12.6 . Step one: constructing the first-order sub-constructs of both personal and organisational capacities of reflective-formative model of WAS using PLS path modeling. ........................................................................................................... 208

Figure 12.7. –Step two: building the second-order formative constructs (organisational and personal capacities) for the reflective-formative model of WAS. ........................................................................................................................... 209

Figure 12.8. Step three: The scores of the first-order latent factors, are used as the manifests of the second-order factors (i.e. organisational and personal capacities) and forming the higher-order construct (WAS). .................................................. 210

Figure 13.1. The final model of reflective-formative WAS development using PLS path modeling. ................................................................................................... 220

Figure 16.1. The standardised path parameter estimates of the misspecified reflective WAS model ......................................................................................... 284

Figure 16.2. The reflective model of WAS using PLS-SEM.................................... 289

Figure 16.3. The reflective WAS development using PLS path modeling. ............ 291

Figure 16.4. The model building process for full formative model of WAS using PLS-SEM. ................................................................................................................... 293

Figure 16.5. The full formative model of WAS using PLS-SEM ............................. 295

11

1

INTRODUCTION TO THE THESIS

So here it is, my final version of the thesis. A thesis which was the best journey of

my life. One which I fell in love with and developed over the years. A thesis that made me

learn a lot and taught me to never stop learning. A thesis that was by my side for seven

years, for all the ups and downs.

1.1 Introduction

The journey started with an initial interest in exploring the SEM-based validation of

measures for formative constructs. At the start I was lost. I felt like I was walking in the

dark with no hope of finding the light. But as time passed and the more I immersed myself

in the subject, the more it all started to make sense. My first realisation was that constructs

are not reflective - ‘by default’ - as the majority of scholars assume. A real life example

exists right in front of us; social economic status indicators (SES). The three main

components of SES are income, education and occupational status. Although these

components may be correlated, they are measuring different constructs. Which means SES

cannot be a reflective construct, as assumed by some scholars. I grew up in a middle-class

family in a highly populated, developing country, where your social status is often defined

by your parent’s income or occupational status. This experience showed me that a parent’s

income and occupational status does not always depend on their level of education. This is

a perfect example of a formative construct.

I then reviewed the literature to see how big the problem of model misclassification

actually is, and as I was expecting, I found that it was big enough to do something about. It

12

is very difficult to evaluate formative models using the conventional SEM procedures,

mainly due to identification problems. It is therefore no wonder that the majority of the

scholars - ‘by default’ - consider their constructs to be reflective. The evaluation of

formative models using the conventional Covariance-Based Structural Equation Modeling

(CB-SEM) procedure using conventional statistical software results in model identification

problems. In my search for a solution I discovered and became familiar with Partial Least

Squares SEM.

Further in my research of formative models, I met with distinguished Professor

Peter M. Bentler at UCLA. After attending his statistics classes and frequent discussions,

SEM application began making more sense than ever before. During one of our meetings,

Peter mentioned Covariate-Dependent Reliability, prompting my second realisation.

The more I read and thought about it, the more I realised the potential applications

for this approach. Why is it that some measurement scales do not have an acceptable level

of reliability in all situations or populations? Is an IQ test derived within a European

context applicable for a remote aboriginal community? Can the reliability of the IQ test that

is associated with ethnicity be separated from the reliability that is independent of

ethnicity? Clearly, the reliability of any measure can be influenced by covariates or

cofounding variables. However, nobody seems to care about this issue, or if they do, they

don’t know how to account for this. With permission from Professor Bentler, I started

evaluating the application of Covariate-dependent Reliability. I was warned that this was a

risky exercise given the novelty of the topic and the lack of previous literature. But

13

nevertheless, the importance of this topic pushed me out of my comfort zone and stretched

the boundaries of my thesis.

Through my investigation of Covariate-dependent Reliability, I came to understand

its application in demonstrating Common Method Bias (CMB), caused by factors such as

social desirability. Not all students that fill in surveys about their drinking behaviours or

emotional intelligence tell us the truth. I came to realise that it was possible to test for CMB

in the following way. If CMB was to be treated as a covariate and we evaluated the

reliability of measurement scales with and without CMB, if we happened to see changes in

the reliability, then we could argue that CMB exists. Moreover we could extract the effect

of the CMB.

Finally, after learning about the importance of bifactor models for multidimensional

modeling, I was encouraged to also explore this neglected area. For the complex

multidimensional measure of Work Organisation Assessment Questionnaire (WOAQ),

bifactor analysis provided a significant improvement in the measurement model.

1.2 Study Structure

The main theme of this journey is the testing of model-based reliability and validity

of measurement models using SEM. I have perused three new developments in this area in

the following three studies:

1) The model-based reliability and validity evaluation in Bifactor Measurement

Models with special focus on the Work Organisation Assessment Questionnaire WOAQ;

comparing the results with a second-order model.

14

2) The application of the newly developed theory of Bentler’s Covariate-dependent

Reliability, not only for reliability assessment but also for demonstrating Common Method

Bias (CMB); and

3) Evaluating Model-Based Reliability and validity in reflective-formative models,

using Partial Least Squares SEM. By using this procedure, the existing misspecified model

of WAS will be compared with a correctly specified model to highlight the impacts of

model specification errors.

A summary of these three studies are presented in Figure 1.1. Clearly Covariance-

Based SEM is used for the first two studies in the context of reflective measurement

models, and PLS-SEM is used for the third study in the context of reflective-formative

measurement models. The theory for these methods is covered in chapters 2-5. In Study one

(Chapters 6 to 8) the validation of a bifactor model for the Work Organisation Assessment

(WOAQ) will be demonstrated in a health setting. Nothing like this has been carried out

before using the Work Organisation Assessment Questionnaire (WOAQ). In addition, the

model-based reliability coefficients (Omega total, Omega hierarchical and Omega

subscales) will be computed and compared with the conventional coefficient alpha. The

first part of the study was conducted with a sample of community nurses. The second part

of this study concerns the cross validation of the bifactor model with a paramedic sample

(across gender), to find out if the bifactor model of Work Organisation Assessment

Questionnaire (WOAQ) has similar properties in another very different population within

the health sector.

15

In Study two (Chapters 9 – 11), two applications of covariate-dependent reliability

will be demonstrated empirically. This is the first time that applications of covariate-

dependent reliability have been undertaken. One of these applications demonstrates

reliability evaluations in the context of Common Method Bias (CMB). This application

demonstrates that, if CMB exists, then the reliability of the measurements will be affected

when you treat CMB as a covariate source of reliability.

In Study three (Chapters 12 – 14), an empirical example of fitting a reflective-

formative measurement model using Partial Least Squares SEM is presented step by step.

To the best of the researchers’ knowledge, there has been no previous clear guideline or

procedure for fitting a reflective-formative model in the literature of Partial Least Squares

SEM. This model is fitted for a work ability measure (WAS) allowing the testing of both

validity and reliability. This reflective-formative model is compared with a misspecified

reflective-reflective model, demonstrating the errors that occur as a result of

misspecification.

16

Figure 1.1. The study structures in this thesis.

Note: CB-SEM: Covariance-based Structural Equation ModelingPLS-SEM: Partial Least Squares Structural Equation Modeling; WOAQ: Work Organisation Assessment Questionnaire; WAS: Work Ability Scale.

Model-based reliability and validity

CB-SEM(Reflective models)

Chapter 2-4

Study 1:Bifactor and higher-order model validtaion of WOAQChapters 6-8

Study 2: Applications of covariate-dependent reliabilityChapters 9-11

PLS-SEM(Reflective-formative models)

Chapter 5

Study 3:Reflective-formative model validation of WASChapters 12-14

Part I: Study of nurses – Validation and Model-based reliability of WOAQ

Part II: study of paramedics- Cross validation of WOAQ

Part I: Study of nurses & paramedics- application in reliability

Part II: Study of students-application in common method bias

Part I: Fitting hierarchical models with formative constructs

Part II: Validation and model-based reliability of WAS

17

Chapters 2 provides an overview of SEM and Chapter 3 covers model-based

reliability, along with new developments, current gaps and applications in these areas.

Chapter 4 compares bifactor and second-order models and Chapter 5 compares reflective

and formative models, presenting an overview of the history of misspecification for

formative models. The misspecification rates for formative and reflective models are

assessed for two top journals in Organisational Psychology in a 9 year period (2006-2014).

A solution to this model identification problem is provided by way of a decision flowchart

for distinguishing formative from reflective models. Chapters 6 to 8 cover the validity,

cross validity and model-based reliability of the Work Organisation Assessment

Questionnaire (WOAQ), using both bifactor and second-order factor models. Chapters 9 to

11 cover two new applications of covariate-dependent reliability, and also demonstrate how

CMB can be detected and measured using this approach. Chapters 12 to 14 concentrate on

the validity and reliability assessments of reflective-formative models as opposed to

misspecified reflective-reflective models using PLS-SEM. Finally all three studies are

summarised in chapter 15.

1.3 Summary

In spite of the importance of multidimensional model-based reliability

measurement, there is limited empirical study of model-based reliability coefficients. In

addition, bifactor models and measures with formative constructs have received less

attention in the literature compared to higher-order and reflective models. Either scholars

do not recognise the importance of these topics, or the appropriate statistical software is not

readily available for performing analysis. This research is designed to fill some theoretical

18

and methodological gaps in this area. Moreover, for the first time, this study demonstrates

the practical implications of the newly introduced theory of covariate-dependent reliability

of Bentler (2014) for reliability and common method bias assessment.

19

2

THE HISTORY OF THE EVOLUTION OF SEM IN PSYCHOLOGY

Structural equation modeling (SEM) is one of the major research tools that is

rapidly growing in popularity. SEM is a statistical technique for testing and examining

measurement models and causal relations, using a combination of statistical data and

qualitative causal assumptions (Pearl, 2000). SEM techniques are a major component of

applied multivariate statistical analysis, which are widely used by researchers in different

disciplines.

In a broader sense, SEM represents a series of cause-effect relationships between

variables in a composite testable model (Shipley, 2000). It extends on conventional

multivariate statistical analysis by accounting for measurement error and by more

thoroughly examining goodness-of-fit. The SEM technique has grown out of methods such

as path and factor analysis.

SEM has attracted attention primarily because it lends itself to effectively studying

problems or models that cannot be easily investigated using other approaches. In this

chapter, a history of the original roots of SEM in psychology will be traced, followed by a

discussion of the current developments in SEM. The structure of the chapter is based on the

path diagram presented in Figure 2.1. The idea of showing the history of SEM using a

graph originated from a personal conversation with Professor Peter Bentler in 2012. The

researcher was inspired by the idea and extensively developed the graph to include all the

major developments in SEM.

20

Figure 2.1. A pseudo path diagram and timeline for some of the developments in SEM modeling and SEM model structures. Acknowledgement:

Special thanks to Professor Peter Bentler (personal communication, 2012), for his inspiration and input into development of the diagram.

RELIABILITY

FACTOR ANALYSIS

CONFIRMATORY FA

NONLINEAR

MULTILEVEL

MIXTURE

BOOTSTRAPPING

PRINCIPAL COMPONENTS

FASEM

PATH ANALYSIS

SIMULTANEOUS EQUATIONS

LINEAR BENTLER/WEEKS

1900 1920 1940 1960 1980 2000

CLASSICAL TEST

THEORY

MODERN TEST THEORY

FORMATIVE MODELS

MIMIC

PLS

GLLAMM

REGRESSION

META ANALYSIS

2.1 First Trend: Exploratory Factor Analysis1

Exploratory factor analysis (EFA) has made an important contribution,

especially in the social sciences, by addressing the needs and interests of various

disciplines. The primary roots of SEM in psychology can be traced to the work of

Pearson (1901) on orthogonal least squares. Pearson’s theory was not fully appreciated

at the time, but it later became a foundation for principal component analysis and

correlation matrix analysis (Hotelling, 1933). Spearman (1904), an English

psychologist, also contributed substantially. Spearman is commonly regarded as the

pioneer of factor analysis, based on his work involving the finding of relationships

between multiple correlated measures of cognitive performance. Factor analysis is

defined as a type of statistical procedure that is conducted to identify clusters or groups

of related items (called factors).

Using factor analytic data, Spearman postulated his original two-factor models for

ability and intelligence, highlighting the theory testing nature of the method. Spearman

found that children's scores on different subjects were connected. Spearman then

extended his theory, proposing three types of factors: a general factor (g), referring to all

activities; a specific factor (s) that refers to a specific mental activity; and a group factor

which is common to some of the variables but not all. Other scholars gradually adopted

this approach using factor analysis (e.g. Mosier, 1939; Guttman, 1952; Lawley, 1940;

Anderson and Rubin, 1956).

1 Shown in blue in Figure 2.1.

22

2.1.1 Multiple factor analysis

Spearman's two-factor theory was criticised widely. Thomson (1916 & 1935)

strongly criticised the sampling theory approach in regard to abilities in the early stages

of the development of Spearman’s two-factor theory. It was claimed that the analysis

considered only a sample of possible abilities, making such an analysis incomplete.

However, the biggest critic of the Spearman model was Wilson (1928a, 1928b, 1929), a

famous mathematician who made a significant contribution to the development of factor

analysis. In different papers and using different examples, Wilson highlighted

indeterminacy issues; lack of uniqueness in the variable (g) of Spearman’s theory; and

the identifiability problem in the variance-covariance parameters of factor analysis. A

number of scholars (such as Irwin, 1935, Thomson, 1935), added to Wilson’s work by

further developing an understanding of indeterminacy (Steiger & Schönemann, 1978).

In the early stages of Spearman's development of factor analysis, some scholars (e.g.

Wilson, 1929; Thomson, 1935) suggested that factor indeterminacy might seriously

affect the ultimate purpose of the model, making this a very important theoretical issue

between 1928 and 1939. However, the focus moved away from factor indeterminacy

when Wolfe (1940) wrote his objective historical review of factor analysis, until 1955,

when factor indeterminacy again attracted attention.

The Spearman two-factor methods were also criticised on the grounds that they

were not appropriate for situations that involved more than one group factor. In 1931,

Thurstone considered this as one of the serious limitations of Spearman's method,

mainly because psychological problems usually involved two or more group factors

(Thurstone, 1935). This limitation led to an interest in multiple-factor analysis to

supplement Spearman’s model, whereby group factors were identified after extracting a

23

general factor (e.g., Holzinger 1941). In multiple-factor analysis there are no restrictions

on the number of general factors or the number of group factors (Thurstone, 1935).

Thurstone (1947) further developed his multiple-factor analysis using the centroid

method of factoring for a correlation matrix, a pragmatic compromise to the

computationally-burdensome principle axis method. This method of factor analysis

attracted further attention in the 1960s (Bentler, 1968; McDonald, 1970). However, from

the early 1980’s explicit optimisation functions, such as least squares, maximum

likelihood (ML), and minimum chi-square, became more popular.

Thurstone (1947), with later contributions from Cattell (1978), developed the

foundations for the concept of factor rotation. Other scholars extended Thurstone’s

work by proposing practical solutions for rotation. The most popular rotation methods

included the Varimax orthogonal rotation, which forced factors to be uncorrelated

(Kaiser, 1958); and various oblique rotation methods (Jennrich & Sampson, 1966;

Jennrich & Clarkson, 1980) which allowed the factors to be correlated. As a result of

these developments exploratory multiple-factor analysis became popular during this

period.

2.2 Second Trend: Confirmatory Factor Analysis (CFA)

There is obviously a connection between exploratory and confirmatory factor

analysis methods. However, other statistical theory apart from exploratory factor

analysis has also made a significant contribution to the development of confirmatory

factor analysis (Bentler, 1986). These theories include analyses for higher order factors.

Although Thurstone (1947) seems to be acknowledged for proposing the

mathematical foundation of second-order factor analysis, it was Jőreskog in 1970 who

24

wrote an equation including first and second-order factors as a single model and it was

Bentler in 1976 who offered a complete and general structure for higher-order factors.

The problem of rotating factor solutions was avoided when confirmatory factor

analysis (CFA) came on board. In CFA, the factors and parameter loadings are

identified before analysis starts, transforming the problem into one of identification of a

model’s parameters from observed moments (Matsueda, 2012).

CFA was introduced originally by Tucker (1955). It was further developed

following the introduction of an ML approach to factor analysis (Lawley, 1940;

Anderson & Rubin, 1956). Finally, it was Jöreskog (1969) who developed the first

computer software programs for CFA estimation using ML.

2.3 Third Trend: Factor Analysis of SEM (FASEM)

Real progress in the evolution of SEM was produced by the integration of the

earlier SEM developments in psychometrics, sociology, econometrics, and biometry

(Bentler, 1986). The factor analysis of structural equation modeling (FASEM) and the

resulting linear structural relations (LISREL) software were the main outcomes of this

integration. At the time, simultaneous equation and path analysis methods were the

main new contributors to FASEM and LISREL.

2.3.1 Path Analysis

Sewall Wright WAS one of the first scholars to use path analysis in medical science

when he started using this in his studies in the 1920s. Path analysis was one of the

primary procedures used to determine a causal structure. Wright used observed

variables to develop a correlation matrix, and drew path diagrams indicating direct and

indirect effects.

25

Path analyses led Wright to develop the Multiple Indicators Multiple Causes

(MIMIC) model among others (Matsueda, 2012). Figure 2.2 presents an early path

analysis by Wright (1920) indicating path modeling of heredity and environment in

shaping the piebald pattern of guinea-pigs.

Figure 2.2. One of Wright's first path diagrams for genetic modeling.

Source: Wright, Sewall (1920). The relative importance of heredity and

environment in determining the piebald pattern of guinea-pigs. Proceedings of the

National Academy of Sciences, 6, 320-332.

2.3.2 Simultaneous equation and errors-in-variables models in economics

The development of SEM in econometrics can be attributed perhaps to Frisch

and Waugh (1933), Haavelmo (1943), and Koopmans (1945). Frisch (1934), the

founder of the Econometric Society and the Econometrica Journal, invented the term

26

“econometrics” and developed many of the identification principles in SEM. The

advances made by Haavelmo (1943), another economist, together with Mann and Wald,

led to work on SEM at the Cowles Commission (1952). This resulted in Haavelmo

solving the major problems of identification, estimation, and testing in SEM.

Koopmans et al. (1945) made some empirical advances on Haavelmo’s model.

However, according to Matsueda (2012), it was Klein (1950) who made the most

significant contribution to the empirical application of simultaneous equation models

using Keynesian economic models, culminating with the 15-equation Klein-Goldberger

model estimated by limited-information methods (Klein & Goldberger 1955). Other

scholars made further contributions to the model (e.g. Anderson & Rubin 1949; Zellner,

1962; Zellner & Theil, 1962).

Frisch (1934) first created an errors-in-variables model and then a graphical

presentation of regression coefficients (the method of bunch maps) which was proposed

as a tool to discover underlying structures, often obtaining approximate bounds for

relationships. According to Hendry and Morgan (1989), Frisch treated observed

variables as fallible indicators of latent variables, examining the interrelationships

among all latent and observed variables to distinguish true relations from confluent

relations.

Frisch’s errors-in-variables model was ignored until the early 1970s when

Zellner became interested and demonstrated the use of generalised least squares (GLS)

and Bayesian approaches in estimating a model with a fallible endogenous predictor.

Later, Goldberger (1971) showed that GLS was equivalent to ML only when errors

were normally distributed with known variances. He also showed that when error

variances were unknown, an iterated GLS converged to ML.

27

According to Bentler (1986), Goldberger was one of the first researchers to

realise the need to integrate some SEM-related ideas into other disciplines (Goldberger,

1971; Goldberger & Duncan, 1973). This integration was one of the turning points in

the evolution of SEM in the 1970s.

2.3.3 FASEM

FASEM is a generic acronym for factor analysis (FA) structural equation

modeling (SEM); a major development in the 1970s and 1980s. FASEM was first used

by Bentler (1986) to refer to conceptual approaches to continuous variables in SEM.

The Conference on Structural Equation Models in 1970 contributed greatly to

the integration of SEM disciplines. The conference was an interdisciplinary forum of

economists, sociologists, psychologists, statisticians, and political scientists and the

academic papers were published in a volume of Structural Equation Models in the

Social Sciences by Goldberger and Duncan in 1973.

According to Bentler (1986), the major achievements in the 1970s and 1980s

can be categorised into three sections: structural concepts, statistical theory and practical

development. The two key papers published in this period were written by Hauser and

Goldberger (1971) and Jöreskog (1973). Hauser and Goldberger’s (1971) examination

of unobservable variables is an exemplar of cross-disciplinary integration, drawing on

path analysis and moment estimators from Wright, as well as work by sociologists. It

also incorporates factor-analytic models from psychometrics, efficient estimation, and

Neyman-Pearson hypothesis testing from statistics and econometrics. Hauser and

Goldberger used limited-information estimation to gain a better understanding of

structural equations estimated by ML. Jöreskog (1973) presented an ML framework for

estimating the parameters of these SEM models, developed a computer program for

28

empirical applications, and showed how the general model could be applied to a myriad

of important substantive models.

2.3.4 Nonlinear SEM

The turning point in the application of SEM in psychology dates back to the

1970s and 1980s, primarily through the work of Bentler and, more particularly, his

development of the EQS SEM software (Matsueda, 2012). Using such analytical

software for evaluating SEMs allows researchers to make better use of their data and to

study the empirical applications of some of the methods proposed by certain scholars in

the literature (Bentler, 1986). During the 1980s some researchers paid attention to

nonlinear SEMs, which helped to extend the overall scope of SEM. Some important

developments in nonlinear latent variable SEM, particularly those for categorical data,

emerged in the 1980s, mainly in the works of Bock and Aitkin (1981), Mislevy (1984)

and Muthén (1984).

2.3.5 Formative models

The first appearance of formative measures probably dates back to the Berkson

error model for radiation epidemiology studies described below. In the 1950s, the U.S.

carried out nuclear testing in the state of Nevada. Due to the sudden increase in thyroid

disease in surrounding areas, a major epidemiological study was carried out at the

University of Utah to evaluate the outcomes of radiation on health. The researchers

found that the main exposure to radiation came from milk and vegetable consumption.

Based on that finding, the people in the study who had similar milk intake were

assigned to the same dose group. Because the effect of radiation on the thyroid cannot

be observed directly, Berkson designed a method in which the true exposure to radiation

29

(true score) was a function of the amount of food consumption (observed score) with

some degree of uncertainty (measurement error).

In Classical Test Theory, a true score, with its measurement error, forms an

observed variable, while in the Berkson error model it is the opposite in that the true

score is equal to observed score plus measurement error (Carroll, Ruppert, Stefanski, &

Crainiceanu, 2006). The Berkson measurement error concept has become the

cornerstone of what is today known as formative models. Although the concept of

formative measures was introduced by Berkson in 1950, it did not attract attention until

the late 1960s. Influenced by principal component and composite-like ideas, attention to

using formative measures in SEM has since increased. The biggest surge in the use of

formative models in certain situations occurred in the early 2000’s. Many scholars (e.g.

Blalock, 1971; Diamantopoulos and Winklhofer, 2001; Jarvis, MacKenzie & Podsakoff,

2003; Petter, Straub & Rai, 2007) have alerted researchers to the relevance of formative

models in specific situations; however, this fact has unfortunately been

underemphasised in the literature.

2.3.6 Multiple-indicators multiple-causes model (MIMIC)

One of the techniques implemented by Wright in the 1920s using path analysis,

is similar to what is now known as a MIMIC model (Matsueda, 2012). The main

advancement in MIMIC was achieved through the works of Jöreskog, Hauser, and

Goldberger in the 1970s. They introduced ML as the estimation method for over-

identified MIMIC models.

The release of the LISREL statistical software by Jöreskog in the 1970s

produced the greatest advancement in estimating MIMIC models. LISREL is still

popular among scholars because of its ability to incorporate factor analysis, path

30

analysis, and FASEMs into a general covariance structure model (Jöreskog and Sörbom

2001; Matsueda, 2012). Using the MIMIC model, identification and estimation of

formative models has become feasible in some circumstances.

2.4 Current Developments in SEM

Some of the most current developments in SEM include multilevel models,

generalised linear latent and mixed modeling (GLLAMM), partial least squares (PLS)

and SEM-based meta-analysis.

2.4.1 Multilevel models

Multilevel models can be estimated using multiple indicator measurement

models in SEM. Using this approach separate models for within and between group

covariances are considered. Further, by using a multiple group analysis, the parameters

can be calculated simultaneously for both levels (Muthén, 1994). Although this

estimation method can be applied using almost any SEM software, this is generally only

for a few specific models.

2.4.2 GLLAMM

As mentioned above, the use of multilevel models is limited to specific models

and cannot be applied to all models. In response to this limitation, a more advanced and

general estimation method, GLLAMM, was introduced by Rabe-Hesketh, Skrondal &

Pickles (2004), and further developed by Skrondal and Rabe-Hesketh’s (2004).

GLLAMM has three main components: a generalised linear model, a structural equation

model for latent variables, and distributional assumptions for these latent variables

(Matsueda, 2012). The generalised linear model is capable of analysing all types of

data; continuous, ordinal, dichotomous and discrete. The GLLAMM program is now

31

part of the Stata program. Many of the GLLAMM models can also be analysed by

MPlus, a powerful software package developed by Muthén and Muthén (2004).

2.4.3 PLS

The roots of PLS, as well as graphical models, can be traced to Herman Wold in

1977 (Geladi, 1988). Wold's PLS modeling was enhanced by the idea of principal

component analysis as well as Jöreskog's LISREL software program.

Originally, PLS was developed to solve the problem of multicollinearity in

multiple regression analysis. According to Wold (1979), PLS regression was an

appropriate estimation method for complex models with undeveloped theoretical

backgrounds. The original application of PLS was more for predictive models (Barclay,

Higgins, & Thompson, 1995). Later, as an alternative to Jöreskog’s covariate-dependent

SEM approach, Wold introduced SEM based on PLS. Because PLS-based SEM has

fewer underlying restrictions, such as normally distributed data and a large sample size,

it is known as soft modeling. Despite its less restrictive nature, PLS-based SEM did not

become as popular as covariate-dependent SEM. The main reason for this was a lack of

software for model estimation.

However, since 1984, and especially from the early 2000s, more user-friendly

software has been introduced for the estimation of PLS-based SEM, adding to the

popularity of the method. Software such as PLS-GUI (Li, 2005), Visual PLS (Fu,

2006a), PLS-Graph (Chin, 2004), SmartPLS (Ringle et al., 2005), SPAD-PLS (Test &

Go, 2006) and XLSTAT (Addinsoft, 2008) are among the recent developments

(Morales, 2011).

32

There has been much debate among the scholars about the application of PLS

and the lack of a goodness-of-fit test. These issues are discussed in Chapter 3.

2.4.4 SEM-based meta-analysis

The concept of SEM-based meta-analysis was introduced by Cheung (2008) to

integrate SEM results from different studies. As a result, studies in meta-analysis are

relevant to SEM. Although Cheung's proposed approach added a new and important

methodological development in SEM, it is not yet fully incorporated into the current

popular SEM software, limiting its further application in practice.

2.5 Conclusion

SEM is rapidly growing in popularity as a major research tool in psychology.

The early foundation of SEM can be traced back to factor analysis, principal component

analysis, regression and path analysis. It started in various disciplines such as

psychometrics, sociology, econometrics and biometry. The Interdisciplinary Conference

on Structural Equation Models in 1970 greatly influenced the integration of SEM work

in these disciplines. The work of Bentler and, especially, his development of EQS in

1970, was another turning point for the application of SEM in psychology. Since then,

SEM has rapidly developed. In particular MIMIC models were developed for the fitting

of formative models in some circumstance. Other recent developments such as PLS,

GLLAMM and multilevel models have extended the application of SEM techniques to a

higher level.

Although this area is progressing rapidly, there is a risk that the technique will

be misused due to its complexity or lack of knowledge among psychological

researchers. Some of the most controversial debates relate to model-based reliability,

33

model misspecification (formative vs. reflective) and the use of Partial Least Squares

SEM (vs. covariance-based SEM). These three issues are described in more detail in the

following three chapters to highlight their importance to researchers.

34

3

THE EVOLUTION OF MODEL-BASED RELIABILITY ESTIMATES

3.1 Introduction

The literature on reliability has developed following the introduction of classical

test theory in the 1900s. Coefficient alpha (α) has been widely utilised as a coefficient

of internal consistency for tests (measurement scales) where overall scores are

generated from the summation of test items (Bollen, 1989; Miller, 1995). Despite its

popularity, the application of coefficient alpha as a reliability estimate has been

contentious and has been subjected to numerous criticisms by scholars such as Green

and Hershberger (2000), Green and Yang (2009a) and Sijtsma (2009a). Some scholars

argue that coefficient alpha has been commonly misinterpreted as a measure of test

homogeneity or unidimensionality (Green & Yang, 2009b; Miller, 1995). Other scholars

such as Miller (1995) as well as Rogers, Schmitt, and Mullins (2002) claim that

coefficient alpha may not be suitable for multidimensional composites. From a differing

viewpoint, others claim that the conventional coefficient alpha leads to the

overestimating or underestimating of true reliability (Raykov, 1997, 1998; Miller,

1995). Based on Raykov (1998) and Bentler (2009) the coefficient alpha is correctly

estimated only when there is no correlation between error terms and the assumption of

essential Tau Equivalency is met. This term is explained below.

Due to the limitations of coefficient alpha, over the past decades attention has

shifted to a model-based internal consistency coefficient for measuring test score

reliability. Some scholars have embraced Structural Equation Modeling (SEM)

approaches for the estimation of model-based reliability as an alternative to coefficient

35

alpha to improve the reporting of psychometric internal consistency (Sijtsma, 2009b).

Unlike classical test theory which considers true-score variance, the SEM approach to

model-based reliability focuses on the composition of the true score. This means that

the true-score variance is partitioned into variance components allowing the researcher

to consider the importance of the different variance components that contribute to test-

score reliability (Sijtsma, 2009b).

For the purpose of this chapter the evolution of model-based reliability estimates

will be explored. The focus will initially be mainly on model-based reliability assuming

a unidimensional reliability coefficient (Jöreskog, 1971; McDonald, 1985; Bentler,

2007), however, reliability coefficients for multi-dimensional and bifactor models will

also be considered, in the form of the Omega, Omega total, Omega hierarchical and

Omega subscale reliability coefficients ((McDonald, 1978, 1999; Zinbarg, Revelle,

Yovel, & Li, 2005; Reise, Bonifay, & Haviland, 2012). The newer theory of covariate-

dependent and covariate-free reliability of Bentler (2014) will also be discussed. The

above mentioned model-based reliability coefficients are estimated using CB-SEM. For

completeness, the chapter will also briefly discuss composite reliability (CR) using

PLS-SEM. The application of composite reliability in scale models, involving formative

constructs, will be elaborated upon in a later chapter.

3.2 Classical Test Theory and Coefficient Alpha

Constructs or latent variables are commonly used to classify or group similar

behaviours or attributes. However, constructs in psychology are usually measured

indirectly, through tests, surveys, or tasks. Designing such measurement instruments

(scales) for measuring constructs is challenging. The test developer must deal with

many measurement problems. The study of measurement problems, including the extent

36

to which they influence the measurements and methods for dealing with these problems,

has evolved into a specialised discipline known as Test Theory. Test Theory “provides a

general framework for viewing the process of instrument development” (Crocker and

Algina, 1986; p. 7).

Historically the roots of Test Theory were developed mainly by psychologists

from Europe and the United States. In Europe, the early development of Test Theory

dates back to the mid 1800s with the work of Wilhem Wundt, Ernst Weber, Gustav

Fechner and their colleagues in Germany. In Great Britain, scientists including Sir

Francis Galton, Charles Darwin and Karl Pearson were among the main scholars who

significantly contributed to the development of Test Theory.

37

dd

Internal Consistency RELIABILITY

Kuder and Richardson (1937) Hoyt (1941)

Guttman (1945)

1900 1950 1970 2000

Coefficient alpha Cronbach (1951)

Unidimensional Reliability (Jöreskog, 1971; Heise &

Bohrnstedt, 1970)

Multidimensional Omega Reliability (McDonald, 1978)

Latent Variable Model Reliability rho (Bentler,

2007)

Covariate-Dependent & Covariate-Free Reliability

(Bentler, 2014)

Omega Hierarchical, Omega subscales and Omega Total (McDonald, 1999; Zinbarg, Revelle, Yovel, & Li, 2005; Reise, Bonifay, & Haviland,

Early Roots Recent Developments

Composite reliability ( cρ ) (Werts, Linn & Joreskog, 1974)

Parti

al L

east

Squ

ares

SEM

(PLS

-SEM

) C

ovar

ianc

e-ba

sed

SEM

(CB

-SEM

)

Figure 3.1. A pseudo path diagram and timeline for some of the developments in historical review of the conceptualisation and estimation of model-based reliability 38

Between 1905 and 1908, the French psychologists Alfred Binet and Theophile

Simon established an example of psychological assessment that has stood the test of

time. They successfully created an intelligence test (IQ) to measure the level of

intellectuality in children. The empirical test analysis and the advanced concept of norm

can be attributed to the work of Binet and is still used by modern test developers.

Early in the 20th century, American scientists made some progress developing

Test Theory further. In 1904, E. L. Thorndike published the first textbook on Test

Theory and James McKeen Cattell acknowledged the significance of norms and errors

in observations. The founding of the Psychometric Society in the 1930s promoted

further advancements for the establishment of Test Theory. Through the Psychometrika

and Educational, and Psychological Measurement journals, more opportunities were

available for scholars to exchange ideas and theories in this field.

In 1986, Galton's study of students from Cambridge University showed that

mental abilities could be distributed as a normal curve allowing the application of

statistical techniques to psychological test data. Karl Pearson’s work with the

computational formula of the correlation coefficient followed. The procedure known as

factor analysis was originally developed based on an advanced set of correlational

procedures designed by Charles Spearman, later becoming one of the most popular

statistical procedures for assessing the validity of measurement instruments.

The importance of Test Theory in research and evaluation is well recognised. In

order to achieve accurate, comparable outcomes, it is crucial for researchers to adhere to

the principles of Test Theory for developing or testing measurement instruments and to

evaluate the accuracy and sensitivity of these tools before utilising them for research

purposes.

39

The reliability coefficients measure is defined as “… the degree to which

individuals’ deviation scores, or z scores, remain relatively consistent over repeated

administration of the same test or alternate test forms” (Crocker & Algina, 1986; p

105). In Test Theory, different types of reliability are introduced. ‘Test-retest’, ‘Parallel

Forms’ and ‘Internal Consistency’.

‘Parallel Forms’ reliability is based on creating two scales which provide

composite scores for measuring the same construct (Nunnally & Bernstein, 1994). This

reliability measure is calculated as the squared correlation between the composite scores

of the two scales. However, although this procedure is a good procedure to identify

sources of error variance (Nunnally & Bernstein, 1994), it is hardly used in the

psychology literature.

‘Test-retest’ reliability is more commonly used. Instead of creating two different

scales and comparing the results as in parallel forms reliability, the consistency of the

responses over different time points are considered. Random measurement errors are

one of the main sources of inconsistency in the responses of individuals over time.

However, given the often limited time interval between test and retest, the accuracy of

the procedure has been criticised in the literature (Nunnally &Bernstein, 1994).

‘Internal Consistency’ reliability is less complex than parallel forms and test-

retest, in that a single scale is measured at only one time point. Two popular procedures

for estimating internal consistency reliability are ‘split-half’ and ‘Cronbach’s alpha’

(hereafter called coefficient alpha).

In the ‘split-half’ procedure, the same scale will be split into two parts and the

correlations between the two parts are compared. The stronger the positive correlation

between the two parts of the scale the better the internal consistency of the scale. There

40

are a few limitations with the split-half procedure. Firstly, there is no clear procedure or

justification for splitting the scale into halves. Secondly, for time-limited testing, such

as ability or IQ measurement, with items arranged from easy to hard, the reliability

estimates may be upwardly biased (Cronbach, 1960). Thus due to these limitations,

coefficient alpha was introduced by Cronbach (1951) as an average reliability of all

possible split-half estimates for estimating internal consistency.

Coefficient alpha was first cited in Cronbach’s famous article in Psychometrika

(1951). Other scholars (e.g. Kuder and Richardson, 1937; Miller, 1995) are credited

with the further development of this measure. In particular Kuder and Richardson

generated variance estimates for this measure using the mean of a series of reliability

coefficients calculated from a single study using a random split of items. Later, Hoyt

(1941) proposed a conservative estimation procedure for assessing the reliability of a

scale based on an analysis of variance decomposition of the data. This estimation

procedure delivers similar results to the KR20, described below, but underestimates

reliability.

As explained above the coefficient alpha formula was proposed as the mean of

all possible split-half coefficients for a particular scale with

2

121

1

n

ii

x

sn

n sα =

= −

−

∑ Equation 3.1

where the number of items is n , the estimated variance of item i is 2is and the estimated

variance of the scale (X) is 2xs . A value closer to one suggests a scale with better

internal consistency.

41

3.3 Major Problems with Using a Coefficient Alpha Reliability Analysis

Based on Cronbach's approach (1951, p 331-332) there are some essential

assumptions that should be considered before applying coefficient alpha to evaluate

internal consistency. Unfortunately, these assumptions are ignored by many researchers,

resulting in concerns regarding the validity of coefficient alpha results for internal

consistency evaluation. Three assumptions should be considered before using

coefficient alpha for internal consistency evaluation. These assumptions are “essential

tau equivalency”, “uncorrelated errors” and “uni-dimensionality”.

Essential Tau Equivalency assumes that each item makes an equal contribution

of variance to the true scale variance (Green and Yang, 2009a). However, equal factor

loadings are seldom found in a scale and, moreover, the majority of scales are

multidimensional with unequal variances explained by each dimension. Thus, the

Essential Tau Equivalency assumption, is often violated (Sijtsma, 2009). In violating

the assumption, a negatively biased reliability coefficient is possible (Green & Yang,

2009; Sijtsma, 2009). As a result, the alpha coefficient often underestimates true

reliability (Green & Yang, 2009; Sijtsma, 2009).

Uncorrelated Errors assumes no correlation between the item errors ( ie ) when

the ith item (xi) is expressed as a linear function of the factor (f)

i i ix f eλ= + Equation 3.2

This assumption is also commonly invalid (for details see Green & Yang, 2009;

Sijtsma, 2009; Bentler, 2009). Violating the uncorrelated errors assumption leads to

several problems. For example, Bentler (2009) explains that violating the assumption

results in overestimated alpha coefficients because of unwanted systematic variance.

42

Other scholars (e.g. Green & Yang, 2009a; Sijtsma, 2009) argue that violating the

assumption can lead to either overestimation or underestimation of coefficient alpha.

Due to the common violation of coefficient alpha’s assumptions, the use of

coefficient alpha is criticised by several researchers (Bentler, 2009; Green &

Hershberger, 2000; Green & Yang, 2009; Sijtsma, 2009). This has led to several

improvements being suggested. These suggestions include reporting the greatest lower

bound (glb) for coefficient alpha as a measure of internal consistency (Sijtsma, 2009),

and Bentler’s dimension-free lower bound for reliability (blb)1 (1972).

These above mentioned recommendations are not always appropriate for

computing reliability coefficients for several reasons. The assumption of blb and glb are

specified at a population level (lower bound reliability) and assume no sampling

covariance. As stated by Bentler (2009), “in practice, sample covariance and correlation

matrices must be used in the computation instead of their population counterparts,

which are essentially never available” (p. 141).

In addition, they also assume uncorrelated error terms and/or no known

dimension for the factor model. Thus, in the presence of correlated errors or strong

theoretical and empirical knowledge on the dimensionality of the model, it does not

seem appropriate to use the blb or glb.

Therefore, Coefficient alpha and the above blb or glb related measures are not

appropriate when:

1 Bentler’s dimension-free lower bound reliability ( blbρ ) is proposed by Bentler (1972) based on no assumption on number of factors. Under the same assumptions glb and blb are equal.

43

a) the assumptions of using coefficient alpha are violated,

b) the dimensionality of the measurement model is already established (as a

unidimensional or multidimensional), and

c) the model fits the data well.

In many situations a model-based reliability measure is preferable to coefficient alpha

and the above blb or glb related measures. This leads to the discussion of

unidimensional and multidimensional model-based reliability estimates which will be

discussed in the next section. These are based on sample covariances matrices and have

weaker assumptions than Coefficient alpha.

3.4 Unidimensional Model-based Reliability

In response to coefficient’s alpha limitations and within the setting of

confirmatory factor analysis, the analysis of congeneric measures was introduced by

Jöreskog (1971) to calculate the uni-dimensional model-based reliability coefficient 11ρ

. This reliability coefficient 11ρ is perhaps one of the earliest proposals for assessing

the reliability of 1-factor models which does not require equal item reliabilities

(Gerbing, & Anderson, 1988). Using Maximum Likelihood (ML) estimation, 11ρ can

be estimated in SEM using the following formula when item residuals (ei) are assumed

independent and k items with loadings λi, are included in a scale.

∑+

∑

∑

=

==

=

k

ii

k

ii

k

ii

eVar1

2

1

2

111

)(λ

λρ Equation 3.3

44

The assumptions of essential tau equivalency or equal variance among items is

less important for this coefficient. The reliability coefficient will not be affected by

large differences between item variances. But in the presence of equal factor loadings

and item variances, 11ρ is equal to coefficient alpha.

Similarly to reliability coefficient 11ρ proposed by Jöreskog (1971), the ρt

coefficient of Zimmerman (1972), defined below, is useful for estimating the model-

based reliability of 1-factor models when the assumption of equal factor loadings across

all items is not met or when errors terms are correlated (McDonald, 1978, Raykov,

2001). However, as with 11ρ , when we have a unidimensional construct with equal

factor loadings and error variances for all items, with no correlation between the

residuals, the numerical value of coefficient (ρt) will be equivalent to that of coefficient

alpha (Raykov & Shrout, 2002).

2

12

1 1 1( ) 2 ( , )

k

ii

t k k

i i i ji i i j k

Var e Cov e e

λρ

λ

=

= = ≤ < ≤

=

+ +

∑

∑ ∑ ∑Equation 3.4

Akin to coefficient alpha, the above mentioned methods involve a one-factor

model which explains a set of items and are therefore not suitable for instruments and

scales that are multidimensional. However, it should be noted that the uni-dimensional

rho (ρ) reliability coefficients mentioned previously, can also be interpreted as a

unidimensional measure that quantifies the proportion of variance due to the most

reliable single dimension in a multidimensional space (Bentler, 2007).

In order to address the need for multi-dimensional reliability measures the

Omega (ω) procedure was developed using multi-dimensional measurement models

45

fitted using SEM. McDonald’s (1978) coefficient omega (ω) is defined below in the

context of a 2-factor model with loadings (λij) for the ith item on the jth factor (ηj) and

errors (ei) for the ith item, i=1, 2,…,k. It represents the ratio of the true variance to the

observed variance for this measurement model.

∑ ∑+∑

∑ ∑

=

= ==

= =

k

i

k

iij

jij

k

ij

jij

eVar

Var

1 1

2

1

1

2

1

ηλ

ηλω Equation 3.5

More recent developments have addressed more complex measurement models

such as the bi-factor models (Gignac, 2013; Reise et al., 2012) described below. This is,

an under-investigated area in psychometrics, in which a general factor exists alongside

sub-factors (Revelle, & Zinbarg, 2008; Zinberg, Revelle, Yovel, & Li, 2005).

3.5 Recent Developments

3.5.1 Omega Hierarchical and Omega Subscale for Bi-factor Models

Using the same Omega formula described above, Omega hierarchical ( hω ) and

Omega subscale (ωs) together estimate the degree of proficiency of a test measure in

assessing the reliability of a hierarchical or bi-factor model. Omega hierarchical ( hω ) is

applicable for assessing the reliability of only the general factor loadings of a bifactor

model. More specifically, it refers to a measure of the variance in total scores that arise

from the general factor running across all the items (Reise, Bonifay & Haviland, 2013).

The degree of reliability of the proposed subscale scores can then be evaluated

after controlling for the variance generated from the general factor. This procedure

creates reliability measures known as Omega subscale ( sω ) for each subscale (Reise et

46

al., 2012) using the same Omega formula for each subscale. Reise, et al. (2012)

advocate the reporting of these reliability indices for all subscales (see Figure 5 for an

example). Reporting the Omega subscales is also very useful in bifactor models when

the plausibility of subscales are of special interest. Omega hierarchical and omega

subscales can be easily estimated using the R psych package (Revelle, 2013) and

AMOS. In addition, by calculating a confidence interval for the omega reliability

coefficients more useful estimates will be obtained.

A bi-factor model with 5 subscales is illustrated in Figure 3.2. The extent to

which multidimensionality affects both the general factor and subscale scores can be

appraised more accurately when the corresponding hω and sω values are reported in the

case of bifactor models (for more details on bifactor models, please see Chapter 4).

The formulae for Omega hierarchical (ωh) and Omega subscale (ωs) are

provided below in the case of k items (i = 1, 2,…,k) contributing to a general factor with

loadings λgi and P subscales with loadings λSj,i for i =1, 2,… SJ for each subscale Sj

j=1,2,..P.

∑∑∑∑∑

∑

=====

=

+

++

+

+

=k

ii

SP

iSPi

S

iiS

S

iiS

k

igi

k

igi

h

eVar1

2

1

22

12

21

11

2

1

2

1

)(... λλλλ

λω Equation 3.6

∑∑∑

∑

===

=

+

+

=1

1

21

11

21

1

21

11

1

)(S

ii

S

iiS

S

igi

S

iiS

s

eVari

λλ

λω Equation 3.7

47

∑∑∑

∑

===

=

+

+

=2

1

22

12

22

1

22

12

2

)(S

ii

S

iiS

S

igi

S

iiS

S

eVarλλ

λω Equation 3.8

etc.

where the items i=1,2,…S1 all belong to the S1 scale and the items i=1,2,…S2 all

belong to the S2 scale, etc. Combining these reliabilities the total reliability of the P-

factor measurement model is obtained using Omega total (ωt)

∑∑∑∑∑

∑∑∑∑

=====

====

+

++

+

+

++

+

+

=k

ii

SP

iSPi

S

iiS

S

iiS

k

igi

SP

iSPi

S

iiS

S

iiS

k

igi

t

eVar1

2

1

22

12

21

11

2

1

2

1

22

12

21

11

2

1

)(...

...

λλλλ

λλλλω Equation 3.9

48

Figure 3.2. Demonstrating Omega reliability coefficients for WOAQ

Note: hω =Omega hierarchical; sω = Omega subscale

3.5.2 Covariate-dependent and covariate free reliability

Bentler’s approach to covariate-dependent and covariate-free reliability based

on coefficient rho is the next new development to be discussed. To establish an

1sω

2sω

hω 3sω

4sω

5sω

49

acceptable level of reliability in measurement instruments, there should be positive

intercorrelations among indicators, especially when these indicators are supposed to

represent a single latent construct (Nunnally, 1978; Zeler & Carmines, 1980).

Reliability is usually quantified with a reflective measurement model that implies that

the latent factor generates the systematic responses to a set of items (Bollen & Lennox

1991). In such a model, since error and specific scores are confounded, an increase in

error variance or specific item variance implies a decrease in internal consistency

(Green & Yang, 2009).

More recently, Bentler (2014) introduced the concept of covariate-dependent

and covariate-free reliability that partitions total reliability into parts based on external

covariates and a part which is unaffected by such covariates. The following material on

covariate-dependent reliability is adapted from either personal conversations with

Bentler (2012, 2013) or Bentler (2014). Only the practical application of this concept is

assessed in this study.

Suppose that the covariance matrix for a given set of p variables Xi can be

modelled as cΣ = Σ +Ψ , where cΣ is the part of the covariance matrix that contains all

influences of latent common factors on the observed variables and Ψ is the covariance

matrix of the error (unique or residual) variation. In a confirmatory factor model with

′Σ = ΛΦΛ +Ψ , c ′Σ = ΛΦΛ represents the factor-implied covariances of the variables.

In this case the reliability coefficient rho, describing the internal consistency of the sum

score1

p

ii

X X=

=∑ , is calculated using a unit-weighting vector 1 as:

1 111 1XXρ′Ψ

= −′Σ

Equation 3.10

50

Where 1 1′Ψ is the sum of the unique or error variances associated with the X-variables,

and 2 1 1xσ ′= Σ is the sum of all the elements in the model-reproduced covariance matrix

of the X-variables. Clearly, XXρ represents the proportion of construct-based variance to

the total variance of the sum score (Figure 6).

Now suppose a model contains latent variables that are influenced by a set of

covariates that we may call Z-variables. For simplicity, we consider only models that

have a single latent factor, say F. Then, assuming that the covariates Z predict F, i.e.,

that the model contains one or more ZF paths, the covariate-free rho is:

2Z (1 )

1 1XXρ⊥ ′∆ ⋅ Λ=

′Σ Equation 3.11

Where∆ is the variance of the residuals in the regression of F on the Z- variables and

1′Λ is the sum of the factor loadings in the unstandardised solution.

The covariate-dependent or covariate-dependent rho can then be defined as

( )Z ZXX XX XXρ ρ ρ⊥= − . Equivalently, it can be calculated as:

2( ) ( )(1 )

1 1Z

XXγ φγρ′ ′Λ

=′Σ

Equation 3.12

where γ ′ is the row vector of regression coefficients of F on the Z covariates and φ is

the covariance matrix of the Z’s. In EQS, F may be called F1 and the residual in the

regression ZF may be called D1. Then γ φγ′ can be most simply computed as var (F1)

– var(D1), where var(F1) comes from the model reproduced covariance matrix and

var(D1) is a parameter estimate obtained from the residual variance of the model

(Bentler, 2006; 2014; personal communications 2012, 2013).

51

3.5.2.1 Interpretation of Covariate-dependent Reliability. Assuming a group

covariate, Bentler (2014; personal communications, 2012, 2013) defines the covariate-

dependent reliability as “… a measure of the effect of group differences on the trait

being measured relative to total variation, while the covariate-free reliability is a

measure of the reliable individual difference variance freed from any mean differences

due to the covariate(s)”.

The traditional view of reliability is defined as a measure of stable individual

difference variation (if the data comes from individuals) relative to total variation. But

in Bentler’s view (personal communications, 2013), “… any individual score might be

influenced by other sources, including group and other individual differences”. For

example, if nurses measure a wound size using a wound-measurement device, how

much of the accuracy in the measurement is due to individual differences, and how

much is due to factors such as level of experience or training? In this case

accuracy/reliability in wound measurement is measured with many indicators, and the

latent factor (= true score) is the trait of interest, with the researcher obtaining reliability

measures in the usual way.

A path diagram of four indicators, V1-V4, as measures of a construct (F1),

shown in Figure 3.3, can be used to illustrate the standard and covariate-dependent

reliability measures discussed above. In the standard case, one may assume the

unidimensional model described below.

52

Figure 3.3. A unidimensional construct with four indicators

The coefficients 1 4 toλ λ are constants representing the strengths of the effect

of F1 on the various indicators (observed variables V1-V4). Typically these are called

factor loadings. Letting the factor F1 be F and the random measurement errors E1-E4 be

1 4 to E E , the diagram corresponds to the following measurement equations:

V1= 1 1F Eλ +

V2= 2 2F Eλ +

V3= 3 3F Eλ +

V4= 4 4F Eλ + Equation 3.13

Standard internal consistency reliability coefficients attempt to provide the

proportion of variance in the scale (sum) score V1+V2+V3+V4 that is due to F. There is

no further variance partitioning.

Covariate-dependent reliability illustrated in Figure 3.4 by adding two covariates

(V5 and V6) to the model that predict the latent factor F.

1λ

2λ

3λ

4λ

53

Figure 3.4. A covariate-dependent construct with four indicators and two covariates

By path tracing, one can determine from Figure 3.4 that the variance of F1 (F)

can be partitioned into the variance due to the covariates V5 and V6, plus the variance

due to the residual D1. The former is used to yield the covariate dependent factor

variance, while the latter represents the part of the variance of F1 that is covariate-free.

These variances are then used in the model-based reliability formulae given above for

covariate-free and covariate-dependent reliability. However, Coefficient alpha can also

be partitioned in this way.

3.5.2.2 Covariate-dependent and Covariate-free Partition of Coefficient Alpha.

Coefficient alpha, previously given in equation 3.1, represents an estimate of

the reliability of 1

p

ii

X X=

=∑. Partitioning coefficient alpha into a part due to

covariates and a part unaffected by covariates requires another approach. As

presented by Bentler (2014, personal communications, 2012, and 2013), the

joint covariance matrix of covariates (Zi) and variables of interest (Xi) can be

presented as:

54

xx xz

zx zz

Σ Σ Σ Σ

Equation 3.14

where xxΣ is the covariance matrix of the original p variables X, zzΣ is the covariance

matrix of the set of q covariates (Z-variables), and xzΣ gives their joint covariances.

In order to calculate a covariate-dependent alpha coefficient, the computations

essentially require regressing X on Z. It is well-known in the regression literature that

such a regression partitions the covariance matrix xxΣ into two parts, the part

1( )xz zz zx−Σ Σ Σ predictable from Z and the part 1( )xx xz zz zx

−Σ −Σ Σ Σ not predictable from Z, that

is,

1 1( ) ( )xx xx xz zz zx xz zz zx− −Σ = Σ −Σ Σ Σ + Σ Σ Σ . Equation 3.15

As a consequence, ijσ , the average covariance in xxΣ , can also be partitioned as

( )Z Zij ij ijσ σ σ⊥= + Equation 3.16

where Zijσ ⊥ is the average off-diagonal element of the 1st right-hand term in equation

(3.15) and ( )Zijσ is the corresponding average of the 2nd right-hand term. Substituting

equation (3.16) into the defining formula for alpha given in equation (3.1), we have

2 2 ( ) 2 2 ( )

2 2 2 2

( )

( )

=

Z Z Z Zij ij ij ij ij

x x x xZ Z

p p p pσ σ σ σ σα

σ σ σ σ

α α

⊥ ⊥

⊥

+= = = +

+

. Equation 3.17

Hence, coefficient alpha can be partitioned into two additive parts, where one

part is free of the covariates and the other part is covariate-dependent. Two major

applications of this procedure will be discussed in chapters 9 to 11. One of the main

applications concerns the effect of a covariate on the reliability of a scale. The second

55

application concerns applying this method for demonstrating the effect of Common

Method Bias (CMB) on reliability.

3.5.3 Composite Reliability using PLS

All the above mentioned model-based reliability assessments require the use of

of reflective measurement models and covariance-based SEM (CB-SEM). However,

CB-SEM is not the only appropriate method for assessments of model-based reliability.

Partial Least Squares (PLS) SEM provides an alternative approach. CB-SEM uses

Maximum likelihood (ML) estimation while PLS-SEM uses partial least squares

estimation. PLS- SEM has fewer underlying restrictions than CB-SEM which usually

requires normally distributed data and large sample sizes. This composite reliability

measure obtained using PLS-SEM will be fully explored in study 3 (chapters 12-14).

3.6 Summary

In this chapter the history of model-based reliability using SEM was critically

explored. When Cronbach’s famous article on his coefficient alpha was published in

Psychometrika in 1951, a single general coefficient for assessing internal consistency

and reliability became available. Since then, the alpha coefficient has been widely used

by researchers in many fields. However, it has been recently criticised by several

researchers (Bentler, 2009; Green & Hershberger, 2000; Green & Yang, 2009; Sijtsma,

2009), resulting in recommendations for improvements. Although these

recommendations may be useful, there are other methods (such as model-based

reliability) that should also be considered.

56

Measures for model-based reliability calculated using SEM include one factor

model coefficients such as rho or 11ρ (Jöreskog, 1971), multi-factor model coefficients

such as McDonald’s Omega (ω) (1978), and model coefficients such as Omega

hierarchical ( hω ), Omega subscales ( sω ) and Omega total ( tω ) for bi-factor

models (Revelle et al., 2009). Finally, the covariate-free and covariate-dependent

reliability coefficients of Bentler (2014) are recent practical methods developed to

examine the effects of covariates on the internal consistency of scales using SEM.

Two major recent model-based reliability measurement developments were

discussed in more detail in this chapter. They included a) Omega hierarchical and

subscale with a focus on their application in bifactor models, b) the covariate-dependent

and covariate-free reliability coefficients of Bentler (2014). A third development, the

PLS-SEM procedure for computing Composite Reliability, will be introduced in a later

chapter.

Unfortunately, software for calculating these new model-based reliabilities has

not routinely been available to scholars, and despite the importance of multidimensional

model-based reliability measurement, there is a lack of empirical studies where these

coefficients are estimated. Either scholars in the disciplines do not recognise the

importance of model-based reliability coefficients based on latent constructs over the

classical alpha coefficient, or the appropriate statistical software is still not readily

available. For example, except for EQS (Bentler, 2006) which calculates model-based

reliability rho and the R psych package (Revelle, 2013), computing omega hierarchical

and subscales, most of the packaged software (e.g., SPSS), only provide the classical

alpha coefficient calculation.

57

Model-based reliability estimation provides a more accurate representation of

the true relative magnitude of systematic variance to total variance in a scale or

instrument. Therefore, once a SEM model fits well with its proposed constructs and

measured variables, a more accurate representation of its reliability can be obtained

using model-based reliability measures.

What comes in the following chapters are the application of these recent

developments in practice, with a special focus on bifactor and reflective-formative

models. Chapters 4-6 lay the theoretical ground work for these applications.

58

4

THE VALIDITY OF BIFACTOR VERSUS HIGHER-ORDER

MEASUREMENT MODELS

This chapter opens with an introduction to bifactor models. The chapter then

considers the application of the bifactor model in an organisational study of Work

Organisation Assessment (WOAQ). Bifactor Models

Constructs are often operationalised as multidimensional units (Diamantopoulos,

2010; Edwards & Bagozzi, 2000). When a number of dimensions or related attributes form

a latent factor, it is considered multi-dimensional. In a multi-dimensional construct,

dimensions can be conceptualised under an overall concept or a second-order (higher-

order) construct (Law, Wong, & Mobley, 1998). In second-order constructs, two levels of

constructs exist: the first-order level constructs composed of indicators and the second-

order level constructs composed of first-order constructs (Jarvis et al., 2003). Such models

are known as hierarchical (higher-order) models.

The majority of researchers in behavioural sciences, by default use higher-order

modeling to evaluate multidimensionality. However, this is not the only procedure for

evaluating multidimensionality and it may not always be the best way to evaluate a

multidimensional model. The use of other approaches, such as bifactor (direct hierarchical

order) modeling are not commonly found in the literature (Gignac, 2007; Reise, Moore, &

Haviland, 2010). In a bifactor model, all latent variables are modelled as first-order

constructs, in which first-order factors are nested within a general factors (Gignac, 2007;

Gustafsson & Balke, 1993; Holzinger & Swineford, 1937).

59

http://www.ncbi.nlm.nih.gov/pubmed/?term=Reise%20SP%5Bauth%5D

http://www.ncbi.nlm.nih.gov/pubmed/?term=Moore%20TM%5Bauth%5D

http://www.ncbi.nlm.nih.gov/pubmed/?term=Haviland%20MG%5Bauth%5D

http://www.ncbi.nlm.nih.gov/pubmed/?term=Haviland%20MG%5Bauth%5D

Perhaps the early roots for the development of bifactor (nested factors) models can

be traced back to the work of Holzinger and Swineford (1937) (for full history of SEM

development, see Karimi & Meyer, 2014). However, the bifactor approach is not well

appreciated in the literature, although there are some important advantages in using this

model as an alternative to the conventional higher-order modeling (Gignac, 2007, 2013).

As discussed by Gignac (2007) “the advantages are that bifactor (nested factors) models:

(a) tend to be associated with non-negligibly higher level of model fit; (b) allow for

statistical significance testing for all parameter estimates, and (c) allow for less ambiguous

interpretations of the factor loadings and the narrow factors ‘nested’ within the higher-order

factors (s)” (p, 40). As asserted, one can achieve a better model fit using bifactor modeling.

Imposing fewer restrictions on parameter estimates (as opposed to the number of

restrictions required in conventional CFA procedures) improves the validity of the reported

results in bifactor models. In addition, the bifactor model provides some evidence regarding

the plausibility of the subfactors and the extent of their contribution in a practical sense.

However, the bi-factor procedure is not without disadvantages. One of the main

limitations of using nested factor models is that they have fewer degrees of freedom which

may lead to model identification problems (Gignac, 2007). However this problem can be

managed simply by constraining some of the parameters in the model (Gignac, 2007,

2013). In this study all the latent variable variances are constrained to 1.0 in order to

achieve an identified model.

It is evident that, if the aim for the proposed model is to present both

multidimensionality and a general single factor at the same time, then a bifactor model is an

60

appropriate procedure to present the model (Reise et al., 2010). Using a bifactor model not

only demonstrates the contribution of the items to a general factor (broad construct) but

also provide information on the item contributions to sub-dimensions (narrow constructs)

(Reise et. al, 2010).

4.1 Bifactor Model of WOAQ

The Work Organisation Assessment Questionnaire (WOAQ) was developed as part

of a risk assessment procedure for stress-related exposures inherent in the manufacturing

sector. For a widely-used measure like the WOAQ, using a bifactor model is deemed to be

appropriate for several reasons.

First, having a broad or macro level assessment (using a general factor) would help

to get an overall picture of the organisation. Conversely, being able to assess the

organisation at a narrow or micro level (using subfactors) has practical implications in that

specific problematic areas can be identified and addressed. Evaluating the plausibility of

subfactors is very important in such contexts, making a direct hierarchical model for

WOAQ a good choice.

Second, as highlighted in recent studies (e.g. Wynne-Jones, Varnaya, Buck,

Karanika-Murray, Griffiths, Phillips, & Main, 2009), the latent structure of the WOAQ in

non-manufacturing sectors did not demonstrate a good fit to the model, suggesting that

conventional models are inadequate. Model fit is often a problem for the WOAQ when

conventional second-order models are considered. In the context of risk assessment in

organisations, a tool like WOAQ presents the overall work condition as a general single

factor. Additionally, it adds further benefit by highlighting the different subsections of work

61

organisation characteristics. Thus, evaluating the plausibility of subfactors is very important

in such contexts, suggesting that a direct hierarchical model for WOAQ would be an

appropriate choice.

One of the aims of study 1, therefore, is to compare a bifactor (nested factor) model

with a conventional second-order (higher-order) model of WOAQ. This is done in a health

setting. This study is expected to open up some empirical and methodological avenues for

further developments in this area.

A higher-order model (or full mediation model) and a bifactor model (partial

mediation model) of WOAQ can be distinguished, statistically (Gignac, 2007, 2008, 2013;

Yung, Thissen, & McLeod, 1999) and diagrammatically, as illustrated below.

62

Model 1. Higher-order model of WOAQ

Model 2: Bifactor model of WOAQ

Figure 4.1. Higher-order vs. Bifactor model of WOAQ

63

4.2 Summary

The distinction between a bifactor and a higher-order measurement model was the

focus of this chapter. It is evident that, a bifactor model has superiority over a higher-order

model when the aim of validating a measurement model is to present not only the

multidimensionality and plausibility of the subfactors but also the underlying general factor

of the scale on its own. A comprehensive measure of WOAQ, using a bifactor model, offers

multiple benefits. Firstly, it demonstrates the contribution of the items to a general factor of

WOAQ. Secondly, it provides information on the item contributions to subscales and

indicates the relative importance of the subscales. This procedure has practical implications

in organisational studies as it provides the researchers/practitioners with both a broad and a

detailed picture of WOAQ in a given setting. The general factor of WOAQ highlights if any

problems exist within the organisation. If so, the subscales of WOAQ would highlight the

more critical points that need attention.

In Chapters 6 to 8, a bifactor model of WOAQ will be validated and cross validated

across gender in a nursing and paramedics setting. The results will be compared with a

higher-order model.

64

5

THE VALIDITY OF FORMATIVE MEASUREMENT MODELS VERSUS

REFLECTIVE MODELS

By default, many researchers use reflective models, usually without precise

evaluation of the model (Diamantopoulos & Winklhofer, 2001). As a result of ensuing

model misspecification, two types of error may be caused (Type I and II errors). Recently

researchers in information systems (IS), leadership, management and marketing have

highlighted problems of misspecification in measurement model construction

(Diamantopoulos & Winklhofer, 2001; Jarvis, MacKenzie & Podsakoff, 2003; Podsakoff,

Shen, & Podsakoff, 2006).

As a result of this type of misspecification, some of the findings in the literature

may be misleading (Jarvis et al. 2003; MacKenzie et al. 2005; Petter, Straub, &Rai, 2007).

As mentioned by Jarvis et al. (2003), construct misspecification issues can lead to “serious

consequences for the theoretical conclusions drawn from the model” (p. 212). The extent of

this misspecification problem has been studied in several areas but never in organisational

psychology (Diamantopoulos & Winklhofer, 2001; Jarvis et al., 2003; Petter et al., 2007).

Therefore the four key aims in this chapter are to:

a) To distinguish between reflective and formative SEM models

b) Review some of the literature to identify the extent of possible SEM measurement

model misspecification problems in the area of organisational psychology,

c) To present an empirical example of misspecification using the Work Ability Scale

(WAS)

65

d) To propose a framework for distinguishing formative from reflective models.

It is hoped that the findings will assist researchers to distinguish which types of

measurement models to use for their research. This chapter is organised into four sections

based on the above aims.

5.1 Differences between Formative and Reflective Models

In 1973, a Swedish statistician Karl Jöreskog combined the factor analytic work of

two psychometricians, Charles Spearman and Louis Thurstone, and the path analysis work

of econometrician Sewall Wright, to develop what is now known as SEM (Cunningham,

2008). For nearly half a century, the SEM technique and the computer program LISREL,

which was the result of Jöreskog's work, aided the mapping of interrelated constructs in

broad areas of study. More programs have since been developed (AMOS, EQS, MPLUS)

extending the scope and simplifying the application of this technique.

SEM distinguishes between two different measurement models: reflective and

formative. When indicators are affected by a latent variable, reflective models are

appropriate. Yet in many settings, indicators may be considered as the cause of latent

variables, making formative models more appropriate.

By default, most researchers assume that models are reflective, although many

scholars (e.g. Blalock, 1971; Bollen, 1984; Diamantopoulos and Winklhofer, 2001; Jarvis

et al., 2003; Petter et al., 2007) have alerted researchers to the relevance of formative

models in some specific situations. This advice is unfortunately ignored in much of the

research literature.

66

The following section compares reflective and formative models conceptually.

5.1.1 First-order Reflective and Formative Models

In classical test theory, indicators (items) are considered to be dependent on a latent

variable, in which case:

x i = λ i ξ+ δ i Equation 5.1

where, xi (the ith indicator) is defined by the latent variable (ξ) , the measurement error (δ i),

and the expected coefficient (λ i).

Such measures can be called reflective in that the items are indicators of a latent

factor (Fornell & Bookstein, 1982). Such models provide the trigger for reliability

evaluation and common/confirmatory factor analysis (Bollen 1989; Long, 1983; Nunnally,

1978). A simple first-order model of reflective measurement is represented in Figure 5.1, in

which the latent variable (ξ) is conceptualised as the common cause of three items or

indicators, identified as x₁, x₂, and x₃.

67

x₁= λ₁ξ+ δ₁

x₂= λ₂ξ+ δ₂

x₃= λ₃ξ+ δ₃

Figure 5.1. First-order reflective model

Conversely, based on the nature of the model, the indicators might cause the

construct (Bollen& Lennox, 1991). When the construct is formed from its indicators a

formative model is suggested (Fornell & Bookstein, 1982). Equation 5.2 presents an

example of a formative model in which a weighted sum of the indicators (Σiλixi), represents

the construct (ξ) with an error (ζ):

ξ =Σiλixi+ ζ Equation 5.2

Figure 5.2 presents an example of a first-order formative model in which the causal action

flows from the indicators (x₁, x₂, x₃) to the composite variable (ξ).

68

ξ=λ₁x₁+λ₂x₂ +λ₃x₃+ζ Equation 5.3

Figure 5.2. First-order formative model

5.1.2 Higher-order Reflective and Formative Models

The reflective and formative models specified in Equation 5.1 and 5.2 are examples

of first-order reflective and formative measurement models. However, constructs are often

operationalised as multidimensional units (Diamantopoulos, 2010; Edwards and Bagozzi,

2000). When a number of dimensions or related first-order constructs form a latent factor, it

is considered a multi-dimensional construct. In a multi-dimensional construct, dimensions

can be conceptualised under an overall concept or a second-order construct (Law, Wong,

and Mobley, 1998), or both, as was seen in the bi-factor model in the last chapter.

In second-order constructs, two levels of constructs exist: the first-order level with

indicators and the second-order level with first-order constructs (Jarvis et al., 2003). As

illustrated in Figure 5.3, a reflective-reflective higher-order model has a reflective model

ξ

69

for each of the first-order constructs as well as a reflective model for the second-order

construct (η).

Equation 5.4

where the construct η is conceptualised as a second-order latent variable upon which the

first-order latent constructs ( ) are dependent with measurement error for each of these

first order constructs and expected coefficients .

Figure 5.3. Higher-order reflective-reflective measurement model

Higher-order effects between constructs can also be incorporated as a formative

model in which:

i i irξ γ η= +

iξ ir

iγ

η

1ξ

X₁

X₂

X₃

λ₁

λ₂

λ₃

δ ₁

δ ₂

δ ₃

X₄

X₅

X₆

λ₄

λ₅

λ₆

δ ₄

δ ₅

δ ₆

ξ ₂

γ₁

γ₂

r₁

r₂

70

A higher-order formative-formative model is presented in Figure 5.4 as an example.

In the model each first order construct is represented as a formative model while the second

order construct (η) is also represented as a formative construct.

Figure 5.4. Higher-order formative-formative measurement model

i i iη γ ξ ζ= Σ +

η

1ξ

X₁

X₂

X₃

λ₁

λ₂

λ₃

X₄

X₅

X₆

λ₄

λ₅

λ₆

ξ ₂

γ₁

γ₂

Ϛ

Ϛ

r₁

r₂

71

5.2 Applications of Formative Models

The most common uses of formative models include:

- Creating an induced latent variable

- Creating a block variable

- Illustrating the influence of an experimental intervention on a construct (Edwards &

Bagozzi, 2000).

Creating an induced latent variable is one of the common uses of formative models.

Examples of induced latent variable are presented by Crossley, Bennett, Jex and Burnfield

(2007) in their study concerning the creation of an idea for job embeddedness. Job

embeddedness represents “a broad array of influences on employee retention. The critical

aspects of job embeddedness are (a) the extent to which the job and community are similar

to, or fit with, the other aspects in a person's life space, (b) the extent to which this person

has links to other people or activities and, (c) what the person would sacrifice if he or she

left”. These aspects are important both on the job and off the job (Holtom, Mitchell, & Lee,

2006, p 320). Composite job embeddedness in this study is operationalised by three main

measures: Organisation- and community-fit (“an employee's perceived compatibility or

comfort with an organisation and with his or her environment”, p 320); links (“formal or

informal connections between an employee and institutions or people”, p320); and sacrifice

(“the perceived cost of material or psychological benefits that are forfeited by

organizational departure”, p320). Each construct represents various aspects of job

embeddedness. In this example, both constructs define the construct of job-embeddedness,

allowing the construction of a job-embeddedness index.

72

Some other examples of induced latent variables are social support indices which

include items that capture different aspects of social support (MacCallun & Browne, 1993),

like a socioeconomic status (SES) index, created as a function of education, income, job

status (Bollen& Lennox, 1991). In this instance, the combination of three diverse variables

(income, education, occupation) allows the construction of an SES index.

First-order formative models can also be used for creating block variables. A block

variable is a single construct which summarises the influence of several variables in a block

of outcome variable/s (Edwards & Bagozzi, 2000). In such cases, variables which

constitute the block variable usually illustrate the distinctive causes of the outcome. This

type of formative model was well-illustrated by Howell, Breivik, & Wilcox (2007), using

the study of family socialisation by Heise (1972). A block variable called “family

socialisation” was introduced by Heise (1972), which was a construct formed by the

mother/father’s liberalism, and other unspecified (disturbance) variables (Edwards &

Bagozzi, 2000).

Finally, another common application of formative modeling can be seen in studies

which involve intervention and the assessment of intervention effects on a construct

(Bagozzi, 1977; Costner, 1971). For instance, in an experimental study (Conster, 1971), a

fatigue construct was manipulated by depriving participants of sleep (indicator). In such

experimental studies that involve intervention, the measures can be considered as formative

constructs (Edwards & Bagozzi, 2000; Bagozzi, 1977; Costner, 1971).

73

5.3 Developing a Framework for Distinguishing Reflective- Formative Models

In this section an attempt is made to develop a clear and well-defined decision-

making framework for assessing whether a reflective or a formative model is appropriate.

Unfortunately, there is little information or practical guidelines for distinguishing the

reflective and formative models. The major work in this area was introduced by Jarvis et al

(2003), and Diamantopoulos and Winklhofer (2001), and then extended by Petter, Straub,

and Rai (2007) and Coltman, Devinney, Midgley, and Venail (2008).

What is presented here is a practical decision-making tree for evaluating reflective

and formative models of measurement, based mainly on a review of the works of Jarvis et

al (2003), Petter, Straub, and Rai (2007) and Diamantopoulos and Winklhofer (2001).

The background theory. The first step in identifying formative vs. reflective models

is to refer to the relevant background theory, to determine whether a construct is typically

viewed as a formative or reflective construct. This is usually considered to be the best way

of distinguishing between formative and reflective models. If there is doubt in the literature

or there are no solid theoretical frameworks available, then the following criteria might help

researchers in distinguishing between formative and reflective models. These criteria are

based mainly on the guidelines proposed by Jarvis, MacKenzie and Podsakoff (2003).

Direction of causality. The next step involves consideration of the direction of

causality between each construct and their indicators. As suggested by Jarvis et al (2003),

the researchers need to know, in the first instance:

1) Whether the items explain the latent factor, or if the latent factor represents the

indicators. In formative models, the indicators influence the latent factor or

74

“composite” variable (MacKenzie et al. 2005). But if the latent variable is fully

derived by its indicator items that manifest or represent the latent factor, a reflective

model is suggested.

2) The nature of changes in the latent factor. In formative models, the measurement

error is at the factor level; the latent factor is partially explained by random error

and is not fully explainable by its items. Any change in an item would lead to a

change in the latent factor, but not vice versa. The opposite is true in reflective

models; the measurement errors are at the item level, therefore, any change in the

indicator does not necessarily result in a change in the latent factor. However, any

change in the latent factor would result in a change in the items (Jarvis et al, 2003;

Petter, Straub, & Rai, 2007).

The interchangeability of the measures. The third step involves examining the

interchangeability of the measures (Jarvis et al, 2003):

1) The similarity of contents of the indicators. In reflective models, measures are

interchangeable and follow a common theme. Employing different themes suggests

formative measures which are not interchangeable.

2) Changes in the indicators. In formative measures, the latent factor is explained by

its items; removing any item of a formative factor would influence the meaning of

the latent or composite factor. In reflective models, however, removing an indicator

would not affect the meaning of the latent factor because they are outcomes of the

construct and not the cause (Jarvis et al, 2003; Petter, Straub, &Rai, 2007).

75

Co-variation among measures. The fourth step involves consideration of the

correlations among indicators; in other words, would variation in one indicator be

correlated with the variations in other indicators (Jarvis et al, 2003). In formative models,

because a construct is formed by different indicators, high correlations between the

indicators are not expected. The indicators in such models might represent totally different

content. However, with reflective models, because the indicators are presented by the latent

factor, high correlation between indicators is required. This suggests multicollinearity

which seems to be desirable for reflective measures. That is why establishing an acceptable

level of internal consistency is required for reflective models while it is not really

appropriate for formative models.

Nomological net of the latent factor indicators. The final decision rule is based on

the following criterion for reflective models: The same antecedents and consequences of.

With formative constructs, it is not expected that the observed variables have similar

predictors or outcomes. This is because the composite factors are formed by indicators that

are not necessarily correlated nor do they necessarily share the same content. Conversely,

with reflective models, due to the interchangeability of reflective indicators, the same

patterns of antecedents and consequences are expected for all indicators (Jarvis et al., 2003;

Petter, Straub, &Rai, 2007). Depending on the extent to which this criterion is met, the

researcher will be able to decide if it is a reflective or formative construct.

A summary of the above is presented in Figure 5.5 in the form of a decision tree.

These decision rules will be used in an examination of the organisational psychology

literature in the next section. However, although using this guideline helps to identify a

76

reflective or formative construct, in practice many constructs are mixed. In other words, a

construct has some items consistent with formative constructs and other items which are

consistent with reflective constructs.

77

Direction of causality

The interchangeability of the indicators

Co-variation among measures

Nomological net of the factor

Reflective model Formative model

yes yes

Figure 5.5. The developed framework for assessing the formative vs. reflective measurement models. Acknowledgement: The main contents of this framework are built based on the guidelines proposed by

Jarvis et al., (2003), Journal of Consumer Research, 30.

78

5.4 Measurement Model Misspecification in Organisational Psychology Literature

5.4.1 Empirical Evidence on Measurement Model Misspecification

In recent times, some researchers have focussed on misspecification in

measurement models. One of the earliest studies which highlighted misspecification in

formative constructs is that of Jarvis et al. (2003). In this study, the extent of

misspecification was assessed by reviewing four marketing journals (Journal of Marketing,

Journal of Marketing Research, Marketing Science and Journal of Consumer Research). A

29 per cent model misspecification rate was reported.

Fassot followed in 2006 (cited in Diamantopoulos, Riefler, & Roth, 2008),

reviewing three German management journals (Zeitschrift für Betriebswirtschaft,

Zeitschrift für etriebswirtschaftliche Forschung and Die Betriebswirtschaft). A 35 per cent

misspecification rate was reported.

In a similar process, Podsakoff et al. (2006) reported a misspecification rate of 62

per cent after reviewing the three most important strategic management journals (Academy

of Management Journal, Administrative Science Quarterly and Strategic Management

Journal). Similar results were also reported for leadership research (47 per cent

misspecification) by Podsakoff, MacKenzie, Podsakoff and Lee (2003) based on articles

published in The Leadership Quarterly, Journal of Applied Psychology, and Academy of

Management Journal.

Petter et al. (2007) examined complete volumes of MIS Quarterly and Information

Systems Research over three years. The study reported a 30 per cent misspecification for

formative constructs.

79

In more recent times, Roy et al. (2012) reviewed four journals in the area of

production, manufacturing and operations management (Journal of Management Science,

Journal of Operations Management, Decision Sciences Journal, and Journal of Production

and Operations Management Society) published between 2002 and 2006. They reported a

misspecification rate of 42.5 per cent.

In summary, the existing studies show a significant degree of misspecification in the

disciplines of information systems (IS), leadership, management and marketing. The

question is: “To what extent does misspecification exist in other disciplines such as

psychology?” To the researcher’s knowledge, no such study has been conducted in the area

of organisational psychology, hence the need for this study. The hypothesis of this study is

that:

Hypothesis 5.1: There is some degree of misspecification in formative vs reflective

measurement models in the organisational psychology literature.

5.4.2 Literature review strategy

Initially, the methodology for identifying misspecification will be discussed based

on the recent organisational psychology literature. To assess the prevalence of measurement

model misspecification, articles published within a nine-year period between 2006 and

2014 in two high profile journals - Journal of Applied Psychology and Personnel

Psychology - were reviewed. While this is not a broad review of the literature; it is

reasonable to assume that a problem exists if construct misspecifications are found in these

80

most cited journals in the discipline. As a result, there is a likelihood of Type I or II errors

in reported results in misspecified models. If a problem of misspecification exists, then

there is a need to take action and pay more attention to this neglected area of study.

The following inclusion criteria were followed in this review:

- Papers with measurement models - Constructs measured by two or more items.

The exclusion criteria were as follow:

- Papers consisting only of single-item measures - Papers that did not report their measurement items.

Based on these criteria, a total of 301 studies were considered in the analysis (See

Appendix H). The measurement items for each construct were examined by two researchers

independently, using the decision making framework provided in Figure 5.5. If both

researchers agreed that at least one construct was misspecified (e.g. modelled as reflective

while it should be formative or vice versa), the study was coded as misspecified, on the

grounds that misspecification for any construct could lead to error.

5.4.3 Inter-rater Reliability

IBM SPSS Statistics (SPSS) for MS Windows Release 21.0 (SPSS Inc., Chicago,

IL) was used to analyse this data. A Cohen’s Kappa was applied to measure the inter-rater

reliability of the decision for the two researchers (the student and her principal supervisor),

both experts in SEM and Organisational Psychology studies. Using only two researchers to

rate the measures is considered to be one of the limitations of this review. All the papers

were examined and the appropriateness of formative and reflective models was judged by

81

both raters in each case. The Cohen’s Kappa test examines the level of agreement between

raters, with a result of higher than 0.70 indicating good agreement between raters.

5.4.4 Results of the Review

A high level of agreement was obtained between the raters (Cohen’s Kappa=0.89),

suggesting that the classification of the articles based on the guidelines provided in Figure

5.5 was reliable. The findings of this review are summarised in Table 5.1.

Table 5.1

Measurement Model Classification

Should be

Reflective

Should be

Formative

Should be

Mixed

Total

Modeled as

Reflective

215 39 16 270 (90%)

Modeled as

Formative

0 21 0 21 (7%)

Modeled as Mixed 0 0 10 10 (3%)

Total 215 (71 %) 60 (20 %) 26 (9 %) 301(100)

* A total of 301 studies from articles published in the Journal of Applied

Psychology and Personnel Psychology between 2006 and 2014 were reviewed.

A misspecification level of 18 per cent (55/301) was found in this review. The

misspecification involved misspecifying a formative model as reflective or a mixed model

as a fully reflective model. Not surprisingly, the majority of the studies (90%) by default

considered measurement models as reflective. Unfortunately, there is no similar

misspecification study in this area to allow a comparison, however, higher percentages have

82

been found in other disciplines, as explained previously. As mentioned previously the

results of such misspecification in measurement models can lead to Type I or II errors.

5.4.5 Discussion

The issue of measurement model misspecification is a very critical topic in

measurement models. As mentioned before the majority of scholars by default consider

measurement models as reflective which leads to misspecificiation. As indicated in

previous studies (Jarvis et al., 2003; Petter et al., 2007; Roy et al., 2012), model

misspecification can bias the parameter estimation leading to Type I and II errors and

incorrect conclusions. Although a higher degree of misspecification has been reported in

other disciplines (e.g. Jarvis et al., 2003; Podsakoff et al., 2006; Petter, Straub and Rai,

2007; Roy et al., 2012), the finding of an 18 per cent reported misspecification rate in two

prestigious organisational psychology journals is nevertheless significant. If such a high

percentage of misspecification is found in top-ranked journals, the researchers predict

significantly higher misspecification rates in journals with less influence.

Given the reported problem of misspecification in the field, greater attention to

measurement model specification is imperative. A lack of awareness about the nature of

formative constructs could be one of the reasons for misspecification. As demonstrated by

previous studies (e.g. Jarvis et al., 2003; Petter et al., 2007) and as shown in Table 5.1, in

all the misspecified studies, researchers had miscategorised formative constructs as

reflective rather than the reverse.

What is needed is a simple but comprehensive framework to distinguish formative

and reflective measures. Also, it is important to ask why formative models are frequently

83

misspecified as reflective models. The problems that occur with the fitting of formative

models are partly to blame. Overall it is easier to fit reflective models. This topic is

discussed later using empirical examples in the context of a work ability measurement

model.

The review however is not without limitations. One of the main limitations in this

review is using only two researchers for rating the measudmrnet models. Reviewing only

two journals also considered as another limiattions of the review which limits the

generalizability of the results.

5.5 Summary and Conclusion

In this chapter an informative introduction was provided to distinguish formative

from reflective measurement models. A simple and easy to understand framework for

distinguishing formative models from reflective models was proposed in this chapter. The

misspecification of formative vs reflective models along with the possible outcomes of

misspecifications were discussed Then it was demonstrated how big the problem of

formative model misspecification is in the organisational psychology discipline. Using a

comprehensive literature review of misspecification over a 9-years period in two high

ranked journals in the discipline of organisational psychology, the misspecification rate was

demonstrated for the first time in this discipline.

In study 3 an example of misspecification involving the measurement of work

ability using the WAS measure is presented. In this study it will be empirically

demonstrated how different model specification/misspecification can yield different results

for a measurement model. The initial second-order WAS model will be re-examined using

84

reflective-reflective, formative-formative and reflective-formative models. Based on the

guidelines provided in Figure 5.5, and theoretical background, the model should be fitted as

reflective-formative (reflective for first-order constructs and formative for the second-order

construct). In Chapters 12 to 14, therefore, the validity and reliability assessments of the

correctly specified model of reflective-formative models of WAS is conducted using Partial

Least Squares SEM. The results will be compared and discussed with those obtained from

the misspecified reflective-reflective and formative-formative model of WAS, along with

the implications for the discipline

85

6

STUDY 1: MODEL-BASED RELIABILITY, VALIDITY AND CROSS

VALIDITY OF BIFACATOR MODEL FOR WOAQ

In chapters 6 to 8 the validity, cross validity and model-based reliability of the

Work Organisation Assessment Questionnaire (WOAQ) is assessed for a sample of nurses

and a sample of paramedics using a bi-factor model. This chapter introduces the data and

the relevant theory and hypotheses, Chapter 7 reports the results and Chapter 8 discusses

and summarises the implications of the results.

The Work Organisation Assessment Questionnaire (WOAQ) was previously

validated by Griffiths, et al. (2006) in a study involving manufacturing workers. In this

study, WOAQ was viewed as a bi-factor measure, including a general measure of the Work

Organisation Assessment Questionnaire (WOAQ) and five nested subfactors, with each

subfactor representing different dimensions of work organisation risk assessment. The five

nested subfactors are: quality of relationships with management, reward and recognition,

workload issues, quality of relationships with colleagues, and quality of the physical

environment.

This study is the first of its kind to be conducted on two groups of employees in an

Australian health setting. As mentioned in Chapter 4, in recent years bifactor modeling is

gaining popularity among the scholars of different disciplines. However, applications of the

bifactor model in the field of organisational psychology have been very limited. Lack of

knowledge or lack of information on the advantages of a bifactor model over a higher order

model in specific contexts may be among the main reasons for this neglect (Reise, 2012).

86

The focus of study one is validation of the Work Organisation Assessment Questionnaire

(WOAQ) which was originally proposed by Griffins et al in 2006. Although the scale has

been used in many studies since its development, there is little work evaluating its validity.

Among these studies, a poor fit is reported for WOAQ using a second-order model and

none of these studies employed bifactor modeling for validity assessment. In this research,

Study 1 consisted of three sub-studies which are discussed below.

a) Validation of the Work Organisation Assessment Questionnaire (WOAQ) for

nurses. This study presents a validity assessment of the WOAQ for nurses, in

which a conventional second-order model of WOAQ will be compared with a

bifactor model. No other such study has been undertaken in an Australian health

setting using such a broad and rigorous examination of model-based reliability and

validity procedures. In particular, the bifactor model of Work Organisation

Assessment Questionnaire (WOAQ), including its general factor and five sub-

factors, will be assessed in terms of construct validity.

b) Model-based reliability of WOAQ. The conventional coefficient alpha reliability

measures for WOAQ will be compared with the model-based reliability of Omega

total, Omega hierarchical and Omega subscales. Based on the literature on

multidimensional scales, the coefficient alpha overestimates reliability. Model-

based reliability coefficients are expected to provide more accurate reliability

measures for multidimensional models and/or when the assumptions for coefficient

alpha are not met.

87

c) Cross validation of the Work Organisation Assessment Questionnaire (WOAQ)

across gender among paramedics. The final section of study one considers the

cross-validity of the Work Organisation Assessment Questionnaire (WOAQ) across

gender considering only the paramedics sample. The best fitting model of Work

Organisation Assessment Questionnaire (WOAQ), obtained in the assessment of

cross-validity across the nurse and paramedic samples, will be tested for invariance

for males and females using the MACS procedure. This allows a statistical

comparison of factor structures and observed means. MACS was first introduced

by Sörbom (1974) for the cross validation of SEM models. However, practical use

of this procedure is often neglected in the literature.

This chapter along with Chapters 7 and 8 present the rational and objectives, the

methodology used, the results and discussion of the findings, the unique strengths of the

research, and possible directions for future related studies.

6.1 Rational and Objectives

6.1.1 Validity of Bifactor Model of WOAQ

One of the greatest challenges for society is sustaining an individual’s health and

quality of life in the workplace (Cox, 1997). There is a broad body of research revealing

damage to health and wellbeing in workplaces. Increasing awareness of the possible

deleterious effects of work related factors on health has led to the enforcement of

regulations and the introduction of legislation in many developed countries to ensure that

organisations put the health of their employees as a high priority (Faragher, Cooper &

88

Cartwright, 2004). As a result, management has also been encouraged to conduct risk

assessments for psychosocial hazards with a view to ensuring employees’ health and safety

in the workplace (Rick & Briner, 2000).

There are several driving forces which contribute to making the workplace a less

convivial place for employees to work in. For instance, the growing competitiveness of the

marketplace, the constant need to improve organisation efficiency and profitability and

radical changes in employment conditions are amongst the major driving forces responsible

for increasing stress in the workplace (Faragher, Cooper & Cartwright, 2004). But in

particular, an inability to incorporate proper work design in the workplace leads to a

negative effect on both employees and organisations (Griffiths, et al., 2006).

Much of the attention in the occupational health and safety (OH&S) literature has

been focusing on linking this inability to incorporate a suitable work design with the right

assessment tools and decreasing negative work related outcomes for individuals and

organisations (Griffiths, et al., 2006).

The efficacy of an OH&S tool in assessing the risk factors in the workplace

environment depends on how well it is designed, implemented, and developed. A more

practical approach is required in order to obtain information from relevant respondents,

taking into consideration the nature of their work (LaMontagne, 2004).

Adapting such approaches to a specific work context provides a benchmark which

can be used to identify the main organisational hazards and to progressively improve OHS

by improving safe work design and practices. The main challenge is to use a suitable

instrument to improve the capture of OH&S indicators.

89

Based on recommendations from previous studies, a good risk assessment process

can only be achieved by using multiple methods of assessment. A well-designed

assessment should recognize the risks in the workplace and also the employees at risk (The

Health and Safety Executive Guidelines, 2000). The organisational risk assessment is

obtained using questionnaire/survey scales. In order to evaluate risk and stress effectively,

this questionnaire must meet some important criteria such as being reliable and valid; easy

to complete; measuring the possible risks, their predictability of outcomes related to the

employees’ health, their size and impact on the target population; and applicable to both

organisations as a whole and at different work levels. To be able to meet such criteria, the

questionnaires are usually quite lengthy. As a result, the large amount of time it takes to

complete a questionnaire leads to a low response rate (Faragher, Cooper & Cartwright,

2004).

A short yet comprehensive risk assessment questionnaire is desirable. One such

instrument called Work Organisation Assessment Questionnaire (WOAQ) developed by

Griffiths, et al., (2006) may be able to overcome problems identified in previously validated

measures due to its short length and yet comprehensive content. The methodology

developed in WOAQ was based on identifying and collecting employees’ opinions on their

work, health, and their workplace design and management (Griffiths, et al., 2006). It was

designed to measure risk factors pertaining to the work design and management which may

influence employee health and health related behaviours in a manufacturing setting

(Griffiths, et al., 2006; Wynne-Jones et al., 2009). The overall score on WOAQ indicates

the extent to which the respondents believe that these dimensions of work are good and can

90

be used as predictors of wellbeing, subjective health and job satisfaction. A high score on

WOAQ indicates that the respondents perceive dimensions of work as good, and a low

score on WOAQ indicates that the respondents perceive dimensions of work as problematic

(Griffiths, et al., 2006).

The WOAQ was initially developed for a manufacturing setting and implemented in

the private sector; however the comprehensive approach to the risk assessment means that

this questionnaire may be used in other settings including non-manufacturing or health

settings.

It is therefore important to check if the WOAQ can be implemented effectively in

other work settings or professions (Wynne-Jones et al., 2009). Only a few studies have

evaluated the application of WOAQ in other workplaces. For example, Wynne-Jones et al.,

(2009) in their research of two large public sector organisations in South Wales, evaluated

the validity and reliability of WOAQ in the public sector. Using a higher order CFA, the

researchers only found a marginal fit for the original five subfactors of WOAQ. In the end

they identified a two-factor structure linked to four of the five scales of the WOAQ,

assessing Management and Work Design, and Work Culture. One of the aims in this study

is therefore to find out if the general and five subfactors of WOAQ can be implemented in a

non-manufacturing, health setting in Australia. Also, in addition to the conventional higher

order model of CFA used frequently by other scholars in the field (including the Waynne-

Jones’s study), a more practical bifactor model will be used to assess the general factor of

WOAQ and the plausibility of its five subfactors in an Australian community nursing

91

setting. As fully discussed in Chapter 4, a bifactor model of WOAQ is deemed to deliver a

better fit and more valuable information in such contexts. It is therefore hypothesised that:

Hypothesis 6.1. A bifactor model of WOAQ has acceptable construct validity in a

non-manufacturing, health setting in Australia.

Hypothesis 6.2: A bifactor model of WOAQ has superior fit over the conventional

higher order, five-factor model of the WOAQ.

In this study covariance-based SEM is used to fit reflective models to the WOAQ.

This allows the evaluation of model fit using conventional goodness of fit measures. In

addition it allows the extraction of model-based measures of reliability. It also allows the

use of invariance tests for comparing the cross-validity of models for different groups (e.g.

nurses and paramedics, males and females) as described below.

6.1.2 Model-based Reliability

One of the commonly used measures for reliability is coefficient alpha which was

proposed originally by Cronbach in 1951. Coefficient alpha was developed for only one-

dimensional scales and is therefore not appropriate for multidimensional constructs as

discussed previously (Sijtsma, 2009; Zinbarg, Revelle, Yovel, & Li, 2005). In the case of

multidimensional scales, coefficient alpha may lead to overestimation of the reliability

(Cortina, 1993).

However, model-based reliability assessments for multi-dimensional scales were

provided many years ago by Bentler (1968) and Heise and Bohrnstedt (1970) and, more

recently, by Bentler (2007, 2009) for factor analytic types of models, and, in a generalised

92

form, for any structural equation model with additive errors. Although reliability for a

general SEM model is rationalised based on the model’s multidimensional structure, it

should be noted that a uni-dimensional model-based coefficient, which we will call ρ or

rho, still quantifies the proportion of variance due to the most reliable single dimension in

multidimensional space (Bentler, 2007). However, there are a few empirical studies that

have also reported reliability coefficients such as omega hierarchical, omega subscale and

omega total that are suitable for bi-factor models with multiple subscales (e.g.Gignac &

Watkins 2013; Reise, Bonifay, & Haviland, 2012; Zinbarg et al., 2012).

For the purpose of this study, both traditional (conventional) estimates (i.e. the

conventional Coefficient alpha) and more modern model-based reliability estimates of

Omega (i.e. omega hierarchical, omega subscale and omega total) will be assessed and

compared for the bifactor WOAQ model.

Omega hierarchical ( hω ) estimates the degree of proficiency of a general factor

test measure in a bifactor model (Revelle, & Zinbarg, 2008; Zinberg, Revelle, Yovel, & Li,

2005). It is a measure of the variance in total scores that arise from the general factor

running across all the items (Reise, Bonifay, & Haviland, 2013). Omega subscale ( sω ) is

used to determine the degree of reliability of the proposed subscale scores after controlling

for the variance generated from the general factor (Reise et al., 2012). Omega total ( tω )

estimates the combined reliability for the general factor and the subscales (McDonald,

1978).

93

Reporting both conventional item based reliability estimates as well as all three

types of omega model-based reliability measures for a bifactor model will provide a more

detailed and comprehensive evaluation of the reliability for WOAQ. All the previous

studies on WOAQ reported only single item-based reliability measures which are not

appropriate given the multidimensional nature of the scale. These reliability measures will

be calculated and compared for the sample of paramedics.

It is hypothesised that:

Hypothesis 6.3: The model-based Omega reliability coefficients will provide

acceptable levels of internal consistency for the bifactor model of WOAQ for a sample of

paramedics.

Hypothesis 6.4: The conventional internal consistency reliability of alpha

coefficient overestimates the reliability of the WOAQ scale compared to the model-based

reliability coefficients of Omega for a sample of paramedics.

6.1.3 Cross Validation of Bifactor Model of WOAQ

An important aspect of a tool’s psychometric properties is its cross validity, or

whether the tool has good fit in other groups of individuals or populations. Once the

validity of a tool is established, it is time to assess its cross-validity. The main issue of

cross-validity is whether the validated tool fits well in more specific populations.

Measurement Model Invariance can be tested for two or more distinct samples using CFA.

If the model fits well in all the samples then it can be concluded that the model is

acceptable and valid across the corresponding populations. However, it is also necessary to

94

test whether the population parameters can be considered equal for all the samples. There

are several procedures for evaluating cross-validity in CFA models, each relating to a

hypothesis for a different set of key population parameters (e.g. Meredith, 1993; Widaman

and Reise, 1997; Bryne, 1995; Bryne & Watkins, 2003, Cheung & Rensvold, 2002). Little

(1997) categorises invariance testing into two major categories. The first type of invariance

procedure refers to evaluating the psychometric characteristics of the model parameters

(e.g. factor loading, measured-variable loading, variance/covariance of errors or factor

residuals) using analysis of covariance (COVS). This type of invariance must be

established before progressing to category two of invariance analysis, relating to invariance

in factor means (Cheung & Rensvold, 2002; Widaman and Reise, 1997). The invariance

analysis of mean and covariance structure (MACS) procedures for latent constructs was

first introduced by Sörbom (1974) for the cross validation of SEM models. Although

category one invariance testing (COVS) of measured (observable) parameters described by

Little (1997) is widely demonstrated by researchers, few scholars have paid attention to the

category two invariance testing of MACS (Cheung & Rensvold, 2002; Chen, Sousa, &

West, 2005; Vandenberg & Lance, 2000). In this study the invariance of all parameters for

the bifactor WOAQ model will be assessed between males and females in the paramedics’

sample.

The main technical aspects of invariance procedures, at both measurement and

construct levels, were introduced by Meredith (1993), Widaman and Reise (1997) and

Meredith and Horn (2001). The main limitation of the techniques they have introduced is

their design for only first-order models. For more complex bifactor or higher order models,

95

the literature on practical assessment techniques is more limited. As mentioned by Chen,

Sousa, and West (2005), previous scholars have not paid enough attention to the invariance

testing of bifactor or higher order models (e.g. Byrne, 1995; Byrne & Campbell, 1999;

Marsh & Hocevar, 1985). In particular, this is true in the case of MACS invariance tests for

bifactor models.

Thus, in this part of the study, invariance testing will be conducted on a bifactor

model of the WOAQ at both parameter and construct levels, building on the

recommendations of previous scholars (Cheung and Rensvold, 2002; Byrne, 1995; Byrne

and Watkins, 2003; Meredith, 1993, Meredith & Horn, 1997; Widaman & Reise, 1997 and

Chen, Sousa, & West, 2005; Yap et al., 2014).

In order to assess the validity and the cross-validity of the WOAQ across the data in

this study, three analyses will be carried out across gender for the bifactor model WOAQ

with its five nested factors. In the first set of analyses, the validity of the bifactor model of

WOAQ will be independently tested for male and female employees from a paramedic

organisation. Invariance testing for the measures will be carried out at the second step. If

the model shows satisfactory invariance of the measures, then in the final analysis the

construct means can be tested for invariance using the MACS approach. In this analysis

only the paramedic sample is considered because there were too few males in the nursing

sample to make a valid comparison.

It is evident that any occupational safety and health interventions should be

beneficial for both males and females. However, gender mainstreaming, or the gender-

sensitive approach to occupational health and safety (OH&S), has been recognised as

96

important to the OH&S agenda by the European Commission in its community safety and

health strategy 2002-06 (EU-OSHA – European Agency for Safety and Health at Work,

2014). Both male and females employees can better benefit from interventions aimed to

improve their health that are developed based on gender-sensitive approach. To obtain such

equality in OH&S, it is critical to recognise the gender differences and as a result the

differences in their work organisations and the way they perceive working conditions. This

should not just be limited to the physical elements (such as designing safety gear

specifically fitted for women) but should also take into account the psychosocial elements

of the work setting. Therefore, in recognition of the importance of a gender-sensitive

approach, the main goal of this study is to examine whether work characteristics are

experienced in the same way by both genders and, in this way, validate the WOAQ across

male and female paramedics.

Therefore it is hypothesised that:

Hypothesis 6.5: Baseline invariance. There is a baseline invariance of the bifactor

CFA model of the WOAQ in that the model describes both female and male paramedics.

Hypothesis 6.6: Configural invariance. There is a configural invariance of the

bifactor CFA model of the WOAQ across gender in that the model describes the combined

data set well.

Hypothesis 6.7: Invariant factor loadings. The bifactor CFA model of the WOAQ

exhibits invariance across gender, even after constraining the factor loadings on observed

variables to be equal for males and females.

97

Hypothesis 6.8: Invariant factor means. The factor (construct) means of the bifactor

CFA model of the WOAQ are invariant for male and female paramedics.

6.2 Method

The data collection for both studies of nurses and paramedics along with the

measures, ethical considerations and data analysis are described below.

6.2.1 Nursing Participants

Data were collected from a sample of Australian nurses for the validation of the

WOAQ. The study design was cross-sectional. A self-report questionnaire was used to

capture demographic-work characteristics and the WOAQ described below.

A questionnaire package that included a cover letter, information sheet, consent

form, questionnaires, and reply-paid envelopes was forwarded to all potential participants.

Three weeks after the mail-out, a letter was forwarded to the employees to thank them for

their participation, or to ask if they could complete and return the questionnaire if they had

not already done so. A total of 334 surveys were returned. Some of the returned surveys

were incomplete with a high percentage of missing data, therefore the decision was made to

remove these incomplete surveys. After data cleaning and removing the incomplete data,

312 surveys were included in the final data analysis.

98

6.2.2 Paramedic Participants.

The paramedic data was collected from a large Australian health organisation

employing paramedics3. The study design was cross-sectional. A self-report electronic

questionnaire was used to capture the variable of interests anonymously. Nine hundred and

seventy nine responses were received from the paramedics. Of these, 33 were from

volunteer paramedics which were excluded from the final database.

6.2.3 Measures

The measurement scale used was the comprehensive Work Organisation

Assessment Questionnaire (WOAQ) consisting of 28 items pertinent to aspects of the

respondents’ work organisation (Griffiths, et al. 2006). Respondents were asked to rate how

problematic or good each of the items were for them in the last six months, with higher

scores representing better quality of work environment. It was assumed that the WOAQ

consisted of a general 28-item summative factor with a five sub-factor structure. The five-

factor structure of the scale included: workload issues, reward and recognition, quality of

relationships with management, relationships with colleagues, and physical environment.

3 The data was collected as part of a study on the prevention of work-related musculoskeletal disorders and the development of a tool kit for workplace users. For this study, psychosocial workplace hazards were recognised as a significant predictor of discomfort /pain levels and absenteeism due to sickness (Jodi & Macdonald, 2012). Therefore, data on the WOAQ which was collected as part of quantifying the psychosocial workplace hazards in that study was also used in this study for evaluating covariate-dependent reliability and cross-validity.

99

6.2.4 Ethics

Human Research Ethics Committee approval was obtained from both the lead

university and the participating nursing and paramedic organisations.

6.2.5 Overview of Statistical Analysis

Normality of the data was assessed before conducting the CFA at both item level

and group level. At the first step of the validation process, the construct validity of a

bifactor model of WOAQ was compared with a higher order model. Although the responses

are captured on a 5-point ordinal scale they are treated as continuous normally distributed

variables. This is a limitation of the analysis although ordinal variables with five categories,

are usually treated as “continuous.” There is some evidence to support that it is unlikely

that this will have any significant practical impact on the results (e.g., Babakus, Ferguson,

& Jöreskog, 1987; Dolan, 1994; Johnson & Creech, 1983; Hutchinson & Olmos, 1998;

Rhemtulia, Brosseau-Liard, & Savalei, 2012). As demonstrated by some simulations

studies (Rhemtulia, Brosseau-Liard, & Savalei, 2012), for five to seven categories, robust

continuous methods of estimation, such as Maximum Likelihood (ML), will deliver similar

outcome as categorical methods of estimation such as categorical Least Squares (cat-LS).

Also, as asserted by Rhemtulia, Brosseau-Liard, and Savalei (2012), the continuous

methods of estimation are very familiar for researchers while there is limited knowledge on

estimation methods for categorical data.

An important factor that was considered in choosing the suitable fit indices was the

degree of penalty included for model complexity. Based on the suggestion by scholars (e.g.

Gignac, 2013), for evaluation of bifactor models, it is better to choose those close-fit

100

indices that include relatively greater penalties for model complexity (i.e. RMSEA,

NNFI/TLI, & AIC).

The fit indices reported in this study are summarised as follows:

- The root mean square error of approximation (RMSEA)

- The Tucker-Lewis Index (TLI) or Non-normed fit index (NNFI)

- The Akaike Information Criterion (AIC)

RMSEA values of less than.08, and .05 (MacCallum, Browne, and Sugawara, 1996)

and NNFI values of greater than 0.90 and 0.95 (Hu & Bentler, 1999) were considered as

marginal and good fit levels respectively . The model comparisons will be performed

based on a practical improvement in NNFI. NNFI reductions of at least .010 show

significant model improvement according to Vandenberg & Lance (2000).

The Akaike Information Criterion (AIC) is a comparative measure of fit which is

meaningful only when two different models are estimated. A smaller value of AIC and a

reduction (ΔAIC) of more than 10 indicates a superior model fit (Akaike, 1973; Raftery,

1995; Schwarz, 1978).

The chi-square goodness of fit test was also reported as the conventional, commonly

reported measure of fit in the literature. Traditionally, a chi square statistic is used for

assessing if the proposed model describes the data adequately. However, as acknowledged

by Hu & Bentler (1999), the chi square statistic is highly dependent on sample size and is

not appropriate for complex or non-normal data. The relative chi square (Chi-Square/DF) is

therefore preferred as a measure of model fit. For this statistic a value of 1 to 2 reflects

101

good fit, less than 3 represents acceptable fit (Kline, 1998), and less than 5 represents

adequate fit (Schumacker & Lomax, 2004).

The other commonly reported fit indices are the Standardized Root Mean Square

Residual (SRMSR) and Comparative Fit Index (CFI), however neither of these fit indices

were considered in this study because they do not adequately penalise for model

complexity (Marsh, Hau,& Grayson, 2005; Gignac, 2013).

6.2.6 Model-based reliability

Model-based reliability coefficients of omega total, omega hierarchical and omega

subscales, and conventional item-based coefficient alpha will be used for testing the

reliability of the WOAQ. Only R Psych package (Revelle, 2013) calculates these omega

coefficients directly. In other SEM software such as AMOS and EQS, omega coefficients

can be calculated indirectly using what is known as a reliability index (Fan, 2003), which is

in fact the implied correlation between a latent variable and its corresponding composite

score (Gignac, 2007).

As recommended by Gignac (2014) a practical approach for the estimation of hω

and sω is “to estimate the (squared) correlation between latent variables within a bifactor

model and their corresponding equally weighted composites scores (known as phantom

variables) within structural equation modeling programs” (p. 9). Figure 6.1, demonstrates

an example for this procedure using EQS. The confidence intervals associated with the

reliability coefficients in this procedure can also be evaluated using a combination of the

phantom variable squared correlation approach and bootstrapping. Due to an identification

102

problem (having only two indicators for the ‘relationship with colleagues’ construct), the

method could not be used in this study. Instead, using an excel spreadsheet and the

formulas for the omega coefficients, they were calculated manually. Using the factor

loadings and error variances of the well-fitting WOAQ, the coefficients of omega were

calculated.

The formulae for Omega hierarchical (ωh) and Omega subscale (ωs) are provided

below in the case of items (i = 1, 2,…,k=28) contributing to a general factor with loadings

λgi and five subscales with loadings λSj,i for i =1, 2,… SJ for each subscale Sj j=1,2,..5.

∑∑∑∑∑

∑

=====

=

+

++

+

+

=k

ii

S

iiS

S

iiS

S

iiS

k

igi

k

igi

h

eVar1

25

15

22

12

21

11

2

1

2

1

)(... λλλλ

λω Equation 6.1

∑∑∑

∑

===

=

+

+

=Sj

ii

Sj

iSji

Sj

igi

Sj

iSji

sj

eVar1

2

1

2

1

2

1

)(λλ

λω Equation 6.2

where the items i=1,2,…Sj all belong to the Sj scale for j=1, 2, ..5. Combining these

reliabilities the total reliability of the 5-factor measurement model is measured using

Omega total (ωt)

103

∑+

∑++

∑+

∑+

∑

∑++

∑+

∑+

∑

=

=====

====

k

ii

S

iiS

S

iiS

S

iiS

k

igi

S

iiS

S

iiS

S

iiS

k

igi

t

eVariiii

iiii

1

25

15

22

12

21

11

2

1

25

15

22

12

21

11

2

1

)(...

...

λλλλ

λλλλω Equation 6.3

6.2.7 Cross-validation of WOAQ

The WOAQ was initially validated using the nursing data and was then cross-

validated on the paramedics data. Finally invariance was tested for males and females in the

paramedics sample. At the first step of this invariance analysis, the baseline bifactor model

was tested separately for males and females and at the second step the cross validity of the

WOAQ was assessed using invariance testing across gender.

104

Figure 6.1. A demonstration of Bifactor model of WOAQ with phantom variables for calculating omega coefficients on the right-hand side (Note that the phantom variable

paths are constrained equal to 1 creating equally weighted composite scores).

105

6.3 Summary

This study has both theoretical and empirical implications. The WOAQ was

originally developed and used in manufacturing settings. In addition, the previous studies

used a higher order model of WOAQ and some reported poor fit for the scale. To the best

of the researchers’ knowledge, no study has been conducted in a non-manufacturing, health

setting in Australia using a bifactor modeling procedure. The present study used data

collected from a group of Australian nurses and a group of paramedics to assess the

validity, cross-validity and model-based reliability of WOAQ, a well-designed instrument

for assessing work and organisational factors as potential risks to employee health. The

main aim of the study was:

1) To assess the validity of WOAQ in an Australian health setting.

2) To compare a bifactor model (nested factor models) with a conventional higher

order model of WOAQ using Confirmatory Factor Analysis (CFA).

3) To assess and compare model-based reliability coefficients of Omega

hierarchical, Omega subscales and Omega total with the conventional

coefficient alpha.

4) To assess the cross-validity of the Work Organisations Assessment

Questionnaire (WOAQ) on a group of paramedics.

5) To assess the cross-validity of the Work Organisations Assessment

Questionnaire (WOAQ) on male and female paramedics

106

Unlike previous studies which used a higher-order Confirmatory Factor

Analysis (CFA) model for WOAQ, a bifactor modeling procedure was used in

this study. There is a very limited literature on the invariance testing of bifactor

models, making this a really novel research study.

107

7

STUDY 1: RESULTS

In this chapter, the study involving 312 nurses was used to validate the Bifactor

WOAQ model. This model is then fitted for a sample of 945 paramedics and a test of

invariance is used to evaluate the cross-validity of this model for male and female

paramedics.

7.1 Results: Study of Nurses-Validation of Bifactor Model of WOAQ

In this chapter, the bifactor model for WOAQ is validated for the sample of nurses

described in Chapter 6. Descriptive statistics are presented and then goodness of fit

statistics and reliability measures are derived using a higher order model and a bifactor

model. The results indicate that the bifactor model is a more valid representation of

WOAQ for this sample.

7.1.1 Descriptive Statistics for Demographics

Table 7.1 presents the frequencies, means and standard deviations for the

demographic variables. The majority of the participants were female (94.5%) with an

average age of 45.19. The majority had more than 4 years’ experience working in a nursing

setting (97.1%). About 40% of the participants were working full-time with the remaining

60% part-time employees.

108

Table 7.1 Descriptive Statistics of the Demographic Variables*

Frequency (%)

Gender

Male Female

17 (5.5) 290 (94.5)

Contacts with clients(hrs/workday)

< 2

2-4 4-6

6-8 >8

22 (7.1)

30 (9.7) 122 (39.4)

134 (43.2) 2 (0.6)

Years of experience(years) < 1

1-3 4-6

>6

2 (0.7)

7 (2.3) 16 (5.2)

282 (91.9)

Employment status Part-time 183 (60) Full-time 123 (40)

Mean Age (SD) 45.19 (9.54)

* n varies between 306-312 due to some missing responses

7.1.1.1 Descriptive Statistics at Item Level.

The twenty eight WOAQ items (Griffiths et al., 2006) are shown in Table 7.2,

grouped according to the five subscales.

109

Table 7.2 Subscales and WOAQ Items

WOAQ Subscales WOAQ Items

Quality of relationships with

management

3. Clear roles and responsibilities

5. Support from supervisor

7. Feedback on your performance

11. Appreciation or recognition of your efforts by

supervisors

16. Senior management attitudes

17. Clear reporting lines

22. Communication with supervisor

26. Status/recognition in the company –

27. Clear company objectives, values, procedures -

Reward & recognition 12. Consultation about changes in your job -

13. Sufficient training for this job -

14. Amount of variety in the work you do –

21. Opportunities for promotion –

23. Opportunities for learning new skills -

24. Flexibility of working hours -

25. Opportunities to use your skills -

Workload issues 6. Pace of work –

8. Your workload –

15. Impact of family/social life on work

19. Impact of work on family/social life

Quality of relationships with

colleagues

10. How you get on with your co-workers

(personally/socially)

28. How well you work with your co-workers (as a

team)

Quality of physical

environment

1. Facilities for taking breaks (places for breaks, meals)

2. Work surroundings (noise, light, temperature, etc.)

110

4. Exposure to physical danger

9. Health and safety at work

18. Equipment, tools, IT or software that you use

20. Work stations and work space

As shown in the next table, all the skewness and kurtosis coefficients were less than

one in absolute value demonstrating behaviour reasonably close to normality at item level

(West, Finch, & Curran, 1995).

Table 7.3 Item Characteristics of WOAQ

Items Mean SD Skew Kurtosis

WOAQ - quality of relationships with

management

3.43 1.03 -.33 -.48

3 3.60 1.02 -.38 -.49

5 3.60 1.21 -.55 -.71

7 3.15 1.04 -.11 -.55

11 3.29 1.11 -.22 -.85

16 3.12 1.15 -.09 -.78

17 3.55 .91 -.40 .09

22 3.49 1.05 -.41 -.45

26 3.39 .96 -.38 -.13

27 3.69 .88 -.48 .33

WOAQ - reward & recognition 3.37 .80 -.21 -.34

12 2.99 1.01 .01 -.50

13 3.52 .97 -.36 -.24

14 3.63 .83 -.18 -.16

111

21 3.06 .90 -.03 .11

23 3.63 .92 -.47 -.29

24 3.20 1.0 -.14 -.61

25 3.61 .89 -.28 -.39

WOAQ - workload issues 2.79 .98 .23 .64

6 2.79 1.16 .17 -1.0

8 2.68 1.0 .28 -.85

15 2.94 .83 .15 .58

19 2.75 .93 .34 .13

WOAQ - quality of relationships with

colleagues

3.94 .83 .58 .47

10 3.83 .82 -.31 -.23

28 4.06 .84 -.85 .72

WOAQ - quality of physical environment 2.97 1.07 .24 -.64

1 2.80 1.27 .26 -1.0

2 2.84 1.09 .27 -.61

4 3.00 .90 .62 .18

9 3.35 .99 .03 -.60

18 2.88 1.14 .21 -.99

20 3.00 1.04 -.05 -.49

Total 3.30 .94 -.31 -.51

7.1.1.2 Test of Model Assumptions

Although the normality assumptions were reasonably valid at item level, the

multivariate distribution of the items also needs to be checked. In this study CFA is testing

a multivariate statistical model using Maximum Likelihood (ML) estimation, assuming

multivariate normality (Hoyle, 2000). Multivariate normality can be evaluated using

112

Mardia’s multivariate skewness and kurtosis tests (Bentler & Wu, 2002). Although the

preliminary assessment at item level showed a relatively normal distribution for the data,

Mardia’s Multivariate Kurtosis coefficient (Mardia's ccoefficient (G2, P) = 109.40:

normalised estimate = 23.57) is a little high, indicating violation of the multivariate

normality assumptions. An upper limit of below 20 is usually required for Mardia’s

Multivariate Kurtosis coefficient (Byrne, 2010).

Non-parametric tests were therefore used for evaluating the model. As described in

the literature (Hu, Bentler, & Kano, 1992; Curran, West, & Finch, 1996), the Satorra-

Bentler (1988, 1994) chi-square test should be used when the assumption of normality is

violated. The scaled chi-square (χ2/df) and robust standard errors using ML estimation is a

method suggested by Satorra and Bentler (1988; 1994). It appears to be a good general

approach for dealing with departures from normality. As noted previously, ideally the

scaled χ2 has a value of between 1 and 2.

7.1.2 Model fit evaluation

The dimensionality of the general score of WOAQ and its nested five subfactors

were assessed using confirmatory factor analysis (CFA). Both the second-order model

(higher-order model) (Figure 7.1, model 1) and bifactor model were assessed (Figure 7.1,

model 2). The results of modification indices suggested correlation between three sets of

construct measurement errors, specifically for one of the environmental factor items (safety

at work with exposure to physical danger) and two of the workload factor items (Impact of

family/social life on work with Impact of your work on family/social life; pace of work

113

with workload). As suggested by Kenny (2011), if some items have similar content and are

theoretically meaningful, one may correlate the errors for these items.

Model 1. Higher order model of WOAQ

Model 2: Bifactor model of WOAQ

Figure 7.1. The proposed bifactor model of WOAQ vs. higher order

114

The results indicated that the higher order model provides a marginally acceptable

model for WOAQ and its five subfactors (SB Scaled χ2=2.14, RMSEA=0.06, NNFI=0.89).

The factor loadings for the subfactors suggest well-defined subfactors. In addition, the

factor loadings of the five subscales over the higher order factor of WOAQ were strong and

significant. The path coefficients were 0.71 for ‘quality of physical environment’, 0.57 for

‘quality of relationship with colleagues’, 0.93 for ‘quality of relationship with

management’, 0.99 for ‘reward and recognition’ and 0.76 for ‘workload issues’.

For meaningful comparison of the higher order model with the bifactor model, the

Schmid-Leiman transformation was conducted to obtain loadings for all items on the higher

order factor. Table 7.4 provides the Schmid–Leiman transformed factor loadings for the

higher order factor. As suggested by Gignac (2007), the Schmid–Leiman (S-L)

transformations were calculated by multiplying the first-order factor loadings with their

respective second-order factor loadings.

The results of the bifactor model suggest an acceptable fit (SB Scaled χ2=1.71,

RMSEA =0.04, NNFI=0.93). Table 7.5 presents the results of the CFA evaluation. Based

on these results, the bifactor model of WOAQ provides a superior fit with a smaller AIC

value (AIC=-89.66) compared to the conventional higher order model (AIC=50.44). The

ΔNNFI is bigger than 0.04 and ΔAIC is -140.10 indicating significant superiority of the

bifactor model over the higher order model.

Important differences were found in the factor loadings of the bifactor model

compared to the higher order model. The most important difference was found for the

‘quality of relationship with management’. The S-L solution of the higher order model

115

showed positively defined fairly uniform factor loadings for this factor, while the bifactor

model detected differentially directed loadings. In addition, in the bifactor model for the

two subscales ‘the quality of relationship with management’ and ‘the reward and

recognition’, items were highly loaded on the general WOAQ but poorly loaded on their

nested group constructs. However, in the higher order model the items for ‘the reward and

recognition’ subscale had low loadings in both cases.

116

Table 7.4 Completely Standardized Maximum Likelihood (ML) Solutions of Higher order Model and the Bifactor Model

Item number S-L Higher order Bifactor S-L QPE QRC QRM RR WI G QPE QRC QRM RR WI 1 0.47 0.55 .32 .60 2 0.54 0.62 .35 .78 3 0.57 0.23 .65 -.18 4 0.29 0.34 .24 .29 5 0.74 0.30 .72 .42 6 0.52 0.46 .47 .37 7 0.71 0.28 .73 .12 8 0.52 0.46 .48 .37 9 0.42 0.49 .44 .33 10 0.37 0.55 .35 .65 11 0.79 0.31 .78 .33 12 0.00 0.02 .74 -.11 13 0.00 0.02 .64 .06 14 0.00 0.02 .48 .43 15 0.43 0.38 .37 .58 16 0.73 0.29 .72 .29 17 0.72 0.28 .72 .18 18 0.34 0.39 .31 .28 19 0.48 0.42 .44 .52 20 0.54 0.62 .49 .52 21 0.57 0.02 .56 .06 22 0.78 0.31 .76 .40 23 0.74 0.02 .69 .30 24 0.60 0.02 .55 .09 25 0.69 0.02 .66 .44 26 0.73 0.29 .80 -.05 27 0.72 0.28 .79 -.11 28 0.45 0.67 .45 .55

Note: S-L= Schmid-Leiman Transformation of Item Loadings, G=General factor of WOAQ, QPE=Quality of physical environment, QRC=Quality of relationship with colleagues, QRM=Quality of relationship with management, RR=reward and recognition, WI=Workload issues.

117

118

Table 7.5 Summary of Model Fit Statistics of the CFA Models of WOAQ

Model SB

χ2/df

RMSEA NNFI AIC ΔNNFI† ΔAIC†

0. Independent Model 11.48

1. Higher order model 2.14 0.06 (.05, 0.06) 0.89 50.44

2. Bifactor model 1.71 0.04 (0.04, 0.05) 0.93 -89.66 0.04 -140.10

Note: SB=Scaled χ2. RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index; AIC = Akaike Information Criterion; †=differences model 2 - model 1.

7.1.3 Model-based reliability

Further analysis was carried out to assess the reliability of the well-fitting bifactor

model of WOAQ. The results of the model-based reliability evaluation of the

multidimensional WOAQ using the tω reliability coefficient (combined true score variance

across the general factor of WOAQ and its five nested subfactors) indicated excellent total

reliability for this scale (0.92). It seems that 92% of the WOAQ variance is true variance,

leaving 8% for error.

The Omega hierarchical reliability coefficient demonstrates that the general factor of

WOAQ explains 87% of the variance while the total contribution of the subfactors is minimal

in the presence of the general WOAQ. In other words, a substantial proportion of internal

consistency belongs to the general factor of WOAQ rather than its nested five subfactors.

118

119

To better understand the individual reliability of each nested subfactor, Omega subscale

reliability coefficient was calculated for each nested subfactor, controlling for the effects of

the general factor of WOAQ. The results show that among the five nested subfactors,

‘physical environment’ ( sω =.52), ‘workload issues’ ( sω =.39), and ‘relationships with

colleagues’ ( sω =.36) demonstrated higher reliability than the other two nested subscales,

independent of general WOAQ. The lowest omega subscale reliability coefficients belonged

to ‘the quality of relationship with management’ and ‘reward and ‘recognition’,

demonstrating more dependency on general WOAQ for these two subscales.

As expected, the conventional coefficient alpha reported an overestimation of the

reliability (α =.94) which was probably caused by a violation of the unidimensionality and

independent residuals assumptions.

Table 7.6 The Reliability Coefficients of WOAQ among Nursing Sample (n=312)

Constructs α tω hω sω

General WOAQ .94 .92 .87 -

Physical environment

Relationships with colleagues

Quality of relationships with management

Reward & Recognition

Workload issues

.51

.35

.16

.15

.39

This bifactor model was then fitted to the paramedics data.

119

120

7.2 Results: Study of Paramedics-Cross Validation of Bifactor Model WOAQ

After establishing the validity of the bifactor model of WOAQ in one health

population, it is important to cross validate the model in a different health population.

Therefore, using a different sample (here paramedics) the invariance evaluation of WOAQ

bifactor was assessed, first for the combined sample and then across gender. The demographic

characteristics of the total sample as well as the male and female samples are presented at

Table 7.7.

7.2.1 Descriptive Statistics for Demographics

As indicated in Table 7.7 this was a much larger sample with a much better

representation of males than the nursing sample. Also, there were very few part-time

employees compared to the nursing sample and average age was younger than for the nursing

sample. These results suggest therefore that this is a very different population, making it

appropriate for this sample to be used for the cross-validation of the bifactor model for

WOAQ in a health setting.

120

121

Table 7.7 Characteristics of Paramedic Participants

Total n=945 Males n=623 Females n=322

Frequency (%)

Gender

Male

Female

623 (65.9)

322 (34.1)

- -

Employment status Ϯ Part-time 895 (94.7) 610 (97.91) 287 (89.13)

Full-time 48 (5.1) 13 (2.09) 35 (10.87)

Years of experience < 1 year

1-3 years

4-6 years

> 6 years

92 (9.8)

127 (13.5)

133 (14.1)

588 (62.6)

38 (6.1)

55 (8.9)

62 (10.0)

465 (75.0)

54 (16.9)

72 (22.5)

71 (22.2)

123 (38.4)

Age (years) Mean (Range) 40.15 (21-65) 43.72 (22-65) 33.24 (21-56)

Note: Ϯ Due to their very low percentage (1.3 per cent in the paramedic organisation),

casual employees have been allocated to the part-time category.

At the first step, a baseline bifactor model of WOAQ that was evaluated in the

previous section, was assessed separately for both male and female paramedic groups. The

results in Table 7.8 show adequate model fit for the baseline bifactor model for males

(RMSEA =0.04, NNFI=0.94), and females (RMSEA =0.05, NNFI=0.92).

Table 7.8 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Gender Model SB χ2/df CFI RMSEA NNFI

Baseline Model:

Male 2.47 0.94 0.04 0.94

Female 1.87 0.93 0.05 0.92

Note: RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index.

121

122

Model 1: Configural model with no constraints. At the next step, a configural bifactor

model was fitted for the male and female groups simultaneously to determine if the model is

appropriate when there are no constraints. Based on the results, the configural model

(RMSEA =0.03, NNFI=0.97) showed good model fit (Table 7.9). This model shows that the

bifactor model is appropriate for paramedics as well as nursing staff, suggesting that this

model may have general applicability in health.

Model 2: Invariant loadings. After constraining the loadings to be equal for males and

females, the results still showed good fit (RMSEA=0.03, NNFI=0.96). To test for evidence of

invariance, the differences between the NNFI and AIC of Model 2 and Model 1 were

considered. This suggests no significant deterioration in model fit for constrained loadings

compared to the configural model (unconstrained loadings) in the case of NNFI (table 7.9);

but there was an increase of more than 10 in the AIC. In the circumstances it was unclear if

invariance could be claimed across gender.

As previously explained, reaching full invariance for all the parameters, or even the

most important ones, is very rare in most models (e.g. Byrne, Shavelson & Muthen, 1989). In

view of the conflicting results obtained above, a decision was therefore made to proceed to

the next stage of the invariance analysis considering the differences in construct means for

males and females.

122

123

Table 7.9 Invariance Testing Across Gender for the Bifactor Model of WOAQ.

Models SB χ2/df CFI RMSEA NNFI AIC Δ*NNFI Δ*AIC

Model 1 - Configural model-no constraints 1.49 0.97 0.03 0.97 -315.84 - -

Model 2 – M1+loadings invariance 1.60 0.96 0.03 0.96 -274.65 0.007 -41.19

Note: RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index; AIC = Akaike Information Criterion; Δ=change

123

124

The Group Differences for Construct Means. Gender differences in the mean values

for the general factor of WOAQ and the mean values for the nested five factor were

considered with the female group means selected as the reference level. The construct means

for the female group were therefore set to zero while the construct means associated with the

male group were estimated, providing an estimate for the mean differences between groups

for the constructs.

After setting equality constraints on loadings, and intercepts for the measured

variables, with factor intercepts of zero for female employees, the results showed a marginal

fit for the model (RMSEA=0.05, NNFI=0.917). The mean differences between the male and

female groups were significant on two of the nested constructs (‘co-worker’ and ‘reward-

recognition’) and also for the general factor of WOAQ. The Z score results showed that the

mean scores on these two nested constructs and the general factor of WOAQ are significantly

higher for male employees than for female employees. In the next chapter the implications of

these results are discussed.

124

8

STUDY 1: DISCUSSION

In this chapter we start by considering the previous WOAQ bifactor model results

obtained for the nursing sample. We then consider this model in the case of the paramedics

sample and, in particular, we probe the implications of the gender differences that were

exposed.

8.1 Discussion: Study of Nurses-Validation of Bifactor Model of WOAQ

The most common problems detected in the literature on full risk assessment relate to

the length of questionnaires. These are either very long or detailed or they are unable to detect

the hazardous nature of any identified problems in a work setting. In response to the

evidenced need for a short, valid s risk assessment, the WOAQ was developed.

Work Organisation Assessment Questionnaire (WOAQ) (Griffiths, et al., 2006) seems

to overcome these problems with its short length (28 items) and yet comprehensive content.

The WOAQ seeks to identify and collect employees’ opinions on their work and health

(Griffiths, Cox, Karanika, Khan, & Toma, 2006). The WOAQ was originally developed for a

manufacturing setting but it is widely used in non-manufacturing settings without having been

properly validated in these new settings.

The present research examines the validity and model-based reliability of WOAQ for

a group of Australian employees using the conventional higher order model and a bifactor

model using CFA. The WOAQ higher order model included a second-order factor and five

first order subfactors, each representing different dimensions of work organisation risk

125

assessment. The five subfactors are: ‘quality of relationships with management’, ‘reward and

recognition’, ‘workload issues’, ‘quality of relationships with colleagues’, and ‘quality of

physical environment’. The bifactor model of WOAQ included a general measure of WOAQ

and the above five subfactors.

Previous studies, using higher order modeling of WOAQ, failed to validate the model,

reporting a poor fit (e.g. Waynne-Jones, 2009). The present study therefore considered a

bifactor model of WOAQ, and compared this model with the conventional higher order model

of WOAQ. Based on previous studies, bifactor models in general demonstrate superior fit

over the higher order models (e.g. Gignac, 2007, 2014; Reise, Bonifay, Haviland, 2012;

Reise, 2012). In spite of their importance in the context of Organisational studies, evaluation

of bifactor models such as WOAQ is something quite new for the organisational psychology

discipline. While a conventional model of WOAQ provides an indirect relationship between

the higher order construct and the items, a bifactor model of WOAQ provides a full first order

multidimensional model, where both the general factor of WOAQ and its nested subfactors

are evaluated with a direct relationship to the WOAQ items. In addition, using model-based

reliability coefficients Omega, more valuable information can be obtained about the internal

consistency of each construct. The results of this study revealed the superiority of the bifactor

model over the conventional higher order model. In addition, very important differences were

found between the higher order model and the bifactor model. The most important difference

was detected when the conventional higher order model failed to recognise the low and

differentially directed loadings of the ‘quality of relationship with management’ items. The

results of the bifactor model showed that the subfactor for ‘quality of relationship with

management’ was poorly defined independently of the general measure of WOAQ. The

126

subfactor of ‘reward and recognition’ was found to be implausible in both models. Given the

fact that high correlations were observed between these two subscales and their high

dependency on the general factor of WOAQ, the reliability of these two subscales clearly rely

more on the variation in the general factor of WOAQ than on any sub-factor.

These results have important practical implications. They show that in the context of

community nursing, although the general measure of WOAQ is a valid and reliable measure

for organisational risk assessment, the most important plausible subscales are ‘quality of

physical environment’, ‘the workload issues’ and ‘quality of relationship with the colleagues’

respectively. Based on the findings, focusing on the two subscales of ‘the quality of

relationship with management’ and ‘reward and recognition’ without considering the general

WOAQ indicators will be unlikely to lead to any significant improvements. In contrast, the

other three sub-constructs, especially ‘the quality of work environment’, seem to have

significant unique reliability, independent of the general factor of WOAQ. In practice, this

means that any intervention to improve only the work environment would still have

significant effects on the level of perceived risks in workplaces.

Unfortunately, lack of previous studies makes it difficult to compare the findings in

other health areas. The majority of previous studies on WOAQ have been conducted in

manufacturing settings using the conventional higher order model procedure (e.g. Griffiths, et

al, 2006; Waynne-Jones, 2009). However, close evaluation of the work setting of nurses,

indicates that these findings should not be much of a surprise. These findings fit with the

nature of the community-nursing work environment. The reason behind this is that although

the nurses belong to a large organisation, they work in different, small branches with their

own immediate managers/supervisors. In such an environment, there is a more informal

127

relationship between the nurse and manager/supervisor. The relationships in community

nursing settings are more colleague-colleague relationships rather than nurse-manager

relationships so it should be expected that ‘the quality of relationship with management’

would be unimportant. Also ‘the reward and recognition’ factor is strongly tied to the

management relationship and only items representing a variety of tasks, opportunity for

learning and using the new skills appeared as important indicators of this subscale. Thus, in

practice, if an organisation is wanting to make risk management improvements, the main

plausible subfactors to look at are the work physical environment, the relationship with

colleagues and managing the workloads.

The bifactor model of WOAQ could also have some critical cost and efficacy

implications in the workplace. For example, consider the situation where there are limited

budgets or resources to be allocated to improve the overall quality of the work organisation,

or, if it is not feasible or realistic to change all the subconstructs of risk in the workplace

simultaneously. Using a bifactor model one can separate the specific effects of each subfactor

from the general factor of WOAQ and determine the most plausible construct for an

immediate, more feasible intervention. In more costly or complicated situations, the

practitioners or policy makers could take advantages of such bifactor modeling to determine

the most plausible sub-constructs for achieving improvement in the short-term.

Unsurprisingly, the results also indicated that the conventional reliability coefficient of

alpha was overestimated (though slightly) compared to the omega total and omega

hierarchical coefficients. This is consistent with results of previous studies in other disciplines

(Gignac, 2007, 2014; Reise, Bonifay, Haviland, 2012). These scholars indicated that in

models violating the assumptions required for a reliable alpha, the coefficient will often be

128

overestimated. Therefore, it is highly recommended that in any future studies and especially

for complicated multidimensional models, that scholars should by default use model-based

reliability coefficients. Although this is deemed to be more critical in clinical or health

studies, overall it is important for scholars of all disciplines to use these more accurate

reliability assessments in order to avoid serious errors.

8.2 Discussion: Study of Paramedics-Cross Validation of Bifactor Model of WOAQ

The cross validation results obtained using the paramedics sample indicate that the

bifactor WOAQ model is a valid tool in another very different health setting. It appears that

using a bifactor model presents not only a better fit than a higher order model, but also

highlights the importance of the subscales relative to a single general factor in a health setting.

Moreover the results suggest that this model can be used with both male and female

paramedics. However, although these results demonstrate good validity for the WOAQ across

gender, the mean differences between males and females were found to be significant. The

results showed that the scores on two of the five nested constructs and the general factor of

WOAQ were significantly higher for male employees compared to female employees,

demonstrating that male employees are happier with the ‘quality of relationships with co-

workers’ and the system of ‘reward and recognition’, as well as the general quality of WOAQ

than female employees in this parmedics organisation. The results have important

implications in practice, and specifically in relation to the occupational health and safety of

paramedics, particularly female paramedics.

Overall, the WOAQ, especially the general WOAQ measure, appears to be a superior

instrument for assessing risk factors associated with employees’ health and health-related

behaviour, due to its satisfactory psychometric properties and short length. More importantly,

129

based on the results from the bifactor analysis, it was shown that some of the subscales are

more important than others in a health setting. This indicated that concerns relating to the

importance of an identified problem in the work setting can be solved by fitting the bifactor

WOAQ model to assess the importance of various risk factors in the workplace. Ultimately,

this will assist management in identifying problem areas which may cause harm to their

employees and the organisation, and thus allow proper action to be undertaken in order to

improve the work environment.

The WOAQ tool can be especially useful when access to specialist occupational health

support is limited. This is because WOAQ is short and easy to use and, in more practical

terms, can directly inform workplace interventions to improve employee health and well-

being. Furthermore, by directly informing the development of targeted workplace

interventions to improve the psychosocial factors and work conditions for paramedics, the

WOAQ offers a potential to help avoid the structural labour force shortages experienced in

this area, especially in the developed world.

8.3 Strengths and Limitations

The strength of the study is the context of the research and the methodology used.

This can be further elaborated as 5 key points.

Firstly, to the best of the researchers’ knowledge, this study is one of the first that is

comparing a conventional higher order model with a bifactor model of WOAQ in a health

setting. The methodology used also has theoretical and practical implications in other

organisational studies. The conventional higher order modeling is based on full mediation of

item effects by first-order sub-constructs. In practice and real life situations, and especially in

130

organisational studies such as WOAQ, this has limited applicability. Bifactor modeling

assumes partial mediation which is much closer to the reality.

Depending on their nature of work (e.g. manufacturing vs. non-manufacturing) and

occupation types, organisations will have significant differences in regard to WOAQ. A risk

assessment tool like WOAQ is a very useful tool for assessing the organisational risk factors.

However in practice, not all of the WOAQ subfactors are plausible or important, as was found

in this study. Therefore, in the work setting, bifactor modelling of WOAQ is deemed to be

more appropriate as the results relate well to real life expectations.

The 2nd key point is that this study has considered only the most suitable fit indices,

based on the degree of penalty included for model complexity. These indices (i.e. RMSEA,

NNFI/TLI, & AIC) and differences between these indices have been demonstrated in this

empirical study for interpreting the complex model of WOAQ.

The 3rd key point is that the study has used model-based reliability coefficients.

Taking into account the multidimensional nature of WOAQ, i.e. both the general and the five-

factor model of WOAQ, omega reliability coefficients have been used to assess measurement

reliability. Using omega model-based reliability measure rather than the conventional

coefficient alpha is recommended for multidimensional models such as the WOAQ.

The 4th key point is that this is one of the first studies that has been conducted for a

group of Australian employees in a health setting as opposed to a manufacturing setting for

which the original WOAQ was developed.. No previous studies have been completed in a

health setting using a comprehensive, short scale of risk assessment similar to the WOAQ.

This study therefore initiates a critical avenue for more research in WOAQ.

131

As the final key point, the WOAQ is a useful tool in practice because of its ability to

provide organisational risk assessment using only 28 items. This meets workplace

requirements in terms of cost, time and resources. Using bifactor modeling the most plausible

subfactors were identified for improving the organisational risk environment in a health

setting.

One of the limitations in this study is that it focuses only on health professionals.

Further studies are needed to expand the concept to other non-manufacturing or ‘blue collar’

occupations.

In spite of the importance of the omega reliability coefficient, still there is no detailed

guideline on the cutoff points for interpreting omega for general scales and for subscales.

Reise et al. (2012) suggested a minimum cutoff point of greater than .50, this is not backed up

by any significant evidence yet. Further studies are needed to shed more light on this.

The lack of background literature in an Australian context for the use of bifactor

modeling of WOAQ makes it difficult to evaluate or compare the results with other studies.

Further studies are needed to fill this gap.

8.4 Summary and Conclusion

In this study, attempts were made to assess the validity and reliability of WOAQ in an

Australian health setting, using robust methodological procedures. Based on the literature,

several robust procedures were adopted for assessing the validity of WOAQ, including a

comparison of the conventional higher order of WOAQ with a bifactor model of WOAQ and

the testing of model-based reliability.

132

In general, results showed that the WOAQ appears to be a superior instrument for

assessing risk factors associated with employees’ health and health-related behaviour due to

its satisfactory psychometric properties and short length. Although the general factor of

WOAQ seems to be the dominant factor, some evidence of multidimensionality was found

and some subfactors appeared to play more critical roles in risk assessment in a nursing

setting. The cross validity of the scale on a paramedic sample was demonstrated when these

results were replicated in another very different health setting. However, interesting

differences in mean values for male and female paramedics indicated that this was a gender-

sensitive assessment tool.

In conclusion, this study adds to the evidence supporting the feasibility of the WOAQ

for both research and practice in a range of settings. However, future research should continue

to validate the WOAQ with other occupational groups and sectors using a bifactor model.

133

9

STUDY 2: APPLICATIONS OF COVARIATE-DEPENDENT RELIABILITY

The purpose of chapters 9 to 11 is to empirically demonstrate the Bentler’s 2014

approach of covariate-dependent reliability. There are two main proposed applications; the

first application considers the effects of potential covariates on scale reliability, the second

demonstrates the effects of Common Method Bias (CMB) on scale reliability. Different data

sets were used for these two applications. The WOAQ data was used for the first application.

For the second application, a student study of social desirability, emotional intelligence,

wellbeing and alcohol drinking behaviour was used. These applications are described in this

chapter but the actual analyses are left until Chapter 10 with the discussion following in

Chapter 11.

9.1 Rational and Objectives

9.1.1 Application of Covariate-dependent Reliability in Reliability Assessments

In 2012 (personal conversations), Bentler introduced the concept of covariate-

dependent and covariate-free reliability that partitions total reliability into two parts. The first

part relates to external covariates and the second part being unaffected by such covariates

(covariate-free reliability). The approach was officially presented in 2014 (Bentler, 2014).

The following material on covariate-dependent reliability was adapted from either personal

conversations with Bentler (2012, 2013) or Bentler (2014). Only the practical application of

this concept was assessed in this study, using the data previously described in Chapter 7 and a

134

second student data set relating to social desirability, emotional intelligence, wellbeing and

alcohol drinking behaviour.

9.1.1.1 First Application of Covariate-dependent Reliability.

Based on the above development, covariate-dependent reliability and covariate-free

reliability can be evaluated for the bifactor model of WOAQ, using the nursing and paramedic

group variables as a covariate. Although the model-based reliability of WOAQ has been

found to be acceptable in the nursing and paramedic organisations (within organisation

assessment), an evaluation of reliability across organisations has yet to be established. It is

hypothesised that although both organisations are health related, due to differences in the

nature of work and different demographic characteristics of the paramedics and home-based

nursing organisations, the type of organisation will affect the reliability of the WOAQ. Hence,

the home-based nursing organisation and the paramedic organisation must be compared in a

reliability assessment of the bifactor model as illustrated in Figure 9.1.

135

Figure 9.1. Covariate-dependent reliability assessment with the bifactor model of WOAQ across the nursing and paramedic organisations.

136

Both nursing and paramedic occupations can be categorised as providing clinical care,

however the nature and demands of these two occupations are very different. A clinician in

the home-based nursing service provides services over a period of time that is defined by a

client’s needs. While these clients may have acute clinical needs, they are generally medically

stable, and often have been discharged from hospital as they no longer require the acute

clinical care provided in the hospital setting. In contrast, paramedical practitioners are called

to respond quickly to clients in need of urgent medical care. The paramedics have short,

intensive interactions with their clients who are often acutely ill. Clearly, the demands and

expectations are different for each of these professions, and consequently for the organisations

in which they work. While both professional groups complete most of their work away from

their formal organisational settings, the nature of their interactions with their clients is

fundamentally different. Typically, the nurses interact with the clients in their own homes

while paramedics work with patients in a wide variety of settings where urgent medical

response is required. The nurses have the opportunity to ‘get to know’ the clients and interact

with them over time, while the paramedics normally interact for only a single short term

episode, during which the clients may not even be responsive.

In addition to the different nature of work and wokplace demands, both organisations

have different demographic characteristics. For example in this study, the majority of home-

based nurses are female while the majority of paramedics are males. Also in comparison to

the community nursing organisation, the paramedics’ organisation has more part-time workers

and a significantly lower average age of workers.

When there is a group covariate, such as organisation type, that affects a latent factor

(WOAQ in this case), the question is whether there are mean differences in the latent factor as

137

a function of the group covariate. As mentioned previously, covariate-dependent reliability is

a measure of the group differences in the trait being measured relative to total variation

(Equation 3.12). Covariate-free reliability is a measure of the individual differences relative to

total variation, freed from any mean differences due to the covariate(s) (Equation 3.11). None

of the Omega reliability coefficients based on the WOAQ bifactor model introduced

previously have partitioned the variance into its covariate-dependent and covariate-free parts.

Based on the above information, it is rational to argue that due to differences in the

nature of the work and the demographic characteristics of paramedic and home-based nursing

organisations, the type of organisation will influence the reliability of work organisation

assessments such as the Work Organisation Assessment Questionnaire (WOAQ), that

measure psychosocial/physical aspects of an organisation. As a result, it was hypothesised

that:

Hypothesis 9.1: The type of organisation (home-based nursing vs. paramedic) will be

one of the possible covariates affecting model-based reliability coefficients of the WOAQ.

Method. The data used for Study 1 (home-based nursing and paramedics) were used

to demonstrate the application of a covariate-dependent (here organisation-dependent)

reliability assessment of WOAQ.

The procedure proposed by Bentler (2014), and fully discussed in Chapter 3, was used

to calculate the covariate-dependent and covariate free coefficients of WOAQ in this study.

This procedure is only available in EQS, and only for higher order models. The calculation for

bifactor models is not implemented in EQS yet, therefore all the calculations for the bifactor

model were conducted manually.

138

9.1.2 Second Application of Covariate-dependent Reliability for Demonstrating CMB

There is a general belief among scholars that measurement error is a source of many

problems in research (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). Measurement error

has the capacity to misrepresent and confound the empirical findings of research, causing

erroneous conclusions to be drawn (Bagozzi & Yi, 1991). This issue becomes more salient

when researchers rely on a single source of data collection and self-report measures (Glick,

Jenkins, & Gupta, 1986; Meade, Watson, & Kroustalis, 2007). The widespread use of single-

source data collection tools is a potential concern in relation to common method variance

(CMV), which has been of interest to psychology since the 1950s.

One of the more popular and most convenient procedures for collecting data in

psychology is the self-rated questionnaire (Malhotra, Kim, & Patil, 2006). It is also a common

practice in psychology that data is gathered from a single self-rated questionnaire (Avolio,

Yammarino, & Bass, 1991). As a result, CMV appears to be a common problem in

psychological studies (Malhotra, et al., 2006). Yet, despite its long history in the field of

psychology, there seems to be a gap in the literature on CMV. The literature has paid more

devotion to post-hoc statistical remedies, while the causes of the bias have been neglected.

Although, only a few researchers have investigated the consequences of CMV for

measurement models, only a limited number of studies have been conducted which try to

determine the effects of CMV on reliability (Williams et al., 2010). The goal of the present

study was to address the current CMV gap in the literature, using the newly developed

procedure termed ‘covariate-dependent and covariate-free reliability’ (Bentler, 2014). Using

this procedure, CMV was introduced as a covariate for the study scales. If any covariate-

139

dependence is identified among the study scales, then we can conclude that CMV exists and

needs to be controlled for in the data analysis.

Acquiescence is a potential source of the common method bias (CMB) that results

from self-report surveys (Spector, 2006). According to Winkler, Kanouse and Ware (1982),

acquiescence response sets refer to the propensity of the respondent to indicate agreement

with items on the questionnaire independent of content.

Results from self-report measures are also susceptible to social desirability bias. Social

desirability bias describes the inclination of the respondent to complete the questionnaire in a

way which enables them to be presented in a positive light and to be in line with the norms

and standards defined by their culture (Donaldson & Grant-Vallone, 2002; Ganster,

Hennessey, & Luthans, 1983; Podsakoff & Organ, 1986). Their responses to questions are

usually determined by the level of social desirability (Schriesheim, Kinicki, & Schriesheim,

1979) inherent in the items of a questionnaire. This form of bias usually serves to hide the true

bivariate relationships between variables and interferes with the interpretation of average

tendencies as well as individual differences (Ganster, et al., 1983; Podsakoff, et al., 2003).

A preventive technique for detecting and controlling CMB can be used when the

assumed cause of the method bias is known to the researcher and can be identified and

measured. For example, this is commonly the case for social desirability. This preventive

technique involves the inclusion of the CMB measure as a covariate with the study variables.

This also allows the effects of the surrogate measure for CMB (e.g. a social desirability scale)

on the reliability of the study measures to be assessed.

140

9.1.3 The Effects of CMB and CMV on reliability of measures

CMB and CMV may affect the validity of a study (Doty & Glick, 1998). CMB and

CMV have the ability to confound the true relationship between variables, resulting in a bias

between the observed and the true relationships by either inflating or deflating the estimates

(Doty & Glick, 1998). Although assessing the presence and quality of CMB provides

important information on the effects on the parameter estimates, it can also be used to

demonstrate the effects on scale reliability. Williams et al. (2010) proposed that the estimation

of the reliability should be achieved by evaluating the decomposition of the overall reliability,

both with and without a “marker variable” to reflect CMB. The reliability decomposition

(composite reliability) formula was originally proposed by Werts, Linn, and Joreskog (1974).

Overall reliability = Equation 9.1

[F = The sum of factor loadings, squared; E = Sum of error variances]

The composite reliability in this instance is equal to coefficient rho for a single factor

model and Omega total (Equation 3.9). For the present study, Bentler’s (2014) approach was

used to compare the reliability of models with and without the presence of a CMB marker.

The procedure is simple and easy to calculate using EQS (see Chapter 3 for full details on this

procedure). Note that, as explained earlier, when rho is used to represent the reliability of a

multi-dimensional model it is quantifying the proportion of variance due to the most reliable

single dimension in this multidimensional space. Using this procedure, one can obtain

estimates of the CMB-dependent reliability, the CMB-free reliability, and the total reliability

in one calculation.

FF E+

141

This method was used to assess the influence of social desirability on the reliability of

the constructs used in a student study of emotional intelligence, wellbeing and alcohol

drinking behaviour. It was hypothesised that:

Hypothesis 9.2: A covariate-dependent reliability assessment, using social desirability as a

potential source of bias, will demonstrate the effects of CMB on the reliability of the

constructs in the study (Figure 9.2).

The main constructs in the data described below contained sensitive questions

(emotional intelligence, wellbeing, and alcohol drinking habits), sometimes prompting

participants to demonstrate socially acceptable responses rather than presenting their true

opinions. Socially desirable responses could therefore lead to bias for the estimated

relationships between the constructs of the study (Ganster, Hennessey & Luthans, 1983).

Based on the literature, if factor or item responses are highly correlated with social

desirability, then social desirability could be a potential source of bias that needs to be

controlled (Podsakoff et al., 2003; Thomas & Kilmann, 1975).

However, the above model can be adapted to control for CMV due to a single survey

source as well as CMB, using SEM procedures. By integrating both an unmeasured latent

variable for CMV and a directly measured latent method factor representing CMB into the

SEM model, CMV and CMB can be evaluated simultaneously. In the context of the above

student survey, this method can be used to test for rater bias, including social desirability, as

well as a single source survey bias. It was therefore hypothesised that:

142

Hypothesis 9.3: A SEM integrated approach, including an unmeasured latent common

method factor and a directly measured method factor (social desirability), can be used to

evaluate the presence of CMV in the above context.

9.2 Method

The data for this study were collected from a group of undergraduate students in one

faculty of the participating university. Participant groups were randomly selected, using a list

of all the active subjects in the faculty for Semester Two in 2011. Upon receiving the

lecturer’s consent, a questionnaire package that included a cover letter, information sheet,

consent form, and a questionnaire, was provided to each student during their lecture break.

The information sheet provided assurances that all participant information would remain

confidential. Upon completion of the survey, students were asked to place their questionnaires

in the locked box provided in the classroom. After discarding the incomplete surveys, the

final number of surveys included in the analysis was 341.

9.2.1 Measures. The questionnaire contained questions relating to wellbeing, alcohol

drinking behaviour, emotional intelligence, social desirability and demographics.

Emotional Intelligence. Emotional intelligence was measured using the 33-item Self-

Report Emotional Intelligence Test (SREIT) (Schutte, Malouff, Hall, Haggerty, Cooper,

Golden, & Dornheim, 1998). On a five-point Likert scale, respondents were asked to self-

report their preferences on a scale from 1 (strongly agree) to 5 (strongly disagree). The

reliability and validity evidence for this scale has been positively assessed in previous studies

(e.g., Schutte & Malouff, 1999; Abraham, 1999; Ciarrochi, Chan, & Caputi, 2000; Petrides &

Furnham, 2000).

143

General Wellbeing. General wellbeing was tested using the General Wellbeing

Questionnaire (GWBQ) (Cox, Thirlaway, Gotts, & Cox, 1983). The GWBQ is a 24-item

instrument used to measure sub-optimal health, using self-reported symptoms of general

malaise. It includes a set of general non-specific symptoms of ill-health, including reportable

aspects of cognitive, emotional, behavioural, and physiological function, none of which are

clinically significant in themselves. The GWBQ consists of two 12-item subscales of sub-

optimum health: (a) worn-out/exhausted and (b) tense/nervous. Respondents were asked to

indicate how often they had experienced the listed 24 symptoms within the previous six

months on a scale from 0 (never) to 4 (all the time).

Social Desirability. The 16-item Social Desirability Scale (SDS-16) (Stöber, 2001)

was used to measure the social desirability of the respondents. The scale is presented with six

reverse-keyed items. The original scale has 17 items, but the item “I have tried illegal drugs”

(e.g., marijuana, cocaine, etc.) was excluded because it is not suitable for the measurement of

social desirability (Stöber, 2001). The items were parcelled into three scales in order to

achieve SEM model identification.

Alcohol Drinking Behaviour Screening. The World Health Organisation’s Alcohol

Use Disorders Identification Test (AUDIT) is a tool used for screening alcohol drinking

behaviour. AUDIT was originally developed by Saunders, Aasland, Babor, de la Fuente and

Grant in 1993 and has been validated extensively across different populations. It consists of

three items on alcohol consumption, three on drinking behaviour and dependency, and four on

the consequences or problems related to drinking. The items were parcelled into three items to

achieve model identification.

144

9.2.2 Overview of analysis. Confirmatory factor analysis (CFA) was conducted to evaluate

the proposed models. EQS 6.2 (built 100) and standard-fit indices (CMIN/DF, CFI, NNFI,

and RMSEA) were used to evaluate the model fit. For reliability assessment and comparison,

coefficient Omega, and covariate-dependent and covariate-free reliability coefficients were

calculated as described in chapter 3.

To evaluate hypothesis 9.3, both constrained (equal-method factor loadings) and

unconstrained (free-method factor loadings) models were assessed to find out if CMV exists

and whether it has equal effects on the constructs of the study. Recently, a partial correlation

technique was introduced by Lindell and Whitney (2001) that can be used to test for CMV. In

this procedure, a ‘marker variable’ representing CMV was included in the analysis. Using a

partial correlation procedure, the association between the marker variable and any construct in

the model is used as an estimate of CMV. This allows all correlations among the constructs of

the study to be corrected for CMV using a partial correlation adjustment (Williams, Hartman,

& Cavazotte, 2010). This method is called the correlational marker technique. Building on the

partial correlation procedure of Lindell and Whitney (2001), further development has been

carried out by Richardson et al. (2009) and Williams, Hartman, & Cavazotte (2010) using a

structural equation modelling procedure for capturing and adjusting for CMV. This marker

variable procedure using CFA was employed to evaluate hypothesis 9.3.

In SEM by default, ML is used for parameter estimation. When the sample size is

large and data is normally distributed, ML provides the most accurate estimation with the

smallest standard errors (Bentler, 2006). However, ML is sensitive to departures from

normality. Therefore, assessment of normality is an essential requirement when using this

procedure. Although the preliminary assessment of the data showed a relatively normal

145

distribution for the data, Mardia’s normalised coefficient was high - (G2, P) = 216.79,

indicating a violation of normality assumptions. Outliers were detected in a further analysis,

however the deletion of these observations did not result in a significant improvement in the

fit indices. As a result, all cases were kept and a suitable, non-parametric test was used to

evaluate the model. The Satorra-Bentler (1988, 1994) chi-square test delivers a more accurate

assessment of model fit when the data does not have a normal distribution.

146

10

STUDY 2: RESULTS

In this chapter the two applications described in the previous chapter are applied. The

application relating to reliability assessment is illustrated using the WOAQ data and the

application relating to CMB (in the form of social desirability) and CMV (in the form of a

single survey source), is illustrated using the student data for emotional intelligence, well-

being, alcohol drinking behaviour and social desirability.

10.1 Results of Application for Reliability Assessments – The study of WOAQ

Because hypothesis 9.1 states that the type of organisation will affect the reliability of

the WOAQ, organisation was added to the validated bifactor model of the WOAQ as a

covariate, allowing the evaluation of the effect of organisation on the reliability of the WOAQ

assessment tool (see Figure 9.1).

10.1.1 Descriptive statistics at item level

At the first stage, the validity of the model was assessed before proceeding with the

reliability assessment. As shown in Table 10.1, the data at item level is relatively normal. All

skewness and kurtosis coefficients were less than two and seven, demonstrating reasonable

normality at item level (West, Finch, & Curran, 1995).

10.1.2 Descriptive statistics at group level

As discussed before, the multivariate normality in EQS can be evaluated using

Mardia’s multivariate skewness and kurtosis tests (Bentler & Wu, 2002). Because Mardia’s

147

coefficient (G2, P) = 116.64: normalised estimate = 50.44) indicated violation of the

normality assumptions, non-parametric tests were used to evaluate the model.

Table 10.1 The Descriptive Characteristics of the Main Study Constructs and Parameters (n=1255)

Constructs Mean SD Skewness Kurtosis WOAQ - quality of relationships with management

2.72 .90 .27 -.52

3 3.18 1.04 -.09 -.57 5 3.10 1.27 -.06 -1.09 7 2.63 1.13 .20 -.76 11 2.42 1.21 .44 -.86 16 2.15 1.19 .79 -.41 17 2.85 1.06 .08 -.60 22 3.06 1.13 -.03 -.79 26 2.52 1.12 .18 -.89 27 2.65 1.13 .19 -.75 WOAQ - reward & recognition 2.60 .82 .41 -.41 12 2.23 1.09 .54 -.58 13 2.62 1.18 .29 -.90 14 3.32 .97 -.17 -.26 21 2.35 1.02 .29 -.60 23 2.63 1.18 .26 -.93 24 2.08 1.18 .82 -.38 25 2.96 1.07 -.07 -.62 WOAQ - workload issues 2.65 .90 .09 -.49 6 2.47 1.14 .38 -.82 8 2.30 1.08 .46 -.70 15 2.56 .79 .48 1.69 19 2.90 1.12 .00 -1.2 WOAQ - quality of relationships with colleagues

3.92 .85 -.46 -.49

10 3.93 .88 -.42 -.54 28 3.92 .94 -.60 -.25 WOAQ - quality of physical environment

2.54 .81 .45 -.16

1 2.60 1.26 .39 -.94 2 2.73 1.12 .31 -.69 4 2.46 .97 .75 .40 9 2.55 1.10 .47 -.54 18 2.42 1.05 .56 -.37 20 2.49 1.08 .38 -.53 Total 2.73 .71 .32 -.29

148

Table 10.2 summarises the characteristics of both the nursing and paramedic

organisations showing some demographic differences between the organisations. In terms of

gender, there was a greater percentage of females in the nursing organisation (94.5%), while

there were more males in the paramedic group (65.9%). The majority of the paramedics were

part-time employees (94.7%), had more than six years of experiences (62.6%), and their

average age was lower than that of the nurses (40 vs. 45 years old).

Table 10.2 Nursing and Paramedic Demographic Characteristics

Nursing

n=312* Paramedics n=945

Frequency (%) Frequency (%)

Gender

Male

Female

17 (5.5)

290 (94.5)

623 (65.9)

322 (34.1)

Employment status† Part-time 183 (60.0) 895 (94.7)

Full-time 123 (40.0) 48 (5.1)

Years of experience/years < 1 year

1-3 years 4-6 years

> 6 years

2 (.7)

7 (2.3) 16 (5.2)

282 (91.9)

92 (9.8)

127 (13.5) 133 (14.1)

588 (62.6)

Age Mean (Range) 45 (22-77) 40 (21-65)

Note. * Due to some missing data, n varies between 306 and 312. † Due to their very low percentage (7% in the nursing organisation and 1.3% in the paramedic organisation), casual employees have been allocated to part-time categories.

Further analysis was conducted to see if the demographic differences between

organisations were statistically significant. Chi-Squared tests of association were carried out

to compare gender ratios, employment status ratios (part-time vs. full-time), and years of

149

experience between the two organisations. The results showed significant differences between

the paramedic and nursing organisations in terms of the gender of employees, level of

experience and employment status (p<0.05).

Table 10.3 Mean Age Differences between Nursing and Paramedic Organisations

Organisation N Mean SD T p

Nursing 308 45 9.54 7.77 * 0.001

Paramedic 942 40 10.88

Note. * Equal variances were not assumed.

The results of the t-test showed that the mean age difference was significant, with the

paramedics being on average younger than the nurses (Table 10.3). It is therefore evident that

the two organisations have significantly different demographic characteristics.

10.2 Model Fit Evaluation

In the next step, the model fit was evaluated for the whole population (combined

nursing and paramedics) and separately for each organisation. Only if the fitted models

described the data well could the reliability assessment proceed.

The bifactor model of WOAQ was assessed separately for each organisation. Table

10.4 shows adequate model fit for the bifactor model for the nursing organisation (RMSEA =

0.04, NNFI = 0.93 as reported in Chapter 7) and the paramedics organisation (RMSEA =

0.05, NNFI = 0.93).

150

Table 10.4 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Organisations

Model SB χ2/df AIC CFI RMSEA NNFI

Nurses 1.71 89.66 0.94 0.04 (0.04-.05) 0.93

Paramedics 3.77 565.32 0.93 0.05 (0.05-0.06) 0.93

Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed fit index.

The model fit for both organisations combined (Table 10.5) was also good (RMSEA =

0.03, NNFI = 0.96, CFI = 0.97), so it was appropriate to proceed with the reliability

assessment of the model.

Table 10.5 Summary of Model Fit Statistics of the Bifactor Models of WOAQ (n=1257)

Model SB χ2/df RMSEA NNFI CFI

0. Independent Model 52.91

1. Bifactor model 2.83 0.03 (.03-.04) .96 .97

Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed Fit Index.

The model-based reliability coefficients of Omega total, Omega hierarchical, and

Omega subscales were calculated for each organisation separately. As shown in Table 10.6,

both organisations demonstrated high reliability for both Omega total and Omega

151

hierarchical, with Omega hierarchical representing the reliability of the general WOAQ

factor. The reliability for the Omega subscales of ‘quality of relationship with management’

and ‘reward and recognition’ were also similar for the two samples. However, the reliabilities

for ‘the quality of physical environment’, ‘the workload issues’ and ‘the relationships with the

colleagues’ were quite different in these two samples. The ‘quality of physical environment

‘and ‘workload issues’ reliabilities were higher for the nurses, while the reliability for the

‘relationships with the colleagues’ construct for the paramedics was almost double that for the

nurses.

Table 10.6 WOAQ Reliability Statistics for Nursing (as reported in Chapter 7) and Paramedic Organisations

Nursing Paramedics

Constructs tω hω sω tω hω sω

General WOAQ .92 .87 - .94 .89 -

Physical environment Relationships with colleagues

Quality of relationships with management Reward & recognition

Workload issues

.51 .35

.16

.15

.39

.42 .70

.22

.13

.29

The covariate-free and covariate-dependent reliability coefficients are given in Table

10.7. The model-based reliability coefficient rho shows that although the WOAQ is very

reliable for the whole sample (coefficient rho = 0.95), some part of the reliability is

dependent on organisational type (covariate-dependent coefficient rho = 0.32). Based on the

results, the type of organisation accounts for around 33% of the reliability. This indicates that

152

once the organisation type is controlled, there is less consistency left in the WOAQ

(covariate-free coefficient rho = .63). This result suggests that different parameter estimates

might be required for the nursing and paramedic samples. This will be tested below.

Table 10.7 WOAQ Reliability Statistics across Organisations (n=1257)

Model 1: Configural model with no constraints. The next step included fitting a

configural bifactor model for the nursing and paramedic organisations simultaneously, to

determine if the model was appropriate across organisations when no constraints were

imposed. Based on the results, the configural model (Table 10.8) showed marginal model fit

across organisations (RMSEA = 0.06, NNFI = 0.88), suggesting that there are indeed

significant differences in the parameter estimates for these two samples.

Model 2: Invariant loadings. However, after constraining the loadings to be equal for

both nursing and paramedics, the results showed good fit (RMSEA = 0.04, NNFI = 0.92).

Bifactor WOAQ Combined organisations

Coefficient rho .95

Covariate-dependent rho .32

Covariate-free rho .63

153

Table 10.8 Invariance Testing Across Organisations for the Bifactor Model of WOAQ

Models SB χ2/df

AIC CFI RMSEA NNFI

Model 1 - Configural model-no constraints

3.73 1086.76 .90 .06 (.06-.07) .88

Model 2 - M1+loadings invariance 2.68 478.16 .93 .04 (.04-.05) .92

Construct mean differences

2.99

698.89

.95

.05 (.05-.06)

.94

Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed Fit Index; CFI = Comparative Fit Index.

Group Differences for Construct Mean. The mean differences for the nursing and

paramedic organisations were therefore considered for the general factor of WOAQ and the

nested five sub-factors, with the paramedic organisation means selected as the reference

category. The construct means for the paramedic organisation were therefore set to zero,

while the construct means associated with the nursing were tested, providing an estimate for

the mean differences between the groups for all the constructs.

After setting equality constraints on loadings and intercepts for the measured

variables, with factor intercepts of zero only for paramedics, the results showed a good fit for

the model (RMSEA = 0.05, NNFI = 0.94). The mean differences between nurses and

paramedics demonstrated differences primarily on two of the nested constructs (‘co-worker’

and ‘workload issues’). The results showed that the mean scores on ‘the relationships with the

co-workers’ were higher for the paramedics than for the nurses, while the mean scores on

‘workload issues’ were lower for the paramedics, confirming that the parameter estimates do

154

differ for these two samples and explaining why the covariate-dependent reliability is so high.

The reasons for these differences will be explored in the discussion in Chapter 11.

But now we return to the second application in which the effects of CMB, measured

using a social desirability scale, and CMV due to a single survey source, are evaluated for a

student study of emotional intelligence, wellbeing and alcohol drinking behaviour.

155

10.3 Application in Demonstrating CMB using Social Desirability

As explained in the previous chapter, covariate-dependent reliability can be used for

common method bias evaluation. This section reports on the results of a different sample

(students) with the purposes of demonstrating possible CMB. The demographic

characteristics of the participants are presented in Table 10.9.

Table 10.9

Summary of the Demographic Characteristics of the Participants (n=341)

%

Gender Male

Female

18.18

81.82

Study status: Part-time

Full-time

2.64

97.36

Age (Mean/SD) 20 (3.98)

The overall reliability of the model, with CMB (social desirability) as a covariate, is

illustrated in Figure 10.1.

156

Figure 10.1. The effects of latent factor bias (social desirability) on the reliability of the constructs. Note. Due to the identification issue, item parcelling was used for the emotional intelligence (EI) construct, allowing four EI subfactors to load as observed variables for this construct.

The effect of social desirability as a source of CMB was assessed by conducting

Bentler’s covariate-dependent and covariate-free reliability assessment procedures. If there

was a difference in the reliability coefficients of the constructs after including CMB as a

covariate in the model, it means that there is some degree of covariate-dependent reliability,

and we can therefore conclude that CMB, in the form of social desirability, has biased the

reliability of the constructs.

CMB-Social Desirability

F Alcohol Habits

Emotional Intelligenc

Wellbeing

157

Table 10.10

Comparison of the Covariate-dependent and Covariate-free Reliability Coefficients of the

Scales after Including CMB

As presented in Table10.10, the CMB variable (social desirability) inflated the

reliability of the scales by around 27%. Removing the effect of CMB reduced the reliability of

the scales to only 0.66. This result suggests that CMB had a remarkable effect on reliability,

providing support for its existence in this study. Further analysis has been conducted in order

to also test for common method variance (CMV) using the model shown in Figure 10.2.

Reliability Coefficients rho( tω )

Overall reliability Covariate-

dependent

reliability

Covariate-

free reliability

All Scales .84 .18 .66

158

Figure 10.2. The proposed model for evaluating CMB/CMV.

Note. CMV = common method variance, CMB = common method bias (social desirability).

WORN

NERVE

ALC1

ALC2

ALC3

EI – FAC

EI - REG

EI - UND

EI - PER

Social Des 1

Social Des 2

Social Des 3

WELBEING

ALCOHOL

EMOTIONAL INTELLIGENCE

CMV

CMB*

D5

159

Model 1. This is a baseline model illustrated in Figure 10.3, where the three study

constructs (well-being, alcohol drinking behaviour, and emotional intelligence) are correlated

with each other, but CMV and CMB weights are constrained to zero (i.e., are not controlled

for). This is used as a comparison model when there is no control for method bias and

variance.

160

Figure 10.3 . Model 1: Baseline model when all the study constructs are correlated

without controlling for CMV and CMB

WORN

NERVE

ALC1

ALC2

ALC3

EI - FAC

EI - REG

EI - UND

EI - PER

Social Des 1

Social Des 2

Social Des 3

WELBEING

ALCOHOL


CMV

CMB*

D5

161

Model 2. The second model, illustrated in Figure 10.4, was compared with the

baseline model. This is a constrained model in which CMV and CMB were included but the

loadings from CMV to the study indicators were constrained to have equal effects. CMB

(social desirability) was included in the model as a predictor of CMV. It is expected that

social desirability would be the main source of bias for the study’s self-rated questionnaire

when asking about alcohol drinking behaviour and emotional intelligence skills. However,

this model controls specifically for CMB caused by social desirability, as well as other

random sources of CMV (e.g. single survey source).

162

Figure 10.4. Model 2. Constrained equal loadings from CMV to the study indicators.

WORN

NERVE

ALC1

ALC2

ALC3

EI - FAC

EI - REG

EI - UND

EI - PER

Social Des 1

Social Des 2

Social Des 3

WELBEING

ALCOHOL


CMV CMB*

D5

163

Model 3. The third model, illustrated in Figure 10.5, is the same as the previous model

but the loadings of CMV to the indicators were allowed to differ. A comparison of the

constrained Model 2 and unconstrained Model 3 with the baseline model tests the amount of

CMV for each of the study constructs individually. A comparison of Model 2 (constrained)

with Model 3 examines whether the effects of CMV are equal for all three constructs

(wellbeing, alcohol drinking behaviour, and emotional intelligence).

As shown in Table 10.11, both the constrained CMV (Model 2) (SBχ2/df (48) = 2.44,

RMSEA = .07, CFI = .86) and unconstrained CMV (Model 3) (SBχ2/df (39) = 1.45, RMSEA

= .04, CFI = .96) describe the data significantly better than the baseline model, which does not

account for any common method variance or bias (SBχ2/df (52) = 3.61, RMSEA = .09, CFI =

.73). These results suggest that social desirability accounts for part of the method bias.

A comparison of Model 2 and Model 3 also showed that the latter, with varying

weights from CMV to the indicators, describes the data significantly better than Model 2 with

equal indicator loadings for CMV. The results suggest that CMV has different effects on the

indicator loadings for the three constructs (wellbeing, alcohol drinking behaviour, and

emotional intelligence), perhaps suggesting that social desirability may not be the only source

of CMV.

164

Figure 10.5. Model 3. Free loadings from CMV to the study indicators

WORN

NERVE

ALC1

ALC2

ALC3

EI – FAC

EI - REG

EI - UND

EI - PER

Social Des 1

Social Des 2

Social Des 3

WELBEING

ALCOHOL


CMV

CMB*

D5

165

Table 10.11

Summary of Fit Indices of Comparison Models

Model SB (df) *

/df

CFI RMSEA

(CI)

Comparison models df P

1. Baseline 188.16 (52) 3.61 .73 .09(.08, .11)

2. Constrained CMV† 117.19 (48) 2.44 .86 .07(.05, .08) Baseline vs. Constrained CMV

(1 vs. 2)

70.97 4 <0.001

3. CMB-CMV 56.62 (39) 1.45 .96 .04(.01,.06) Baseline vs. CMB-CMV(1 v 3)

(2 vs. 3)

131.54

60.57

13

9

<0.001

<0.001

*Satorra-Bentler scaled chi-square; † loadings from CMV set to be equal in this model.

2χ 2χ 2χ∆ ∆

166

Table 10.12 presents the differences between the standardised loadings of the three

constructs (wellbeing, emotional intelligence and alcohol drinking behaviour) when

CMV/CMB are controlled. As can be seen, in Model 3, CMV does not have equal effects

on the indicator loadings, and, comparing Model 1 and Model 3, emotional intelligence and

alcohol drinking behaviours have the most inflated weights when CMV and CMB are not

controlled.

Table 10.12 Standardised Factor Loadings for Different Models Compared to Baseline

Indicators Baseline-

Model 1

Constrained

CMV-CMB

Model 2

CMB-CMV

Model 3

Wellbeing: worn-out/

exhausted 0.71 0.68 0.70

Wellbeing:

nervous/tense 0.85 0.62 0.74

EI-Facilitation 0.40 0.36 0.19

EI-Regulation 0.92 0.71 0.69

EI-Understanding 0.43 0.56 0.26

EI-Perceiving 0.46 0.62 0.29

Alcohol – 1 0.55 0.39 0.11

Alcohol – 2 0.93 0.66 0.25

Alcohol – 3 0.43 0.37 0.09

Social Desirability- 1 0.50 0.61



CMB>CMV path -0.003 0.45

167

This example supports the hypothesis that the SEM-integrated approach allows

control for the effects of common method variance and bias due to social desirability as

well as other possible sources of CMV.

168

11

STUDY 2: DISCUSSION

An overview of the history of reliability assessment was presented in Chapter 3,

starting with the single general coefficient for assessing internal consistency reliability

which was published by Cronbach in 1951. The critique of this coefficient and

recommendations for improvements were discussed. Although these recommendations may

be useful, there are other methods that could be considered In particular, the newly

developed covariate-free and covariate-dependent coefficients (Bentler, 2014) provide

insight into the internal consistency of scales when covariates are controlled.

The influence of covariates on rho (a model-based reliability coefficient) and on the

development of covariate-free coefficients of reliability was described in chapter 9. An

empirical study demonstrated the role of organisational type on the reliability of WOAQ in

Chapter 10. The WOAQ is a widely used measure for risk assessment in organisations,

based on the identification and collection of employee opinion regarding their work and

health (Griffiths, Cox, Karanika, Khan, & Toma, 2006). The scale is relatively short with

28 items. Using another student data set (chapter 10) also demonstrated how the effect of

CMB on reliability could be evaluated using the covariate-dependent and covariate-free

reliability measures. In addition, the effects of CMV and CMB on each of the constructs

emotional intelligence, wellbeing and alcohol drinking behaviour were compared in

Chapter 10. This chapter provides a discussion for these two applications.

169

11.1 Discussion: Application in Reliability Assessment of WOAQ

In this section, the reliability and covariate-dependent reliability of the WOAQ is

discussed for Australian employees in two separate organisations – a community nursing

organisation and a paramedic organisation. The WOAQ was validated as a bifactor measure

in Chapter 7, including a general measure of WOAQ and five nested subfactors, each

representing different dimensions of work organisation risk assessment. The five nested

subfactors were: ‘quality of relationships with management’, ‘reward and recognition’,

‘workload issues’, ‘quality of relationships with colleagues’, and ‘quality of physical

environment’. Although the employees in both organisations provide clinical services, the

nature and demands of the occupations are very different. In chapter 10 the results of

Bentler’s covariate-dependent approach to reliability assessment showed that almost one

third of the reliability was accounted for by the type of organisation.

The invariance testing supported this conclusion. It was found that ‘relationship

with colleagues’ was much more important to paramedics than to nurses. This can be

explained by the nature of work done by the paramedics. The service delivery for

paramedics is based on teamwork; without teamwork the quality of work would be

affected. Therefore, it makes complete sense that for paramedics their relationship with

their colleagues had higher loadings than was the case for home-based nurses.

In contrast, workload issues were less important to paramedics than to nurses. The

items captured for the workload issue construct consisted of pace of work, workload, and

the impact of work on family and family on work. When comparing the demographic

characteristics of both organisations, it is not surprising that the workload issue had less

170

importance for paramedics than for nurses. One main reason is working hours. The

majority of paramedics work part-time, while the majority of nurses work full-time. Also

the majority of the nurses were female and the majority of the paramedics were males. The

literature on work-family conflict demonstrates that women, both as employees and

caregivers, tend to be more exposed to the experience of conflict in juggling their work and

family demands. It is therefore likely that the impact of work-family conflicts and workload

are less overwhelming for paramedics, who are mainly male and work part-time, than for

females who are mainly females working full-time.

‘Pace of work’ was another item included in the workload construct. Because

paramedics work in a fast-paced work environment, it is possible that individuals interested

in such working conditions tend to become paramedics. This suggests that the pace of work

does not bother them as much as it does home-based nurses. Both, the home-based nursing

staff and the paramedics provide their services outside of their formal organisational

settings. However, the nurses’ work sites tend to be within the homes of their clients. The

atmosphere is one of trust, as the nurses and their clients interact over a number of clinical

treatments over an extended period of time. In contrast, the paramedics have a variety of

work sites, ranging from road, school and workplace accidents, through to nursing homes

and clients’ homes. Their interaction is by definition urgent and filled with emotion. Often,

their clients are unable to interact with the paramedics. For these two seemingly similar

organisational types, there are very significant differences in the work environment that

influence the responses to both the ‘Your Work’ and ‘Your Well-being’ components of the

WOAQ.

171

The above comparisons show how the type of organisation and the nature of work

impact on the way work organisation is assessed. As the example of WOAQ has

demonstrated, using the Bentler’s covariate-dependent and covariate-free reliability

assessment could have many benefits in practice, allowing scholars and researchers to

extract meaningful information from these measures. The results of Study 1 and Study 2

clearly show that the WOAQ is a useful tool for assessing different aspects of work

organisations in health settings. However, different types of organisations put different

weightings on the parameters assessed in the WOAQ. When assessing WOAQ in a

paramedic organisation, more attention should be paid to teamwork and fostering a spirit of

teamwork in the organisation. If WOAQ needs to be improved in a home-based nursing

setting, the focus should be on workload issues and managing work-life balance.

Therefore, it is very important to consider the possible covariates of reliability to get

more precise and meaningful outcomes in assessments. Study 2 presented an example

relating to the application of WOAQ, but this procedure has many other potential uses in

educational, clinical and/or health settings that need further investigation.

11.2 Discussion: Application in Demonstrating CMB

The other application of Bentler’s covariate-dependent and covariate-free reliability

procedure covered in chapter 10 offers new and comprehensive techniques for controlling

CMB and CMV which are common problems in psychological studies. The issue of CMV

becomes more noticeable when researchers rely on a single source of data collection and

CMB becomes more noticeable when self-report measures are used. In Chapter 10 the

covariate-dependent reliability procedure was used to demonstrate the effect of CMB and

172

CMV on scale reliability. The effects of social desirability was evaluated as a covariate in

the student study of emotional intelligence, wellbeing and alcohol drinking behaviour. The

results showed that around 27% of the scale reliability was influenced by CMB as

measured by social desirability. A SEM approach introduced by Williams, Hartman, and

Cavazotte (2003) and Podsakoff et al., (2003), and further developed by other researchers

(e.g., Richardson et al., 2009; Williams et al., 2010), was then used for assessing the effect

of CMV and CMB on each of the study constructs, emotional intelligence, wellbeing and

alcohol drinking behaviour.

The results produced two main findings:

a) CMB (due to social desirability) appears to inflate the reliability measures of the

scales.

b) The measures of emotional intelligence and alcohol drinking habits were more

influenced by CMB and CMV than wellbeing.

Consistent with previous findings (e.g., Podsakoff et al., 2003; Richardson et al.,

2009; Williams et al., 2010), it seems that the CFA approach provides a practical method

for controlling for method variances and biases. The findings also demonstrate that forcing

equal CMV effects for all measures is not appropriate because it adversely affects the

model fit. The findings showed that CMV effects differ depending on the nature of each

measure; CMV weights should therefore be allowed to vary. This result is similar to the

finding of Williams et al., (2010) but contradicts the equivalent method effects technique

proposed by Lindell and Whitney (2001).

173

11.3 Strengths

The research conducted in Study 2 and reported in chapters 9 to 11 was underpinned

by several strengths. First of all, to the best of the researcher’s knowledge, this study was

the first of its kind to demonstrate covariate-dependent reliability empirically. While

methods of reliability generalization (Vacha-Haase, 1998; Vacha-Haase & Thompson,

2011) have been proposed to study variation in reliability, reliability generalization is a

meta-analysis methodology requiring data from a large number of studies. In contrast

Bentler’s (2014) new procedure for covariate-dependent and covariate-free reliability

coefficients requires only a single study and can provide more accurate coefficients of

reliability with estimates of how these are affected by group characteristics. This approach

can be adopted for the conventional internal consistency measure (coefficient α), as well as

model based reliability for multi-dimensional studies. In Study 2 this method was initially

adopted in the context of a bifactor model. Using Omega hierarchical and Omega subscales

the reliability of a general factor and its subfactors were assessed. Then using a covariate-

dependent reliability assessment the between group variation with regard to reliability was

assessed. This is a novel approach and has application potential whenever the reliability of

a given scale might be affected by grouping variations.

Secondly, this study has shown that the type of organisation influences the

reliability of assessments such as the WOAQ that measure the psychosocial and physical

aspects of an organisation. This finding introduces a novel area of research that needs

further exploration.

174

Thirdly, this study could be considered as one of the first of its kind that

demonstrates the application of covariate-dependent and covariate-free reliability in

assessing CMB effects on reliability. The procedure appears to provide a very

comprehensive and simple quantification of the method effects in self-reported studies of

this kind. More studies of this type are needed to provide a comprehensive understanding of

common method effects on the reliability of measures. The proposed covariate-dependent

and covariate-free reliability procedure can be easily calculated using EQS.

Fourthly, this is one of the first studies that integrates a measured and unmeasured

latent variable procedure (Podsakoff et al., 2003), controlling for CMV and CMB. The

procedure appears to provide a very comprehensive way of controlling method effects in

self-reported studies. However, there is a need for further studies in order to shed more

light on this procedure.

11.4 Limitations and Directions for Future Research

Despite the above strengths, this study is not without weaknesses and limitations.

One of the limitations in this study is the lack of background literature on covariate-

dependent reliability. This makes it difficult to evaluate and compare the results with other

studies. Further studies are required to expand this new, practical area of research.

In this study, only two occupations in the health field were compared in order to

assess the covariate-dependent reliability of the WOAQ. Further studies are needed to

expand the concept to white, blue, and pink collar workers, as well as other health

professional occupations.

175

Covariate-dependent reliability may have practical implications for cultural

comparisons using the WOAQ or other similar scales. Therefore, future studies could

consider the role of culture as a covariate in the assessment of reliability of such scales.

The marker variable choice for CMB (in this study, social desirability) is

controversial. Some scholars believe that the marker variable should not have any

relationship with other substantive constructs while others believe the opposite (e.g.,

Lindell and Whitney, 2001; Richardson et al., 2009; Williams et al., 2010). In their review

of previous studies, Williams et al., (2010) demonstrated that researchers use a broad range

of variables as marker variables for CMB. They concluded that “... no consideration has

been given to the role of theory associated with method processes to guide the selection of

marker variables and the understanding of their effects” (p. 505). Further studies are needed

to determine the best criteria for choosing an appropriate marker variable when controlling

for CMB.

176

12

STUDY 3: MODEL-BASED RELIABILITY AND VALIDITY OF

REFLECTIVE-FORMATIVE MODEL OF WAS USING PLS-SEM

All studies discussed in the previous chapters had constructs measured using

reflective models. The literature rarely reports formative-formative or reflective-formative

measurement models. The purpose of Study 3 is to demonstrate the validity and model-

based reliability of a reflective-formative model of the Work Ability Scale (WAS). The

results are compared with the misspecified reflective forms of the WAS model to highlight

the possible errors that occur because of misspecification of measurement models. The

results of Covariance-Based Structural Equation Modelling (CB-SEM) and Partial Least

Squares SEM procedures are presented in Chapter 13. The Discussion in Chapter 14

presents the practical implications of these findings. This chapter introduces this study.

12.1 Rationale and Objectives

In Australia, the Work Ability Survey (WAS) and the Work Ability Survey

Revised (WAS-R) were developed by the Business, Work and Ageing Research Centre, at

Swinburne University of Technology, Melbourne (Taylor & McLoughlin, Dec 2011). As

described in Chapter 5, a decision-making tree has been developed for distinguishing

formative from reflective models (Figure 5.5, Chapter 5),

As explained below this decision-making tree suggests that the WAS should be

specified as reflective in the first-order and formative at higher orders (i.e. a reflective-

formative model).

177

12.1.1 Empirical Example: The Study of Work Ability

This section presents background on an empirical example (study 3) comparing the

results of a reflective-reflective higher order measurement model for work ability scale,

fitted using CB-SEM, and corresponding reflective-reflective, reflective-formative and

formative-formative work ability models fitted using the PLS SEM procedure. Before

discussing the methodology used, there is a need to fully explain the theoretical background

on which this model is based. As demonstrated in the decision-making framework in

Chapter 5 (Figure 5.5), the first step is to evaluate the background theory of the measure to

find out if the work ability model has previously been considered as reflective or formative.

Therefore, a review of the empirical and theoretical background of work ability will be

provided next.

12.1.2 History of work ability research

More than thirty years of international research on work ability and age

management have provided proof that the working life can be improved and extended.

Work ability is predominantly about work-life balance. In the early 1980s, research on

work ability started in Finland to determine the length of people’s working life and how

this is affected by work contentment and job demands (Ilmarinen, 2009). Through the

years, work ability conceptualisation has progressed to become more holistic. The history

of work ability research can be divided into three phases: (1) Evolution (1980 – 1989), (2)

Conceptualisation and Implementation (1990 – 1999), and (3) Internationalisation (2000 –

present). A brief description of each phase will be presented next.

178

Evolution (1980 – 1989). Work ability in the 1980s was defined as “How good is the

worker at present and in the near future, and how able is he or she to do his or her work

with respect to work demands, health and mental resources?” (Illmarinen, 2003; p. 3).

This was a period of longitudinal research, driven primarily by the question as to

what would happen employment-wise to the post-war baby boomers in the 1990s as they

started approaching retirement. The research also examined the extent of people’s working

life span and retirement (Ilmarinen, 2010).

Based on this positive approach, a multidisciplinary team of scientists constructed

and validated the Work Ability Index (WAI) and, using a stress/strain concept, they applied

and evaluated the work ability index on a large number of participants between 1981 and

2009 (llmarinen, 1991).

Conceptualisation and Implementation (1990 – 1999). The main characteristic of

the research during this period was the large number of longitudinal studies of men and

women who worked in the same occupation throughout the entire study period. The aim of

these longitudinal studies was to find a way to prevent disease and disability among

workers who were approaching retirement. Concurrently, the researchers were seeking a

way to maintain workers’ health and work ability (Tuomi, Ilmarinen, Klockars, Nygård,

Seitsamo, & Huuhtanen, 1997). Emphasis was on changes in work, lifestyle, health, stress

symptoms and work ability, as well as on the causes of any change. Changes were analysed

based on age, gender, work contentment and work profile. The occupations of the

participants were divided into physical work, mental work, and both physically and

179

mentally demanding work (Tuomi, Ilmarinen, Seitsamo, Huuhtanen, Martikainen &

Klockars, 1997).

The results highlighted that the different interactions between biological ageing,

health, lifestyle and work strongly affect work ability. But it appeared that, in general, work

ability decreases with age (Ilmarinen, Tuomi, & Klockars, 1997).

Even though a decline was observed in the work ability of the participants with age,

the initial age did not explain observed differences in the magnitude of these changes in the

participants’ work ability. The authors suggested that, in order to improve work ability,

there is a need for better supervisor attitudes, increasing variety at work, leisure and

physical activity (Tuomi, Ilmarinen, Martikainen, Aalto, & Klockars, 1997). It seemed that

while the work ability of senior employees usually declined with age, the work ability of

employees could be improved regardless of their age.

It was also found that the mean WAI improved among 10 per cent of the

participants and declined dramatically among 30 per cent. For 60 per cent of participants,

the index was steady at a good or excellent level (Tuomi, 1997). Based on a logistic

regression analysis, it was found that factors relating to lifestyle, management, and

ergonomics explained both positive and negative changes in work ability (Ilmarinen et al.,

1997).

The outcomes of the research had a profound impact in Finland. The Finnish social

partners made an agreement to promote and maintain work ability in workplaces. A work

ability measure was created and validated, and health professionals including physicians

and nurses were trained in the application of the WAI (Ilmarinen, 2009).

180

The study showed that the behaviours of managers and supervisors are among the

most critical factors influencing work ability. Also, improved work ability of ageing

employees and workers was directly related to age awareness (Ilmarinen, 2010). Based on

the study results, a focus on age management became popular in the early 1990s, and

training in age management started shortly afterwards. This developed into an international

course on age management which is still running (Ilmarinen, 2010).

Internationalisation (2000 – present). The original WAI was translated into many

languages in the early 1990s. The international validation of the index showed good results.

The psychometric properties of the scale and its predictive ability and cultural

appropriateness have been acknowledged to be constant across Europe (Gould, Ilmarinen,

Järvisalo, & Koskinen, 2008).

The global use of the original WAI provides excellent possibilities for international

networks and databases related to the index. This allows new possibilities for research,

which will strengthen WAI networks worldwide.

However, the work ability concept has changed over time. Current

multidimensional work ability theory focuses on the promotion of longer and healthier

careers with employment growth and improved wellbeing of the population until retirement

and beyond. Today, work ability is related to nearly all factors of work and life including

work-related, individual and social factors (Gould et al., 2008). These connections to most

aspects of daily living make the definition of work ability challenging and its promotion

demanding.

181

Since the 1980s, a large amount of research which focused on work ability and its

related factors has helped in the understanding of work ability and its complex relationship

with these factors. The growing importance of work ability research and applications is also

due to changes in the organisation of work and wider societal and population trends across

the world. In order to preserve work ability, it is essential to strive for a healthy work-life

balance (Gould et al., 2008).

There are several other indicators of work ability available; however, the original

Work Ability Index is by far the most widely used measure. In a three-level assessment of

work ability, participants evaluate their current work ability regardless of whether they

work. They may be completely fit for work, partially disabled for work, or completely

disabled for work. The score is usually referred to as the ‘work ability estimate’, and ranges

from 0 to 10. (Gould et al., 2008). A range of scores from 0 to 10 indicates full work

disability to best work ability. In the next section the current WAI, which incorporates the

original work ability estimate, is explained briefly.

The current Work Ability Index (WAI). The Finnish Institute of Occupational

Health originally developed the current index as a tool to predict retirement age and to

record the work ability of employees. It was designed to identify the health risks of

employees at an early stage and to highlight the risks of early retirement so as to avoid

these risks (Morschhäuser & Sochert, 2006). The WAI validity was tested using clinical

studies for many years. It has been used for years in occupational health and safety research

and practice in order to investigate the association between human resources and other

182

work-life factors, as well as to compare work ability in different age groups (Ilmarinen,

Tuomi, & Seitsamo, June 2005).

The index involves a self-assessment questionnaire and has a strong focus on health

status, resources and the subjective estimation of work ability (Gould et al., 2008). It is

based on questions that incorporate both the physical and mental demands of an employee’s

work (Tuomi, Ilmarinen, Jahkola, Katajarinne, & Tulkki, 2006). In the original study, after

completing the questionnaire, each employee was interviewed by an occupational health

professional. Based on the assessment, an evaluation was made as to whether there could be

any restriction or improvement on the employee’s current work ability in the future (Tuomi

et al., 2006).

The WAI has seven items (See Table 12.1), with a total score ranging from 7 to 49.

There are four categories derived from the WAI score, reflecting poor work ability (7 – 27

points), moderate work ability (28 – 36 points), good work ability (37 – 43 points) and

excellent work ability (44 – 49 points) (Martus, Jakob, Rose, Seibt, & Freude, 2010). The

score refers not only to the employee’s current status of work ability but also provides some

information on health-related risk factors. The results give an indication as to whether the

appropriate strategy is to maintain the current work ability, improve it and support it, or re-

establish it. According to Ilmarinen, the WAI is capable of reliably predicting work

disability, retirement and mortality (Ilmarinen, 2007).

183

Table 12.1

Items of the Work Ability Index

Items Range

1 Current work ability compared with the lifetime best 0 – 10

2 Work ability in relation to the demands of the job 2 – 10

3 Number of current diseases diagnosed by a physician 1 – 7

4 Estimated work impairment due to diseases 1 – 6

5 Sick leave during the past year (12 months) 1 – 5

6 Own prognosis of work ability two years from now 1 – 7

7 Mental resources 1 – 4

Note: Reprinted from Ilmarinen, J. (2007). The Work Ability Index (WAI). Occupational

Medicine, 57, p. 160

The index is analysed based on two factors. The first factor reflects a subjective

assessment of current and future work ability. The second factor reflects objective data

regarding health status and sick leave (Radkiewicz, Widerszal-Bazyl, 2005 & the NEXT-

Study group, 2005). Items one, two, six and seven measure the subjective component of

the index. The third, fourth and fifth items measure the objective component of the index,

based on the occurrence or absence of different illnesses listed in the questionnaire.

The WAI is easy to use. It takes about ten to fifteen minutes to administer the

questionnaire and a further three to five minutes for evaluation. (Ilmarinen & Tuomi, 2004).

It is highly recommended that participation is voluntary because the WAI surveys provide

confidential data on an employee’s illnesses and work ability. This means that data

protection must be strictly observed.

184

There are several other instruments designed to assess work ability or health-related

risk factors. While most instruments focus on labour and human resources policies, the

advantage of the WAI is that it concentrates directly on the employee's self-assessed work

ability (Morschhäuser & Sochert, 2006).

The reliability of the index was further analysed in the Netherlands using a test-

retest evaluation within a four-week interval. The results indicated that 25 per cent of

participants achieved the same WAI score on both measurements. The average test and

retest results were also similar indicating scale reliability (de Zwart, Frings-Dresen & van

Duivenbooden, 2002).

Validity and reliability for this index have been assessed using correlation analyses.

Other psychometric properties of the WAI have also been tested, including internal and

predictive validity, and the results have been published in peer-reviewed literature and

reports (European Network for Workplace Health Promotion (ENWHP) & National Work

Ability Index Network, 2012).

For example, de Zwart, Frings-Dresen, & van Duivenbooden ( 2002) explored work

ability among well-educated professionals, while another study (Pensola, Järvikoski, &

Järvisalo, 2008) looked at unemployment and work ability. Long unemployment, poverty,

and a lack of education are well known risks for marginalisation, and the results showed

clearly that unemployed individuals had more limited work ability than those who were

employed. Work ability scores were directly related to the extent of unemployment such as

its length and frequency. It was noted that part of the relationship between unemployment

and limited work ability was linked to economic difficulties, especially among the long-

185

term unemployed. In a 1991 study, subjective assessments reported via the WAI

questionnaire were compared with clinical examinations including cardiovascular,

musculoskeletal, and psychological measurements. The clinical examinations for both male

and female workers were selected according to health and subjective work ability as

reported by the questionnaire. The researchers found that the results suggested a

relationship between the level of work ability and other clinically assessed factors. There

were some individual differences observed, but they were explained on the basis of the

available data (Eskelinen, Kohvakka, Merisalo, Hurri, & Wagar, 1991).

The WAI can be used for individuals and groups, or even an entire company

(Morschhäuser & Sochert, 2006). A selected review panel from European countries stated

that the index is a useful, valid, and reliable tool that addresses a very relevant issue in the

workplace (European Network for Workplace Health Promotion (ENWHP) & National

WAI Network, 2012). They also viewed the index as a powerful predictive tool for

premature retirement. Organisations can implement strategies to moderate the risk of early

retirement based on the item responses for this index. However, the panel highlighted a few

challenges in terms of practical applicability and this has led to the development of new

multidimensional work ability models.

Multidimensional work ability model. With a large amount of research

undertaken internationally, there have been substantial changes to the work ability concept

and the conceptual models used to describe work ability. At the beginning of this

development, the aim was to predict retirement age and to try to find out how long people

are able to continue working after retirement, and what role work satisfaction and job

186

demands play in determining these factors. Health status was viewed as the most important

component of an individual’s functional capacity. With the development of the concepts of

work ability in a more holistic direction, consensus grew that work ability could not be

analysed individually and that there was a need for a conceptual shift to more of a life-work

balance model of work ability (Gould et al., 2008).

In early 2000, the Finnish Institute of Occupational Health in Helsinki introduced a

more advanced model of work ability. It is based on studies and development projects

conducted in the 1990s on occupational wellbeing in different industrial sectors and among

different age groups. The multidimensional image of work ability includes both individual

resources as well as work-related and personal factors (Finnish Institute of Occupational

Health, 2011). The dimensions of work ability are presented in the form of a ‘work ability

house’.

The factors influencing work ability represent four floors in a house (Figure 12.1).

The first floor includes human resources such as health - physical, mental, and social

functioning. If the first floor is strong, the chances are that a person will have stronger work

ability throughout his or her working life.

The second floor of the house contains knowledge and skills and their constant

updating, including education and relevant training. The third floor refers to the inner

values and attitudes and also to circumstances that motivate people at work. Work

environment is located on the fourth floor, right above attitudes because it directly affects

attitudes. When a person is exposed to good experiences, his or her positive values and

attitudes towards work are strengthened. On the other hand, bad experiences weaken both

187

attitudes and values (Finnish Institute of Occupational Health, 2011; Ilmarinen, 2010). As

clearly presented in the work ability house, work ability is formed by the work environment

as well as personal health and abilities.

Figure 12.1. Multidimensional work ability model. Reprinted from Finnish Institute of Occupational Health. (2011). Multidimensional work ability model. Helsinki, Finland, p. 1.

Outside the work ability house are additional influences on work ability.

Community organisations that support work, occupational health care and safety, as well as

188

the immediate social environment (family, friends, relatives, etc.) are also important.

Finally, the operational environment of work is added, including society, culture, social and

health policies and legislation. Government policies contribute to creating significant

prerequisites for work ability, but they also create challenges for work ability, such as

demanding a higher employment rate. Evidence shows that the core structure of work

ability is very dynamic and can change greatly during a person’s career. Any conflict

between family life and work life will have an impact on work ability. Also, support or lack

of support from the community will affect one’s work ability. Likewise, the introduction of

new technologies, the impact of globalisation, or changes in retirement/health/welfare

systems and legislation status will make a difference to work ability (Gould et al., 2008).

The multidimensional work ability model is very versatile and can be applied to

planning research and developmental projects, as well as training and education programs

(Ilmarinen, 2010).

Work ability in Australia. The work ability index has been used in Australia for

more than ten years. The major reason researchers are interested in its application is the

ageing population, and the need to enhance health and labour systems (Taylor, 2010). Such

considerations have caused policymakers to rethink the length of working lives. As

Australia faces a skills shortage and an ageing workforce, the focus is on finding answers to

the following question: “How can we tap into the available talent in the workforce and

remove the barriers to a life in work?” (Australian Government, Compare, 2011).

Researchers in Australia have used the WAI for different purposes such as

predicting employees’ retirement intentions (Oakman & Wells, 2009), predicting work

189

ability of employees (Palermo, Webber, Smith, & Khor, 2009), and examining the

relationship between age and work ability (Webber, Smith, & Scott, 2006).

However, while individual factors remain significant predictors of work ability,

Palermo (2010) has found that other organisational factors such as occupational stress, job

satisfaction, leadership effectiveness and the nurturing of workers are significant positive

predictors of work ability. The outcomes of this study strongly support the Finnish findings

that managers and supervisors play key roles in influencing work ability (Ilmarinen, 2010).

Organisations that advocate and endorse caring values for others are more likely to return a

better work ability score.

Figure 12.2. WAI scores: Australia and Finland. Reprinted from Taylor, P., & McLoughlin, C. C. (Dec 2011). Pilot Study on Workability. Monash University.

Unpublished presentation. Melbourne, p. 10.

Australian studies have compared the predictive power of retirement intentions,

investigating the connection between age, injury proneness and work ability, and assessing

190

the influence of organisational values on work ability. Three predictors of the WAI

accounted for 42 per cent of its variation. These variables were “management respects

you”, “working beyond physical capacity” and “unevenly distributed work” (Brooke,

Goodall, & Mawren, 2010). A surprising finding in these studies was an extremely high

mean work ability score (Figure 12.2) compared to the Finnish population (Taylor, &

McLoughlin, Dec 2011; Palermo et al., 2009). These scores had a more negatively skewed

distribution compared to the Finnish distribution, even though the population studied varied

from private to public organisations, across locations and across different industries.

In Australia, a Work Ability Survey (WAS) and the Work Ability Survey Revised

(WAS-R) were instruments developed in four companies by the Business, Work and

Ageing Research Centre, Melbourne (Taylor & McLoughlin, Dec 2011). These authors

have kindly provided their data for use in this study. WAS is an organisational survey that

is aligned with the four levels of the multidimensional work ability model, as well as the

WAI described above. It consists of physical and psychosocial work demand measures. The

original model (McLoughlin, 2009; Taylor & McLoughlin, Dec 2011) was specified as a

higher order reflective-reflective model as illustrated in Figure 12.4.

The personal and organisational capacities are two independent constructs that

jointly form the WAS. However, the six factors contributing to the organisational capacity

construct and the five factors contributing to the personal capacity construct are expected to

be correlated, suggesting that a reflective-formative specification of this model would have

been preferable to the originally specified reflective-reflective format. Considering the

191

theoretical background and other criteria demonstrated in Figure 5.5, it is confirmed that

the WAS should be modeled as a reflective-formative model.

The major aim of this study was to demonstrate the validity and model-based

reliability of a correctly specified reflective-formative model of WAS. It was hypothesised

that:

Hypothesis 12.1: A reflective-formative higher order model of WAS has acceptable validity

and model-based reliability.

The review of the misspecification literature in Chapter 5 showed that

misspecification of measurement models is common. As mentioned in that chapter,

misspecification can lead to Type I and II errors. A Type I error is a false positive error that

occurs when a path is declared significant when it is not (incorrect rejection of a true null

hypothesis). A Type II error is a false negative error that occurs when declaring a path to be

nonsignificant when it is significant (failure to reject a false null hypothesis). In SEM, Type

I errors may result from the erroneous application of a reflective model instead of a

formative model, while a Type II error can occur with the erroneous application of

formative models in place of reflective models (Jarvis et al., 2003; MacKenzie, Podsakoff,

& Jarvis, 2005).

A study by Petter et al., (2007) considered a series of simulations for structural

models that contained no significant paths. They found that when the formative construct

was misspecified as reflective, upward bias in the parameter estimates often produced a

Type I error. Roy, Tarafdar, Ragu-Nathan & Marsillac (2012) presented similar results.

They reported that misspecifying a reflective model as formative leads to a deflation of path

192

coefficients and R square values (Type II error). Conversely, while Petter et al., (2007)

found that misspecifying a formative model as reflective results in the inflation of path

coefficients and R square values (Type I error). Petter et al., (2007; p. 631) stated that “The

danger of Type I error is that we, as researchers, may build new theories and models based

on prior research that finds support for a given relationship that does not actually exist. This

may affect the implications of our research for both academia and practice. The danger of

Type II error is that some interesting, valuable research may not be published if many of

the relationships within the model are found to be nonsignificant”. In a misspecified model

such as the reflective-reflective WAS (Figure 12.4), the variance of the constructs will

increase due to shared error. As a result, the path coefficients to the higher order constructs

will be increased, creating an upward bias in the result. The opposite will happen in a

misspecified formative-formative model of WAS, resulting in a downward bias. To the best

of this researcher’s knowledge, no study has investigated the consequences of

misspecifying a mixed model (reflective-formative and formative-reflective).

The secondary aim of this study was to increase the awareness of the

misspecification problem by demonstrating the possible consequences of model

misspecification in an empirical study. The correctly specified reflective-formative model

for WAS was therefore compared with the misspecified reflective-reflective and formative-

formative models, fitted using Partial Least Squares SEM in order to quantify any Type I or

II errors.

The Partial Least Squares SEM was used to evaluate the WAS models for several

reasons. First, evaluating the WAS reflective-formative model using the Conventional

193

Covariance-Based SEM procedure was very problematic due to identification problems. As

asserted by Bollen and Lennox (1991), a reflective measure can be easily identified and

evaluated using Covariance-Based SEM, while a formative measure cannot be easily

identified, except by placing the measure in a larger path structure with other variables that

can be evaluated (e.g. using a MIMIC model). Partial Least Squares SEM procedure is a

better alternative for evaluating models with formative measures; these can be simply

evaluated in isolation using this procedure. There is also less restriction in terms of

normality or sample size compared to Covariance-Based SEM (Roy et al., 2012).

The PLS-SEM results for the correctly specified reflective-formative model were

compared with the misspecified reflective-reflective WAS model evaluated using CB-SEM,

to evaluate the consequences of the misspecification, allowing the testing of the following

hypothesis

Hypothesis 12.2: The results for a misspecified reflective-reflective model (fitted

using Covariance-Based SEM) will demonstrate inflated loadings, compared to a correctly

specified reflective-formative WAS model (fitted using Partial Least Squares SEM).

The methodological procedure for fitting a reflective-formative model in Partial

Least Squares SEM is demonstrated in detail below.

194

Figure 12.3. The correctly specified reflective-formative model of WAS.

Q 1

Q 2

Q 4

Q 3

Q 6

Q 7

Q 8

Q 9

Q 11

Q 12

Q 13

Q 14

Q 15

Q 16

Q 17

Q 29

Q 10

Q 5

Q 30

Q 19

Q 28

Q 27

Q 26

Q 25

Q 24

Q 23

Q 22

Q 21

Q 20

Q 18

Q 31

Q 32

Q 33

Q 34

CONTROL

TRUST

RESPECT

SUPPORT

HARASSMENT

TRAINING

MENTAL H

PHYSICAL H

WORK-HOME

HOME-WORK

LEISURE

WAS

ORGANISATIONAL

PERSONAL

195

Figure 12.4. The misspecified reflective-reflective model of WAS

Q 1 Q 2

Q 4

Q 3

Q 6

Q 7

Q 8

Q 9

Q 11

Q 12

Q 13

Q 14

Q 15

Q 16

Q 17

Q 29

Q 10

Q 5

Q 30

Q 19

Q 28

Q 27

Q 26

Q 25

Q 24

Q 23

Q 22

Q 21

Q 20

Q 18

Q 31

Q 32

Q 33

Q 34

CONTROL

TRUST

RESPECT

SUPPORT

HARASSMENT

TRAINING

MENTAL H

PHYSICAL H

WORK-HOME

HOME-WORK

LEISURE

WAS

ORGANISATIONAL

PERSONAL

196

.

Figure 12.5. The misspecified formative-formative model of WAS.

Q 1 Q 2

Q 4

Q 3

Q 6

Q 7

Q 8

Q 9

Q 11

Q 12

Q 13

Q 14

Q 15

Q 16

Q 17

Q 29

Q 10

Q 5

Q 30

Q 19

Q 28

Q 27

Q 26

Q 25

Q 24

Q 23

Q 22

Q 21

Q 20

Q 18

Q 31

Q 32

Q 33

Q 34

CONTROL

TRUST

RESPECT

SUPPORT

HARASSMENT

TRAINING

MENTAL H

PHYSICAL H

WORK-HOME

HOME-WORK

LEISURE

WAS

ORGANISATIONAL

PERSONAL

197

12.1.3 Composite reliability using PLS

All the model-based reliability assessments mentioned in study 1 and 2, require the

use of reflective model and covariance-based SEM (CB-SEM). CB-SEM uses Maximum

likelihood (ML) estimation. Partial Least Squares (PLS) requires an alternative model

estimation approach called partial least squares estimation.

In the absence of normality or when the sample size is small, PLS-SEM seems to be

an appropriate alternative to CB-SEM for computing model-based reliability coefficients.

PLS-SEM is considered to be a correct and feasible method for estimating formative or

reflective-formative models. Usually models involving formative constructs present

identification problems and are difficult to evaluate using CB-SEM, while PLS-SEM is

commonly regarded as a good tool for evaluating such models. An additional advantage of

PL-SEM is acknowledged when developing measurements with new theoretical or

empirical backgrounds (Ridgon, 2012), PLS-SEM seems to provide a more appropriate

procedure for reliability assessments in this case. Pro-PLS scholars believe that by using

research data, one can help in building empirical background and unobservable conceptual

variables (Ridgon, 2012). On the other hand, CB-SEM followers believe that one should

specify a conceptual structure and seek evidence regarding whether these structures are

consistent with empirical evidence, so that results can challenge, support, or modify those

conceptualizations (please see previous chapter for more details on PLS-SEM vs. CB-

SEM).

Despite the less restrictive nature of PLS-based SEM, it is still not as popular as

covariate-dependent SEM in model-based reliability assessments. The main reason for this

198

previously was a lack of software for model estimation, but this problem is now being

addressed. Since 1984, and especially from the early 2000s, more user-friendly software

has been introduced for the estimation of PLS-based SEM, adding to the popularity of the

method.

Built on classical test theory and using PLS-SEM, Composite Reliability can be

estimated for constructs (Werts, Linn & Jo¨reskog, 1974). Composite reliability (CR) is the

reliability of multiple constructs with similar items. In other words, CR is the total true

score variance extracted over the total scale variance.

The CR will be equal to coefficient alpha when the essential tau-equivalency of all

items are met, otherwise CR is usually higher than coefficient alpha.

The reliability of reflective measures using PLS can be tested using Composite

Reliability ( cρ ) (Werts, Linn & Jo¨reskog, 1974). Composite reliability takes into account

the different outer loadings of the indicator variables in a model and therefore seems to

better reflect the model-based reliability compared to internal consistency coefficients such

as Coefficient alpha (Hair et al., 2014). Values of 0.60-0.70 or higher are acceptable for CR

for early stages of scale developments, and values of 0.80 and higher are satisfactory for

more developed (established) measures (Nunally & Bernstein, 1994).

To test the reliability of constructs, some scholars (Chin 1998; Hair et al., 2014;

Fornell & Larcker, 1981) suggest reporting not only the Composite Reliability of the scale

but also the reliability of each indicator (since the reliability of each indicator may differ) as

199

well as the Average Variance extracted (AVE), which measures how much indicator

variance is explained by the common factors.

As before, the convergent validity of the indicators of a construct is defined as “the

extent to which a measure correlates positively with alternative measures of the same

construct” (Hair et al., 2014, p 102) and the average variance extracted (AVEs) can be used

to test for convergent validity (Fornell & Larcker, 1981) with a cutoff point of greater than

0.50 required. In addition, if the square roots of AVE exceeds the estimates of the

intercorrelation of the construct with other constructs, discriminant validity is supported

(Chin 1998; Fornell & Larcker 1981). Reporting the Composite reliability as the reliability

of a summated scale is needed as much as the average variance extracted (Fornell &

Larcker, 1981).

LISREL does not output the CR directly and some manual calculation needs to be

done in order to obtain CR. However, Smart PLS reports not only the reliability at item

level and the CR at scale level but also the AVEs, all in one single analysis. In addition,

using SmartPLS, the confidence interval of the composite reliability can be estimated by a

bootstrapping procedure. This allows the testing of the hypothesis that the reliability

coefficient exceeds a specified value in the population.

200

12.2 Method

12.2.1 Participants

The data for the present study was obtained from the Redesigning Work for an

Ageing Society research program conducted by the Business, Work & Ageing Centre for

Research (BWA) at Swinburne University of Technology in Melbourne. The data was

collected from four case study organisations during 2007-2008 with an overall response

rate of around 40% (a total sample of 1687 respondents). The final data used in the present

study contained 1344 respondents, allowing for the removal of 343 incomplete survey

responses.

12.2.2 Measure

The Redesigning Work for an Ageing Society research program developed the

Work Ability Survey (WAS), through the works of McLoughlin (2009), Taylor, and

McLoughlin (2011). The WAS has two main sub-constructs entitled personal and

organisational capacities. The organisational capacities scale consists of six subconstructs:

control, respect, trust, support, harassment, and training. The personal capacities has five

subconstructs: leisure, work-home balance, home-work balance, mental health, and

physical health. A version of the questionnaire is presented in Appendix E with the

permission from the researchers involved in the original study.

12.2.3 Ethics

The original study obtained ethics clearance from the participating organisations

and permission to reuse the database in similar studies.

201

12.2.4 Overview of analysis

Covariance-Based SEM analysis using AMOS software, and a Partial Least

Squares SEM using SmartPLS (v2.0), was used to assess the reflective-reflective model of

WAS. An overview of the similarities and differences between Covariance-Based SEM and

Partial Least Squares SEM follows.

Debates regarding the superiority of Covariance-Based SEM over Partial Least

Squares SEM have existed since the early years of development of these procedures (see

Chapter 2 for details on the origins of each procedure). In particular, some scholars have

questioned the practicality and generalisation of the PLS method for factor estimation.

In spite of the wide criticism of Partial Least Squares in the literature, PLS has

specific strengths in specific situations that some Covariance-Based SEM scholars have

misunderstood or ignored. A comparison of some of the main features of both approaches,

along with some of the criticisms follows.

Predicting validity. The literature shows that PLS has capability as a prediction tool,

a fact that has not been fully appreciated. As such PLS provides a correct method for

evaluating formative constructs and for developing measurements with new theoretical or

empirical backgrounds (Ridgon, 2012). Scholars supporting PLS believe that using

research data allows the building of empirically-based theory and constructs (Ridgon,

2012). On the other hand, Covariance-Based SEM followers believe that theory is needed

to specify a conceptual structure, while research data is needed to test whether these

202

structures are consistent with empirical evidence. The argument is that the results can

challenge, support, or modify those conceptualisations.

Fit assessment test. Covariance-Based SEM assesses the overall fit of the model

using the covariance among the items, assuming that all measures are reflective, with less

interest in the individual effects of construct or path coefficients. In contrast, PLS does not

rely on item covariance and overall goodness-of-fit; instead, the focus is on the variances of

predicted variables or construct variances (Chin, 2010). In practice, in the presence of

formative constructs, PLS might be a better choice than Covariance-Based SEM. Indeed, as

explained below, Covariance-Based SEM cannot be used for third-order models, such as

the WAS model considered here.

Theoretical background. Due to the holistic and confirmatory approach of

Covariance-Based SEM, it is more appropriate when there is solid theoretical and

background knowledge of the model. In contrast, a Partial Least Squares approach, with its

exploratory nature and focus on the significance and strengths of individual paths and

constructs, seems to be an appropriate procedure for new models. It is particularly useful in

social and behavioural sciences when the background knowledge of the expected model is

limited (Chin & Newsted, 1999; Chin, 2010; Roldán & Sánchez-Franco, 2012).

Normality assumption. Covariance-Based SEM commonly uses ML estimation

assuming a normal distribution for the data, while for PLS there is no underlying

assumption for the data distribution. This indicates that for non-normal data, the use of

variance-based PLS is justified when sample sizes are too small to allow asymptotically

distribution-free Covariance-Based SEM or bootstrap analyses.

203

Sample size. One of the requirements for using Covariance-Based SEM is to have a

relatively large sample size, while PLS can be conducted with small sample sizes.

However, in PLS, the estimators are inconsistent and biased in that standard errors do not

decline with increasing sample size and expected parameter estimates do not converge to

their true values. This lack of consistency means that increasing sample size does not

provide a more reliable analysis in the case of Partial Least Squares SEM. However, in

Covariance-Based SEM models, if the underlying assumptions are met, consistency is

ensured and larger sample sizes do provide a more reliable analysis.

Reflective and Reflective-formative models. Partial Least Squares SEM and

Covariance-Based SEM are two different approaches for estimating a SEM model and both

can be used to fit reflective models. PLS can also be used to fit reflective-formative models

and formative-formative models. However, Covariance-Based SEM can only be used to fit

reflective-formative models when there is a reliable measure for the higher-order latent

constructs (using MIMIC models). Each approach is suitable for a specific context.

Researchers need to appreciate the characteristics of each method to be able to choose the

most suitable approach (Hair et al., 2010; Hair, Ringle, & Sarstedt, 2011; Hair, Hult,

Ringle, Sarstedt, 2014). As acknowledged by Hair et al., (2011), neither method is superior

to the other. They further state that “depending on the specific empirical context and

objectives of a SEM study, PLS‑SEM’s distinctive methodological features make it a

valuable and potentially better-suited alternative to the more popular Covariance-Based

SEM approach” (p. 149).

204

The attempt to use the conventional Covariance-Based SEM procedure in this study,

using the MIMIC model to evaluate a formative-formative WAS model, failed to identify

the model. When partial least squares (PLS) Structural Equation Modelling (SEM) was

used instead of the conventional CB-SEM to evaluate a formative-formative WAS model,

model identification was achieved. In order to evaluate the correctly specific reflective-

formative WAS model, Partial Least Squares SEM was also needed.

12.2.4.1 Building a higher-order reflective-formative model of WAS using PLS-SEM.

To the researcher's best knowledge, there are only a handful of studies (Becker,

Klein, Wetzels, 2012; Wetzels, Odekerken-Schroder, van Oppen, 2009) that recommend

guidelines for fitting a higher-order model in Partial Least Squares SEM. Wetzels et al.,

(2009) developed guidelines for building such a higher-order ‘reflective’ model. “PLS path

modeling can also be used for higher-order models with formative constructs or a mix of

formative and reflective constructs” (Wetzels et al., 2009, p. 189). In this study, mixed

approaches suggested by Wetzels et al. (2009) and Becker et al. (2012) were used to fit the

proposed reflective-formative WAS model, with some amendments to the guidelines as

proposed by Wetzels et al. (2009) for reflective models. To clarify the approaches used in

this study, a brief description of each approach including their advantages and

disadvantages, is provided below.

In the reflective-formative WAS model, the first-order constructs are reflective but

the higher-order constructs are formative (Figure 12.3). In the formative-formative WAS

model (Figure 12.5), the first-order and second-order constructs are both formative. The

repeated indicators approach and the two-stage approach are recommended to test a higher-

205

order reflective-formative model in Partial Least Squares SEM: (Becker et al., 2012; Hair,

Hult, Ringle, & Sarstedt, 2014; Wold, 1982). In the repeated indicators approach, all the

indicators of the first-order constructs are allocated to the second-order construct. This is

called the repeated indicators approach (Wold, 1982) because the indicator variables are

repeated twice in the model (i.e., for the first and second-order constructs). The two-stage

approach requires two steps in the model analysis. The first-order constructs are evaluated

at the first stage and the predicted value for the first order constructs are then used in the

second stage as indicators for the second-order constructs (Becker et al., 2012; Hair et al.,

2014; Wetzels et al., 2000). According to the simulation study by Becker et al., (2012) and

recommendations by other researchers in this area (Ringle, Sarstedt, Straub, 2012; Hair et

al., 2014), these approaches are only appropriate in specific circumstances.

The benefit of the repeated indicator procedure is the estimation of all constructs in

a single analysis. However, there are some weaknesses with this approach. First,

misspecifying the repeated loadings of higher-order constructs (reflective vs. formative)

could lead to incorrect results. It is advised by Becker et al., (2012) that for reflective

higher order models (reflective-reflective and formative-reflective models), the inner

indicators of the higher-order constructs should be reflective; while for any type of higher-

order formative model the repeated indicators of the higher-order constructs should be

specified as formative. Another weakness of this procedure is that unequally important

indicators of first-order constructs could lead to biased results (Chin et al., 2003; Ringle et

al., 2012). Although, simulation studies indicate this is a concern for reflective models

only, not for formative models (Becker et al., 2014). A further weakness is the production

206

of incorrectly correlated residuals due to repeated use of the same indicators for the first

and second-order constructs (Becker et al., 2012). A final weakness of this procedure is that

most of the variance is explained by the lower-order constructs. As a result, the path

coefficients of higher-orders are usually zero or non-significant (Ringle et al., 2012).

The two-stage approach also has advantages and disadvantages. In this approach, a

higher-order model is estimated separately from the first-order model, resulting in no risk

of misspecification of the repeated indicators for higher-order constructs. For reflective

models with unequally important indicators, this approach delivers a more reliable result

compared with the repeated indicator approach (Becker et al., 2012). Most importantly,

applying the two-stage approach and estimating the first-order constructs in a separate

analysis of higher-order constructs allows other variables to emerge to explain some of the

variances contributing to the higher-order formative constructs (Ringle et al., 2012). The

disadvantage is that the first-order and higher-order constructs are not estimated

simultaneously. Therefore, the model estimators might not be as precise as those obtained

with the repeated indicator approach.

For this study, a reflective-formative model was fitted using a mixture of ‘repeated

indicator’ and ‘two-stage approaches’. Based on the above recommendations, the repeated

indicator approach was used at the first stage, with the construct scores of the first-order

constructs used at the second stage as the manifests/indicators of the higher-order

constructs. Applying the repeated indicator approach at two stages creates less bias, more

reliable parameter estimates/scores, and a more precise estimation of path coefficients of

constructs (Becker, et al., 2012).

207

Figures 12.6 to 12.8 present the PLS model building process used in this study.

Figure 12.6 . Step one: constructing the first-order sub-constructs of both personal and organisational capacities of reflective-formative model of WAS using PLS path

modeling.

At step one of the PLS path modeling of WAS, the first-order sub-constructs of both

personal capacity (mental and physical health, leisure, work-home and home-work factors)

and organisational capacity (control, trust, training, respect, support, harassment) were

constructed individually (Figure 12.6). Although a small number of the constructs (e.g.,

physical health) had a formative structure, for consistency purposes all were considered as

reflective.

208

Figure 12.7. –Step two: building the second-order formative constructs (organisational and personal capacities) for the reflective-formative model of WAS.

Note. Due to limited space, only some of the indicators of organisational capacity

are shown.

At step two, the second-order formative constructs (organisational and personal

capacities) were built by relating them to their first-order reflective sub-constructs and the

firs-order indicators. Both personal and organisational constructs were estimated separately

to obtain the scores for the first-order latent factors (Figure 12.7).

209

Figure 12.8. Step three: The scores of the first-order latent factors, are used as the manifests of the second-order factors (i.e. organisational and personal capacities) and

forming the higher-order construct (WAS).

At step three (Figure 12.8), the scores of the first-order latent factors were used as

the manifests of the second-order factors (i.e., organisational and personal capacities) and

the higher-order construct (WAS) was built by relating it to the second-order constructs

(organisational and personal capacities). The inner and outer loadings were built on

repeated predictors of the first-order observed scores that were obtained at Step 1.

The third-order model was assessed in the final step (Figure 13.1). The inner and outer

models of first-, second-, and third-order loadings were estimated using SmartPLS.

The inner model in SEM refers the relationships between the independent and dependent

latent variables, while the outer model demonstrates the relationships between the latent

210

variables and their observed indicators. Because Partial Least Squares SEM does not

require any normality assumptions for the data, parametric test results could not be used for

inferential decision-making. Instead, to evaluate the significance of the coefficients, a

nonparametric bootstrapping procedure was applied (Chin 1998; Efron & Tibshirani, 1993;

Tenenhaus, Vinzi, Chatelin, & Lauro, 2005) in order to draw an inferential conclusion. The

number of bootstrapping subsamples needs to be higher than the number of valid

observations in the original dataset (in this study, higher than 1344). As a rule, 5000

bootstrap samples are recommended in Partial Least Squares SEM (Hair et al., 2014). The

number of cases used for each randomly chosen bootstrap sample is the same as the number

of cases used in the analysis (1344 cases in this study).

12.2.4.2 Measurement Model Evaluation Criteria for WAS Model

Reliability, convergent and discriminant validity of the WAS. The reliability of the

reflective measures at first-order was tested using model-based reliability coefficients, or

what is known in Partial Least Squares SEM as composite reliability (Chin 1998; Fornell &

Larcker 1981). The conventional Coefficient alpha was compared with these values.

Composite reliability takes into account the different outer loadings of the indicator

variables and therefore better reflects the reliability compared to internal consistency

coefficients such as Coefficient alpha (Hair et al., 2014). Similarly to Omega, the

composite reliability coefficient is defined as the ratio F/(F+E) where F is the sum of factor

loadings, squared and E is the sum of the error variances. As was the case for Omega,

composite reliability refers to a model-based reliability coefficient and values between

0.70-0.90 are satisfactory (Nunnally & Bernstein, 1994).

211

Internal consistency reliability coefficients such as Coefficient alpha are not

appropriate for second- and third-order formative conducts. For the higher order model of

WAS, where the first-order constructs are reflective, the model-based reliability was

calculated only for the first-order constructs; however, as explained by Edward (2001),

reliability is not an issue for the higher order formative constructs. Instead, the validity for

formative constructs is critical. According to Bollen and Lennox (1991), if the path from

each subconstruct, considered as an indicator of its corresponding formative construct, is

significant, then the validity of the formative construct is confirmed. The significance of all

these coefficient paths demonstrates the validity of the formative model for this construct.

Another part of the validation process was to find out how distinguishable the

constructs were (discriminant validity). As emphasised by Campbell and Fiske (1959, p.

84), “One cannot define without implying distinctions, and the verification of these

distinctions is an important part of the validation process.” One procedure for evaluating

discriminant validity requires assessing the intercorrelation of the constructs. If the square

root of the average variance extracted (AVE) exceeds the estimates of the intercorrelation

of a construct with the other constructs, discriminant validity is supported (Chin 1998;

Fornell & Larcker 1981).

In a previous chapter, convergent validity of a construct was defined as “the extent

to which a measure correlates positively with alternative measures of the same construct”

(Hair et al., 2014, p. 102). The average variance extracted (AVE) can be used to test for

convergent validity (Fornell & Larcker, 1981) with a cut-off point of greater than 0.50

required for demonstrating an acceptable convergent validity.

212

13

STUDY 3: RESULTS

In this chapter the correctly specified reflective-formative model of WAS was fitted

for evaluation using the Partial Least Squares SEM procedure described in chapter 12. The

results were then compared with the misspecified models of WAS to demonstrate the

consequences.

13.1 Results of Model Fit Evaluation

SmartPLS was employed to estimate the inner and outer first-, second-, and third-

order loadings. Tables 13.1 and 13.2 show the reliability results and convergent-

discriminant validity of the constructs at the first-order of WAS. Table 13.1 demonstrates

the standardised coefficients, Average Variance Extracted (AVE) for first-order constructs,

the model-based reliability at construct level, and the conventional coefficient alpha

reliability at item level. The model-based reliability measures for the constructs are higher

than the conventional coefficient alpha. The model-based reliability of all constructs greatly

exceeds the minimum acceptable level of .70, demonstrating great reliability of the

constructs. The AVE of all constructs, with the exception of ‘training and harassment’,

exceeds the cut-off point of .50, suggesting convergent validity in all but these two

constructs. Most importantly, all the lower order loadings were significant.

Table 13.2 presents the intercorrelations of the first-order constructs along with their

Square Roots of Average Variance extracted (AVE) for assessing the discriminant validity

of the constructs. The results confirm that discriminant validity of the first-order constructs

213

exists because the square root of AVE for each construct is higher than any intercorrelation

with the other constructs.

214

Table 13.1 Quality Criteria of the Reflective-formative WAS First-order Constructs using PLS-SEM

Latent Variable Indicators Loadings AVE Model-based reliability

Coefficient alpha Reliability

Convergent Validity

LEISURE Q56d 0.68 0.52 0.77 0.55

Yes Q56h 0.68 Q56i 0.78 HOME-WORK Q51c 0.92 0.82 0.90 0.78 Yes Q51d 0.89 WORK-HOME BALANCE

Q51a 0.94 0.88 0.94 0.86 Yes

Q51b 0.93 PHYSICAL HEALTH

Diagnosis 0.64 0.61

0.75

0.39

Yes

Q37 0.89 MENTAL HEALTH

Q47a 0.89 0.74

0.90

0.83

Yes

Q47b 0.89 Q47c 0.79 CONTROL Q15a 0.67 0.50 0.85

0.79

Yes

Q15b 0.68 Q15c 0.65 Q19a 0.68 Q19b 0.72 Q19c 0.76 TRAINING Q30a 0.53 0.44

0.76

0.59 No Q30c 0.77 Q30d 0.63 Q30e 0.68 HARASSMENT Q42a 0.74 0.49 0.79 0.65 No

215

Q42c 0.63 Q42d 0.73

Q42h 0.68 SUPPORT Q27a 0.89 0.80

0.92

0.87

Yes Q27e 0.90 Q27f 0.89 RESPECT Q22c 0.89 0.81 0.93

0.89

Yes Q22d 0.93

Q22e 0.87 TRUST Q24e 0.85 0.78 0.92 0.86

Yes

Q24f 0.89 Q24g 0.90

216

Table 13.2 Intercorrelation Analysis and the Square Roots of AVE of First-order Constructs of Reflective-formative PLS-SEM Model †

Discriminant

Validity

LEIS

URE

HOME-

WORK

WORK-

HOME

PHYSICAL

HEALTH

MENTAL

HEALTH CONTROL

TRAINI

NG

HARASS

MENT SUPPORT RESPECT TRUST

LEISURE YES 0.72

HOME-WORK YES 0.11 0.91

WORK-HOME YES 0.21 0.23 0.94

PHYSICAL HEALTH YES 0.20 0.11 0.21 0.78

MENTAL HEALTH YES 0.29 0.20 0.31 0.34 0.95

CONTROL YES 0.10 -0.01 0.19 0.10 0.19 0.71

TRAINING YES 0.05 -0.07 0.01 0.01 0.09 0.21 0.67

HARASSMENT YES 0.04 0.05 0.24 0.15 0.20 0.16 0.07 0.70

SUPPORT YES 0.10 0.00 0.20 0.09 0.15 0.30 0.37 0.24 0.89

RESPECT YES 0.10 0.09 0.30 0.20 0.26 0.37 0.20 0.41 0.53 0.90

TRUST YES 0.11 0.10 0.30 0.18 0.24 0.32 0.19 0.35 0.52 0.78 0.89

† The square roots of the average variance extracted (AVE) are in bold.

217

Upon satisfying the validity and reliability of the first-order constructs, the next step

involved the assessment of the validity of the second- and third-order formative constructs. As

mentioned previously, the issue of reliability is meaningless for formative constructs; instead, the

significance of the predictors’ paths (the path from the subcontracts to their corresponding

formative construct) is important. Tables 13.3 and 13.4 present the path coefficients of

subconstructs for the higher-order construct/s, confirming that all these paths are significant, and

hence all these formative constructs are valid, supporting hypothesis 12.1.

Table 13.3 The Standardised Mean Coefficients of the Second-order Formative Constructs of Reflective-formative PLS-SEM Model (n=5000 bootstrap)

Standardised Path Coefficients Mean

T Value Support

LEISURE -> PERSONAL 0.31 56.82 YES

HOME-WORK -> PERSONAL 0.34 55.95 YES

WORK-HOME -> PERSONAL 0.23 53.81 YES

PHYSICAL HEALTH -> PERSONAL 0.20 50.33 YES

MENTAL HEALTH-> PERSONAL 0.63 59.63 YES

CONTROL -> ORGANISATIONAL 0.74 64.46 YES

TRAINING -> ORGANISATIONAL 0.19 54.31 YES

HARASSMENT -> ORGANISATIONAL 0.33 58.62 YES

SUPPORT -> ORGANISATIONAL 0.34 67.59 YES

RESPECT -> ORGANISATIONAL 0.37 61.33 YES

TRUST -> ORGANISATIONAL 0.39 62.67 YES

Note: p<0.05

218

Table 13.4 Results for Third-order formative Constructs of Reflective-formative WAS (n=5000 bootstrap samples)

Standardised Path Coefficients (Mean)

T Value Support

ORGANISATIONAL -> WAS 0.67 74.79 YES

PERSONAL -> WAS 0.50 52.48 YES

Note: p<0.05

Both Table 13.4 and Figure 13.1 demonstrate the significant path coefficients for the

reflective-formative WAS model performed using bootstrapping (n=5000). The path coefficient

for organisational capacities (β=0.67) and personal capacities (β=0.50) suggest that

organisational capacity is a slightly stronger component of WAS than personal capacity.

219

Figure 13.1. The final model of reflective-formative WAS development using PLS path

modeling.

To evaluate the next hypothesis (12.2) and to demonstrate the possible Type I and II

errors resulting from measurement model misspecification, the misspecified models of WAS (i.e.,

reflective-reflective and formative-formative) were evaluated using PLS-SEM. In addition,

Covariance-based SEM was applied to evaluate the reflective-reflective model of WAS to

establish whether the difference in model or the difference in estimation method was responsible

for the differences in the results. Unfortunately, due to an identification problem, the formative-

formative model of WAS could not be evaluated using the MIMIC method with Covariance-

based SEM procedure. The full details of the results and the step-by step guide to evaluating the

220

misspecified models are presented in Appendix E. The next section presents the comparison of

path coefficients and reliability coefficients of the misspecified models with the correctly

specified model of WAS, fitted using PLS-SEM.

13.2 Comparison of the Misspecified Models with the Correctly Specified WAS Model

In this analysis, the results of all four sets of coefficients were compared - misspecified

reflective-reflective using both CB-SEM and PLS-SEM, formative-formative and correctly

specified reflective-formative WAS models. The full analysis and results are presented in

Appendix B. Table 13.5 presents a comparison of the path coefficients of all four analyses. The

results showed that, in comparison with the correctly specified reflective-formative model, the

paths of the misspecified reflective-reflective models were highly inflated , regardless of the

evaluation procedure used (i.e., CB-SEM or Partial Least Squares SEM). Conversely, in the

misspecified formative-formative model, the path coefficients were highly deflated, especially for

the lower order construct. The results indicate that the inflated (in reflective misspecified models)

and deflated (in formative misspecified model) path coefficients lead to Type I and II errors

respectively.

221

Table 13.5 Comparing the Standardized Path Coefficients of Misspecified and Correctly Specified WAS Models

Misspecified models Correctly specified model

Constructs 1) Reflective –Reflective CB-SEM

2)Reflective-Reflective PLS-SEM

3)Formative-Formative PLS-SEM

4)Reflective-formative PLS-SEM

LEISURE -> PERSONAL .44 .55 -0.02† 0.31

HOME-WORK -> PERSONAL .34 .46 -0.01† 0.34

WORK-HOME -> PERSONAL .57 .65 0.70 0.23

PHYSICAL HEALTH -> PERSONAL .68 .53 0.20 0.20

MENTAL HEALTH-> PERSONAL .71 .80 0.40 0.63

CONTROL -> ORGANISATIONAL .53 .58 0.24 0.74

TRAINING -> ORGANISATIONAL .32 .39 -0.15 0.19

HARASSMENT -> ORGANISATIONAL .52 .50 0.36 0.33

SUPPORT -> ORGANISATIONAL .65 .74 0.04† 0.34

RESPECT -> ORGANISATIONAL .95 .87 0.33 0.37

TRUST -> ORGANISATIONAL .92 .84 0.36 0.39

ORGANISATIONAL -> WAS .72 0.90 0.68 0.67

PERSONAL -> WAS .62 0.71 0.51 0.50

Note: †- non-significant paths.

In contrast, the model-based reliability coefficients of a misspecified reflective-reflective

CB-SEM show a downward (deflating) bias compared to the correctly specified reflective-

formative WAS fitted using PLS-SEM (Table 13.6). This is primarily due to the shared

measurement errors in reflective second- and third-order models. In the reflective-formative

models, the first–order constructs predict the second-order construct, preventing the sharing of

222

measurement error with the reliability of the first–order loadings. As expected, the results of the

reflective-reflective WAS model, evaluated with Partial Least Squares SEM, showed the same

reliability coefficients as the correctly specified reflective-formative WAS model. This occurred

because in Partial Least Squares SEM, the reliability coefficients of the first-order constructs

were evaluated in isolation. Therefore, the misspecification of second or third-order constructs

does not affect the reliability of the first-order constructs. The reliability coefficients for the

misspecified formative-formative model of WAS were not calculated. As stated previously, the

issue of reliability is meaningless for formative constructs where the indicators are predictors of

the construct.

223

Table 13.6 Comparing the Model-based Reliability Coefficients of a Misspecified Reflective-reflective WAS (CB-SEM) with the Correctly Specified Reflective-formative Model of WAS

Reflective-reflective WAS (CB-SEM)

Reflective-formative WAS (PLS-SEM)

Latent Variable Indicators Model-based reliability

Model-based reliability

LEISURE Q56d .59 0.77 Q56h Q56i HOME-WORK Q51c .80 0.90 Q51d WORK-HOME BALANCE

Q51a .86 0.94

Q51b

PHYSICAL HEALTH No of conditions

.42 0.75

Q37 MENTAL HEALTH Q47a .83 0.90

Q47b Q47c CONTROL Q15a .89 0.85

Q15b Q15c Q19a Q19b Q19c TRAINING Q30a .59 0.76

Q30c Q30d Q30e HARASSMENT Q42a .66 0.79

Q42c Q42d Q42h SUPPORT Q27a .87 0.92

Q27e Q27f RESPECT Q22c .89 0.93

Q22d Q22e TRUST Q24e .86 0.92 Q24f Q24g

224

14

STUDY 3: DISCUSSION

This chapter provides a discussion of the results obtained in Chapter 13. While

validation of reflective models is common in the literature, there is little work on the

validation and assessment of model-based reliability for measurement models containing

formative constructs. The purpose of Study 3 was to illustrate empirically the fitting,

validation, and model-based reliability assessments of a reflective-formative model for the

Work Ability Scale (WAS), using Partial Least Squares SEM. The Work Ability Scale is

misspecified in the literature as a full reflective model. The proposed reflective-formative

model of WAS is a correctly specified model based on the theory described in Chapter 5

and the related decision-making tree that was developed in Chapter 5 (Figure 5.5). The

secondary aims of this study were to demonstrate the likelihood of Type I and II errors

occurring. This was achieved by comparing the correctly specified model with a

misspecified reflective-reflective model (fitted using Partial Least Squares SEM and

Covariance-Based SEM), and a misspecified full formative-formative model (fitted using

Partial Least Squares SEM). Unfortunately, due to identification issues, evaluation of the

formative-formative model using Covariance-Based SEM was not achieved, allowing only

an evaluation of the formative-formative model fitted using Partial Least Squares SEM.

The proposed revised reflective-formative WAS model was based on the work of the

Redesigning Work for an Ageing Society research program at Swinburne University of

Technology (2009) and was evaluated using AMOS and SmartPLS. The three different

225

models of WAS (reflective-reflective, formative-formative and reflective-formative) were

built using the PLS path modelling approach, employing the repeated indicators approach

and the two-stage approach (Becker et al., 2012; Hair, Hult, Ringle, & Sarstedt, 2014;

Wold, 1982). Based on the literature, applying the repeated indicator approach at two stages

creates less bias, more reliable parameter estimates/scores, and a more precise estimation of

path coefficients of constructs (Becker, et al., 2012).

The results of the Partial Least Squares SEM approach to the fitting of the

reflective-formative model for WAS showed that the proposed second-order WAS model in

this empirical illustration contained relatively valid indicators and predictors for the WAS

model. The t-statistics generated by bootstrapping also showed significant paths for both

organisational and personal capacities. The correctly specified model demonstrated

acceptable discriminant and convergent validity (with exceptions in the case of the training

and harassment subconstructs).

The model-based reliability measures were acceptable for the first-order reflective

constructs. The internal reliability coefficients alpha were clearly underestimated compared

to the model-based reliability coefficients of the first-order constructs. The main reason for

the underestimation of reliability by coefficient alpha is thought to be due to the assumption

of essential tau-equivalence, assumed in calculating the coefficient alpha. The reported

reliability in coefficient alpha is also only a lower bound for reliability and therefore results

in underestimation of the true reliability (Graham, 2006).

Comparisons of the model-based reliability coefficients of the correctly specified

model of WAS with its misspecified reflective-reflective model, produced thought-

226

provoking results. The model-based reliability for the misspecified model underestimated

the reliability compared to the correct model. These results are important for several

reasons. Comparing the correlation between the first-order constructs it seems that the

misspecified model produced an overestimation of correlation among the constructs of the

misspecified model compared to the correct model. The results support the literature claim

stating that underestimated reliability coefficients increase the correlations between the

constructs (e.g., Fan, 2003; Revelle & Zinbarg, 2008). As determined by these scholars,

underestimating the reliability results in an overestimation of correlation among constructs

and vice versa. Revelle and Zinbarg (2008) stress that selecting the proper reliability

coefficient is important in multidimensional studies. The findings of this study add to their

recommendation that researchers should also pay attention to the specification of the

measurement models. Even if proper model-based reliability coefficients were used, if the

model is misspecified (in terms of reflective vs formative nature) it leads to bias in the

reliability estimations.

The results of the comparison of the correct and misspecified models showed

inflated and deflated path coefficients when the model was misspecified as reflective or

formative respectively. When comparing reflective-reflective misspecified models (fitted

using Covariance-Based SEM and Partial Least Squares SEM) with the correctly specified

reflective-formative model, inflated path coefficients reported in the majority of the paths

presented a higher likelihood of Type I error. Based on the above comparison in terms of

reliabilities this attenuation of inter-relationships is related to the lower reliabilities found in

the misspecified reflective-reflective model. Nonsignificant and deflated coefficients were

227

reported for some paths when a full formative-formative model was compared with the

reflective-formative model. This demonstrated the presence of Type II error (rejecting a

true hypothesis). Based on the simulation study of Jarvis et al., (2003), when the structural

paths originate from a misspecified construct for reflective models, there is a high

possibility of inflated path estimates, resulting in the Type I error.

As mentioned in the reliability discussion, since reliability is under-estimated in

reflective constructs, the path coefficients are inflated compared to the formative constructs

(Fan 2003). In both misspecified reflective-reflective models fitted using Covariance-

Based SEM and Partial Least Squares SEM, the path coefficients were therefore inflated

compared to the correctly specified reflective-formative model. This is consistent with

previous empirical and simulation studies (Aguirre-Urreta and Marakas 2008, 2012; Jarvis

et al., 2003; Law and Wong 1999; MacKenzie et al., 2005; Petter et al., 2007). Similar to

previous studies, when a formative model was misspecified as reflective, the misspecified

constructs upwardly bias the coefficients of the model (Petter et al., 2011).

Any bias in a study leads to misleading conclusions and therefore it is critical to pay

attention to model specification (Petter et al., 2011). It is evident that the significant level of

misspecification in the area of psychology identified in Chapter 5 (18%) demonstrates the

need to pay greater attention to model specification in order to achieve reliable results.

These findings are important not only for this specific example but also for future studies,

opening a new area of study necessitating further research and development.

228

14.1 Implications for Work Ability Assessments

A validated scale of work ability would have many practical benefits. The work

ability concept and the Work Ability Index have far-reaching and strategic benefits for

work organisations, resulting in better productivity. Specific benefits of the concept include

early prediction of work disability, initiation of preventive procedures, recognition of work

ability status and the need for promotion (Daws, 2012; Ilmarinen, 2010).

The concept of work ability has advanced significantly from the original research on

the Work Ability Index due to the multidimensional holistic view provided by the work

ability model. According to scholars, work ability research in the future will include some

of the following (Daws, 2012; Ilmarinen, 2010):

• utilisation of a multidimensional work ability model with a link between research and practice;

• development of new work ability measures with better capacities for the identification of problems;

• comprehensive evaluation of effects of interventions; • development of national and international work ability networks; • development of national surveys and the creation of datasets; • international studies of long-term effects on the Third Age (silent and boom

generations) using a generational framework of analysis; • improvement of tools for training; and • development of curricula for occupational gerontology at universities.

The concept of work ability provides an all-inclusive and evidence-based concept

for quality of work life as well as positive ageing. However, major attitudinal, managerial

and occupational health and safety (OH&S) reforms are needed in the modern work-life

environment (Ilmarinen, 2010; Taylor, Sep 2008)

229

The importance of workplace as a component of quality of life is well known.

Effective evaluation of work ability, appropriate management and supervision of workers,

and the improvement of work ability and occupational well-being to achieve a win-win

situation are the key ingredients. While the work ability concept is primarily concerned

with the working population, it is equally important to maintain the workability of the

unemployed.

Population ageing in many countries has led to concerns about labour supply, thus

giving rise to an increasing emphasis on prolonging working life (Taylor, Sep 2008). The

creation of a ‘golden age’ for older workers requires overcoming an early retirement

mentality, changing business behaviour and attitudes among the social actors, and

instituting new public policies.

In the meantime, we need reliable information based on follow-up studies or data

from workplaces. We also need international comparisons of the work ability of

populations and, more particularly, we need to identify the factors that maintain and

promote work ability (Gould et al., 2008). Estimates of the work ability of different

populations are required to support decision-making on health, work, and pension policies.

One of the critical challenges is for studies to focus on the future – “How can we find the

best predictors for the development of the population’s (future) work ability?”

14.2 Limitations and Directions for Future Research

Part of the study focus was to clarify the difference between reflective and

formative measurement models. The literature review has revealed that there are some

serious misspecification problems in the field of organisational psychology. It seems that

230

lack of knowledge could be one of the main reasons for misspecified models. Based on the

literature, a framework for identifying formative vs. reflective models has been presented in

Chapter 5 to help researchers to better identify the most appropriate type of measurement

models for constructs. The proposed decision making framework is easy to understand and

at the same time very comprehensive, and should therefore be of benefit to researchers.

However, some important issues regarding the identification of formative vs.

reflective models still need to be resolved. In some cases, the relevance of

reflective/formative constructs may differ according to the group (e.g., gender, occupation

level, etc.) or situation. Further studies are required to shed more light on such specific

group/situation complexities.

The difficulties encountered when fitting models for formative constructs using

Covariance-Based SEM is another hurdle in choosing formative constructs. Some of the

well-known solutions include using Monte Carlo simulations and MIMIC models in which

the reflective-formative models are expanded by adding reflective indicators for the higher-

order latent constructs. Despite MIMIC being a suitable procedure for the identification of

formative measures in most cases, this solution is criticised in the literature. With this

procedure the formative ƒ construct is replaced with F (represented by another standard

common factor), resulting in the deterioration of the intended meaning of the formative

construct, which is formed by its antecedents (Treiblmaier, Bentler & Maira, 2011). More

importantly, it is not clear how to use this method with a third-order model, such as that

considered in this study. This problem is solved by using PLS-based modelling for

formative constructs instead of the more popular variance-covariance-based SEM. A more

231

recent solution for the estimation of reflective-formative model is proposed by Treiblmaier,

Bentler and Maira (2011). They proposed substituting ƒ with F with minimal manipulation.

In this procedure, using canonical correlation in a two-step approach, the items belonging

to each formative construct are split into two (or more) composites. The newly developed

canonical constructs can then be treated as common reflective factors and can be placed

into any reflective SEM model (Treiblmaier, Bentler, & Maira, 2011). However, further

studies are needed to shed more light on the estimation problems encountered when fitting

formative constructs.

The results of both studies (Study 1 and Study 2) showed that conventional

coefficient alpha is not the best method for the estimation of internal consistency. Further

studies should report model-based reliability coefficients especially for multidimensional

scales like WAS.

232

15

SUMMARY

In the final chapter of this thesis, a summary of each study along with their main

contributions to the literature as well as a general concluding summary will be presented.

15.1 Study 1: Model-Based Reliability, Validity and Cross Validity of the Bifactor Model

for WOAQ

In this study, attempts were made to assess the validity, cross-validity and reliability of

the WOAQ in two Australian health settings, the community nursing and paramedic industries.

Based on the literature, a robust procedure of bifactor modeling was adopted for assessing the

validity of the WOAQ, which was then compared with a higher order model. Cross-validity

procedures, using mean and covariate structures (MACS), were adopted to evaluate the

invariance across gender in regard to covariance structure and observed means. Also the means

at construct level in the bifactor model were evaluated. This is a neglected area in the literature.

To estimate robust and more accurate reliability coefficients, instead of relying on the

conventional internal consistency measure (Coefficient α), the model based reliability of omega

coefficients were used.

In general, the results showed that the WOAQ appears to be a superior instrument for

assessing risk factors due to its satisfactory psychometric properties and short length. Also it was

demonstrated that a bifactor model of WOAQ fits the data better than a higher order model. A

bifactor model of WOAQ provides more information not only for the general (overall) WOAQ

factor but also for its nested factors and their relative importance in a given setting. In this study,

233

it was documented that a general WOAQ factor has more importance than its nested factors in a

health setting. This result may be related to the differences in model structure for a

nursing/paramedics setting compared to a manufacturing setting.

WOAQ was initially developed as part of a risk assessment in the manufacturing sector

where direct line management is important. However, relationships with colleagues is more

important in the health sector. Therefore the nested factors on management can be expected to be

less important that relationships with colleagues in health settings.

The path loadings for some of the nested constructs were low indicating that a dominant

proportion of the variation within each indicator is attributable to a general factor of WOAQ

rather than its nested sub factor. Therefore, it is recommended that future studies should consider

WOAQ as a single general score even though it contains five nested factors. Methodologically,

calculating only a separate single score for each of the nested factors does not appear to be a good

choice in such settings.

This recommendation is further supported by the results of omega model-based reliability

at both general level and subscales’ level. These model-based reliabilities were acceptable though

demonstrating more reliability for the general factor of WOAQ than its subscales. The general

factor of WOAQ attributed the largest portion of the variance compared to the nested factors,

especially ‘the relationship with the management’ and ‘reward and recognition’ subscales. It is

therefore suggested that for the current sample of community nursing service, it is more

appropriate to use and report the general factor of WOAQ rather than its nested factors in

isolation. When they were compared with the conventional coefficient alpha, the results showed

overestimated reliability for coefficient alpha compared to the omega coefficients. It is therefore

234

recommend that in any future studies researchers should by default use only model-based

reliability coefficients.

15.2 Study 2: Applications of Covariate-dependent Reliability

Study two presented two applications of the newly proposed covariate-dependent and

covariate-free reliability approach of Bentler’s. The applications demonstrated in this study were

the reliability assessment of WOAQ and the role of occupation type and also the effects of CMB

on reliability. Using Covariate-dependent and covariate-free approach it was demonstrated that

although WOAQ showed acceptable reliability in a nursing and paramedic organisation

separately, when these samples were combined a considerable proportion of the WOAQ was

attributable by the organisation type. Surprisingly the results showed that although ‘within’

organisation reliability exists for the WOAQ, ‘between’ organisation assessments failed to

demonstrate a high degree of reliability between the nursing and paramedic samples. The reasons

for seeing such differences in reliability was explained in terms of the differences between these

organisations, their demographic characteristics, the different pace of work, different work

settings and different ways of interacting with the patient and providing service delivery. Often

scholars neglect to perform reliability assessments for their scales even when being used for the

first time in a new setting. WOAQ is one example of many scales that are highly influenced by

the type of organisation and/or the demographic characteristics of the population.

The second application of the Bentler’s covariate-dependent approach was demonstrated

in the context of CMB. This new procedure was proposed for assessing the effects of CMB on

the reliability of a model. . It appears that CMB has a marked effect on the reliability of the

model considered in this study. This seems to be an interesting area to be explored in further

research. The presence of CMB was backed up by further analysis using a confirmatory factor

235

analysis (CFA) marker approach for controlling for CMV/CMB. The results supported the

presence of CMB in this application therefore supporting the influence of CMB on the reliability

of the scales. This is an important finding; if scholars using covariate-dependent reliability

assessment can show any CMB effect, then they must control for CMB in the rest of their

analysis using a marker variable or some other approaches. Covariate-dependent reliability

assessment therefore provides a new quick and easy method for testing for CMB/CMV.

Focusing on the causes and consequences of CMB using preventive procedures, as well

as statistical procedures, better ways to prevent and control for the possible effects of CMV/CMB

are recommended. In this study using a preventative procedure, one of the potential common

method biases (social desirability) was detected and measured. Using statistical procedures,

unmeasured sources of possible bias (CMV) were also controlled and evaluated.

15.3 Study 3: Model-based Reliability and Validity of Reflective-formative Model of WAS

In study 3, a comprehensive statistical and theoretical explanation of the differences

between formative and reflective models as presented. Then, based on the literature, a

comprehensive, simple and easy to follow decision making tree was proposed to easily

distinguish reflective from formative models. The next aim was to illustrate how big the

misspecification problem is in the area of organisational psychology. Although a few literature

reviews have been carried out in other disciplines highlighting the misspecification rate, no study

has been carried out in the psychology area. Given that scholars in the psychology discipline

usually hold strong statistical knowledge, there was an expectation of a lower level of error in this

area. Using this decision making tree, a broad literature review in two top journals of

Organisational Psychology were undertaken over a 9 years period (2006-2014). The two

researchers found a high level of agreement (Kappa=.89) on distinguishing between misspecified

236

models using the proposed decision making tree. An 18 percent misspecification rate was found

in the literature review of these two organisational psychology journal articles.

One of the main reasons for misspecification could be a lack of knowledge or problems in

fitting formative models. The majority of the readily available software for SEM is designed for

fitting reflective models. Therefore the main aim of study 3 was to empirically fit and evaluate

the validity and model-based reliability of a mixed reflective-formative model for WAS. WAS is

misspecified in the literature as a reflective model. The second aim was to compare and to

demonstrate the outcomes of model misspecification and the likelihood of Type I and II errors.

As a first step, it is important to design and distinguish the structure of a measurement

model before commencing the data collection. Using the literature and conceptual background of

the constructs, it should be determined at the outset whether the constructs are formative or

reflective. The decision flowchart was applied in the context of a work ability survey (WAS).

Using empirical data, an evaluation of the reflective-formative, formative-formative and

reflective-reflective higher order models was performed. In this evaluation, two model fitting

procedures (CB-SEM and PLS-SEM) were used for all models. Unfortunately, due to

identification problems, the evaluation of a formative-formative model using CB-SEM was not

possible. Two common procedures, repeated indicators and a two-stage path-modeling approach

were used for fitting the three models using PLS-SEM with SmartPLS software. The fitted

models showed major differences between the correctly specified reflective-formative model and

the incorrectly specified reflective-reflective and formative-formative models. For the incorrectly

specified reflective-reflective model, the structural paths were significantly inflated compared to

the correctly specified reflective-formative model, suggesting in a higher probability for Type I

errors. Interestingly this was more of a problem when PLS-SEM was used to fit the reflective-

237

reflective model than when CB-SEM was used. The comparison of the incorrectly specified fully

formative-formative model with the correctly specified reflective-formative model showed some

deflated/nonsignificant loadings. This was more evident for the lower order constructs,

suggesting a higher probability for Type II error. These findings exhibited empirically the

dangers of model misspecification.

It is highly recommended the scholars specify their measurement models with more

caution in order to avoid Type I or II errors. The nature of constructs needs to be identified before

the model fitting software is chosen. The theoretical background should always be considered as

a first step to identify and conceptualise the nature of constructs (reflective vs formative or

mixed).

15.4 Thesis contributions to SEM

The contribution of this thesis and directions for future research are discussed in detail at

the end of each chapter. A summary of the contribution of the thesis, specifically in regard to the

SEM discipline, is presented below. The findings of the 3 studies undertaken in this project,

contribute to SEM by:

- Path diagrams showing history of SEM and model based reliability. An overview

of the literature on SEM and model-based reliability in psychology was provided using

two simple, yet comprehensive path diagrams. In Chapter 2, an overview of the

development of SEM in psychology was presented using a path diagram (figure 2.1).

Similarly, in chapter 3 a history of model-based reliability was presented using a path

diagram (Figure 3.1). This diagram illustrates the history, recent developments and

current gaps in the literature, along with some justifications for carrying out the studies in

this thesis were highlighted. These diagrams can be utilized as effective training tools for

238

both Statistical and Psychology students to better understand the early roots of SEM and

how more recent SEM developments relate to each other.

- Validating a bifactor model in a health setting using SEM. A comprehensive

procedure for the validation of a bifactor measurement model was assessed in study 1

using SEM. Study of bifactor models and their implications is a neglected area in the

literature and especially in the psychology discipline. These findings shed more light on

this poorly investigated area.

- Model-based reliability of a bifactor scale. Calculating and comparing the model-

based reliability coefficients of a bifactor model with the overestimated conventional

coefficient alpha, demonstrated the importance of using model-based reliability for

multidimensional constructs or complex scales.

- Cross-validation of a bifactor scale using latent factor means and covariance

structures (MACS) procedure in SEM. In study 1, cross-validation of a bifactor

measurement scale WOAQ was assessed across gender using MACS. The conventional

procedure for the cross-validation of measurement models considers only covariances and

observed means. Using MACS, the cross validity goes beyond observed parameter

invariance assessment and looks at the mean differences at construct level. This procedure

for a bifactor model is the contribution to the SEM literature that provides a more

comprehensive assessment of the validity of a scale in different populations.

- Presenting an empirical application for the novel concept of covariate-dependent

reliability using SEM. Two new applications of covariate-dependent reliability were

introduced for the first time in study 2. Using an empirical example, it is shown how ‘type

of occupation’ can affect the reliability of a scale. As such, a tool that is highly reliable in

239

one specific organisation might show very poor reliability in another organisation after

controlling for cofounding variables, for example, controlling for ‘organisation type’

reduces the reliability of a scale considerably. This procedure is expected to have many

implications in the SEM discipline and with issues related to model-based reliability.

- Demonstrating the novel application of covariate-dependent reliability in the

evaluation of CMB using SEM. A novel approach is proposed in study 2 by drawing

attention to the possible effects of CMB on the reliability of a model. In this study the

covariate-dependent reliability procedure was applied to assess the effects of CMB

(measured using a social desirability scale) on the reliability of a model. The results

highlight clearly how CMB can influence the reliability of scales. This is a novel area of

study and will have many applications in further studies.

- Developing a flowchart for distinguishing formative versus reflective models.

Providing a simple, yet comprehensive guideline using a flowchart (Figure 5.5) for

distinguishing between formative and reflective SEM measurement models was another

contribution to SEM. By having a clear guideline or procedure, researchers gain better

confidence in specifying the nature of new appropriate measurement models (e.g.

formative models). Lack of knowledge or clear rules/principles may otherwise create a

high risk of misspecification in the field.

- Demonstrating the misspecification rate of SEM measurement models (formative

vs. reflective) in the Organisational Psychology literature. As mentioned in the literature

review of Chapter 5, lack of knowledge is one of the reasons for the misspecification in

the case of the measurement models used in the field of Organisational Psychology.

Presenting a misspecification review for a 9-year period in the Organisational Psychology

240

literature will create some awareness of the extent of the problem. The results showed an

18% misspecification rate suggesting a problem in this discipline, as is the case in some

other disciplines. Misspecification of measurement models may lead to incorrect findings

and the development of misleading theories, leading to false findings.

- Presenting an empirical example for fitting a reflective, a formative and a

reflective-formative model using a partial least squares-based SEM approach. The

majority of the SEM software on the market is built mainly for conducting CB-SEM

evaluations. However, fitting a formative model of any type using CB-SEM is deemed to

be difficult and usually results in identification problems. One of the solutions in such

situations is the use of a PLS-SEM program to fit formative models. This is still a new

area of study and as a result there is limited knowledge on fitting and evaluation

approaches. In study 3, three different types of SEM measurement model were fitted and

evaluated using PLS-SEM. The procedures developed in this thesis for fitting higher-

order models for mixed models using PLS-SEM therefore represent a significant

methodological advance.

- Empirical comparisons of correctly specified versus misspecified measurement

models. The majority of the studies in the SEM discipline compare different types of

measurement models using simulation studies. In study 3, empirical data was used to

evaluate different types of measurement model using two common approaches (CB-SEM

and PLS-SEM).

- Assessing the likelihood of Type I and Type II errors as a result of measurement

model misspecification. The results of study 3 clearly showed how misspecified models

can lead to inflated, deflated or non-significant results, thereby increasing the risk of Type

241

I and/or Type II errors. These findings highlight the importance of correctly specifying

measurement models, thereby avoiding fundamental biases or errors. The danger of Type

I and II errors was well highlighted in study 3. This contributes to increasing awareness

among scholars about the consequences of measurement model misspecification.

These findings are all based on solid SEM theory and they are illustrated with

empirical analyses, which are of interest in their own right.

15.5 Summary

Overall this thesis has shown interesting applications where SEM is used for evaluating

model-based reliability and validity using both CB-SEM and PLS-SEM procedures. It has

also highlighted the procedures and applications of model-based reliability and validity

for under-investigated measurement models, such as the bifactor and mixed reflective-

formative models. In particular it has been shown how SEM makes possible the

estimation of model-based reliability covariate-dependent reliability and covariate free

reliability.. In addition this thesis has demonstrated the need for careful identification of

the nature of constructs as formative or reflective in Organisational Psychology and the

usefulness of PLS-SEM for fitting formative models. Recent SEM developments suggest

that the importance of SEM will be growing in the future as its capabilities become more

powerful. It is hoped that this thesis has contributed to this growth in a small way.

242

16 APPENDICES

243

PLEASE NOTE The articles listed below are not able to be reproduced online. Please consult the print copy of this thesis held in the Swinburne library. Karimi, L & Meyer, D 2015, ‘Validity and model-based reliability of the work organisation assessment questionnaire among nurses’, Nursing Outlook, vol. 63, no. 3, pp. 318-330, doi: 10.1016/j.outlook.2014.09.003 Karimi, L & Meyer, D 2014, ‘Structural equation modelling in psychology: the history, development and current challenges’, International Journal of Psychology Studies, vol. 6, no. 4, pp. 123-133, doi: 10.5539/ijps.v6n4p123 Karmi, L 2015 (in press), ‘Cross-validation of the work organization assessment questionnaire across gender: a study of Australia health organization’, Journal of Occupational and Environmental Medicine.

16.5 EVALUATING A HIGHER-ORDER MISSPECIFIED REFLECTIVE MODEL OF

WAS USING CB-SEM.

Using the AMOS and CB-SEM approach, the misspecified reflective models of WAS

were assessed using ML estimation, assuming normally distributed data. However, Mardia’s

Multivariate Kurtosis coefficient of 120.6 suggests that normality assumptions are not supported

(DeCarlo, 1997). Therefore bootstrapping methods were used to determine bias-corrected

confidence intervals for the parameter estimates. The bootstrap analysis indicated that the

structural paths were significant. Based on the results from the second-order model in Figure 16.1

and according to Byrne (2009), this reflective model of WAS describes the data well (χ2/df =

2.54, CFI=.95, RMSEA=.03). The standardised path parameter estimates are presented in Figure

16.2 and Table 16.5. The loading for Organisational Capacity is clearly stronger than the loading

for Personal Capacity suggesting that Organisational Capacity is a more important component of

work ability than Personal Capacity in the Australian context.

283

Figure 16.1. The standardised path parameter estimates of the misspecified reflective WAS model

284

Table 16.1 The Standardised Path Parameter Estimates for the Misspecified Reflective WAS Model Using the CB-SEM Procedure

Estimate

LEISURE <--- PERSONAL CAPACITY .44

HOME-WORK BALANCE <--- PERSONAL CAPACITY .34

WORK-HOME BALANCE <--- PERSONAL CAPACITY .57

PHYSICAL HEALTH <--- PERSONAL CAPACITY .68

MENTAL HEALTH <--- PERSONAL CAPACITY .71

CONTROL <--- ORG. CAPACITY .53

TRAINING <--- ORG. CAPACITY .32

HARASSMENT <--- ORG. CAPACITY .52

SUPPORT <--- ORG. CAPACITY .65

RESPECT <--- ORG. CAPACITY .95

TRUST <--- ORG. CAPACITY .92

ORG. CAPACITY <--- WORK ABILITY INDEX (WAI) .72

PERSONAL CAPACITY <--- WORK ABILITY INDEX (WAI) .62

The reliability of the reflective WAS subfactors was assessed. The results are

presented in Table 16.2. In summary, the majority of the subfactors produced acceptable

levels of model-based reliability (CR>0.60) (Byrne, 2009) with the exception of “leisure”

(CR=0.59), “physical health” (CR=0.42), and “training” (CR=0.59). Construct reliability will

be discussed in more detail in the next chapter.

Convergent validity is defined as “the extent to which a measure correlates positively

with alternative measures of the same construct” (Hair et al., 2014, p 102). A procedure to

evaluate convergent validity uses the average variance extracted (AVEs) (Fornell & Larcker,

1981). A cutoff point of greater than 0.50 should be considered accounting for more than 50

per cent of variance of the indicators (Fornell & Larcker, 1981). The discriminant validity

285

WAS assessed using intercorrelation between subfactors and comparing them with the

construct’s square roots of average variance extracted (AVE) (Table 16.3). If the square root

of AVE was higher than the construct’s higher correlation with other constructs, then

discriminant validity existed. Based on the results, discriminant validity was shown for all

factors with no cross-loadings, apart from the “trust” construct. There is high cross-loading

between “trust” and “respect” constructs showing lack of discriminant validity for these two

subfactors.

Table 16.2 The Parameter Estimates for First-order Reflective Model Using the CB-SEM Procedure

Estimate* AVE CR Alpha Cronbach

Convergent validity

Q56d <--- LEISURE .351 .34 .59 .79 No

Q56h <--- LEISURE .586 Q56i <--- LEISURE .743

Q5 <--- HOMEWORK_BALANCE .922 .67 .80 .89 YES

Q51d <--- HOMEWORK_BALANCE .700

Q51b <--- WORKHOME_BALANCE .824 .76 .86 .93 YES

Q51a <--- WORKHOME_BALANCE .922

Diagnosis <--- PHYSICAL_HEALTH .371 .28 .42 .67 NO

Q37 <--- PHYSICAL_HEALTH .652

Q47c <--- MENTAL_HEALTH .658 .90 .83 .91 YES Q47b <--- MENTAL_HEALTH .850 Q47a <--- MENTAL_HEALTH .850 Q19c <--- CONTROL .827 .56 .89 .86 YES Q19b <--- CONTROL .719 Q19a <--- CONTROL .710 Q15a <--- CONTROL .734 Q15b <--- CONTROL .797 Q15c <--- CONTROL .735 Q30e <--- TRAINING .545 .26 .59 .80 NO Q30d <--- TRAINING .543 Q30c <--- TRAINING .580 Q30a <--- TRAINING .381

286

Q42h <--- HARASSMENT .538 .32 .66 .83 NO Q42d <--- HARASSMENT .621 Q42c <--- HARASSMENT .493 Q42a <--- HARASSMENT .624 Q27a <--- SUPPORT .829 .69 .87 .93 YES Q27e <--- SUPPORT .851 Q27f <--- SUPPORT .827 Q22c <--- RESPECT .858 .72 .89 .94 YES Q22d <--- RESPECT .898 Q22e <--- RESPECT .804 Q24e <--- TRUST .757 .67 .86 .92 YES Q24f <--- TRUST .873 Q24g <--- TRUST .838 Note: *= all loadings are significant at P<0.05. CR=composite reliability (model-based reliability)

287

Table 16.3 Intercorrelation analysis and the square roots of AVE for all subfactors

Discriminant Validity 1 2 3 4 5 6 7 8 9 10 11

Personal Capacity Subfactor Org. Capacity Subfactors PERSONAL CAPACITY

.45

1.LEISURE YES .58 2.HOME-WORK BALANCE

YES .14 .82

3.WORK-HOME BALANCE

YES .25 .19 .87

4.PHYSICAL HEALTH YES .29 .23 .39 .53 5.MENTAL HEALTH YES .30 .24 .40 .48 .95 ORG. CAPACITY 6.CONTROL YES .10 .08 .13 .16 .16 .75 7.TRAINING YES .06 .04 .08 .09 .10 .16 .51 8.HARASSMENT YES .10 .08 .13 .16 .16 .27 .16 .57 9.SUPPORT YES .12 .09 .16 .19 .20 .33 .20 .33 .83 10.RESPECT YES .18 .14 .24 .29 .30 .50 .30 .49 .61 .85 11.TRUST NO .18 .14 .23 .28 .29 .48 .29 .47 .59 .87 .82

† The square roots of the average variance extracted (AVE) are presented in bold.

288

16.5.1 Measurement Model Evaluation Results for the PLS-SEM Misspecified

Reflective Model:

To be able to compare the path coefficients of a correctly specified reflective-

formative model with a misspecified reflective model, another analysis was run using PLS-

SEM for a full reflective model (Figure 16.2). A similar procedure to the one described

above was used for model construction in PLS, but modified to demonstrate a reflective

model. The results are presented in the following section with Table 16.4 and Table 16.5

indicating significant paths in all cases.

Figure 16.2. The reflective model of WAS using PLS-SEM

289

Table 16.4 The Path Coefficients Results for Second-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap).

Standardised Path Coefficients (M)

T Value Support*

LEISURE -> PERSONAL .55 18.15 YES

HOME-WORK -> PERSONAL .46 11.26 YES

WORK-HOME -> PERSONAL .65 29.32 YES

PHYSICAL HEALTH -> PERSONAL .53 17.58 YES

MENTAL HEALTH-> PERSONAL .80 60.53 YES

CONTROL -> ORGANISATIONAL .58 21.77 YES

TRAINING -> ORGANISATIONAL .39 13.24 YES

HARASSMENT -> ORGANISATIONAL .50 17.08 YES

SUPPORT -> ORGANISATIONAL .74 55.38 Yes

RESPECT -> ORGANISATIONAL .87 125.56 YES

TRUST -> ORGANISATIONAL .84 99.62 YES

Note: * p<0.05

290

Table 16.5 The Path Coefficinets Results for Higher-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap samples).


T Value Support*



Note: * p<0.05

As demonstrated in Figure 16.3 and Table 16.5, the WAS reflective-reflective

model presents significant regression paths for higher-order constructs of the reflective

WAS. As before, the path coefficients for organisational capacities (β=0.90) represent a

stronger path than those for personal capacities (β=0.71). However, both these paths are

much stronger than those obtained for the reflective-formative model, suggesting that the

risk of Type 1 errors has been increased as a result of the second-order model

misspecification. Also these paths are much stronger than those for the reflective-reflective

model fitted using CB-SEM, suggesting perhaps that models fitted using PLS-SEM are

more sensitive to model misspecification than models fitted using CB-SEM

Figure 16.3. The reflective WAS development using PLS path modeling.

291

16.5.2 Measurement Model Evaluation Results for the PLS-SEM Misspecified Full

Formative Model:

To be able to compare the path coefficients of a correctly specified reflective-

formative model with a misspecified formative-formative model, another analysis was

carried on using PLS-SEM for a full formative model (Figure 12.5). The similar steps as

reflective-formative model building were adapted with one main difference. Through the

model building process for using PLS, all the indicators at first and second order and

repeated measures were regarded as formative. A snapshot of the process is presented at

Figure 16.4.

292

Step1: Building the repeated measures of personal capacities construct

Step 2: Building the repeated measures of organisational capacities

construct

Step 3: Building the repeated measures of WAS

Figure 16.4. The model building process for full formative model of WAS using PLS-SEM.

293

The results of path coefficients form the first order constructs are presented at

Table 16.6. As shown in the Table, all path coefficients are significant except for the two

sub-constructs of personal capacities (leisure and home-work) and one sub-construct of

organisational capacities (support). However, the T-Values are much smaller than was the

case for the correctly specified reflective-formative model suggesting the occurrence of

Type II error as a result of the misspecification of the first order model

Table 16.6 The Path Coefficinets Results for Second-order Reflective Constructs of Full Formative PLS-SEM Model (n=5000 bootstrap).


T Value Support

LEISURE -> PERSONAL -0.02 0.40 NO†

HOME-WORK -> PERSONAL -0.01 0.16 NO†

WORK-HOME -> PERSONAL 0.70 11.94 YES

PHYSICAL HEALTH -> PERSONAL 0.20 2.72 YES

MENTAL HEALTH-> PERSONAL 0.40 5.23 YES

CONTROL -> ORGANISATIONAL 0.24 3.43 YES

TRAINING -> ORGANISATIONAL -0.15 2.08 YES

HARASSMENT -> ORGANISATIONAL 0.36 4.97 YES

SUPPORT -> ORGANISATIONAL 0.04 0.58 NO†

RESPECT -> ORGANISATIONAL 0.33 2.87 YES

TRUST -> ORGANISATIONAL 0.36 3.12 YES

Note: † p>0.05, hence not significant.

294

Figure 16.5. The full formative model of WAS using PLS-SEM

As shown at Figure 16.5 and Table 16.7, the WAS full formative-formative model

demonstrates significant regression paths for the higher-order constructs. The path

coefficient for organisational capacities (β=0.68) and for personal capacities (β=0.51) are

similar to those for the reflective-formative model.

Table 16.7 The Path Coefficient Results for Higher-order Reflective Constructs (n=5000 bootstrap samples).


T Value Support



Note: p<0.05

295

16.6 DEFINITIONS OF IMPORTANT TERMS

Measure. A measure in this study is defined as “a quantified record, or datum, taken as an

empirical analogy to a construct” (Edwards and Bagozzi, 2000, p. 156). In this definition,

measure is not a tool for data gathering, instead it is considered to be an observed score

gathered through self-report, interview, observation or some other means (e.g. Messick,

1995).

Construct. A construct is a conceptual term used to describe a phenomenon of theoretical

interest (Cronbach & Meehl, 1955; Nunnally, 1978 and Schwab, 1980 as cited by Edwards

& Bagozzi, 2000). In this definition construct refers to a real phenomenon (observable or

unobservable) in an abstract sense. As acknowledged by Edwards and Bagozzi (2000),

these phenomena involve some degree of measurement error and must be viewed through

an imperfect epistemological lens.

Measurement error. Measurement error is defined “as that part of an observed variable

that is not 'determined by' a construct” (Lord and Novick, 1968, p. 531). Reflective and

formative models. Reflective and formative models are known also as effect and cause

indicators respectively (Blalock, 1971). A reflective model considers effect indicators and

a formative model considers cause indicators. Detailed characteristics of both models will

be discussed in chapter two.

Covariate-dependent reliability. Bentler (in press; personal communications, 2013)

defines (group) covariate-dependent reliability as “… a measure of the group differences

on the trait being measured relative to total variation, while covariate-free reliability is a

measure of the reliable individual difference variance freed from any mean differences due

to the covariate(s)”.

296

Higher-order model. In multidimensional measurement constructs, a minimum of two

levels of constructs exist: the first-order level with indicators and the second-order (higher-

order) level with first-order constructs (Jarvis et al., 2003). Such models are known as

hierarchical (higher-order or second-order in this example) models.

Bifactor model. Among the higher-order models, a bifactor model is defined where all

latent variables are modelled as first-order constructs, in which first-order factors nested

within the general factors (Gignac, 2007; Gustafsson & Balke, 1993; Holzinger &

Swineford, 1937).

Reliability rho/ 11ρ . Within the setting of model-based reliability, the analysis of

congeneric measures presented by Jöreskog (1971) to calculate the reliability coefficient

11ρ was introduced. This reliability coefficient 11ρ perhaps is one of the earliest

proposals for assessing reliability of 1-factor model which ignores equal item reliabilities

(Gerbing, & Anderson, 1988).

Omega total reliability coefficient. Similar to reliability rho/ 11ρ , omega total ( tω )

estimates the combined proportion of true score variance from the general factor and any

subscales (McDonald, 1978).

Omega hierarchical reliability coefficient. Omega hierarchical (ω h) estimates the degree

of proficiency of a test measure in assessing the reliability of a hierarchical model

(Revelle, & Zinbarg, 2008; Zinbarg, Revelle, Yovel, & Li, 2005).

Omega subscales reliability coefficient. Omega subscale ( sω ) determines the degree of

reliability of the subscale scores of a bifactor model after controlling for the reliable

variance generated from the general factor (Reise et al., 2012).

297

Common method bias. Common method bias (CMB) is a type of error inherent in a

measure attributed to the particular method used for data collection (Bagozzi and Yi,

1990).

Common method variance. Common method variance (CMV) is a major type of

systematic measurement error (Bagozzi & Yi, 1990). It represents the variance in

measurement generated from the specific instrument used to collect the data (Spector,

1987).

Type I error. Type I error is a false positive error that occurs when a path is declared as

significant when it is not really significant (rejecting the null hypothesis when it is true)

(Jarvis et al., 2003; MacKenzie, Podsakoff, & Jarvis, 2005).

Type II error. Type II error is a false negative by declaring a path as nonsignificant when

it is really significant (failure to reject the null hypothesis when it is not true). (Jarvis et al.,

2003; MacKenzie, Podsakoff, & Jarvis, 2005). MacKenzie et al. (2005) reported that the

primary cause of Type II errors is when both constructs (i.e. exogenous and endogenous)

are misspecified as reflective instead of formative, resulting in a higher standard error for

the parameter being reported.

Convergent validity of constructs. This type of validity evaluation refers to convergent

validity where the individual loadings of each indicator on its own construct are the focus

(Mackenzie, Podsakoff & Podsakoff, 2011). Variable correlations within their related

factors refer to the convergent validity of the tool. If the t-ratios for the loadings are

significant and the factor loadings are above the recommended level (0.40), the convergent

validity of the scale is supported (Hair et al., 2014).

Analysis of covariance (COVS) in invariance testing. This type of invariance procedure

refers to comparing the covariance structure of the model parameters (e.g. factor loading,

298

measured-variable loading, variance/covariance of errors or factor residuals) across groups

using analysis of covariance (COVS) (Byrne, 2009).

The invariance analysis of mean and covariance structure (MACS).This procedure

refers to invariance testing of constructs means. Once the invariance of the covariance

structure has been established, the invariance of construct means can be evaluated using

MACS (Cheung & Rensvold, 2002; Widaman and Reise, 1997). MACS was first

introduced by Sörbom (1974) for the cross validation of SEM models.

299

16.7 THE WOAQ AND ITS SUBFACTORS ITEMS.

Item number/Factor

Quality of physical environment

1 - Facilities for taking breaks

2- Work surroundings

4- Exposure to physical danger

9- Safety at work

18- The equipment/IT that you use

20- Work station/work space

Quality of relationship with colleagues

10- Your relationship with your co-workers

(socially)

28- How well you work with your co-workers (as a

team)

Quality of relationship with management

3- Clear roles and responsibilities

5- Support from line manager/supervisor

7- Feedback on your performance

11- Appreciation of efforts from line

managers/supervisors

16- Senior management attitudes

17- Clear reporting line(s)

22- Communication with line manager/supervisor

26- Status/recognition in the workplace

27- Clear workplace objectives, values, procedures

Reward and recognition

12- Consultation about changes in your job

13- Adequate training for your current job

14- Variety of different tasks

21- Opportunities for promotion

300

23- Opportunities for learning new skills

24- Flexibility of working hours

25- Opportunities to use your skills

Workload issues

6- Pace of work

8- Your work load

15- Impact of family/social life on work

19- Impact of your work on family/social life

301

16.8 The R-WAS questionnaire

The complete R-WAS questionnaire (copied with permission from Prof Philip

Taylor, the Redesigning Work for an Ageing Society research program conducted by the

Business, Work & Ageing Centre for Research (BWA) at Swinburne University of

Technology) (2009)

302

16.9 List of items used in construction of WAS

The list of items of WAS copied with permission from the Redesigning Work for

an Ageing Society research program conducted by the Business, Work and Ageing Centre

for Research (BWA) at Swinburne University of Technology (2009)

323

16.10 Ethics clearance

a) Letter of approval

To: Dr Denny Meyer, FLSS/Ms Leila Karimi

[BC: Ms Leila Karimi]

Dear Dr Meyer,

SUHREC Project 2011/175 The effects of common method variance on structural equation

modeling

Dr Denny Meyer, FLSS/Ms Leila Karimi

Approved Duration: 22/09/2011 To 28/02/2014

I refer to the ethical review of the above project protocol undertaken on behalf of

Swinburne's Human Research Ethics Committee (SUHREC) by SUHREC Subcommittee

(SHESC2) at a meeting held on 5 September 2011. Your response to the review as e-

mailed on 16 September 2011 was reviewed by a SHESC2 delegate.

I am pleased to advise that, as submitted to date, the project has approval to

proceed in line with standard on-going ethics clearance conditions here outlined.

- All human research activity undertaken under Swinburne auspices must conform to

Swinburne and external regulatory standards, including the National Statement on Ethical

Conduct in Human Research and with respect to secure data use, retention and disposal.

- The named Swinburne Chief Investigator/Supervisor remains responsible for any

personnel appointed to or associated with the project being made aware of ethics clearance

conditions, including research and consent procedures or instruments approved. Any

change in chief investigator/supervisor requires timely notification and SUHREC

endorsement.

326

- The above project has been approved as submitted for ethical review by or on behalf of

SUHREC. Amendments to approved procedures or instruments ordinarily require prior

ethical appraisal/ clearance. SUHREC must be notified immediately or as soon as possible

thereafter of (a) any serious or unexpected adverse effects on participants and any redress

measures; (b) proposed changes in protocols; and (c) unforeseen events which might affect

continued ethical acceptability of the project.

- At a minimum, an annual report on the progress of the project is required as well as at the

conclusion (or aboundonment) of the project.

- A duly authorised external or internal audit of the project may be undertaken at any time.

Please contact me if you have any queries about on-going ethics clearance. The SUHREC

project number should be quoted in communication. Chief Investigators/Supervisors and

Student Researchers should retain a copy of this e-mail as part of project record-keeping.

Best wishes for the project.

Yours sincerely

XXXX

Secretary, SHESC2

*******************************************

327

XXXX

Administrative Officer (Research Ethics)

Swinburne Research (H68)

Swinburne University of Technology

P O Box 218

HAWTHORN VIC 3122

Tel +61 3 9214 8468

328

MEMORANDUM

RESEARCH SERVICES

To: Dr Leila Karimi, School of Public Health, Faculty of Health

Sciences

From: Secretary, La Trobe University Human Ethics Committee

Subject: Review of Human Ethics Committee Application No. 11-054

Title: The effects of common method variance on structural equation

modeling

Thank you for your recent correspondence in relation to the research project

referred to above. The project has been assessed as complying with the National

Statement on Ethical Conduct in Human Research. I am pleased to advise that your

project has been granted ethics approval and you may commence the study.

The project has been approved from the date of this letter until 31 December

2012.

329

Please note that your application has been reviewed by a sub-committee of the

University Human Ethics Committee (UHEC) to facilitate a decision about the study

before the next Committee meeting. This decision will require ratification by the full

UHEC at its next meeting and the UHEC reserves the right to alter conditions of approval

or withdraw approval. You will be notified if the approval status of your project changes.

The UHEC is a fully constituted Ethics Committee in accordance with the National

Statement on Ethical Conduct in Research Involving Humans- March 2007 under Section

5.1.29.

The following standard conditions apply to your project:

• Limit of Approval. Approval is limited strictly to the research proposal as

submitted in your application while taking into account any additional conditions advised

by the UHEC.

• Variation to Project. Any subsequent variations or modifications you wish

to make to your project must be formally notified to the UHEC for approval in advance

of these modifications being introduced into the project. This can be done using the

appropriate form: Ethics - Application for Modification to Project which is available on

the Research Services website at http://www.latrobe.edu.au/research-

services/ethics/HEC_human.htm. If the UHEC considers that the proposed changes are

significant, you may be required to submit a new application form for approval of the

revised project.

• Adverse Events. If any unforeseen or adverse events occur, including

adverse effects on participants, during the course of the project which may affect the

ethical acceptability of the project, the Chief Investigator must immediately notify the

UHEC Secretary on telephone (03) 9479 1443. Any complaints about the project

received by the researchers must also be referred immediately to the UHEC Secretary.

330

http://www.latrobe.edu.au/research-services/ethics/HEC_human.htm


• Withdrawal of Project. If you decide to discontinue your research before its

planned completion, you must advise the UHEC and clarify the circumstances.

• Annual Progress Reports. If your project continues for more than 12

months, you are required to submit an Ethics - Progress/Final Report Form annually, on

or just prior to

12 February. The form is available on the Research Services website (see above

address). Failure to submit a Progress Report will mean approval for this project will

lapse. An audit may be conducted by the UHEC at any time.

• Final Report. A Final Report (see above address) is required within six months of the

completion of the project or by 30 June 2013.

If you have any queries on the information above or require further clarification

please contact me through Research Services on telephone (03) 9479-1443, or e-mail at:

[email protected].

On behalf of the University Human Ethics Committee, best wishes with your

research!

XXXX

Administrative Officer (Research Ethics) University Human Ethics Committee

Research Compliance Unit / Research Services

La Trobe University Bundoora, Victoria 3086

P: (03) 9479 – 1443 / F: (03) 9479 - 1464 http://www.latrobe.edu.au/research-

services/ethics/HEC_human.htm

331

mailto:[email protected]



16.11

A List of Articles Included in the Review

No Authors Title Journal / year/ issue / page

1 John E. Mathieu and Lucy L. Gilson, Thomas M. Ruddy

Empowerment and Team Effectiveness: An Empirical Test of an Integrated Model Journal of Applied Psychology, 2006, Vol. 91, No. 1, 97–108

2 Yaping Gong and Jinyan Fan Longitudinal Examination of the Role of Goal Orientation in Cross-Cultural Adjustment Journal of Applied Psychology, 2006, Vol. 91, No. 1, 176–184

3 Christopher C. Rosen, Paul E. Levy, and Rosalie J. Hall

Placing Perceptions of Politics in the Context of the FeedbackEnvironment, Employee Attitudes, and Job Performance

Journal of Applied Psychology, 2006, Vol. 91, No. 1, 211–220

4 Bradley J. Alge, Gary A. Ballinger, Subrahmaniam Tangirala, and James L. Oakley

Information Privacy in Organizations: Empowering Creative and Extrarole Performance Journal of Applied Psychology, 2006, Vol. 91, No. 1, 221–232

5 Sabine Sonnentag, Fred R. H. Zijlstra Job Characteristics and Off-Job Activities as Predictors of Need for Recovery, Well-Being, and Fatigue Journal of Applied Psychology, 2006, Vol. 91, No. 2, 330–350

6 Kimberly A. Eddleston, John F. Veiga and Gary N. Powell

Explaining Sex Differences in Managerial Career Satisfier Preferences: The Role of Gender Self-Schema Journal of Applied Psychology, 2006, Vol. 91, No. 2, 437–445

7 Sharon K. Parker, Helen M. Williams Modeling the Antecedents of Proactive Behavior at Work Journal of Applied Psychology, 2006, Vol. 91, No. 3, 636–652

8 J. Craig Wallace, Eric Popp and Scott Mondore

Safety Climate as a Mediator Between Foundation Climates and Occupational Accidents: A Group-Level Investigation Journal of Applied Psychology, 2006, Vol. 91, No. 3, 681–688

9 Douglas J. Brown, Richard T. Cober, Kevin Kane, Paul E. Levy and Jarrett Shalhoop

Proactive Personality and the Successful Job Search: A Field Investigation With College Graduates Journal of Applied Psychology, 2006, Vol. 91, No. 3, 717–726

10 Dishan Kamdar, Daniel J. McAllister and Daniel B. Turban

“All in a Day’s Work”: How Follower Individual Differences and Justice Perceptions predict OCB Role Definitions and Behavior

Journal of Applied Psychology , 2006, Vol. 91, No. 4, 841–855

11 Christine L. Jackson , Jason A. Colquitt, Michael J. Wesson and Psychological Collectivism: A Measurement Validation and

Linkage to Group Member Performance Journal of Applied Psychology , 2006, Vol. 91, No. 4, 884–899 12 Cindy P. Zapata-PhelanWesson

13 Debra A. Major, Jonathan E. Turner, and Thomas D. Fletcher

Linking Proactive Personality and the Big Five to Motivation to Learn and Development Activity Journal of Applied Psychology, 2006, Vol. 91, No. 4, 927–935

333

14 Vivien K. G. Lim and Qing Si Sng Does Parental Job Insecurity Matter? Money Anxiety, Money Motives, and Work Motivation Journal of Applied Psychology, 2006, Vol. 91, No. 5, 1078–1087

15 Alannah E. Rafferty and Mark A. Griffin Perceptions of Organizational Change: A Stress and Coping Perspective Journal of Applied Psychology, 2006, Vol. 91, No. 5, 1154–1162

16 Frederick P. Morgeson, Stephen E. Humphrey The Work Design Questionnaire (WDQ): Developing and Validating a Comprehensive Measure for Assessing Job Design and the Nature of Work


17 Laura M. Graves, Patricia J. Ohlott and Marian N. Ruderman

Commitment to Family Roles: Effects on Managers’ Attitudes and Performance Journal of Applied Psychology,2007, Vol. 92, No. 1, 44–56

18 Jonathon R. B. Halbesleben,Wm. Matthew Bowler

Emotional Exhaustion and Job Performance: The Mediating Role of Motivation Journal of Applied Psychology, 2007, Vol. 92, No. 1, 93–106

19 Samuel Aryee, Zhen Xiong Chen, Li-Yun Sun, and Yaw A. Debrah,

Antecedents and Outcomes of Abusive Supervision: Journal of Applied Psychology, 2007, Vol. 92, No. 1, 191–201

20 Test of a Trickle-Down Model

21 Gilad Chen, Bradley L. Kirkman, Ruth Kanfer, Don Allen, Benson Rosen

A Multilevel Study of Leadership, Empowerment, and Performance in Teams Journal of Applied Psychology, 2007, Vol. 92, No. 2, 331–346

22 Mo Wang Profiling Retirees in the Retirement Transition and Adjustment Process: Examining the Longitudinal Change Patterns of Retirees’Psychological Well-Being


23 Hui Liao Do It Right This Time: The Role of Employee Service Recovery Performance in Customer-Perceived Justice and Customer Loyalty After Service Failures


24 Adam B. Butler Job Characteristics and College Performance and Attitudes: A Model of Work–School Conflict and Facilitation Journal of Applied Psychology, 2007, Vol. 92, No. 2, 500–510

25 Jo Silvester, Fiona Patterson, Anna Koczwara, Eamonn Ferguson

“Trust Me. . .”: Psychological and Behavioral Predictors of Perceived Physician Empathy Journal of Applied Psychology, 2007, Vol. 92, No. 2, 519–527

26 Richard D. Arvey, Zhen Zhang, Bruce J. Avolio, Robert F. Krueger

Developmental and Genetic Determinants of Leadership Role Occupancy Among Women Journal of Applied Psychology, 2007, Vol. 92, No. 3, 693–706

27 Seokhwa Yun, Riki Takeuchi, Wei Liu Employee Self-Enhancement Motives and Job Performance Behaviors: Investigating the Moderating Effects of Employee Role Ambiguity and


28 Hilary J. Gettman and Michele J. Gelfand When the Customer Shouldn’t Be King: Antecedents and Consequences of Sexual Harassment by Clients and Customers Journal of Applied Psych2007, Vol. 92, No. 3, 757–770ology,

29 James M. Diefendorff, Kajal Mehta The Relations of Motivational Traits With Workplace Deviance Journal of Applied Psychology, 2007, Vol. 92, No. 4, 967–977

334

30 Craig D. Crossley, Rebecca J. Bennett, Steve M. Jex and Jennifer L. Burnfield,

Development of a Global Measure of Job Embeddedness and Integration Into a Traditional Model of Voluntary Turnover Journal of Applied Psychology2007, Vol. 92, No. 4, 1031–1042,

31 Michael Frese, Harry Garst, Doris Fay Making Things Happen: Reciprocal Relationships Between Work Characteristics and Personal Initiative in a Four-Wave Longitudinal Structural Equation Model


32 Marie S. Mitchell and Maureen L. Ambrose Abusive Supervision and Workplace Deviance and the Moderating Effects of Negative Reciprocity Beliefs Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1159–1168

33 Christian Vandenberghe, Kathleen Bentein, Richard Michon, Jean-Charles Chebat, Michel Tremblay, Jean-Franc¸ois Fils

An Examination of the Role of Perceived Support and Employee Commitment in Employee–Customer Encounters Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1177–1187

34 Daniel J. McAllister, Dishan Kamdar, Elizabeth Wolfe Morrison, Daniel B. Turban

Disentangling Role Perceptions: How Perceived Role Breadth, Discretion, Instrumentality, and Efficacy Relate to Helping and Taking Charge


35 Kathi Miner-Rubino, Lilia M. Cortina Beyond Targets: Consequences of Vicarious Exposure to Misogyny at Work Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1254–1269

36 Dishan Kamdar, Linn Van Dyne The Joint Effects of Personality and Workplace Social Exchange Relationships in Predicting Task Performance and Citizenship Performance


37 Mo Wang, Riki Takeuchi The Role of Goal Orientation During Expatriation: A Cross-Sectional and Longitudinal Investigation Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1437–1445

38 Christine A. Sprigg, Christopher B. Stride, Work Characteristics, Musculoskeletal Disorders, and the Mediating Role of Psychological Strain: A Study of Call Center Employees

Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1456–1466 39 Toby D. Wall, and David J. Holman, Phoebe

R. Smith

40 Michael Frese, Stefanie I. Krauss, Nina Keith, Susanne Escher, Rafal Grabarkiewicz, Siv Tonje Luneng, Business Owners’ Action Planning and Its Relationship to

Business Success in Three African Countries Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1481–1498

41 Constanze Heers, Jens Unger, and Christian Friedrich

42 Wei-Chi Tsai, Chien-Cheng Chen, Hui-Lu Liu

Test of a Model Linking Employee Positive Moods and Task Performance Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1570–1583

43 Brent A. Scott, Jason A. Colquitt and Cindy P. Zapata-Phelan

Justice as a Dependent Variable: Subordinate Charisma as a Predictor of Interpersonal and Informational Justice Perceptions Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1597–1609

44

45 Julia Levashina, Michael A. Campion Measuring Faking in the Employment Interview: Development and Validation of an Interview Faking Behavior Scale Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1638–1656

335

46 David G. Allen, Raj V. Mahto, Robert F. Otondo

Web-Based Recruitment: Effects of Information, Organizational Brand, and Attitudes Toward a Web Site on Applicant Attraction

Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1696–1708 47

48 Zhi-Xue Zhang, Paul S. Hempel, Yu-Lan Han, Dean Tjosvold

Transactive Memory System Links Work Team Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1722–1730

49 Characteristics and Performance

50 Henry Moon, Dishan Kamdar, David M. Mayer, Riki Takeuchi

Me or We? The Role of Personality and Justice as Other-Centered Antecedents to Innovative Citizenship Behaviors Within Organizations


51 Sandy Lim, Lilia M. Cortina, Vicki J. Magley Personal and Workgroup Incivility: Impact on Work and Health Outcomes Journal of Applied Psychology, 2008, Vol. 93, No. 1, 95–107

52 Bradford S. Bell, Steve W. J. Kozlowski, Active Learning: Effects of Core Training Design Elements on Self-Regulatory Processes, Learning, and Adaptability Journal of Applied Psychology, 2008, Vol. 93, No. 2, 296–316

53 Lillian T. Eby, Jaime R. Durley, and Sarah C. Evans, Belle Rose Ragins

Mentors’ Perceptions of Negative Mentoring Experiences: Scale Development and Nomological Validation Journal of Applied Psychology, 2008, Vol. 93, No. 2, 358–373

54 James R. Detert, Linda Klebe Trevin˜o, Vicki L. Sweitzer

Moral Disengagement in Ethical Decision Making: Journal of Applied Psychology, 2008, Vol. 93, No. 2, 374–391

55 A Study of Antecedents and Outcomes

56 Severin Hornung, Denise M. Rousseau, Ju¨rgen Glaser

Creating Flexible Work Arrangements Through Idiosyncratic Deals Journal of Applied Psychology, 2008, Vol. 93, No. 3, 655–664

57 Dov Zohar and Orly Tenne-Gazit Transformational Leadership and Group Interaction as Climate Antecedents: A Social Network Analysis Journal of Applied Psychology, 2008, Vol. 93, No. 4, 744–757

58 Mahesh Subramony, Nicole Krause, Jacqueline Norton, and Gary N. Burns

The Relationship Between Human Resource Investments and Organizational Performance: A Firm-Level Examination of Equilibrium Theory

Journal of Applied Psychology, 2008, Vol. 93, No. 4, 778–788 59

60 Arnold B. Bakker, Evangelia Demerouti, Maureen F. Dollard

How Job Demands Affect Partners’ Experience of Exhaustion: Integrating Work–Family Conflict and Crossover Theory Journal of Applied Psychology, 2008, Vol. 93, No. 4, 901–911

61

62

Shaul Oreg, Mahmut Bayazıt, Maria Armenakis, Rasa Barkauskiene, Nikos Bozionelos, Yuka Fujimoto, Luis Gonzalez, Jian Han, Martina Hrˇebı´cˇkova, Nerina Jimmieson, Jana Kordacova, Hitoshi Mitsuhashi, Boris Mlacˇic´, Ivana Feric´, Marina Kotrla Topic, Sandra Ohly, Per Øystein Saksvik, Hilde Hetland and Ingvild Saksvik and Karen van Dam

Dispositional Resistance to Change: Measurement Equivalence and the Link to Personal Values Across 17 Nations Journal of Applied Psychology, 2008, Vol. 93, No. 4, 935–944

336

63 Greg L. Stewart and Susan L. Dustin, Murray R. Barrick, Todd C. Darnold Exploring the Handshake in Employment Interviews Journal of Applied Psychology, 2008, Vol. 93, No. 5, 1139–1146

64 David J. Henderson and Sandy J. Wayne, Lynn M. Shore, William H. Bommer, Lois E. Tetrick

Leader–Member Exchange, Differentiation, and Psychological Contract Fulfillment: A Multilevel Examination Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1208–1219

65 Fiona A. White, Margaret A. Charles, and Jacqueline K. Nelson

The Role of Persuasive Arguments in Changing Affirmative Action Attitudes and Expressed Behavior in Higher Education Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1271–1286

66 Daniel P. Skarlicki, Danielle D. van Jaarsveld, and David D. Walker

Getting Even for Customer Mistreatment: The Role of Moral Identity in the Relationship Between Customer Interpersonal Injustice and Employee Sabotage


67 Samantha D. Montes, P. Gregory Irving Disentangling the Effects of Promised and Delivered Inducements: Relational and Transactional Contract Elements and the Mediating Role of Trust


68 Brent A. Scott, Timothy A. Judge The Popularity Contest at Work: Who Wins, Why, and What Do They Receive? Journal of Applied Psychology, 2009, Vol. 94, No. 1, 20–33

69 Eric Kearney, Diether Gebert Managing Diversity and Enhancing Team Outcomes: The Promise of Transformational Leadership Journal of Applied Psychology, 2009, Vol. 94, No. 1, 77–8

70 Hans-Georg Wolff and Klaus Moser Effects of Networking on Career Success: A Longitudinal Study Journal of Applied Psychology, 2009, Vol. 94, No. 1, 196–206

71 Susan M. Stewart, Mark N. Bing and H. Kristl Davison, David J. Woehr and Michael D. McIntyre

In the Eyes of the Beholder: A Non-Self-Report Measure of Workplace Deviance Journal of Applied Psychology, 2009, Vol. 94, No. 1, 207–215

72 Jin Nam Choi, Jae Yoon Chang Innovation Implementation in the Public Sector: An Integration of Institutional and Collective Dynamics Journal of Applied Psychology, 2009, Vol. 94, No. 1, 245–253

73 Yaping Gong, Kenneth S. Law and Song Chang, Katherine R. Xin

Human Resources Management and Firm Performance: The Differential Role of Managerial Affective and Continuance Commitment


74 Peter W. Hom and Anne S. Tsui, Joshua B. Wu, Thomas W. Lee, Ann Yan Zhang, Ping Ping Fu, Lan Li

Explaining Employment Relationships With Social Exchange and Job Embeddedness Journal of Applied Psychology, 2009, Vol. 94, No. 2, 277–297

75 Greet Van Hoye and Filip Lievens Tapping the Grapevine: A Closer Look at Word-of-Mouth as a Recruitment Source Journal of Applied Psychology, 2009, Vol. 94, No. 2, 341–352

76 Tove Helland Hammer, Mahmut Bayazit, David L. Wazeter

Union Leadership and Member Attitudes: A Multi-Level Analysis Journal of Applied Psychology, 2009, Vol. 94, No. 2, 392–410

337

77 Steven L. Blader and Tom R. Tyler Testing and Extending the Group Engagement Model: Linkages Between Social Identity, Procedural Justice, Economic Outcomes, and Extrarole Behavior


78 Maureen L. Ambrose and Marshall Schminke The Role of Overall Justice Judgments in Organizational Justice Research: A Test of Mediation Journal of Applied Psychology, 2009, Vol. 94, No. 2, 491–500

79 Lei Lai, Denise M. Rousseau, Klarissa Ting Ting Chang Idiosyncratic Deals: Coworkers as Interested Third Parties Journal of Applied Psychology, 2009, Vol. 94, No. 2, 547–556

80 Gregory M. Hurtz, Kevin J. Williams Attitudinal and Motivational Antecedents of Participation in Voluntary Employee Development Activities Journal of Applied Psychology, 2009, Vol. 94, No. 3, 635–653

81 Chad H. Van Iddekinge, Gerald R. Ferris, Alexa A. Perryman, rFred R. Blass, Thomas D. Heetderks

Effects of Selection and Training on Unit-Level Performance Over Time: A Latent Growth Modeling Approach Journal of Applied Psychology, 2009, Vol. 94, No. 4, 829–843

82 D. Scott DeRue and Ned Wellman Developing Leaders via Experience: The Role of Developmental Challenge, Learning Orientation, and Feedback Availability


83 Remus Ilies, Ingrid Smithey Fulmer, Matthias Spitzmuller, Michael D. Johnson

Personality and Citizenship Behavior: The Mediating Role of Job Satisfaction Journal of Applied Psychology, 2009, Vol. 94, No. 4, 945–959

84 Karin A. Orvis, Sandra L. Fisher and Michael E. Wasserman

Power to the People: Using Learner Control to Improve Trainee Reactions and Learning in Web-Based Instructional Environments


85 Jessica B. Rodell and Jason A. Colquitt, Looking Ahead in Times of Uncertainty: The Role of Anticipatory Justice in an Organizational Change Context Journal of Applied Psychology, 2009, Vol. 94, No. 4, 989–1002

86 Brian C. Holtz, Crystal M. Harold Fair Today, Fair Tomorrow? A Longitudinal Investigation of Overall Justice Perceptions Journal of Applied Psychology, 2009, Vol. 94, No. 5, 1185–1199

87 Jerry W. Grizzle, Alex R. Zablah, Tom J. Brown, and John C. Mowen, James M. Lee

Employee Customer Orientation in Context: How the Environment Moderates the Influence of Customer Orientation on Performance Outcomes

88 David R. Hekman, H. Kevin Steensma, Gregory A. Bigley,

Effects of Organizational and Professional Identification on the Relationship Between Administrators’ Social Influence and Professional Employees’ Adoption of New Work Behavior

Journal of Applied Psychology, 2009, Vol. 94, No. 5, 1325–1335 89 and James F. Hereford

90 Martha C. Andrews, K. Michele Kacmar, Kenneth J. Harris

Got Political Skill? The Impact of Justice on the Importance of Political Skill for Job Performance Journal of Applied Psychology, 2009, Vol. 94, No. 6, 1427–1437

91 Abraham Carmeli and Batia Ben-Hador, David A. Waldman, Deborah E. Rupp

How Leaders Cultivate Social Capital and Nurture Employee Vigor: Implications for Job Performance Journal of Applied Psychology, 2009, Vol. 94, No. 6, 1553–1561

338

92 Timothy A. Judge, Remus Ilies and Nikolaos Dimotakis

Are Health and Happiness the Product of Wisdom? The Relationship of General Mental Ability to Educational and Occupational Attainment, Health, and Well-Being


93 Leigh Anne Liu, Chei Hwee Chua, Gu¨nter K. Stahl

Quality of Communication Experience: Definition, Measurement, and Implications for Intercultural Negotiations Journal of Applied Psychology, 2010, Vol. 95, No. 3, 469–487

94 Richard G. Netemeyer and James G. Maxham III, Donald R. Lichtenstein

Store Manager Performance and Satisfaction: Effects on Store Employee Performance and Satisfaction, Store Customer Satisfaction, and Store Customer Spending Growth


95 Tracy D. Hecht, Julie M. Mccarthy Coping With Employee, Family, and Student Roles: Journal of Applied Psychology, 2010, Vol. 95, No. 4, 631–647

96 Thomas W. H. Ng, Daniel C. Feldman, Simon S. K. Lam

Psychological Contract Breaches, Organizational Commitment, and Innovation-Related Behaviors: A Latent Growth Modeling Approach


97 Elizabeth E. Umphress, John B. Bingham, Marie S. Mitchell

Unethical Behavior in the Name of the Company: The Moderating Effect Of Organizational Identification and Positive Reciprocity Beliefs on Unethical Pro-Organizational Behavior


98

Robert Eisenberger, Gokhan Karagonlar, Florence Stinglhamber, Pedro Neves, Thomas E. Becker, M. Gloria gonzalezmeta Steiger-Mueller-Morales,

Leader–Member Exchange and Affective Organizational Commitment: The Contribution of Supervisor’s Organizational Embodiment


99 Xiao-Hua (Frank) Wang, Jane M. Howell Exploring the Dual-Level Effects of Transformational LeadershipOn Followers Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1134–1144

100 Murray R. Barrick and Brian W. Swider, Greg L. Stewart

Initial Evaluations in the Interview: Relationships with Subsequent Interviewer Evaluations and Employment Offers Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1163–1172

101 John P. Trougakos, Christine L. Jackson, Daniel J. Beal

Service Without a Smile: Comparing the Consequences of Neutral and Positive Display Rules Journal of Applied Psychology, 2010

102 Myriam N. Bechtoldt, Sonja Rohrmann, Irene E. De Pater and Bianca Beersma

The primacy of perceiving: Emotion regulation buffers negative effects of emotional labor Journal of Applied Psychology, 2011, Vol. 96, No. 5, 1087-1094

103 Pamela Tierney and Steven M. Farmer Creative self-efficacy development and creative performance over time Journal of Applied Psychology, 2011, Vol. 96, No. 2, 277-293

104 Bradley L. Kirkman, John E. Mathieu, John L. Cordery, Benson Rosen and Michael Kukenberger

Managing a new collaborative entity in business organizations: Understanding organizational communities of practice effectiveness

Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1234-1245

105 Tal Yaffe and Ronit Kark Leading by example: The case of leader OCB Journal of Applied Psychology, 2011, Vol. 96, No. 4, 806-826

339

106 Chad H. Van Iddekinge, Dan J. Putka and John P. Campbell

Reconsidering vocational interests for personnel selection: The validity of an interst-based selection test in relation to job knowledge, job performance, and continuance intentions


107 J. Craig Wallace, Paul D. Johnson, Kimberly Mathe and Jeff Paul

Structural and psychological empowerment climates, performance, and the moderating role of shared felt accountability: A managerial perspective


108 Scott E. Seibert, Gang Wang and Stephen H. Courtright

Antecedents and consequences of psychological and team empowerment in organizations: A meta-analytic review Journal of Applied Psychology, 2011, Vol. 96, No. 5, 981-1003

109 Debra L. Shapiro, Alan D. Boss, Silvia Salas, Subrahmaniam Tangirala and Mary Ann Von Glinow

When are transgressing leaders punitively judged? An empirical test Journal of Applied Psychology, 2011, Vol. 96, No. 2, 412-422

110 Jason D. Shaw, Jing Zhu, Michelle K. Duffy, Kristin L. Scott, His-An Shih and Ely Susanto A contingency model of conflict and team effectiveness Journal of Applied Psychology, 2011, Vol. 96, No. 2, 391-400

111 Stefan Diestel and Klaus-Helmut Schmidt Costs of simultaneous coping with emotional dissonance and self-control demands at work: Results from two German samples

Journal of Applied Psychology, 2011, No. 96, No. 3, 643-653

112 Jia Hu and Robert C. Liden Antecedents of team potency and team effectiveness: An examination of goal and process clarity and servant leadership Journal of Applied Psychology, 2011, Vol. 96, No. 4, 851-862

113 Ronald Bledow, Antje Schmitt, Michael Frese and Jana Kuhnel The affective shift model of work engagement Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1246-1257

114 Gilad Chen, Payal Nangia Sharma, Suzanne K. Edinger, Debra L. Shapiro and Jiing-Lih Farh

Motivating and demotivating forces in teams: Cross-level influences of empowering leadership and relationship conflict Journal of Applied Psychology, 2011, Vol. 96, No. 3, 541-557

115 Spencer H. Harrison, David M. Sluss, Blake E Ashforth

Curiosity adapted the cat: The role of trait curiosity in newcomer adaptation Journal of Applied Psychology, 2011, Vol. 96, No. 1, 211-220

116 John Schaubroeck , Simon S. K. Lam and Ann Chunyan Peng

Cognition-based and affect-based trust as mediator of leader behaviour influences on team performance Journal of Applied Psychology, 2011, Vol. 96, No. 4, 863-871

117 John P. Hausknecht, Michael C. Sturman and Quinetta M. Roberson

Justice as a dynamic construct: Effects of individual trajectories on distal work outcomes Journal of Applied Psychology, 2011, Vol. 96, No. 4, 872-880

118 Sven Gross, Norbert K. Semmer, Laurenz L. Meier, Wolfgang Kalin, Nicola Jacobshagen and Franziska Tschan

The effect of positive events at work on after-work fatigue: They matter most in face of adversity Journal of Applied Psychology, 2011, Vol. 96, No. 3, 654-664

340

119 Filip Lievens and Fiona Patterson The validity and incremental validity of knowledge tests, low-fidelity simulations, and high-fidelity simulations for predicting job performance in advanced-level high-stakes selection


120 Maria L. Kraimer, Scott E. Seibert, Sandy J. Wayne, Robert C. Liden and Jesus Bravo

Antecedents and outcomes of organizational support for development: The critical role of career opportunities Journal of Applied Psychology, 2011, Vol. 96, No. 3, 485-500

121 Jessica Lang, Paul D. Bliese, Jonas W. B. Lang and Amy B. Adler

Work gets unfair for the depressed: Cross-lagged relations between organizational justice perceptions and depressive symptoms


122 Huy Le, In-Sue Oh, Steven B. Robbins, Remus Ilies, Ed Holland and Paul Westrick

Too much of a good thing: Curvilinear relationships between personality traits and job performance Journal of Applied Psychology, 2011, Vol. 96, No. 1, 113-133

123 Ning Li, T. Brad Harris, Wendy R. Boswell and Zhitao Xie

The role of organizational insiders’ developmental feedback and proactive personality on newcomers’ performance: An interactionist perspective


124 Dong Liu and Xiao-Ping Chen and Xin Yao From autonomy to creativity: A multilevel investigation of the mediating role of harmonious passion Journal of Applied Psychology, 2011, Vol. 96, No. 2, 294-309

125 Dong Liu and Ping-ping Fu Motivating proteges’ personal learning in teams: A multilevel investigation of autonomy support and autonomy orientation Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1195-1208

126 Christopher D. Nye and Fritz Drasgow Effect size indices for analyses of measurement equivalence: Understanding the practical importance of differences between groups


127 Dong Liu, Shu Zhang, Lei Wang and Thomas W. Lee

The effects of autonomy and empowerment on employee turnover: Test of a multilevel model in teams Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1305-1316

128 Nora Madjar, Ellen Greenberg and Zheng Chen

Factors for radical creativity, incremental creativity, and routine, noncreative performance Journal of Applied Psychology, 2011, Vol. 96, No. 4, 730-743

129 Jake G. Messersmith, Pankaj C. Patel, David P. Lepak and Julian Gould-Williams

Unlocking the black box: Exploring the link between high-performance work systems and performance Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1105-1118

130 Elizabeth Wolfe Morrison, Sara L. Wheeler-Smith and Dishan Kamdar

Speaking up in groups: A cross-level study of group voice climate and voice Journal of Applied Psychology, 2011, Vol. 96, No. 1, 183-191

131 Kok-Yee Ng, Christine Koh, Soon Ang, Jeffrey C. Kennedy, and Kim-Yin Chan

Rating leniency and halo in multisource feedback ratings: Testing cultural assumptions of power distance and individualism- collectivism


132 Muammer Ozer A moderated mediation model of the relationship between organizational citizenship behaviors and job performance Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1328-1336

341

133 S. Douglas Pugh, Markus Groth and Thorsten Hennig-Thurau

Willing and able to fake emotions: A closer examination of the link between emotional dissonance and employee well-being Journal of Applied Psychology, 2011, Vol. 96, No. 2, 377-390

134 Simon Lloyd D. Restubog, Kristin L. Scott and Thomas J. Zagenczyk

When distress hits home: The role of contextual factors and psychological distress in predicting employees’ responses to abusive supervision


135 Zhaoli Song, Maw-Der Foo, Marilyn A. Uy and Shuhua Sun

Unraveling the daily stress crossover between unemployed individuals and their employed spouses Journal of Applied Psychology, 2011, Vol. 96, No. 1, 151-168

136 Sabine Sonnentag, Eva J. Mojza, Evangelia Demerouti and Arnold B. Bakker

Reciprocal relations between recovery and work engagement: The moderating role of job stressors Journal of Applied Psychology, 2012, Vol. 97, No. 4, 842-853

137 Andreas W. Richter, Giles Hirst, Daan van knippenberg and Markus Baer

Creative self-efficacy and individual creativity in team contexts: Cross-level interactions with team informational resources Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1282-1290

138 Anat Rafaeli, Amir Erez, Shy Ravid, Rellie Derfler-Rozin, Dorit Efrat Treister and Ravit Scheyer

When customers exhibit verbal aggression, employees pay cognitive costs Journal of Applied Psychology, 2012, Vol. 97, No. 5, 931-950

139 Steffen Raub and hui Liao Doing the right thing without being told: Joint effects of initiative climate and general self-efficacy on employee proactive customer service performance


140 Steven W. Whiting, Timothy D. Maynes, Nathan P. Podsakoff and Philip M. Podsakoff

Effects of message, source, and context on evaluations of employee voice behaviour Journal of Applied Psychology, 2012, Vol. 97, No. 1, 159-782

141 Chia-Huei Wu and Mark A. Griffin Longitudinal relationships between core self-evaluations and job satisfaction Journal of Applied Psychology, 2012, Vol. 97, No. 2, 331-342

142 Thomas W. H. Ng and Daniel C. Feldman The effects of organizational and community embeddedness on work-to-family and family-to-work conflict Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1233-1251

143 Karsten Mueller, Kate Hattrup, Sven-Oliver Spiess and Nick Lin-Hi

The effects of corporate social responsibility on employees’ affective commitment: A cross-cultural investigation Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1186-1200

144 Lisa Schurer Lambert, Bennett J Tepper, Jon C. Carr, Daniel T. Holt and Alex J. Barelka

Forgotten but not gone: An examination of fit between leader consideration and initiating structure needed and received Journal of Applied Psychology, 2012, Vol. 97, No. 5, 913-930

145

Hannes Leroy, Bart Dierynck, Frederik Anseel, Tony Simons, Jonathon R. B. Halbesleben, Deirdre McCaughey, Grant T. Savage and Luc Sels

Behavioral integrity for safety, priority of safety, psychological safety, and patient safety: A team-level study Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1273-1281

342

146 Jason A. Colquitt, Jeffery A. LePine, Ronald F. Piccolo, Cindy P. Zapata, and Bruce L. Rich

Explaining the justice-performance relationship: Trust as exchange deepener or trust as uncertainty reducer? Journal of Applied Psychology, 2012, Vol. 97, No. 1, 1-15

147 Deanne N. Den Hartog and Frank D. Belschak When does transformational leadership enhance employee proactive behaviour? The role of autonomy and role breadth self-efficacy


148 Bart A. de Jong and Kurt T. Dirks Beyond shared perceptions of trust and monitoring in teams: Implications of asymmetry and dissensus Journal of Applied Psychology, 2012, Vol. 97, No. 2, 391-406

149 Marne L. Arthaud-Day, Joseph C. Rode and William H. Turnley

Direct and contextual effects of individual values on organisational citizenship behaviour in teams Journal of Applied Psychology, 2012, Vol. 97, No. 4, 792-807

150 Samuel Aryee, Fred O. Walumbwa, Emmanuel Y. M. Seidu and Lilian E. Otaye

Impact of high-performance work systems on individual- and branch level performance: Test of a multilevel model of intermediate linkages


151 Richard G. Netemeyer, Carrie M. Heilman and James G. Maxham, III

Identification with the retail organization and customer-perceived employee similarity: Effects on customer spending Journal of Applied Psychology, 2012, Vol. 97, No. 5, 1049-1058

152 Richard P. Bagozzi, Massimo Bergami, Gian Luca Marzocchi, and Gabriele Morandin

Customer-organization relationships: Development and test of a theory of extended identities Journal of Applied Psychology, 2012, Vol. 97, No. 1, 63-76

153 Uta K. Bindl, Sharon K. Parker, Peter Totterdell and Gareth Hagger-Johnson

Fuel of the self-starter: How mood relates to proactive goal regulation Journal of Applied Psychology, 2012, Vol. 97, No. 1, 134-150

154 Xiao-Ping Chen, Dong Liu and Rebecca Portnoy

A multilevel investigation of motivational cultural intelligence, organizational diversity climate, and cultural sales: Evidence from U.S. real estate firms


155 Lisa Dragoni and Maribeth Kuenzi Better understanding work unit goal orientation: Its emergence and impact under different types of work unit structure Journal of Applied Psychology, 2012, Vol. 97, No. 5, 1032-1048

156 D. Scott DeRue, Jennifer D. Nahrgang, John R. Hollenbeck and Kristina Workman

A quasi-experimental study of after-event reviews and leadership development Journal of Applied Psychology, 2012, Vol. 97, No. 5, 997-1015

157 Crystal I. C. Chien Farh, Myeong-Gu Seo and Paul E. Tesluk

Emotional intelligence, teamwork effectiveness, and job performance: The moderating role of job context Journal of Applied Psychology, 2012, Vol. 97, No. 4, 890-900

158 David M. Fisher, Suzanne T. Bell, Erich C. Dierdorff, and James A. Belohlav

Facet personality and surface-level diversity as team mental model antecedents: Implications for implicit coordination Journal of Applied Psychology, 2012, Vol. 97, No. 4, 825-841

159 Ravi S. Gajendran and Aparna Joshi Innovation in globally distributed teams: The role of LMX, communication frequency, and member influence on team decisions


343

160 Michele J. Gelfand, Lisa M. Leslie, Kirsten Keller and Carsten de Dreu

Conflict cultures in organizations: How leaders shape conflict cultures and their organizational-level consequences Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1131-1147

161 Dvora Geller and Peter A. Bamberger The impact of help seeking on individual task performance: The moderating effect of help seekers’ logics of action Journal of Applied Psychology, 2012, Vol. 97, No. 2, 487-497

162 Robert T. Keller Predicting the performance and innovativeness of scientists and engineers Journal of Applied Psychology, 2012, Vol. 97, No. 1, 225-233

163 Karoline Strauss, Mark A. Griffin and Sharon K. Parker

Future work selves: How salient hoped-for identities motivate proactive career behaviors Journal of Applied Psychology, 2012, Vol. 97, NO. 3, 580-598

164 Sharon Toker and Michal Biron Job burnout and depression: Unraveling their temporal relationship and considering the role of physical activity Journal of Applied Psychology, 2012, Vol. 97, No. 3, 699-710

165 Le Zhou, Mo Wang, Gilad Chen and Junqi Shi Supervisors’ upward exchange relationships and subordinate outcomes: Testing the multilevel mediation role of empowerment


166 Herman H. M. Tse, Catherine K. Lam, Sandra A. Lawrence and Xu Huang

When my supervisor dislikes you more than me: The effect of dissimilarity in leader-member exchange on coworkers’ interpersonal emotion and perceived help


167 Subrahmaniam Tangirala, Dishan Kamdar, Vijaya Venkataramani and Michael R. Parke

Doing right versus getting ahead: The effects of duty and achievement orientations on employees’ voice Journal of Applied Psychology, 2013, Vol. 98, No. 6, 1040-1050

168 Daniel J. Beal, John P. Trougakos, Howard M. Weiss, and Reeshad S. Dalal Affect spin and the emotion regulation process at work Journal of Applied Psychology, 2013, Vol. 98, No. 4, 593-605

169 Junqi Shi, Russell E. Johnson, Yihao Liu and Mo Wang

Linking subordinate political skill to supervisor dependence and reward recommendations: A moderated mediation model Journal of Applied Psychology, 2013, Vol. 98, No. 2, 374-384

170 Mindy K. Shoss, Robert Eisenberger, Simon Lloyd D. Restubog and Thomas J. Zagenczyk

Blaming the organization for abusive supervision: The roles of perceived organizational support and supervisor’s organizational embodiment


171 Aaron M. Watson, Lori Foster Thompson, Jane V. Rudolph, Thomas J. Whelan, Tara S. Behrend, and Amanda L. Gissel

When big brother is watching: Goal orientation shapes reactions to electronic monitoring during online training Journal of Applied Psychology, 2013, Vol. 98, No. 4, 642-657

172 Julie Holliday Wayne, Wendy J. Casper, Russell A. Matthews, and Tammy D. Allen

Family-supportive organization perceptions and organizational commitment: The mediating role of work-family conflict and enrichment and partner attitudes


173 James W. Beck and Aaron M. Schmidt State-level goal orientations as mediators of the relationship between time pressure and performance: A longitudinal study Journal of Applied Psychology, 2013, Vol. 98, No. 2, 354-363

344

174 D. Lance Ferris, Russell E. Johnson, Christopher C. Rosen, Emilija Djurdjevic, Chu-Hsiang Chang and James A Tan

When is success not satisfying? Integrating regulatory focus and approach/avoidance motivation theories to explain the relation between core-self-evaluation and job satisfaction


175 Adam M. Grant and Nancy P. Rothbard When in doubt, seize the day? Security values, prosocial values, and proactivity under ambiguity Journal of Applied Psychology, 2013, Vol. 98, No. 5, 810-819

176 Nina Gupta, Daniel C. Ganster and Sven Kepes Assessing the validity of scales self-efficacy: A cautionary tale Journal of Applied Psychology, 2013, Vol. 98, No. 4, 690-700

177

Sean T. Hannah, John M. Schaubroeck, Ann C Peng, Robert G Lord, Linda K Trevino, Steve W. J. Kozlowski, Bruce J. Avolio, Nikolaos Dimotakis and Joseph Doty

Job influences of individual and work unit abusive supervision on ethical intentions and behaviors: A moderated mediation model


178 Daniel S. Stanhope, Samuel B. Pond III and Erica A. Surface

Core self-evaluations and training effectiveness: Prediction through motivational intervening mechanisms Journal of Applied Psychology, 2013, Vol. 98, No. 5, 820-831

179 Kristin L. Scott, Simon Lloyd D. Restubog and Thomas J. Zagenczyk

A social exchange-based model of the antecedents of workplace exclusion Journal of Applied Psychology, 2013, Vol. 98, No. 1, 37-48

180 Scott E. Seibert, Maria L. Kraimer, Brooks C. Holtom and Abigail J. Pierotti

Even the best laid plans sometimes go askew: Career self- management processes, career shocks, and the decision to pursue graduate education


181 Guo-hua Huang, Helen Hailin Zhao, Xiong-ying Niu, Susan J. Ashford and Cynthia Lee

Reducing job insecurity and increasing performance ratings: Does impression management matter? Journal of Applied Psychology, 2013, Vol. 98, No. 5, 852-862

182 Laura Huang, Marcia Frideger and Jone L. Pearce

Political skill: Explaining the effects of non-native accent on managerial hiring and entrepreneurial investment decisions Journal of Applied Psychology, 2013, Vol. 98, No. 6, 1005-1017

183

Michael G. Hughes, Eric Anthony Day, Xiaoqian Wang, Matthew J. Schuelke, Matthew L. Arsenault, Lauren N. Harkrider, and Olivia D. Cooper

Learner-controlled practice difficulty in the training of a complex task: Cognitive and motivational mechanisms Journal of Applied Psychology, 2013, Vol. 98, No. 1, 80-98

184 Ryan C. Johnson and Tammy D. Allen Examining the links between employed mothers’ work characteristics, physical activity, and child health Journal of Applied Psychology, 2013, Vol. 98, No. 1, 148-157

185 Timothy A. Judge, Jessica B. Rodell, Ryan L. Klinger, Lauren S. Simon and Eean R. Crawford

Hierarchical representations of the five-factor model of personality in predicting job performance: Integrating three organizing frameworks with two theoretical perspectives


186 Jun Liu, Cynthia Lee, Chun Hui, Ho Kwong Kwan and Long-Zeng Wu

Idiosyncratic deals and employee outcomes: The mediating roles of social exchange and self-enhancement and the moderating role of individualism


345

187 Lisa M. Leslie, Mark Snyder and Theresa M. Glomb

Who gives? Multilevel effects of gender and ethnicity on workplace charitable giving Journal of Applied Psychology, 2013, Vol. 98, No. 1, 49-62

188 Wu Liu, Subrahmaniam Tangirala and Rangaraj Ramanujam The relational antecedents of voice targeted at different leaders Journal of Applied Psychology, 2013, Vol. 98, No. 5, 841-851

189 Julie M. McCarthy, Chad H. Van Iddekinge, Filip Lievens, Mei-Chuan Kung, Evan F. Sinar and Michael A. Campion

Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance


190 Laurenz L. Meier and Paul E. Spector Reciprocal effects of work stressors and counterproductive work behaviour: A five-wave longitudinal study Journal of Applied Psychology, 2013, Vol. 98, No. 3, 529-539

191 Nathan T. Carter, Dev K. Dalal, Anthony S. Boyce, Matthew S. O’Connell, Mei-Chuan Kung and Kristin M. Delgado

Uncovering curvilinear relationships between conscientiousness and job performance: How theoretically appropriate measurement makes an empirical difference


192 Song Chang, Liangding Jia, Riki Takeuchi and Yahua Cai

Do high-commitment work systems affect creativity? A multilevel combinational approach to employee creativity Journal of Applied Psychology, 2014, Vol. 99, No. 4, 665-680

193 Jinseok S. Chun and Jin Nam Choi Members’ needs, intragroup conflict, and group performance Journal of Applied Psychology, 2014, Vol. 99, No. 3, 437-450

194 Stephen H. Courtright, Amy E. Colbert and Daejeong Choi

Fired up or burned out? How developmental challenge differentially impacts leader behaviour Journal of Applied Psychology, 2014, Vol. 99, No. 4, 681-696

195 Jeroen P. de Jong, Petru L. Curseu and Roger Th. A. J. Leenders

When do bad apples not spoil the barrel? Negative relationships in teams, team performance, and buffering mechanisms Journal of Applied Psychology, 2014, Vol. 99, No. 3, 514-522

196 Lisa Dragoni, Haeseen Park, Jim Soltis and Sheila Forte-Trammell

Show and tell: How supervisors facilitate leader development among transitioning leaders Journal of Applied Psychology, 2014, Vol. 99, No. 1, 66-86

197 Lisa Dragoni, In-Sue Oh, Paul E. Tesluk, Ozias, A. Moore, Paul VanKatwyk and Joy Hazucha

Developing leaders’ strategic thinking through global work experience: The moderating role of cultural distance Journal of Applied Psychology, 2014, Vol. 99, No. 5, 867-882

198 Crystal I. C. Farh and Zhijun Chen Beyond the individual victim: Multilevel consequences of abusive supervision in teams Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1074-1095

199 David M. Fisher Distinguishing between taskwork and teamwork planning in teams: Relations with coordination and interpersonal processes Journal of Applied Psychology, 2014, Vol. 99, No. 3, 423-436

200 David M. Fisher A multilevel cross-cultural examination of role overload and organizational commitment: Investigating the interactive effects of context


346

201 Erik Gonzalez-Mule, David S. DeGeest, Brian W. McCormick, Jee Young Seong and Kenneth G. Brown

Can we get some cooperation around here? The mediating role of group norms on the relationship between team personality and individual helping behaviors


202 Vicente Gonzalez-Roma and Ana Hernandez Climate uniformity: Its influence on team communication quality, task conflict, and team performance Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1042-1058

203 Rebecca L. Greenbaum, Matthew J. Quade, Mary B. Mawritz, Joongseo Kim and Durand Crosby

When the customer is unethical: The explanatory role of employee emotional exhaustion onto work-family conflict, relationship conflict with coworkers, and job neglect


204 Julia E. Hoch and Steve W. J. Kozlowki Leading virtual teams: Hierarchical leadership, structural supports, and shared team leadership Journal of Applied Psychology, 2014, Vol. 99, No. 3, 390-403

205 Xu Huang, JJ Po-An Hsieh, and Wei He Expertise dissimilarity and creativity: The contingent roles of tacit and explicit knowledge sharing Journal of Applied Psychology, 2014, Vol. 99, No. 5, 816-830

206 Jaclyn M. Jensen, Pankaj C. Patel and Jana L. Raver

Is it better to be average? High and low performance as predictors of employee victimization Journal of Applied Psychology, 2014, Vol. 99, No. 2, 296-309

207 Howard J. Klein, Joseph T. Cooper, Janice C. Molloy and Jacqueline A. Swanson

The assessment of commitment: Advantages of a unidimensional, target-free approach Journal of Applied Psychology, 2014, Vol. 99, No. 2, 222-238

208 Alex Ning Li and Hui Liao How do leader-member exchange quality and differentiation affect performance in teams? An integrated multilevel dual process model


209 Wen-Dong Li, Doris Fay, Michael Frese, Peter D. Harms and Xiang Yu Gao

Reciprocal relationship between proactive personality and work characteristics: A latent change score approach Journal of Applied Psychology, 2014, Vol. 99, No. 5, 948-965

210 Huiwen Lian, D. Lance Ferris, Rachel Morrison and Douglas J. Brown

Blame it on the supervisor or the subordinate? Reciprocal relations between abusive supervision and organizational deviance


211 Sandy Lim and Kenneth Tai Family incivility and job performance: A moderated mediation model of psychological distress and core self-evaluation Journal of Applied Psychology, 2014, Vol. 99, No. 2, 351-359

212 Songqi Liu, Mo Wang, Hui Liao and Junqi Shi Self-regulation during job search: The opposing effects of employment self-efficacy and job search behaviour self-efficacy Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1159-1172

213 Russell A. Matthews, Julie Holliday Wayne and Michael T. Ford

A work-family conflict/ subjective well-being process model: A test of competing theories of longitudinal effects Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1173-1187

214 M. Travis Maynard, Margaret M. Luciano, Lauren D’Innocenzo, John E. Mathieu and Matthew D. Dean

Modeling time-lagged reciprocal psychological empowerment- performance relationships Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1244-1253

347

215 Timothy D. Maynes and Philip M. Podsakoff Speaking more broadly: An examination of the nature, antecedents, and consequences of an expanded set of employee voice behaviors


216 Susan Mohammed and Sucheta Nadkarni Are we all on the same temporal page? The moderating effects of temporal team cognition on the polychronicity diversity-team performance relationship


217 Inbal Nahum-Shani, Melanie M. Henderson, Sandy Lim and Amiram D. Vinokur

Supervisor support: Does supervisor support buffer or exacerbate the adverse effects of supervisor undermining? Journal of Applied Psychology, 2014, Vol. 99, No. 3, 484-503

218 Christopher D. Nye, Bradley J. Brummel and Fritz Drasgow

Understanding sexual harassment using aggregate construct models Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1204-1221

219 Jerel E. Slaugher, Daniel M. Cable and Daniel B. Turban

Changing job seekers’ image perceptions during recruitment visits: The moderating role of belief confidence Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1146-1158

220 Gergana Todorova, Julia B. Bear and Laurie R. Weingart

Can conflict be energizing? A study of task conflict, positive emotions, and job satisfaction Journal of Applied Psychology, 2014, Vol. 99, No. 3, 451-467

221 Prajya R. Vidyarthi, Berrin Erdogan, Smriti Anand, Robert C. Liden and Anjali Chaudhry

One member, two leaders: Extending leader-member exchange theory to a dual leadership context Journal of Applied Psychology, 2014, Vol. 99, No. 3, 468-483

222 David D. Walker, Danielle D. van Jaarsveld and Daniel P. Skarlicki

Exploring the effects of individual customer incivility encounters on employee incivility: The moderating roles of entity (In)civility and negative affectivity


223 Kai Chi Yam, Ryan Fehr and Christopher M. Barnes

Morning employees are perceived as better employees: Employees’ start times influence supervisor performance ratings Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1288-1299

224 Craig Wallace, Gilad Chen A Multilevel Integration Of Personality, Climate, Self-Regulation, And Performance Personnel Psychology, 2006, 59, 529–557

225 Michael Mount, Remus Ilies, Erin Johnson Relationship of personality traits and counterproductive work behaviors: The mediating effects of job satisfaction Personnel Psychology, 2006, 59, 591–622

226 David a. Hofmann, barbara mark An investigation of the relationship Between safety climate and medication Errors as well as other nurse And patient outcomes Personnel Psychology, 2006, 59, 847–869

227 Patrick f. Mckay, derek r. Avery, Scott Tonidandel, Mark a. Morris, Morela Hernandez, Michelle r. Hebl

Racial differences in employee retention: are diversity climate perceptions the key? PERSONNEL PSYCHOLOGY, 2007, 60, 35–62

228 Erich c. Dierdorff, eric a. Surface Placing Peer Ratings In Context: Systematic Influences Beyond Ratee Performance PERSONNEL PSYCHOLOGY, 2007, 60, 93–126

348

229 Peng wang, fred o. Walumbwa Family-Friendly Programs, Organizational Commitment, And Work Withdrawal: The Moderating Role Of Transformational Leadership

PERSONNEL PSYCHOLOGY, 2007, 60, 397–427

230 Fred luthans and bruce j. Avolio, james b. Avey, steven m. Norman,

Positive psychological capital: Measurement and relationship with Perfor mance and satisfaction PERSONNEL PSYCHOLOGY, 2007, 60, 541–572

231 Hao zhao, sandy j. Wayne, brian c. Glibkowski, jesus bravo

The impact of psychological contract Breach on work-related outcomes: PERSONNEL PSYCHOLOGY, 2007, 60, 647–680

232 A meta-analysis

233 Colin m. Gill, gerard p. Hodgkinson Development and validation of the Five-factor model questionnaire : an adjectival-based personality inventory for use in occupational settings


234 Mel fugate, angelo j. Kinicki, gregory e. Prussia

Employee coping with organizational Change: an examination of alternative Theoretical perspectives and models PERSONNEL PSYCHOLOGY, 2008, 61, 1–36

235 Paul j. Taylor, wen-dong li, kan shi, walter c. Borman The transportability of job information Across countries PERSONNEL PSYCHOLOGY, 2008, 61, 69–111

236 Michael k. Mount, in-sue oh, and melanie burns

Incremental validity of perceptual Speed and accuracy over general Mental ability PERSONNEL PSYCHOLOGY, 2008, 61, 113–139

237 Jeffery a. Lepine, ronald f. Piccolo, christine l. Jackson, john e, jessica r. Saulmathieu,

A meta-analysis of teamwork processes: Tests of a multidimensional model And relationships with team Effectiveness criteria


238 Lisa h. Nishii, david p. Lepak, benjamin Schneider

Employee attributions of the “why” Of hr practices: their effects on Employee attitudes and behaviors, And customer satisfaction PERSONNEL PSYCHOLOGY, 2008, 61, 503–545

239 Ronald Bledow And Michael Frese A situational judgment test of personal Initiative and its relationship To performance PERSONNEL PSYCHOLOGY, 2009, 62, 229–258

240 Jixia yang, james m. Diefendorff The Relations Of Daily Counterproductive Workplace Behavior With Emotions, Situational Antecedents, And Personality Moderators: A Diary Study In Hong Kong


241 Janet h. Marler, sandra l. Fisher, weiling ke Employee self-service technology Acceptance: a comparison Of pre-implementation and Post-implementation relationships PERSONNEL PSYCHOLOGY, 2009, 62, 327–358

242 Herman mark d. Mazurkiewiczaguinis, eric d. Heggestad

Using web-based frame-of-reference Training to decrease biases in Personality-based job analysis: An experimental field study PERSONNEL PSYCHOLOGY, 2009, 62, 405–438

349

243 Chad h. Van iddekinge, gerald r. Ferris, tonia s. Heffner

Test of a multistage model of distal And proximal antecedents Of leader performance PERSONNEL PSYCHOLOGY, 2009, 62, 463–495

244 Daniel b. Turban, cynthia k. Stevens Effects of conscientiousness and Extraversion on new labor market Entrants.’ job search: the mediating role of metacognitive activities and positive emotions


245 Brian j. Hoffman, david j. Woehr Disentangling the meaning of multisource Performance rating source and dimension Factors PERSONNEL PSYCHOLOGY, 2009, 62, 735–765

246 Chih-Hsun Chuang, Hui Liao Strategic Human Resource Management In Service Context: Taking Care Of Business By Taking Care Of Employees And Customers


247 Connie r. Wanberg, zhen zhang, erica w. Diehn

Development of the “getting ready for Your next job” inventory for unemployed Individuals PERSONNEL PSYCHOLOGY, 2010, 63, 439–478

248 Gary j. Greguras, james m. Diefendorff Why does proactive personality predict Employee life satisfaction and work Behaviors? A field investigation of the Mediating role of the self-concordance model


249 MarÍa Del Carmen Triana, MarÍa Fernanda GarcÍa, Adrienne Colella

Managing diversity: how organizational Efforts to support diversity moderate the effects of perceived racial discrimination On affective commitment


250 Dawn S. Carlson, Merideth Ferguson, Pamela L. Perrewe and Dwayne Whitten

The fallout from abusive supervision: An examination of subordinates and their partners Personnel Psychology, 2011, 64, 937-961

251 Shoshana R. Dobrow and Jennifer Tosti-Kharas Calling: The development of a scale measure Personnel Psychology, 2011, 64, 1001-1049

252 Lisa Dragoni, In-Sue Oh, Paul Vankatwyk and Paul E. Tesluk

Developing executive leaders: The relative contribution of cognitive ability, personality, and the accumulation of work experience in predicting strategic thinking competency

Personnel Psychology, 2011, 64, 829-864

253 J. Robert Baum, Barbara Jean Bird and Sheetal Singh

The practical intelligence of entrepreneurs: Antecedents and a link with new venture growth Personnel Psychology, 2011, 64, 397-425

254 Sean T. Hannah, Fred O. Walumbwa and Louis W. Fry

Leadership in action teams: Team leader and members’ authenticity, authenticity strength, and team outcomes Personnel Psychology, 2011, 64, 771-802

255 Theresa M. Glomb, Devasheesh P. Bhave, Andrew G. Miner and Melanie Wall

During good, feeling good: Examining the role of organizational citizenship behaviors in changing mood Personnel Psychology, 2011, 64, 191-223

256 Brian J. Hoffman, Klaus G. Melchers, Carrie A. Blair, Martin Kleinmann and Robert T. Ladd

Exercises and dimensions are the currency of assessment centers Personnel Psychology, 2011, 64, 351-395

257 Jason L. Huang and Ann Marie Ryan Beyond personality traits: A study of personality states and situational contingencies in customer service jobs Personnel Psychology, 2011, 64, 451-488

350

258 Brian K. Griepentrog, Crystal M. Harold, Brian C. Holtz, Richard J. Klimoski and Sean M. Marsh

Integrating social identity and the theory of planned behaviour: Predicting withdrawal from an organizational recruitment process


259 Scott B. Mackenzie, Philip M. Podsakoff, Nathan P. Podsakoff

Challenge-oriented organizational citizenship behaviors and organizational effectiveness: Do challenge-oriented behaviors really have an impact on the organization’s bottom line?


260 Shaul Oreg and Yair Berson Leadership and employees’ reactions to change: The role of leaders’ personal attributes and transformational leadership style Personnel Psychology, 2011, 64, 627-659

261 Suzanne J. Peterson, Fred Luthans, Bruce J. Avolio, Fred O. Walumbwa and Zhen Zhang

Psychological capital and employee performance: A latent growth modelling approach Personnel Psychology, 2011, 64, 427-450

262 Christopher R. Plouffe and Yany Gregoire Intraorganizational employee navigation and socially derived outcomes: Conceptualization, validation, and effects on overall performance


263 John J. Sumanth and Daniel M. Cable Status and organizational entry: How organizational and individual status affect justice perceptions of hiring systems Personnel Psychology, 2011, 64, 963-1000

264 Fred O. Walumbwa, Russell Cropanzano and Barry M. Goldman

How leader-member exchange influences effective work behaviors: Social exchange and internal-external efficacy perspectives


265 Mo Wang and Elizabeth Mccune Understanding newcomers’ adaptability and work-related outcomes: Testing the mediating roles of perceived P-E fit variables

Personnel Psychology, 2011 64, 163-189

266 Riki Takeuchi, Zhijun Chen and Siu Yin Cheung

Applying uncertainty management theory to employee voice behaviour: An integrative investigation Personnel Psychology, 2012, 65, 283-323

267 Subrahmaniam Tangirala and Rangaraj Ramanujam

Ask and you shall hear (but not always): Examining the relationship between manager consultation and employee voice Personnel Psychology, 2012, 65, 251-282

268 Belle Rose Ragins, Jorge A. Gonzalez, Kyle Ehrhardt and Romila Singh

Crossing the threshold: The spillover of community racial diversity and diversity climate to the workplace Personnel Psychology, 2012, 65, 755-787

269 Mary Bardes Mawritz, David M. Mayer, Jenny M. Hoobler, Sandy J. Wayne and Sophia V. Marinova

A trickle-down model of abusive supervision Personnel Psychology, 2012, 65, 325-357

270 Celia Moore, James R. Detert, Linda Klebe Trevino, Vicki L. Baker and David M. Mayer

Why employees do bad things: Moral disengagement and unethical organizational behaviour Personnel Psychology, 2012, 65, 1-48

351

271 Brian J. Hoffman, C. Allen Gorman, Carrie A. Blair, John P. Meriac, Benjamin Overstreet and E. Kate Atchley

Evidence for the effectiveness of an alternative multisource performance rating methodology Personnel Psychology, 2012, 65, 531-563

272 Yuanyuan Huo, Wing Lam, Ziguang Chen Am I the only one this supervisor is laughing at? Effects of aggressive humor on employee strain and addictive behaviors Personnel Psychology, 2012, 65, 859-885

273 Suzanne J. Peterson, Benjamin M. Galvin and Donald Lange

CEO servant leadership: Exploring executive characteristics and firm performance Personnel Psychology, 2012, 65, 565-596

274 Myeong-Gu Seo, M. Susan Taylor, N. Sharon Hill, Xiaomeng Zhang, Paul E. Tesluk and Natalia M. Lorinkova

The role of affect and leadership during organizational change Personnel Psychology, 2012, 65, 121-165

275 Sabine Sonnentag and Adam M. Grant Doing good at work feels good at home, but not right away: When and why perceived prosocial impact predicts positive affect


276 Stanley M. Gully, Jean M. Phillips, William G. Castellano, Kyongji Han and Andrea Kim

A mediated moderation model of recruiting socially and environmentally responsible job applicants Personnel Psychology, 2013, 66, 935-973

277 Derek R. Avery, Mo Wang, Sabrina D. Volpone and Le Zhou

Different strokes for different folks: The impact of sex dissimilarity in the empowerment-performance relationship Personnel Psychology, 2013, 66, 757-784

278 Erik R. Eddy, Scott I. Tannenbaum and John E. Mathieu

Helping teams to help themselves: Comparing two team-led debriefing methods Personnel Psychology, 2013, 66, 975-1008

279 Alicia A. Grandey, Nai-Wen Chi and Jennifer A. Diamond

Show me the money! Do financial rewards for performance enhance or undermine the satisfaction from emotional labor? Personnel Psychology, 2013, 66, 569-612

280 Angelo J. Kinicki, Kathryn J. L. Jacobson, Suzanne J. Peterson and Gregory E. Prussia

Development and validation of the performance management behaviour questionnaire Personnel Psychology, 2013, 66, 1-45

281 Ning Li, Dan S. Chiaburu, Bradley L. Kirkman and Zhitao Xie

Spotlight on the followers: An examination of moderators of relationships between transformational leadership and subordinates’ citizenship and taking charge


282 Thomas W. H. Ng and Daniel C. Feldman Changes in perceived supervisor embeddedness: Effects on employees’ embeddedness, organizational trust, and voice behaviour


283 Gera Noordzij, Edwin A. J. Van Hooft, Heleen Van Mierlo, Arian Van Dam and Marise Ph. Born

The effects of a learning-goal orientation training on self-regulation: A field experiment among unemployed job seekers Personnel Psychology, 2013, 66, 723-755

284 Robert S. Rubin, Erich C. Dierdorff and Daniel G. Bachrach

Boundaries of citizenship behaviour: Curvilinearity and context in the citizenship and task performance relationship Personnel Psychology, 2013, 66, 377-406

352

285 Deborah E. Rupp, Ruodan Shao, Meghan A. Thornton and Daniel P. Skarlicki

Applicants’ and employees’ reactions to corporate social responsibility: The moderating effects of first-party justice perceptions and moral identity


286 Daniel B. Turban, Felissa K. Lee, Serge P. Da Motta Veiga, Dana L. Haggard and Sharon Y. Wu

Be happy, don’t wait: The role of trait affect in job search Personnel Psychology, 2013, 66, 483-514

287 Devasheesh P. Bhave The invisible eye? Electronic performance monitoring and employee job performance Personnel Psychology, 2014, 67, 605-635

288 Stephan A. Boehm, Florian Kunze and Heike Bruch

Spotlight on age-diversity climate: The impact of age-inclusive HR practices on firm-level outcomes Personnel Psychology, 2014, 67, 667-704

289 Wendy R. Boswell, Julie B. Olson-Buchanan and T. Brad Harris

I cannot afford to have a life: Employee adaptation to feelings of job insecurity Personnel Psychology, 2014, 67, 887-915

290 Amy E. Colbert, Murray R. Barrick and Bret H. Bradley

Personality and leadership composition in top management teams: Implications for organizational effectiveness Personnel Psychology, 2014, 67, 351-387

291 Hong Deng and Kwok Leung Contingent punishment as a double-edged sword: A dual-pathway model from a sense-making perspective Personnel Psychology, 2014, 67, 951-980

292 Graham Brown, Craig Crossley and Sandra L. Robinson

Psychological ownership, territorial behaviour, and being perceived as a team contributor: The critical role of trust in the work environment


293 T. Brad Harris, Ning Li, Wendy R. Boswell, Xin-An Zhang and Zhitao Xie

Getting what’s new from newcomers: Empowering leadership, creativity, and adjustment in the socialization context Personnel Psychology, 2014, 67, 567-604

294 Dong Liu, Morela Hernandez and Lei Wang The role of leadership and trust in creating structural patterns of team procedural justice: A social network investigation Personnel Psychology, 2014, 67, 801-845

295 Jean M. Phillips, Stanley M. Gully, John E. McCarthy, William G. Castellano and Mee Sook Kim

Recruiting global travellers: The role of global travel recruitment messages and individual differences in perceived fit, attraction, and job pursuit intentions


296 Belle Rose Ragins, Karen S. Lyness, Larry J. Williams and Doan Winkel

Life spillovers: The spillover of fear of home foreclosure to the workplace Personnel Psychology, 2014, 67, 763-800

297 B. Sebastian Reiche, Pablo Cardona, Yin-teen Lee, Miguel Angel Canela, Esther Akinnukawe , et al.

Why do managers engage in trustworthy behaviour? A multilevel cross-cultural study in 18 countries Personnel Psychology, 2014, 67, 61-98

298 Hong Ren, Margaret A. Shaffer, David A. Harrison, Carmen Fu and Katherine M. Fodchuk

Reactive adjustment or proactive embedding? Multistudy, multiwave evidence for dual pathways to expatriate retention Personnel Psychology, 2014, 67, 203-239

353

299 Ruodan Shao and Daniel P. Skarlicki Service employees’ reactions to mistreatment by customers: A comparison between North America and East Asia Personnel Psychology, 2014, 67, 23-59

300 Jerel E. Slaughter, Michael S. Christian, Nathan P. Podsakoff, Evan F. Sinar and Filip Lievens

On the limitations of using situational judgement tests to measure interpersonal skills: The moderating influence of employee anger

Personnel Psychology 2014, 67, 847-885

301 Jeffrey R. Spence, Douglas J. Brown, Lisa M. Keeping and Huiwen Lian

Helpful today, but not tomorrow? Feeling grateful as a predictor of daily organizational citizenship behaviors Personnel Psychology, 2014, 67, 705-738

354

17 REFERENCES

Abraham R. (1999) Emotional intelligence in organisations: a conceptualization.

Genetic, Social and General Psychology Monographs 125, 209–227.

Akaike, H. (1973). Information theory and an extension of the maximum

likelihood principle. In B. N. Petrov & B. F. Csaki (Eds.), Second International

Symposium on Information Theory, (pp. 267-281). Academiai Kiado: Budapest.

Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis.

Proceedings of the Third Berkeley Symposium on Mathematical Statistics and

Probability (pp. 111-150). Berkeley: University of California Press.

Australian Government. (Compare, 2011). Work ability—Professor Juhani

Ilmarinen. Retrieved 1 March, 2013, from

http://www.comcare.gov.au/news__and__media/news_listing/work_abilityprofessor_juh

ani_ilmarinen

Avolio, B. J., Yammarino, F. J., & Bass, B. M. (1991). Identifying common

methods variance with data collected from a single source: An unresolved sticky issue.

Journal of management, 17, 571-586.

Bagozzi, R. P. (1977). Structural equation models in experimental research.

Journal of Marketing Research, 14, 209-226.

Bagozzi, R. P. (1980). Causal Modeling in Marketing, Wiley & Sons,. New

York, NY.

355

Bagozzi, R. P., & Yi, Y. (1990). Assessing method variance in multitrait-

multimethod matrices: The case of self-reported affect and perceptions at work. Journal

of Applied Psychology, 75(5), 547-560.

Bagozzi, R. P., & Yi, Y. (1991). Multitrait-Multimethod matrices in consumer

research. Journal of Consumer Research, 17(4), 426-439.

Barclay, D., Higgins, C. and Thompson, R. (1995). The Partial Least Squares

(PLS) Approach to Causal Modeling: Personal Computer Adoption and Use an

Illustration, Technology Studies, 2, 285-309.

Becker, J.M., Klein, K., Wetzels, M., (2012). Hierarchical latent variable models

in PLS-SEM: guidelines for using reflective-formative type models, Long Range

Planning 45 (6), 359-394.

Bentler, P. M. (1968). Alpha-maximized factor analysis (Alphamax): Its relation

to alpha and canonical factor analysis. Psychometrika, 33, 335-345.

Bentler, P. M. (1972). A lower-bound method for the dimension-free

measurement of internal consistency. Social Science Research, 1, 343-357.

Bentler, P. M. (1986). Structural modeling and psychometrika: An historical

perspective on growth and achievements. Psychometrika, 51(1), 35-51

Bentler, P. M. (2006). EQS 6 structural equations program manual. Encino, CA:

Multivariate Software (www.mvsoft.com). Los Angeles.

356

Bentler, P. M. (2007). Covariance structure models for maximal reliability of

unit-weighted composites. In S. –Y. Lee (Ed.), Handbook of latent variable and related

models (pp. 1-19). Amsterdam: North-Holland.

Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal

consistency reliability. Psychometrika, 74, 137-143.

Bentler, P. M. (2014). Covariate-free and Covariate-dependent Reliability. The

79th Annual Meeting of the Psychometric Society. Madison, Wisconsin. July 22-25.

Blakeley, J. A. & Ribeiro, V. E. S. (2008). Early retirement among registered

nurses: Contributing factors. Journal of Nursing Management, 16, 29-37.

Blalock, H. M. (1971). Causal models in the social sciences. Chicago: Aldine-

Atherton.

Bock, R. D., and Aitkin, M. (1981). Marginal maximum likelihood estimation of

item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.

Bollen, K. A. (1984). Multiple indicators: Internal consistency or no necessary

relationship? Quality and Quantity, 18, 377-385.

Bollen, K. A. (1989). Structural Equations with Latent Variables, New York:

John Wiley & Sons.

Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A

structural equation perspective. Psychological Bulletin, 110, 305-314.

357

http://conferencing.uwex.edu/conferences/ps2014/documents/pbentler.pdf

http://conferencing.uwex.edu/conferences/ps2014/documents/pbentler.pdf

Boumans, N. P. G., de Jong, A. H. J., & Vanderlinden, L. (2008). Determinants

of early retirement intentions among Belgian nurses. Journal of Advanced Nursing,

63,(1), 64-74.

Brooke, E., Goodall, J., & Mawren, D. (2010). Retaining older workforces in

aged care work. Paper presented at the 4th International Symposium on Work Ability:

Age Management during the Life Course, pp. 187-197, Tampere, Finland.

Browne, M. W. (1968). A comparison of factor analytic techniques.

Psychometrika, 33, 267-334.

Buck, R., Varnava, A., Wynne-Jones, G., Phillips. C., Farewell, D., Porteous, C.,

Webb, K., Buttton, L., Cooper, L., & Main, C. (2008). Health and well-being in work in

Merthyr Tydfill: A biopsychosocial approach. Well-being in Work Stage 2: Final Report

to the Wales Centre for Health and Welsh Assembly Government.

www.wellbeinginwork.org

Burnham, K. P. & Anderson, D. R. (2004). Multimodel Inference Understanding

AIC and BIC in Model Selection. Sociological Methods & Research, 33 (2).

BWA Centre for Research. (2007a). The redesigning work for an ageing society

project: Fact Sheet 1. Retrieved 2 March, 2013, from

http://www.swinburne.edu.au/business/business-work-

ageing/documents/ARC_FactSheet1_16Mar07.pdf

358

http://www.wellbeinginwork.org/





BWA Centre for Research. (2007b). What is work ability?: Fact Sheet 2.

Retrieved 2 March, 2013, from http://www.swinburne.edu.au/business/business-work-

ageing/documents/ARC_FactSheet2_10Sep07.pdf

Byrne, B. M. (2006). Structural equation modeling with EQS: Basic concepts,

application, and programming. New Jersey: Lawrence Elbaum Associates.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation

by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.

Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. M. (2006).

Measurement Error in Nonlinear Models: A Modern Perspective, 2nd ed. Chapman and

Hall/CRC Press: Boca Raton.

Cattell, R. B. (1978). The scientific use of factor analysis in behavioural and life

sciences. New York: Plenum.

Cheung, G. W. & Rensvold, R. B. (2002): Evaluating Goodness-of-Fit Indexes

for Testing Measurement Invariance, Structural Equation Modeling: A Multidisciplinary

Journal, 9(2), 233-255.

Cheung, M. W. (2008). A Model for Integrating Fixed-, Random-, and Mixed-

Effects Meta-Analyses Into Structural Equation Modeling. Psychological Methods,

13(3), 182–202.

Chin, W. W., & Newsted, P. R. (1999). Structural equation modeling analysis

with small samples using partial least squares. In Hoyle, R. (Ed.), Statistical strategies

for small samples research (pp. 307–341). Thousand Oaks, CA: Sage.

359

Chin, W.W. (1998). The partial least squares approach to structural equation

modeling. In: Marcoulides, G.A. (Ed.), Modern Methods for Business Research.

Erlbaum, Mahwah, pp. 295e358.

Chin, W.W., Marcolin B.L., & Newsted, P.R. (2003) A partial least squares

latent variable modeling approach for measuring interaction effects. Results from a

Monte Carlo simulation study and an electronic-mail emotion/adoption study. Inf Syst

Res 14(2):189–217

Chin, W. W. (2010). How to write up and report PLS analyses. In Esposito, V., et

al. (eds.),. Handbook of Partial Least Squares, pp 655-690.

Ciarrochi J., Chan A.Y.C. & Caputi P. (2000) A critical evaluation of the

emotional intelligence concept. Personality and Individual Differences 28, 1477–1490.

Copertano, A., Bevilacqua, G., Barbaresi, M., Barchiesi, F., & Copertano, B.

(2010). Work-related stress: Risk assessment in the local regional health service unit of

Ancona [La valutazione dello stress lavoro-correlato nell’azienda sanitaria di Ancona].

Giornale Italiano di Medicina del Lavoro ed Ergonomia, 29(4), 128-129.

Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and

applications. Psychological Bulletin, 78, 98-104.

Costner, H. L. (1971). Utilizing causal models to discover flaws in experiments.

Sociometry, 34, 398-410.

Cox T, Thirlaway M, Gotts G, Cox S. (1983). The nature and assessment of

general wellbeing. Journal of Psychosomatic Research, 27, 353-359.

360

http://europepmc.org/search?page=1&query=JOURNAL:%22G+Ital+Med+Lav+Ergon%22




Cox T. (1997). Workplace health promotion. Work & Stress, 11, 1-5.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test

theory. New York: Holt, Rinehart, and Winston.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.


Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological

tests. Psychological Bulletin, 52, 281-302.

Crossley, C. D., Bennett, R. J., Jex S. M. , & Burnfield, J. L. (2007).

Development of a global measure of job embeddedness and integration into a traditional

model of voluntary turnover. Journal of Applied Psychology. 92(4),1031-1142.

Cunningham, E. (2008). A practical guide to Structural Equation Modeling using

AMOS. Melbourne: Statsline.Daws, J. (2012). Finnish history of work ability research

and age management Retrieved 6 March, 2013, from

http://www.ngssuper.com.au/assets/Images/Supermembers/NGS-SA-2011-12-

WinnerJimDaws-1102-1012.pdf

D’Errico, A., Viotti, S., Baratti, A., Mottura, B., Barocelli, A.P., Tagna, M.,

Sgambelluri, B., Battaglino, P., & Converso, D. (2013). Low back pain and associated

presenteeism among hospital nursing staff. Journal of Occupational Health, 55, 276-

283.

361

de Zwart, B. C., Frings-Dresen, M. H., & van Duivenbooden, J. C. (2002). Test–

retest reliability of the Work Ability Index questionnaire. Occupational Medicine, 52(4),

177-181.

DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological

Methods, 2, 292-307.

DeMars, C. (2010). Item Response Theory. New York: Oxford University Press.

Diamantopoulos, A, & Winklhofer H. (2001). Index Construction with Formative

Indicators: An Alternative to Scale Development. Journal of Marketing Research, 38

(2), 269-277.

Diamantopoulos, A. (2010). Reflective and Formative Metrics of Relationship

Value: Response to Baxter’s Commentary Essay, Journal of Business Research, 63(1),

91-93.

Diamantopoulos, A. and Winklhofer, H. M. (2001), "Index Construction with

Formative Indicators: An Alternative to Scale Development." Journal of Marketing

Research, 38, 269-277.

Diamantopoulos, A., Riefler, P., and Roth, K. P. (2008). Advancing Formative

Measurement Models, Journal of Business Research, 61(12), pp. 1203-1218.

Donaldson, S. I., & Grant-Vallone, E. J. (2002). Understanding self-report bias in

organizational behavior research. Journal of Business and Psychology, 17(2), 245-260.

Doty, D. H., & Glick, W. H. (1998). Common method bias: Does common

methods variance really bias results? Organizational Research Methods, 1(4), 374-406.

362

Edwards, J. R., and Bagozzi, R. P. (2000). On the Nature and Direction of

Relationships Between Constructs and Measures, Psychological Methods (95:2), 155-74.

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap.

Monographs on Statistics and Applied Probability, no. 57. New York, NY: Chapman

and Hall.

Eskelinen, L., Kohvakka, A., Merisalo, T., Hurri, H., & Wagar, G. (1991).

Relationship between the self-assessment and clinical assessment of health status and

work ability. Scand J Work Environ Health, 17(Suppl 1), 40-47.

European Network for Workplace Health Promotion (ENWHP) & National

WORK ABILITY INDEX (WAI) Network. (2012). Work Ability Index - Europe.

Retrieved 27 Feb, 2013, from http://www.thcu.ca/workplace/sat/pubs/tool_159.pdf

Faragher, E. B., Cooper CL, & Cartwright S. (2004). A shortened stress

evaluation tool (ASSET). Stress and Health, 20, 189-201.

Finnish Institute of Occupational Health. (2011). Multidimensional work ability

model. Helsinki, Finland.

Fochsen, G., Josephson, M., Hagberg, M., Toomingas, A., & Lagerström, M.

(2006). Predictors of leaving nursing care: a longitudinal study among Swedish nursing

personnel. Occupational and Environmental Medicine, 63(3), 198-201.

doi:10.1136/oem.2005.021956

363

Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with

unobservable variables and measurement error. Journal of Marketing Research, 18, 39–

50.

Fornell, C., and Bookstein, F. L. (1982). A Comparative Analysis of Two

Structural Equation Models: LISREL and PLS Applied to Market Data, in C. Fornell

(ed.), A Second Generation of Multivariate Analysis, New York: Praeger, 289-324.

Frisch, R. (1934). Statistical confluence analysis by means of complete

regression systems. Oslo: Oslo University.

Frisch, R. and Waugh, F. (1933). Partial Time Regressions as Compared with

Individual Trends. Econometrica, 1 (4), 387-401.

Ganster, D. C., Hennessey, H. W., & Luthans, F. (1983). Social desirability

response effects: Three alternative models. The Academy of Management Journal 26(2),

321-331.

Geladi, P. (1988) "Notes on the History and Nature of Partial Least Squares

(PLS) Modeling", Journal of Chemometrics, 2, 231-246

Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale

development incorporating unidimensionality and its assessment. Journal of Marketing

Research, 25, 186-192.

Gignac, G. E. (2007). Multifactor modeling in individual differences research:

Some recommendations and suggestions. Personality and Individual Differences, 42, 37-

48.

364

Gignac, G. E. (2008). Higher-order models versus direct hierarchical modes: g as

superordinate or breadth factor? Psychology Science, 50, 21-43.

Gignac, G. E. (2013). Modeling the Balanced Inventory of Desirable

Responding: Evidence in favour of a revised model of socially desirable responding.

Journal of Personality Assessment, 95, 645-656.

Gignac, G. E., & Watkins, M. W. (2013). Bifactor modeling and the estimation

of model-based reliability in the WORK ABILITY INDEX (WAI)S-IV. Multivariate

Behavioral Research, 48, 639-662.

Gilbreath, B., & Frew, E. J. (2008). The stress-related presenteeism scale.

Colorado State University - Pueblo, Hasan School of Business, Colorado State

University – Pueblo, Pueblo, CO.

Glick, W. H., Jenkins, G. D., Jr., & Gupta, N. (1986). Method versus substance:

How strong are underlying relationships between job characteristics and attitudinal

outcomes? Academy of Management Journal, 29(3), 441-464.

Goldberger, A. S. (1971). Econometrics and psychometrics: A survey of

communalities. Psychometrika, 36, 83-107.

Goldberger, A. S., & Duncan, O. D. (Eds.). (1973). Structural equation models in

the social sciences. New York.

Gould, R., Ilmarinen, J., Järvisalo, J., & Koskinen, S. (Eds.). (2008). Dimensions

of Work Ability: Results of the Health 2000 Survey. Helsinki, Finland.

365

Grandey, A. A. (2000). Emotion regulation in the workplace: a new way to

conceptualize emotional labor. Journal of Occupational Health Psychology 5, 95-110.

Green, S. B., & Hershberger, S. L. (2000). Correlated errors in true score models

and their effect on coefficient alpha. Structural Equation Modeling, 7, 251-270.

Green, S. B., & Yang, Y. (2009a). Commentary on coefficient alpha: A

cautionary tale. Psychometrika, 74, 121-135.

Green, S. B., & Yang, Y. (2009b). Reliability of summed item scores using

structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74,

155-167.

Green, S.B. & Hershberger, S.L. (2000). Correlated errors in true score models

and their effect on coefficient alpha. Structural Equation Mode1ing, 7, 251-270.

Griffiths, A., Cox, T., Karanika, M., Khan, S., & Tomas, J.M. (2006). Work

design and management in the manufacturing sector: development and validation of the

work organisation assessment questionnaire. Occupational and Environmental Medicine,

63, 669-675.

Guo, K.H., Yuan, Y., Archer, N.P., & Connelly, C.E. (2011). Understanding

nonmalicious security violations in the workplace: A composite behavior model. Journal

of Management Information Systems, 28(2), 203-236.

Gustafsson, J. E., & Balke, G. (1993). General and specific abilities as predictors

of school achievement. Multivariate Behavioral Research, 28, 407–434.

366

Guttman, L. (1952). Multiple group methods for common factor analysis: Their

basis, computation, and interpretation. Psychometrika, 17, 209-222.

Guttman, L. A. (1945). A basis for analyzing test-retest reliability.


Haavelmo, T. (1943). The Statistical Implications of a System of Simultaneous

Equations. Econometrica, 11, 1-12.

Hair, J. F. Hult,G., Ringle, C. and Sarstedt M. (2014). A primer on partial least

squares structural equation modeling (PLS-SEM). SAGE Publications, Thousand Oaks,

Calif.

Hair, J. F. Hult,G., Ringle, C. and Sarstedt M. (2014). A primer on partial least

squares structural equation modeling (PLS-SEM). SAGE Publications, Thousand Oaks,

Calif.

Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data

Analysis, seventh ed. Prentice Hall, Englewood Cliffs.

Hair, J.F., Ringle, C.M., & Sarstedt, M., 2011. PLS-SEM: indeed a silver bullet,

Journal of Marketing Theory and Practice 19 (2), 139-151.

Hasselhorn, H-M., Muller, B.H., & Tackenberg, P. (2005, July). NEXT Scientific

Report. Retrieved 22 September 2015 from

http://www.econbiz.de/archiv1/2008/53602_nurses_work_europe.pdf

367

Hauser, R. M., and Goldberger, A. S. (1971). The Treatment of Unobservable

Variables in Path Analysis. Chapter 4 in Sociological Methodology, edited by H.L.

Costner. San Francisco: Jossey-Bass.

Health and Safety Executive Guidelines (2010). Retrieved from:

http://www.hse.gov.uk/guidance/ on 12/10/2011.

Heise, D. R., & Bohrnstedt, G. W. (1970). Validity, invalidity, and reliability. In

E. F. Borgatta (Ed.), Sociological methodology 1970 (pp. 104-129). San Francisco:

Jossey-Bass.

Heise, D.R. (1972). Employing nominal variables, induced variables, and block

variables in path analysis. Social Methods Research, 1, 147–173.

Hendry, D. F., and Morgan. M. (1989). A Re-Analysis of Confluence Analysis.

Oxford Economic Papers. 41, 35-52.

Holtom, B. C., Mitchell, T. R., & Lee, T. W. (2006). Increasing human and

social capital by applying job embeddedness theory. Organizational Dynamics, 35(4),

316–331.

Holzinger, K. J. (1941). Factor Analysis. Chicago: University of Chicago Press.

Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika,

2, 41-54.

Hotelling, H. (1933). Analysis of a Complex of Statistical Variables into

Principal Components. Journal of Educational Psychology, 24, 498-520.

368

Howell, R.D., Breivik, E., & Wilcox, J.B. (2007). Reconsidering formative

measurement. Psychological Methods, 12, 205–218.

Hoyt, C. (1941). Test Reliability Estimated by Analysis of Variance,


Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance

structure analysis: Conventional criteria versus new alternatives. Structural Equation

Modeling, 6, 1-55.

Ilmarinen, J. (2003). Work Ability Index: a tool for occupational health research

and practise. Paper presented at the 11th Annual EUPHA meeting, Rome, Italy.

Ilmarinen, J. (2007). The Work Ability Index (WAI). Occupational Medicine,

57, 160.

Ilmarinen, J. (2009). Work ability: a comprehensive concept for occupational

health research and prevention; Editorial. Scand J Work Environ Health, 35(1), 1-5.

Ilmarinen, J. (2010). 30 years’ work ability and 20 years’ age management.

Paper presented at the 4th International Symposium on Work Ability: Age Management

during the Life Course, pp. 12-22, Tampere, Finland.

Ilmarinen, J., & Tuomi, K. (2004). Past, present and future of work ability.

Helsinki, Finland: Finnish Institute of Occupational Health.

Ilmarinen, J., Tuomi, K., & Klockars, M. (1997). Changes in the work ability of

active employees as measured by the work ability index over an 11-year period. Scand J

Work Environ Health 23(Suppl 1), 49-57.

369

Ilmarinen, J., Tuomi, K., & Seitsamo, J. (June 2005). New dimensions of work

ability. International Congress Series, 1280, 3-7.

Irwin, J. O. (1935). On the indeterminacy in the estimate of g. British Journal of

Psychology, 25, 393-394.

Jackson, P., & Agunwamba, C. (1977). Lower bounds for the reliability of the

total score on a test composed of non-homogeneous items: I: Algebraic lower bounds.

Psychometrika, 42(4), 567-578.

Jarvis, C. B., MacKenzie, S. B., and Podsakoff, P. M. (2003). A Critical Review

of Construct Indicators and Measurement Model Misspecification in Marketing and

Consumer Research, Journal of Consumer Research 30 (2), 199-218

Jennrich, R. I. & Sampson, P.F. (1966). Rotation for simple loadings.


Jennrich, R.I., Clarkson. D. B. (1980). A Feasible Method for Standard Errors of

Estimate in Maximum Likelihood Factor Analysis. Psychometrika, 45, 237-247.

Johnson, S., Cooper, C., Cartwright, S., Donald, I., Taylor, P., Millet, C. (2005).

The experience of work‐related stress across occupations. Journal of Managerial

Psychology, 20, 2, 178 – 187.

Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood

factor analysis. Psychometrika, 34, 183-202.

Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations.

Psychometrika, 36, 409–426.

370

Jöreskog, K. G. (1973). A General Method for Estimating a Linear Structural

Equation System. In Structural Equation Models in the Social Sciences. Edited by A.

Goldberger and O.D. Duncan. (pp. 85-112), New York: Academic Press.

Jöreskog, K. G., and Sörbom, D. (2001). LISREL 8 User's Reference Guide.

Chicago: Scientific Software International.

Kaiser, H. F. (1958). The Varimax Criterion for Analytic Rotation in Factor

Analysis. Psychometrika, 23, 187-200.

Karimi, L., & Bentler, P. M. (under review). Application of covariate-free and

covariate-dependent reliability.

Karimi, L., & Meyer, D. (2014).Validity and Model-Based Reliability of the

Work Organisation Assessment Questionnaire WOAQ Among Nurses. Nursing Outlook.

Klein, L, and Goldberger. A. S. (1955). An Econometric Model of the United

States 1929- 1952. Amsterdam: North-Holland.

Klein, L. (1950). Economic Fluctuations in the United States 1921-1941. New

York: John Wiley.

Kline, R. B. (1998). Principles and Practice of Structural Equation Modeling.

New York: The Guilford Press.

Koopmans, T. (1945). Statistical Estimation of Simultaneous Economic

Relations. Journal of the American Statistical Association, 40, 488-66.

371

Kuder, G. E, & Richardson, M. W. ( 1937). The theory of estimation of test

reliability. Psychometrika, 2, 151-160.

LaMontagne, A (2004), Improving OHS policy through intervention research.

Journal of Occupational Health and Safety, 20 (2), 107-113.

Laschinger HKS (2012) Job and career satisfaction and turnover intentions of

newly graduated nurses, Journal of Nursing Management 20, 472–484

Law, K. S., Wong, C. S., Mobley, W. H. (1998). Towards a Taxonomy of

Multidimensional Constructs. Academy of Management Review, 23 (4), 741-755.

Lawley, D. N. (1940). The Estimation of Factor Loadings by the method of

Maximum Likelihood. Proceedings of the Royal Society of Edinburgh, 60, 64-82.

Lindell, M. K., & Whitney, D. J. (2001). Accounting for common method

variance in cross-sectional research designs. Journal of Applied Psychology, 86, 114-

121.

llmarinen, J. (1991). The aging worker. Editorial. Scand J Work Environ Health,

17 (Suppl 1), 141 p.

Long, J. S. (1983). Confirmatory Factor Analysis, Beverly Hills, CA: Sage

Publications.

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores.

Reading, MA: Addison-Wesley.

372

MacCallum, R. C., & Browne, M.W. (1993). The use of causal indicators in

covariance structure models: Some practical issues. Psychological Bulletin, 114, 533-

541.

MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis

and determination of sample size for covariance structure modeling. Psychological

Methods, 1, 130-149.

MacKenzie, S. B. Podsakoff, P. M. & Podsakoff, N. P. (2011): Construct

Measurement and Validation Procedures in MIS and Behavioral Research, Integrating

New and Existing Techniques. MIS Quarterly, 35 (2), 293–334.

MacKenzie, S. B., Podsakoff, P. M., and Jarvis, C. B. (2005). The Problem of

Measurement Model Misspecification in Behavioral and Organizational Research and

Some Recommended Solutions, Journal of Applied Psychology, 90 (4), 710-730.

Magnavita N, Mammi F, Roccia K, & Vincenti F (2007). WOA: un

questionnario per la valutazione dell’ organizzazione del lavoro. Traduzione e

validazione della versione italiana. [WOA: a questionnaire for the evaluation of work

organization. Translation and validation of the Italian version]. Giornale Italiano di

Medicina del Lavoro ed Ergonomia, 29, 663-665.

Malhotra, N. K., Kim, S. S., & Patil, A. (2006). Common method variance in IS

research: A comparison of alternative approaches and a reanalysis of past research.

Management Science, 52(12), 1865-1883.

373

Mann, H. B., and Wald, A. (1943). On the Statistical Treatment of Linear

Stochastic Difference Equations. Econometrica 11, 173-220.

Marsh, H. W., Hau, K. T., & Grayson, D. (2005). Goodness of fit in structural

equation models. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary

psychometrics: A Festschrift for Roderick P. McDonald (pp.225-340). Mahwah, NJ:

Erlbaum.

Martus, P., Jakob, O., Rose, U., Seibt, R., & Freude, G. (2010). A comparative

analysis of the Work Ability Index. Occupational Medicine, 60(7), 517-524.

Matsueda R. L. (2012). Key Advances In The History Of Structural Equation

Modeling. Handbook of Structural Equation Modeling. Edited by R. Hoyle. New York,

NY: Guilford Press

McDonald, R. P. (1970). The theoretical foundations of principal factor analysis,

canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and

Statistical Psychology, 23, 1-21.

McDonald, R. P. (1999). Test Theory: A unified treatment. Mahwah. N.J.:

Erlhaum.

Meade, A. W., Watson, A. M., & Kroustalis, C. M. (2007). Assessing common

methods bias in organisational research. Paper presented at the 22nd Annual Meeting of

the Society for Industrial and Organizational Psychology, New York.

374

Messick, S. (1995). Validity of psychological assessment: Validation of

inferences from persons’ responses and performances as scientific inquiry into score

meaning. American Psychologist, 50, 741-749.

Miller, M. (1995). Coefficient alpha: A basic introduction from the perspectives

of classical Test Theory and structural equation modeling. Structural Equation

Modeling, 2, 255-273.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-

381.

Morales, M. G. (2011). Partial Least Squares (PLS) Methods: Origins, Evolution,

and Application to Social Sciences. Communications in Statistics Theory and Methods,

40 (13), 2305-2317.

Morschhäuser, M., & Sochert, R. (2006). Healthy Work in an Ageing Europe:

Strategies and Instruments for Prolonging Working Life. Essen, Germany.

Mosier, C. I. (1939). Determining a simple structure when loadings for certain

tests are known. Psychometrika, 4, 149-162.

Muthén, B. (1984). A General Structural Equation Model with Dichotomous,

Ordered Categorical, and Continuous Latent Variable Indicators. Psychometrika, 49,

115-32.

Muthén, B. (1994). Multi-Level Covariance Structure Analysis. Sociological

Methods and Research, 22, 376-98.

Muthén, B., and Muthén. L. K. (2004). Mplus User’s Guide. Los Angeles, CA:

375

Nelson, C. R. (1972). The Prediction Performance of the FRB-MIT-PENN

Model of the U.S. Economy. American Economic Review. 62, 902-917.

Nunnally, J. C. (1978). Psychometric Theory (2nd ed.), McGraw-Hill, New

York.

Nunnally, J. C., and Bernstein, I. H. (1994). Psychometric Theory (3rd ed.), New

York: McGraw Hill.

Oakman, J., & Wells, Y. (2009). Can organizations influence employees’

intentions to retire? Paper presented at the 3rd International Symposium on Work

Ability: Promotion of Work Ability Towards Productive Aging, pp. 133 -138, Hanoi,

Vietnam.

Palermo, J. (2010). Investigating modifiable organizational factors relating to

workability: a focus on gendered culture. Paper presented at the 4th International

Symposium on Work Ability: Age Management during the Life Course, pp. 365-377,

Tampere, Finland.

Palermo, J., Webber, L., Smith, K., & Khor, A. (2009). Factors that predict work

ability: Incorporating a measure of organizational values towards ageing. Paper

presented at the 3rd International Symposium on Work Ability: Promotion of Work

Ability Towards Productive Aging, pp. 45 -58, Hanoi, Vietnam.

Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge, UK:

Cambridge University Press.

376

Pearson, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in

Space. Philosophical Magazine, 6, 559-72.

Pensola, T., Järvikoski, A., & Järvisalo, J. (2008). Unemployment and Work

Ability. In Gould, R., Ilmarinen, J., Järvisalo, J., & Koskinen, S. (Eds.) Dimensions of

Work Ability: Results of the Health 2000 Survey. Helsinki, Finland.

Petrides K.V. & Furnham A. (2000) On the dimensional structure of emotional

intelligence. Personality and Individual Differences 29, 313–320.

Petter, S., Straub, D., and Rai, A. (2007). Specifying Formative Constructs in

Information Systems Research, MIS Quarterly, 31 (4), 623-656.

Podsakoff, N. P., Shen, W., and Podsakoff, P. M. (2006). The Role of Formative

Measurement Models in Strategic Management Research: Review, Critique, and

Implications for Future Research, Research Methodology in Strategy and Management

(3), 197-252.

Podsakoff, P. M., & Organ, D. W. (1986). Self-reports in organizational research:

Problems and prospects. Journal of Management, 12, 531-544.

Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003).

Common method biases in behavioral research: A critical review of the literature and

recommended remedies. Journal of Applied Psychology, 88(5), 879-903.

Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized Multilevel

Structural Equation Modeling. Psychometrika 69, 167-90.

377

Radkiewicz, P., Widerszal-Bazyl, M., & the NEXT-Study group. (2005).

Psychometric properties of Work Ability Index in the light of comparative survey study.

International Congress Series, 1280, 304–309.

Raftery, Adrian E. 1995. Bayesian Model Selection in Social Research.

Sociological Methodology 25:111-95.

Raykov, T., & Marcoulides, G. A. (2011). 7 Procedures for Estimating

Reliability. In Introduction to Psychometric Theory. (pp160–196). Abingdon, Oxon:

Routledge.

Reise, P., Moore, T. M. & Haviland, M. G. (2010). Bifactor Models and

Rotations: Exploring the Extent to which Multidimensional Data Yield Univocal Scale

Scores. Journal of Personality Assessment. 92(6), 544–559.

Reise, S. P, Bonifay, W. E., & Haviland, M. G. (2012). Scoring and modeling

psychological measures in the presence of multidimensionality. Journal of Personality

Assessment, 95, 129-140.

Revelle, W., & Zinbarg, R. E. (2008). Coefficient alpha, beta, omega, and the

glb: Comments on Sijtsma. Psychometrika, 74, 145-154.

Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three

perspectives: Examining post hoc statistical techniques for detection and correction of

common method variance. Organizational Research Methods, 12, 762-800.

Rick, J., & Briner, R.B. (2000). Psychosocial risk assessment: problems and

prospects. Occupational Medicine, 50(5), 310-314.

378

Rigdon, E.E. (2012). Rethinking partial least squares path modeling: in praise of

simple methods, Long Range Planning 45 (5e6), 341-358.

Ringle, C.M., Sarstedt, M., & Straub, D.W. (2012). A critical look at the use of

PLS-SEM in MIS quarterly, MIS Quarterly 36 (1), iiiexiv.

Ringle, C.M., Wende, S., & Will, S. (2005). SmartPLS 2.0 (M3) Beta,

Hamburg http://www.smartpls.de.

Rogers, W. M., Schmitt, N., & Mullins, M. E. (2002). Correction for unreliability

of multifactor measures: Comparison of alpha and parallel forms approaches.

Organizational Research Methods, 5, 184-199.

Roldán, J. L. and Sánchez-Franco, M. J. (2012). Variance-Based Structural

Equation Modeling: Guidelines for Using Partial Least Squares in Information Systems

Research. Research Methodologies, Innovations and Philosophies in Software Systems

Engineering and Information Systems. IGI Global, 193-221.

Roy, S. Tarafdar, M., Ragu-Nathan, T.S. & Marsillac, E. (2012). The Effect of

Misspecification of Reflective and Formative Constructs in Operations and

Manufacturing Management Research. Journal of Business Research Methods, 10 (1),

34-52.

Satorra, A., & Bentler, P. M. (1988). Scaling corrections for statistics in

covariance structure analysis (UCLA Statistics Series 2). Los Angeles: UCLA,

Department of Psychology.

379

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard

errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds), Latent

variables analysis: Applications for developmental research (399-419).

Saunders, J.B., Aasland, O.G., Babor, T.F., de la Fuente, J.R. and Grant, M.

(1993). Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO

collaborative project on early detection of persons with harmful alcohol consumption. II.

Addiction, 88, 791-804.

Schmid, J., & Leiman, J. (1957). The development of hierarchical factor

solutions. Psychometrika, 22, 53--61.

Schriesheim, C. A., Kinicki, A. J., & Schriesheim, J. F. (1979). The effect of

leniency on leader behavior descriptions. Organizational Behavior and Human

Performance 23, 1-29.

Schumacker, R. E., & Lomax, R. G. (2004). A beginner's guide to structural

equation modeling, Second edition. Mahwah, NJ: Lawrence Erlbaum Associates.

Schutte N.S. & Malouff J.M. (1999) Measuring Emotional Intelligence and

Related Constructs. E. Mellen Press, Lewiston, NY.

Schutte N.S., Malouff J.M., Hall L.E., Haggerty D.J., Cooper J.T., Golden C.J. &

Dornheim L. (1998) Development and validation of a measure of emotional intelligence.

Personality and Individual Differences 25, 167–177.

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6,

461-464.

380

Shipley, B. (2000). Cause and Correlation in Biology; A User’s Guide to Path

Analysis, Structural Equations and Causal Inference. Cambridge, UK: Cambridge

University Press.

Sijtsma, K. (2008). On the use, the misuse, and the very limited usefulness of

Cronbach’s alpha. Psychometrika, 74, 107-120.

Sijtsma, K. (2009b). Reliability beyond theory and into practice. Psychometrika,

74, 169-173.

Skrondal, A., and Rabe-Hesketh, S. (2004). Generalized Latent Variable

Modeling: Multilevel, Longitudinal, and Structural Equation Models. Boca Raton:

Chapman and Hall.

Sočan, G. (2000). Assessment of reliability when test items are not essentially Ʈ-

equivalent. Developments in Survey Methodology, 15, 23-35.

Sörbom, D. (1974). A general method for studying differences in factor means

and factor structures between groups. British Journal of Mathematical and Statistical

Psychology, 27, 229-239.

Spearman, C. (1904). General Intelligence, Objectively Determined and

Measured. American Journal of Psychology. 15, 201-93.

Spector, P. E. (1987). Method variance as an artifact in self-reported affect and

perceptions at work: Myth or significant problem. Journal of Applied Psychology, 72(3),

438-443.

381

Steiger, J. H., & Schönemann, P. H. (1978). A history of factor indeterminacy. In

S. Shye (Ed.), Theory construction and data analysis. Chicago: University of Chicago

Press.

Stober, J. (2001). The social desirability scale-17 (SDS17): Convergent validity,

discriminant validity, and relationship with age. European Journal of Psychological

Assessment, 17(3), 222–232.

Taylor, P. (2010). Planning for an Ageing Workforce. Paper presented at the 4th

International Symposium on Work Ability: Age Management during the Life Course,

pp. 23-33, Tampere, Finland.

Taylor, P. (Sep 2008). Assessing Workability in the Workplace. Unpublished

presentation. OHSIG. Aotea Centre, Auckland, New Zealand.

Taylor, P., & McLoughlin, C. C. (Dec 2011). Pilot Study on Workability.

Monash University. Unpublished presentation. Melbourne, Australia.

Tenenhaus, M., Vinzi, V. E., Chatelin, Y.-M., & Lauro, C. (2005). PLS path

modeling. Computational Statistics & Data Analysis, 48(1), 159–205.

Thomas, K. W., & Kilmann, R. H. (1975). The social desirability variable in

organizational research: An alternative explanation for reported findings. The Academy

of Management Journal, 18(4), 741-752.

Thomson, G. H. (1916), A hierarchy without a general factor. British Journal of

Psychology, 1904-1920, 8, 271–281.

382

Thomson, G. H. (1935). The definition and measurement of "g" (general

intelligence). Journal of Educational Psychology, 26, 241-262.

Thurstone, L. L. (1935). The Vectors of Mind. Chicago: University of Chicago

Press.

Thurstone, L. L. (1947). Multiple factor analysis. Chicago: Chicago University

Press.

Treiblmaier, H. Bentler, P., and Mair, P. (2011). Formative Constructs

Implemented via Common Factors, Structural Equation Modeling, 18 (1), 1-17.

Tucker, R. (1955). The Objective Definition of Simple Structure in Linear Factor

Analysis. Psychometrika, 20, 209-225.

Tuomi, K. (1997). Eleven-year follow-up of aging workers; Editorial. Scand J

Work Environ Health, 23(Suppl 1), 66–71.

Tuomi, K., Ilmarinen, J., Jahkola, M., Katajarinne, L., & Tulkki, A. (2006). Work

Ability Index. 2nd revised edition. Helsinki, Finnish Institute of Occupational Health.

Tuomi, K., Ilmarinen, J., Klockars, M., Nygård, C.-H., Seitsamo, J., &

Huuhtanen, P. (1997). Finnish research project on aging workers in 1981-1992. Scand J

Work Environ Health, Suppl 1, 7-11.

Tuomi, K., Ilmarinen, J., Martikainen, R., Aalto, L., & Klockars, M. (1997).

Aging, work, life-style and work ability among Finnish municipal workers in 1981-

1992. Scand J Work Environ Health, 23(Suppl 1), 58-65.

383

Tuomi, K., Ilmarinen, J., Seitsamo, J., Huuhtanen, P., Martikainen, R., C-H, N.,

& Klockars, M. (1997). Summary of the Finnish research project (1981-1992) to

promote the health and work ability of aging workers. Scand J Work Environ Health,

Suppl 1, 66-71.

Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in

measurement error affecting score reliability across studies. Educational and

Psychological Measurement, 58, 6–20.

Vacha-Haase, T., & Thompson, B. (2011). Score reliability: A retrospective look

back at 12 years of reliability generalization studies. Measurement and Evaluation in

Counseling and Development, 44, 159-168.

Van der Heijden, B.I.J.M., Van Dam, K., & Hasselhorn, H.M. (2009). Intent to

leave nursing. The importance of interpersonal work context, wok-home interference,

and job satisfaction beyond the effect of occupational commitment. Career Development

International, 14(7), 616-635.

Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the

measurement invariance literature: suggestions, practices, and recommendations for

organizational research. Organizational Research Methods, 3(1), 4–70.

Warr, P., Cook, J., & Wall, T. (1979). Scales for the measurement of some work

attitudes and aspects of psychological well-being, Journal of Occupational

Psychology, 1979, 52(2), 129-148.

384

Webber, L., Smith, K., & Scott, K. (2006). Age, work ability and plans to leave

work. Paper presented at the Joint Conference of the Australian Psychological Society

and the New Zealand Psychological Society, pp. 479-483, Auckland, New Zealand.

Werts, C. E., Linn, R. L., & Joreskog, K. G. (1974). Interclass reliability

estimates: Testing structural assumptions. Educational and Psychological Measurement,

34, 25-33.

Wetzels, M., Odekerken-Schroder, G., van Oppen, C., (2009). Using PLS path

modeling for assessing hierarchical construct models: guidelines and empirical

illustration, MIS Quarterly 33 (1), 177-195.

Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of

psychological instruments: Applications in the substance use domain. In K. J. Bryant, M.

Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from

alcohol and substance abuse research (pp. 281–324). Washington, DC: American

Psychological Association.

Williams, L. J., Cote, J. A., & Buckley, M. R. (1989). Lack of method variance

in self-reported affect and perceptions at work: Reality or artifact? Journal of Applied

Psychology, 74(3), 462-468.

Williams, L. J., Hartman, N. & Cavazotte, F. (2010). Technique Method

Variance and Marker Variables: A Review and Comprehensive CFA Marker.

Organizational Research Methods, 13 (3), 477-514.

385

Wilson, E. B. (1928a) Review of 'The abilities of man, their nature and

measurement' by C. Spearman. Science, 67, 244-248.

Wilson, E. B. (1928b). On hierarchical correlation systems. Proceedings of the

National Academy of Sciences, 14, 283-291.

Wilson, E. B. (1929). Review of 'Crossroads in the mind of man: A study of

differentiable mental abilities' by T. L. Kelley. Journal of General Psychology, 2, 153-

169.

Wilson, E. B., & Worcester, J. (1939). A note on factor analysis. Psychometrika,

4, 133-148.

Winkler, J. D., Kanouse, D. E., & Ware, J. E., Jr. (1982). Controlling for

acquiescence response set in scale development. Journal of Applied Psychology, 67(5),

555-561.

Wold, H. (1979). Model construction and evaluation when theoretical knowledge

is scarce: Theory and application of partial least squares. In J. Kmenta & J. B. Ramsey

(Eds.), Evaluation of econometric models (pp. 47-74). New York: Academic.

Wold, H. (1982). Soft modeling: the basic design and some extensions, In:

Jöreskog, K.G., Wold, H. (Eds.), Systems Under Indirect Observations: Part II. North-

Holland, Amsterdam, pp.1 e54.

Wolfe, A. W. (1966). Factor analysis to 1940. Psychometric Monograph, 3.

386

Woodhouse, B., & Jackson, P. (1977). Lower bounds for the reliability of the

total score on a test composed of non-homogeneous items: H: A search procedure to

locate the greatest lower bound. Psychometrika, 42(4), 579-591.

Wright, Sewall. (1920). The relative importance of heredity and environment in

determining the piebald pattern of guinea-pigs. Proceedings of the National Academy of

Sciences. 6. 320-332.

Wynne-Jones, G., Buck, R., Varnava, C.J., & Main, C. (2011). Impacts on work

performance; what matters 6 months on? Occupational Medicine, 61, 205-208.

Wynne-Jones, G., Varnaya, A., Buck, R., Karanika-Murray, M., Griffiths, A.,

Phillips, C., & Main, C.J. (2009). Examination of the work organisation assessment

questionnaire in public sector workers. Journal of Occupational & Environmental

Medicine, 51(5): 586-593.

Yang, Y., & Green, S. B. (2011). Coefficient alpha: A reliability coefficient for

the 21st century. Journal of Psychoeducational Assessment, 29, 377-392.

Yung, Y. F., Thissen, D., & McLeod, L. D. (1999) On the relationship between

the higher-order factor model and the hierarchical factor model. Psychometrika, 64:113–

128.

Zeller, R. A., Measurement in the Social Sciences: The Link Between Theory

and Data. Cambridge University Press.

http://www.google.com.au/search?tbo=p&tbm=bks&q=inauthor:%22Edward+G.+Carmi

nes%22&source=gbs_metadata_r&cad=4 Carmines, E.G. (1980).

387

http://www.google.com.au/search?tbo=p&tbm=bks&q=inauthor:%22Edward+G.+Carmines%22&source=gbs_metadata_r&cad=4

http://www.google.com.au/search?tbo=p&tbm=bks&q=inauthor:%22Edward+G.+Carmines%22&source=gbs_metadata_r&cad=4

Zellner, A, and Theil. H (1962). Three-Stage Least Squares: Simultaneous

Estimation of Simultaneous Equations. Econometrica, 30, 54-78.

Zellner, A. (1962). An Efficient Method of Estimating Seemingly Unrelated

Regressions and Tests of Aggregation Bias. Journal of the American Statistical

Association, 57, 348-68.

Zimmerman, D.W. (1972) Test reliability and the Kuder-Richardson formulas:

Derivation from probability theory. Educational and Psychological Measurement, 32,

939-954.

Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s

β, and McDonald’s ωh: Their relations with each other and two alternative

conceptualizations of reliability. Psychometrika, 70, 123-133.

388

model-based reliability and validity of measurement models ... · emotional intelligence. these...

Documents