model-based reliability and validity of measurement models ... · emotional intelligence. these...
TRANSCRIPT
Model-based Reliability and Validity of Measurement Models
Using Structural Equation Modeling
Leila Karimi
Ph.D
2015
To
“The hands of my mum who showed me how the impossible
can become possible with hard work,
and the big heart of my dad who always gave me the courage
to follow my dreams”
ABSTRACT
Structural Equation Modeling (SEM) has a long and interesting history and it
continues to evolve, providing exciting research opportunities. This study considers the
roots of SEM and model-based reliability and the developments in these areas in the
context of measurement models. Looking to the future, the research provides important new
applications of model-based reliability in bifactor models using Covariance-based SEM
(CB-SEM) and reflective-formative measurement models, using Partial Least Squares SEM
(PLS-SEM). In addition the application of Bentler’s covariate-dependent reliability for
reliability assessments and demonstrating common method bias are demonstrated for the
first time.
The thesis considers three important research studies involving work ability, work
organisational assessment and a survey of social desirability, wellness, drinking habits and
emotional intelligence. These studies are used to demonstrate the above new developments
in model-based reliability. The contribution of the research and directions for future
research are discussed separately for each study and in general.
1
ACKNOWLEDGEMENTS
I would like to express my sincere appreciation to my supervisor and mentor,
Associate Professor Denny Meyer, for her continual support, encouragement, patience and
kindness throughout the life of this PhD. I would also like to extend my gratitude to
Professor Peter Bentler, University of California, Los Angeles (UCLA) who introduced me
to a new era of research.
Thank you also to my associate supervisors Professor Philip Taylor from Monash
University and Associate Professor Christine Critchley from Swinburne University for their
support through different stages of this project. Thanks to the Business, Work & Ageing
Centre for Research (BWA) at Swinburne University for sharing the database on WAS and
Dr Jodi Oakman from La Trobe University for generously sharing the paramedics data, Dr
James Gaskin from Brigham Young University, Professor Joerg Henseler from the
University of Twente and Professor Christian M. Ringle from the Hamburg University of
Technology (TUHH) for introducing me to the world of Partial Least Squares (PLS).
Special thanks to my lovely family and friends, particularly my brothers ‘Puya’ and
‘Hamid’ for their unconditional love and support which kept me strong during the difficult
times. I am so lucky to have them in my life. The last but not the least, thanks to my caring,
supportive partner ‘Arron’ for putting up with ‘crazy me’ in the last few stressful months of
wrapping up this project.
2
DECLARATION
This is to declare that the examinable outcome:
contains no material which has been accepted for the award to the candidate
of any other degree or diploma, except where due reference is made in the
text of the examinable outcome;
to the best of the candidate’s knowledge contains no material previously
published or written by another person except where due reference is made
in the text of the examinable outcome.
Signature
Date
3
TABLE OF CONTENTS
1 INTRODUCTION TO THE THESIS ................................................................................ 12
1.1 Introduction ............................................................................................ 12
1.2 Study Structure ....................................................................................... 14
1.3 Summary ................................................................................................. 18
2 THE HISTORY OF THE EVOLUTION OF SEM IN PSYCHOLOGY .................................... 20
2.1 First Trend: Exploratory Factor Analysis................................................... 22
2.2 Second Trend: Confirmatory Factor Analysis (CFA) .................................. 24
2.3 Third Trend: Factor Analysis of SEM (FASEM) .......................................... 25
2.4 Current Developments in SEM ................................................................ 31
2.5 Conclusion .............................................................................................. 33
3 THE EVOLUTION OF MODEL-BASED RELIABILITY ESTIMATES ................................... 35
3.1 Introduction ............................................................................................ 35
3.2 Classical Test Theory and Coefficient Alpha ............................................. 36
3.3 Major Problems with Using a Coefficient Alpha Reliability Analysis ......... 42
3.4 Unidimensional Model-based Reliability ................................................. 44
3.5 Recent Developments ............................................................................. 46
3.6 Summary ................................................................................................. 56
4 THE VALIDITY OF BIFACTOR VERSUS HIGHER-ORDER MEASUREMENT MODELS ..... 59
4.1 Bifactor Model of WOAQ......................................................................... 61
4.2 Summary ................................................................................................. 64
5 THE VALIDITY OF FORMATIVE MEASUREMENT MODELS VERSUS REFLECTIVE MODELS........................................................................................................................... 65
5.1 Differences between Formative and Reflective Models ........................... 66
5.2 Applications of Formative Models ........................................................... 72
5.3 Developing a Framework for Distinguishing Reflective- Formative Models .. ................................................................................................................ 74
4
5.4 Measurement Model Misspecification in Organisational Psychology Literature ............................................................................................................. 79
5.5 Summary and Conclusion ........................................................................ 84
6 STUDY 1: MODEL-BASED RELIABILITY, VALIDITY AND CROSS VALIDITY OF BIFACATOR MODEL FOR WOAQ ......................................................................................................... 86
6.1 Rational and Objectives ........................................................................... 88
6.2 Method ................................................................................................... 98
6.3 Summary ............................................................................................... 106
7 STUDY 1: RESULTS .................................................................................................. 108
7.1 Results: Study of Nurses-Validation of Bifactor Model of WOAQ ........... 108
7.2 Results: Study of Paramedics-Cross Validation of Bifactor Model WOAQ .... .............................................................................................................. 120
8 STUDY 1: DISCUSSION ............................................................................................ 125
8.1 Discussion: Study of Nurses-Validation of Bifactor Model of WOAQ ...... 125
8.2 Discussion: Study of Paramedics-Cross Validation of Bifactor Model of WOAQ ............................................................................................................ 129
8.3 Strengths and Limitations ...................................................................... 130
8.4 Summary and Conclusion ...................................................................... 132
9 STUDY 2: APPLICATIONS OF COVARIATE-DEPENDENT RELIABILITY........................ 134
9.1 Rational and Objectives ......................................................................... 134
9.2 Method ................................................................................................. 143
10 STUDY 2: RESULTS .............................................................................................. 147
10.1 Results of Application for Reliability Assessments – The study of WOAQ . .......................................................................................................... 147
10.2 Model Fit Evaluation .......................................................................... 150
10.3 Application in Demonstrating CMB using Social Desirability ............... 156
11 STUDY 2: DISCUSSION ......................................................................................... 169
11.1 Discussion: Application in Reliability Assessment of WOAQ ............... 170
5
11.2 Discussion: Application in Demonstrating CMB .................................. 172
11.3 Strengths ........................................................................................... 174
11.4 Limitations and Directions for Future Research ................................. 175
12 STUDY 3: MODEL-BASED RELIABILITY AND VALIDITY OF REFLECTIVE-FORMATIVE MODEL OF WAS USING PLS-SEM ................................................................................... 177
12.1 Rationale and Objectives ................................................................... 177
12.2 Method ............................................................................................. 201
13 STUDY 3: RESULTS .............................................................................................. 213
13.1 Results of Model Fit Evaluation .......................................................... 213
13.2 Comparison of the Misspecified Models with the Correctly Specified WAS Model ..................................................................................................... 221
14 STUDY 3: DISCUSSION ......................................................................................... 225
14.1 Implications for Work Ability Assessments......................................... 229
14.2 Limitations and Directions for Future Research ................................. 230
15 SUMMARY .......................................................................................................... 233
15.1 Study 1: Model-Based Reliability, Validity and Cross Validity of the Bifactor Model for WOAQ ............................................................................... 233
15.2 Study 2: Applications of Covariate-dependent Reliability ................... 235
15.3 Study 3: Model-based Reliability and Validity of Reflective-formative Model of WAS ................................................................................................. 236
15.4 Thesis contributions to SEM .............................................................. 238
15.5 Summary ........................................................................................... 242
16 APPENDICES ........................................................................................................ 243
16.1 PUBLISHED ARTICLES ......................................................................... 244
16.2 Validity and model-based reliability of the Work Organisation Assessment Questionnaire (WOAQ) among nurses ......................................... 244
16.3 Structural Equation Modeling in Psychology: The History, Development and Current Challenges ................................................................................... 257
6
16.4 Cross-validation of the Work Organization Assessment Questionnaire across gender: A study of Australian Health Organization ............................... 268
16.5 EVALUATING A HIGHER-ORDER MISSPECIFIED REFLECTIVE MODEL OF WAS USING CB-SEM. ....................................................................................... 283
16.6 DEFINITIONS OF IMPORTANT TERMS ................................................. 296
16.7 THE WOAQ AND ITS SUBFACTORS ITEMS. .......................................... 300
16.8 The R-WAS questionnaire .................................................................. 302
16.9 Appendix F. List of items used in construction of WAS ....................... 323
16.10 Ethics clearance ................................................................................. 326
16.11 A List of Articles Included in the Review ............................................. 333
17 REFERENCES ........................................................................................................ 355
7
LIST OF TABLES
Table 7.1 Descriptive Statistics of the Demographic Variables ............................ 109
Table 7.2 Subscales and WOAQ Items ................................................................ 110
Table 7.3 Item Characteristics of WOAQ ............................................................. 111
Table 7.4 Completely Standardized Maximum Likelihood (ML) Solutions of Higher order Model and the Bifactor Model .................................................................. 117
Table 7.5 Summary of Model Fit Statistics of the CFA Models of WOAQ ............. 118
Table 7.6 The Reliability Coefficients of WOAQ among Nursing Sample (n=312) . 119
Table 7.7 Characteristics of Paramedic Participants ............................................ 121
Table 7.8 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Gender .................................................................................................... 121
Table 7.9 Invariance Testing Across Gender for the Bifactor Model of WOAQ. ... 123
Table 10.1 The Descriptive Characteristics of the Main Study Constructs and Parameters (n=1255) .......................................................................................... 148
Table 10.2 Nursing and Paramedic Demographic Characteristics ........................ 149
Table 10.3 Mean Age Differences between Nursing and Paramedic ................... 150
Table 10.4 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Organisations ............................................................................... 151
Table 10.5 Summary of Model Fit Statistics of the Bifactor Models of WOAQ (n=1257) ............................................................................................................. 151
Table 10.6 WOAQ Reliability Statistics for Nursing (as reported in Chapter 7) and Paramedic Organisations .................................................................................... 152
Table 10.7 WOAQ Reliability Statistics across Organisations (n=1257) ............... 153
Table 10.8 Invariance Testing Across Organisations for the Bifactor Model of WOAQ ................................................................................................................ 154
Table 10.9 Summary of the Demographic Characteristics of the Participants (n=341) ............................................................................................................... 156
Table 10.10 Comparison of the Covariate-dependent and Covariate-free Reliability Coefficients of the Scales after Including CMB .................................................... 158
8
Table 10.11 Summary of Fit Indices of Comparison Models ................................ 166
Table 10.12 Standardised Factor Loadings for Different Models Compared to Baseline .............................................................................................................. 167
Table 12.1 Items of the Work Ability Index ......................................................... 184
Table 13.1 Quality Criteria of the Reflective-formative WAS First-order Constructs using PLS-SEM .................................................................................................... 215
Table 13.2 Intercorrelation Analysis and the Square Roots of AVE of First-order Constructs of Reflective-formative PLS-SEM Model † ......................................... 217
Table 13.3 The Standardised Mean Coefficients of the Second-order formative Constructs of Reflective-formative PLS-SEM Model (n=5000 bootstrap) ............. 218
Table 13.4 Results for Third-order formative Constructs of Reflective-formative WAS (n=5000 bootstrap samples)....................................................................... 219
Table 13.5 Comparing the Standardized Path Coefficients of Misspecified and Correctly Specified WAS Models ......................................................................... 222
Table 13.6 Comparing the Model-based Reliability Coefficients of a Misspecified Reflective-reflective WAS (CB-SEM) with the Correctly Specified Reflective-formative Model of WAS .................................................................................................... 224
Table 16.1 The Standardised Path Parameter Estimates for the Misspecified Reflective WAS Model Using the CB-SEM Procedure .......................................... 285
Table 16.2 The Parameter Estimates for First-order Reflective Model Using the CB-SEM Procedure ................................................................................................... 286
Table 16.3 Intercorrelation analysis and the square roots of AVE for subfactors. 288
Table 16.4 Structural Model Results for Second-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap). .................................................. 290
Table 16.5 Structural Model Results for Higher-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap samples). .................................... 291
Table 16.6 Structural Model Results for Second-order Reflective Constructs of Full Formative PLS-SEM Model (n=5000 bootstrap). ................................................. 294
Table 16.7 Structural Model Results for Higher-order Reflective Constructs (n=5000 bootstrap samples). ............................................................................................ 295
9
LIST OF FIGURES
Figure 1.1. The study structures in this thesis. ...................................................... 17
Figure 2.1. A pseudo path diagram and timeline for some of the developments in SEM modeling and SEM model structures. ........................................................... 21
Figure 2.2. One of Wright's first path diagrams for genetic modeling. .................. 26
Figure 3.1. A pseudo path diagram and timeline for some of the developments in historical review of the conceptualisation and estimation of model-based reliability ............................................................................................................................. 38
Figure 3.2. Demonstrating Omega reliability coefficients for WOAQ..................... 49
Figure 3.3. A unidimensional construct with four indicators ................................. 53
Figure 3.4. A covariate-dependent construct with four indicators and two covariates ............................................................................................................. 54
Figure 4.1. Higher-order vs. Bifactor model model of WOAQ ................................ 63
Figure 5.1. First-order reflective model ................................................................ 68
Figure 5.2. First-order formative model ................................................................ 69
Figure 5.3. Higher-order reflective-reflective measurement model ...................... 70
Figure 5.4. Higher-order formative-formative measurement model ..................... 71
Figure 5.5. The developed framework for assessing the formative vs. reflective measurement models........................................................................................... 78
Figure 6.1. A demonstration of Bifactor model of WOAQ with phantom variables for calculating omega coefficients on the righthand side .................................... 105
Figure 7.1. The proposed bifactor model of WOAQ vs. higher order ................... 114
Figure 9.1. Covariate-dependent reliability assessment with the bifactor model of WOAQ across the nursing and paramedic organisations. .................................... 136
Figure 10.1. The effects of latent factor bias (social desirability) on the reliability of the constructs. ................................................................................................... 157
Figure 10.2. The proposed model for evaluating CMB/CMV. .............................. 159
10
Figure 10.3 . Model 1: Baseline model when all the study constructs are correlated without controlling for CMV and CMB ................................................................ 161
Figure 10.4. Model 2. Constrained equal loadings from CMV to the study indicators. .......................................................................................................... 163
Figure 10.5. Model 3. Free loadings from CMV to the study indicators ............... 165
Figure 12.1. Multidimensional work ability model. ............................................. 188
Figure 12.2. WAI scores: Australia and Finland. .................................................. 190
Figure 12.3. The correctly specified reflective-formative model of WAS. ............ 195
Figure 12.4. The misspecified reflective-reflective model of WAS ....................... 196
.Figure 12.5. The misspecified formative-formative model of WAS. .................... 197
Figure 12.6 . Step one: constructing the first-order sub-constructs of both personal and organisational capacities of reflective-formative model of WAS using PLS path modeling. ........................................................................................................... 208
Figure 12.7. –Step two: building the second-order formative constructs (organisational and personal capacities) for the reflective-formative model of WAS. ........................................................................................................................... 209
Figure 12.8. Step three: The scores of the first-order latent factors, are used as the manifests of the second-order factors (i.e. organisational and personal capacities) and forming the higher-order construct (WAS). .................................................. 210
Figure 13.1. The final model of reflective-formative WAS development using PLS path modeling. ................................................................................................... 220
Figure 16.1. The standardised path parameter estimates of the misspecified reflective WAS model ......................................................................................... 284
Figure 16.2. The reflective model of WAS using PLS-SEM.................................... 289
Figure 16.3. The reflective WAS development using PLS path modeling. ............ 291
Figure 16.4. The model building process for full formative model of WAS using PLS-SEM. ................................................................................................................... 293
Figure 16.5. The full formative model of WAS using PLS-SEM ............................. 295
11
1
INTRODUCTION TO THE THESIS
So here it is, my final version of the thesis. A thesis which was the best journey of
my life. One which I fell in love with and developed over the years. A thesis that made me
learn a lot and taught me to never stop learning. A thesis that was by my side for seven
years, for all the ups and downs.
1.1 Introduction
The journey started with an initial interest in exploring the SEM-based validation of
measures for formative constructs. At the start I was lost. I felt like I was walking in the
dark with no hope of finding the light. But as time passed and the more I immersed myself
in the subject, the more it all started to make sense. My first realisation was that constructs
are not reflective - ‘by default’ - as the majority of scholars assume. A real life example
exists right in front of us; social economic status indicators (SES). The three main
components of SES are income, education and occupational status. Although these
components may be correlated, they are measuring different constructs. Which means SES
cannot be a reflective construct, as assumed by some scholars. I grew up in a middle-class
family in a highly populated, developing country, where your social status is often defined
by your parent’s income or occupational status. This experience showed me that a parent’s
income and occupational status does not always depend on their level of education. This is
a perfect example of a formative construct.
I then reviewed the literature to see how big the problem of model misclassification
actually is, and as I was expecting, I found that it was big enough to do something about. It
12
is very difficult to evaluate formative models using the conventional SEM procedures,
mainly due to identification problems. It is therefore no wonder that the majority of the
scholars - ‘by default’ - consider their constructs to be reflective. The evaluation of
formative models using the conventional Covariance-Based Structural Equation Modeling
(CB-SEM) procedure using conventional statistical software results in model identification
problems. In my search for a solution I discovered and became familiar with Partial Least
Squares SEM.
Further in my research of formative models, I met with distinguished Professor
Peter M. Bentler at UCLA. After attending his statistics classes and frequent discussions,
SEM application began making more sense than ever before. During one of our meetings,
Peter mentioned Covariate-Dependent Reliability, prompting my second realisation.
The more I read and thought about it, the more I realised the potential applications
for this approach. Why is it that some measurement scales do not have an acceptable level
of reliability in all situations or populations? Is an IQ test derived within a European
context applicable for a remote aboriginal community? Can the reliability of the IQ test that
is associated with ethnicity be separated from the reliability that is independent of
ethnicity? Clearly, the reliability of any measure can be influenced by covariates or
cofounding variables. However, nobody seems to care about this issue, or if they do, they
don’t know how to account for this. With permission from Professor Bentler, I started
evaluating the application of Covariate-dependent Reliability. I was warned that this was a
risky exercise given the novelty of the topic and the lack of previous literature. But
13
nevertheless, the importance of this topic pushed me out of my comfort zone and stretched
the boundaries of my thesis.
Through my investigation of Covariate-dependent Reliability, I came to understand
its application in demonstrating Common Method Bias (CMB), caused by factors such as
social desirability. Not all students that fill in surveys about their drinking behaviours or
emotional intelligence tell us the truth. I came to realise that it was possible to test for CMB
in the following way. If CMB was to be treated as a covariate and we evaluated the
reliability of measurement scales with and without CMB, if we happened to see changes in
the reliability, then we could argue that CMB exists. Moreover we could extract the effect
of the CMB.
Finally, after learning about the importance of bifactor models for multidimensional
modeling, I was encouraged to also explore this neglected area. For the complex
multidimensional measure of Work Organisation Assessment Questionnaire (WOAQ),
bifactor analysis provided a significant improvement in the measurement model.
1.2 Study Structure
The main theme of this journey is the testing of model-based reliability and validity
of measurement models using SEM. I have perused three new developments in this area in
the following three studies:
1) The model-based reliability and validity evaluation in Bifactor Measurement
Models with special focus on the Work Organisation Assessment Questionnaire WOAQ;
comparing the results with a second-order model.
14
2) The application of the newly developed theory of Bentler’s Covariate-dependent
Reliability, not only for reliability assessment but also for demonstrating Common Method
Bias (CMB); and
3) Evaluating Model-Based Reliability and validity in reflective-formative models,
using Partial Least Squares SEM. By using this procedure, the existing misspecified model
of WAS will be compared with a correctly specified model to highlight the impacts of
model specification errors.
A summary of these three studies are presented in Figure 1.1. Clearly Covariance-
Based SEM is used for the first two studies in the context of reflective measurement
models, and PLS-SEM is used for the third study in the context of reflective-formative
measurement models. The theory for these methods is covered in chapters 2-5. In Study one
(Chapters 6 to 8) the validation of a bifactor model for the Work Organisation Assessment
(WOAQ) will be demonstrated in a health setting. Nothing like this has been carried out
before using the Work Organisation Assessment Questionnaire (WOAQ). In addition, the
model-based reliability coefficients (Omega total, Omega hierarchical and Omega
subscales) will be computed and compared with the conventional coefficient alpha. The
first part of the study was conducted with a sample of community nurses. The second part
of this study concerns the cross validation of the bifactor model with a paramedic sample
(across gender), to find out if the bifactor model of Work Organisation Assessment
Questionnaire (WOAQ) has similar properties in another very different population within
the health sector.
15
In Study two (Chapters 9 – 11), two applications of covariate-dependent reliability
will be demonstrated empirically. This is the first time that applications of covariate-
dependent reliability have been undertaken. One of these applications demonstrates
reliability evaluations in the context of Common Method Bias (CMB). This application
demonstrates that, if CMB exists, then the reliability of the measurements will be affected
when you treat CMB as a covariate source of reliability.
In Study three (Chapters 12 – 14), an empirical example of fitting a reflective-
formative measurement model using Partial Least Squares SEM is presented step by step.
To the best of the researchers’ knowledge, there has been no previous clear guideline or
procedure for fitting a reflective-formative model in the literature of Partial Least Squares
SEM. This model is fitted for a work ability measure (WAS) allowing the testing of both
validity and reliability. This reflective-formative model is compared with a misspecified
reflective-reflective model, demonstrating the errors that occur as a result of
misspecification.
16
Figure 1.1. The study structures in this thesis.
Note: CB-SEM: Covariance-based Structural Equation ModelingPLS-SEM: Partial Least Squares Structural Equation Modeling; WOAQ: Work Organisation Assessment Questionnaire; WAS: Work Ability Scale.
Model-based reliability and validity
CB-SEM(Reflective models)
Chapter 2-4
Study 1:Bifactor and higher-order model validtaion of WOAQChapters 6-8
Study 2: Applications of covariate-dependent reliabilityChapters 9-11
PLS-SEM(Reflective-formative models)
Chapter 5
Study 3:Reflective-formative model validation of WASChapters 12-14
Part I: Study of nurses – Validation and Model-based reliability of WOAQ
Part II: study of paramedics- Cross validation of WOAQ
Part I: Study of nurses & paramedics- application in reliability
Part II: Study of students-application in common method bias
Part I: Fitting hierarchical models with formative constructs
Part II: Validation and model-based reliability of WAS
17
Chapters 2 provides an overview of SEM and Chapter 3 covers model-based
reliability, along with new developments, current gaps and applications in these areas.
Chapter 4 compares bifactor and second-order models and Chapter 5 compares reflective
and formative models, presenting an overview of the history of misspecification for
formative models. The misspecification rates for formative and reflective models are
assessed for two top journals in Organisational Psychology in a 9 year period (2006-2014).
A solution to this model identification problem is provided by way of a decision flowchart
for distinguishing formative from reflective models. Chapters 6 to 8 cover the validity,
cross validity and model-based reliability of the Work Organisation Assessment
Questionnaire (WOAQ), using both bifactor and second-order factor models. Chapters 9 to
11 cover two new applications of covariate-dependent reliability, and also demonstrate how
CMB can be detected and measured using this approach. Chapters 12 to 14 concentrate on
the validity and reliability assessments of reflective-formative models as opposed to
misspecified reflective-reflective models using PLS-SEM. Finally all three studies are
summarised in chapter 15.
1.3 Summary
In spite of the importance of multidimensional model-based reliability
measurement, there is limited empirical study of model-based reliability coefficients. In
addition, bifactor models and measures with formative constructs have received less
attention in the literature compared to higher-order and reflective models. Either scholars
do not recognise the importance of these topics, or the appropriate statistical software is not
readily available for performing analysis. This research is designed to fill some theoretical
18
and methodological gaps in this area. Moreover, for the first time, this study demonstrates
the practical implications of the newly introduced theory of covariate-dependent reliability
of Bentler (2014) for reliability and common method bias assessment.
19
2
THE HISTORY OF THE EVOLUTION OF SEM IN PSYCHOLOGY
Structural equation modeling (SEM) is one of the major research tools that is
rapidly growing in popularity. SEM is a statistical technique for testing and examining
measurement models and causal relations, using a combination of statistical data and
qualitative causal assumptions (Pearl, 2000). SEM techniques are a major component of
applied multivariate statistical analysis, which are widely used by researchers in different
disciplines.
In a broader sense, SEM represents a series of cause-effect relationships between
variables in a composite testable model (Shipley, 2000). It extends on conventional
multivariate statistical analysis by accounting for measurement error and by more
thoroughly examining goodness-of-fit. The SEM technique has grown out of methods such
as path and factor analysis.
SEM has attracted attention primarily because it lends itself to effectively studying
problems or models that cannot be easily investigated using other approaches. In this
chapter, a history of the original roots of SEM in psychology will be traced, followed by a
discussion of the current developments in SEM. The structure of the chapter is based on the
path diagram presented in Figure 2.1. The idea of showing the history of SEM using a
graph originated from a personal conversation with Professor Peter Bentler in 2012. The
researcher was inspired by the idea and extensively developed the graph to include all the
major developments in SEM.
20
Figure 2.1. A pseudo path diagram and timeline for some of the developments in SEM modeling and SEM model structures. Acknowledgement:
Special thanks to Professor Peter Bentler (personal communication, 2012), for his inspiration and input into development of the diagram.
RELIABILITY
FACTOR ANALYSIS
CONFIRMATORY FA
NONLINEAR
MULTILEVEL
MIXTURE
BOOTSTRAPPING
PRINCIPAL COMPONENTS
FASEM
PATH ANALYSIS
SIMULTANEOUS EQUATIONS
LINEAR BENTLER/WEEKS
1900 1920 1940 1960 1980 2000
CLASSICAL TEST
THEORY
MODERN TEST THEORY
FORMATIVE MODELS
MIMIC
PLS
GLLAMM
REGRESSION
META ANALYSIS
2.1 First Trend: Exploratory Factor Analysis1
Exploratory factor analysis (EFA) has made an important contribution,
especially in the social sciences, by addressing the needs and interests of various
disciplines. The primary roots of SEM in psychology can be traced to the work of
Pearson (1901) on orthogonal least squares. Pearson’s theory was not fully appreciated
at the time, but it later became a foundation for principal component analysis and
correlation matrix analysis (Hotelling, 1933). Spearman (1904), an English
psychologist, also contributed substantially. Spearman is commonly regarded as the
pioneer of factor analysis, based on his work involving the finding of relationships
between multiple correlated measures of cognitive performance. Factor analysis is
defined as a type of statistical procedure that is conducted to identify clusters or groups
of related items (called factors).
Using factor analytic data, Spearman postulated his original two-factor models for
ability and intelligence, highlighting the theory testing nature of the method. Spearman
found that children's scores on different subjects were connected. Spearman then
extended his theory, proposing three types of factors: a general factor (g), referring to all
activities; a specific factor (s) that refers to a specific mental activity; and a group factor
which is common to some of the variables but not all. Other scholars gradually adopted
this approach using factor analysis (e.g. Mosier, 1939; Guttman, 1952; Lawley, 1940;
Anderson and Rubin, 1956).
1 Shown in blue in Figure 2.1.
22
2.1.1 Multiple factor analysis
Spearman's two-factor theory was criticised widely. Thomson (1916 & 1935)
strongly criticised the sampling theory approach in regard to abilities in the early stages
of the development of Spearman’s two-factor theory. It was claimed that the analysis
considered only a sample of possible abilities, making such an analysis incomplete.
However, the biggest critic of the Spearman model was Wilson (1928a, 1928b, 1929), a
famous mathematician who made a significant contribution to the development of factor
analysis. In different papers and using different examples, Wilson highlighted
indeterminacy issues; lack of uniqueness in the variable (g) of Spearman’s theory; and
the identifiability problem in the variance-covariance parameters of factor analysis. A
number of scholars (such as Irwin, 1935, Thomson, 1935), added to Wilson’s work by
further developing an understanding of indeterminacy (Steiger & Schönemann, 1978).
In the early stages of Spearman's development of factor analysis, some scholars (e.g.
Wilson, 1929; Thomson, 1935) suggested that factor indeterminacy might seriously
affect the ultimate purpose of the model, making this a very important theoretical issue
between 1928 and 1939. However, the focus moved away from factor indeterminacy
when Wolfe (1940) wrote his objective historical review of factor analysis, until 1955,
when factor indeterminacy again attracted attention.
The Spearman two-factor methods were also criticised on the grounds that they
were not appropriate for situations that involved more than one group factor. In 1931,
Thurstone considered this as one of the serious limitations of Spearman's method,
mainly because psychological problems usually involved two or more group factors
(Thurstone, 1935). This limitation led to an interest in multiple-factor analysis to
supplement Spearman’s model, whereby group factors were identified after extracting a
23
general factor (e.g., Holzinger 1941). In multiple-factor analysis there are no restrictions
on the number of general factors or the number of group factors (Thurstone, 1935).
Thurstone (1947) further developed his multiple-factor analysis using the centroid
method of factoring for a correlation matrix, a pragmatic compromise to the
computationally-burdensome principle axis method. This method of factor analysis
attracted further attention in the 1960s (Bentler, 1968; McDonald, 1970). However, from
the early 1980’s explicit optimisation functions, such as least squares, maximum
likelihood (ML), and minimum chi-square, became more popular.
Thurstone (1947), with later contributions from Cattell (1978), developed the
foundations for the concept of factor rotation. Other scholars extended Thurstone’s
work by proposing practical solutions for rotation. The most popular rotation methods
included the Varimax orthogonal rotation, which forced factors to be uncorrelated
(Kaiser, 1958); and various oblique rotation methods (Jennrich & Sampson, 1966;
Jennrich & Clarkson, 1980) which allowed the factors to be correlated. As a result of
these developments exploratory multiple-factor analysis became popular during this
period.
2.2 Second Trend: Confirmatory Factor Analysis (CFA)
There is obviously a connection between exploratory and confirmatory factor
analysis methods. However, other statistical theory apart from exploratory factor
analysis has also made a significant contribution to the development of confirmatory
factor analysis (Bentler, 1986). These theories include analyses for higher order factors.
Although Thurstone (1947) seems to be acknowledged for proposing the
mathematical foundation of second-order factor analysis, it was Jőreskog in 1970 who
24
wrote an equation including first and second-order factors as a single model and it was
Bentler in 1976 who offered a complete and general structure for higher-order factors.
The problem of rotating factor solutions was avoided when confirmatory factor
analysis (CFA) came on board. In CFA, the factors and parameter loadings are
identified before analysis starts, transforming the problem into one of identification of a
model’s parameters from observed moments (Matsueda, 2012).
CFA was introduced originally by Tucker (1955). It was further developed
following the introduction of an ML approach to factor analysis (Lawley, 1940;
Anderson & Rubin, 1956). Finally, it was Jöreskog (1969) who developed the first
computer software programs for CFA estimation using ML.
2.3 Third Trend: Factor Analysis of SEM (FASEM)
Real progress in the evolution of SEM was produced by the integration of the
earlier SEM developments in psychometrics, sociology, econometrics, and biometry
(Bentler, 1986). The factor analysis of structural equation modeling (FASEM) and the
resulting linear structural relations (LISREL) software were the main outcomes of this
integration. At the time, simultaneous equation and path analysis methods were the
main new contributors to FASEM and LISREL.
2.3.1 Path Analysis
Sewall Wright WAS one of the first scholars to use path analysis in medical science
when he started using this in his studies in the 1920s. Path analysis was one of the
primary procedures used to determine a causal structure. Wright used observed
variables to develop a correlation matrix, and drew path diagrams indicating direct and
indirect effects.
25
Path analyses led Wright to develop the Multiple Indicators Multiple Causes
(MIMIC) model among others (Matsueda, 2012). Figure 2.2 presents an early path
analysis by Wright (1920) indicating path modeling of heredity and environment in
shaping the piebald pattern of guinea-pigs.
Figure 2.2. One of Wright's first path diagrams for genetic modeling.
Source: Wright, Sewall (1920). The relative importance of heredity and
environment in determining the piebald pattern of guinea-pigs. Proceedings of the
National Academy of Sciences, 6, 320-332.
2.3.2 Simultaneous equation and errors-in-variables models in economics
The development of SEM in econometrics can be attributed perhaps to Frisch
and Waugh (1933), Haavelmo (1943), and Koopmans (1945). Frisch (1934), the
founder of the Econometric Society and the Econometrica Journal, invented the term
26
“econometrics” and developed many of the identification principles in SEM. The
advances made by Haavelmo (1943), another economist, together with Mann and Wald,
led to work on SEM at the Cowles Commission (1952). This resulted in Haavelmo
solving the major problems of identification, estimation, and testing in SEM.
Koopmans et al. (1945) made some empirical advances on Haavelmo’s model.
However, according to Matsueda (2012), it was Klein (1950) who made the most
significant contribution to the empirical application of simultaneous equation models
using Keynesian economic models, culminating with the 15-equation Klein-Goldberger
model estimated by limited-information methods (Klein & Goldberger 1955). Other
scholars made further contributions to the model (e.g. Anderson & Rubin 1949; Zellner,
1962; Zellner & Theil, 1962).
Frisch (1934) first created an errors-in-variables model and then a graphical
presentation of regression coefficients (the method of bunch maps) which was proposed
as a tool to discover underlying structures, often obtaining approximate bounds for
relationships. According to Hendry and Morgan (1989), Frisch treated observed
variables as fallible indicators of latent variables, examining the interrelationships
among all latent and observed variables to distinguish true relations from confluent
relations.
Frisch’s errors-in-variables model was ignored until the early 1970s when
Zellner became interested and demonstrated the use of generalised least squares (GLS)
and Bayesian approaches in estimating a model with a fallible endogenous predictor.
Later, Goldberger (1971) showed that GLS was equivalent to ML only when errors
were normally distributed with known variances. He also showed that when error
variances were unknown, an iterated GLS converged to ML.
27
According to Bentler (1986), Goldberger was one of the first researchers to
realise the need to integrate some SEM-related ideas into other disciplines (Goldberger,
1971; Goldberger & Duncan, 1973). This integration was one of the turning points in
the evolution of SEM in the 1970s.
2.3.3 FASEM
FASEM is a generic acronym for factor analysis (FA) structural equation
modeling (SEM); a major development in the 1970s and 1980s. FASEM was first used
by Bentler (1986) to refer to conceptual approaches to continuous variables in SEM.
The Conference on Structural Equation Models in 1970 contributed greatly to
the integration of SEM disciplines. The conference was an interdisciplinary forum of
economists, sociologists, psychologists, statisticians, and political scientists and the
academic papers were published in a volume of Structural Equation Models in the
Social Sciences by Goldberger and Duncan in 1973.
According to Bentler (1986), the major achievements in the 1970s and 1980s
can be categorised into three sections: structural concepts, statistical theory and practical
development. The two key papers published in this period were written by Hauser and
Goldberger (1971) and Jöreskog (1973). Hauser and Goldberger’s (1971) examination
of unobservable variables is an exemplar of cross-disciplinary integration, drawing on
path analysis and moment estimators from Wright, as well as work by sociologists. It
also incorporates factor-analytic models from psychometrics, efficient estimation, and
Neyman-Pearson hypothesis testing from statistics and econometrics. Hauser and
Goldberger used limited-information estimation to gain a better understanding of
structural equations estimated by ML. Jöreskog (1973) presented an ML framework for
estimating the parameters of these SEM models, developed a computer program for
28
empirical applications, and showed how the general model could be applied to a myriad
of important substantive models.
2.3.4 Nonlinear SEM
The turning point in the application of SEM in psychology dates back to the
1970s and 1980s, primarily through the work of Bentler and, more particularly, his
development of the EQS SEM software (Matsueda, 2012). Using such analytical
software for evaluating SEMs allows researchers to make better use of their data and to
study the empirical applications of some of the methods proposed by certain scholars in
the literature (Bentler, 1986). During the 1980s some researchers paid attention to
nonlinear SEMs, which helped to extend the overall scope of SEM. Some important
developments in nonlinear latent variable SEM, particularly those for categorical data,
emerged in the 1980s, mainly in the works of Bock and Aitkin (1981), Mislevy (1984)
and Muthén (1984).
2.3.5 Formative models
The first appearance of formative measures probably dates back to the Berkson
error model for radiation epidemiology studies described below. In the 1950s, the U.S.
carried out nuclear testing in the state of Nevada. Due to the sudden increase in thyroid
disease in surrounding areas, a major epidemiological study was carried out at the
University of Utah to evaluate the outcomes of radiation on health. The researchers
found that the main exposure to radiation came from milk and vegetable consumption.
Based on that finding, the people in the study who had similar milk intake were
assigned to the same dose group. Because the effect of radiation on the thyroid cannot
be observed directly, Berkson designed a method in which the true exposure to radiation
29
(true score) was a function of the amount of food consumption (observed score) with
some degree of uncertainty (measurement error).
In Classical Test Theory, a true score, with its measurement error, forms an
observed variable, while in the Berkson error model it is the opposite in that the true
score is equal to observed score plus measurement error (Carroll, Ruppert, Stefanski, &
Crainiceanu, 2006). The Berkson measurement error concept has become the
cornerstone of what is today known as formative models. Although the concept of
formative measures was introduced by Berkson in 1950, it did not attract attention until
the late 1960s. Influenced by principal component and composite-like ideas, attention to
using formative measures in SEM has since increased. The biggest surge in the use of
formative models in certain situations occurred in the early 2000’s. Many scholars (e.g.
Blalock, 1971; Diamantopoulos and Winklhofer, 2001; Jarvis, MacKenzie & Podsakoff,
2003; Petter, Straub & Rai, 2007) have alerted researchers to the relevance of formative
models in specific situations; however, this fact has unfortunately been
underemphasised in the literature.
2.3.6 Multiple-indicators multiple-causes model (MIMIC)
One of the techniques implemented by Wright in the 1920s using path analysis,
is similar to what is now known as a MIMIC model (Matsueda, 2012). The main
advancement in MIMIC was achieved through the works of Jöreskog, Hauser, and
Goldberger in the 1970s. They introduced ML as the estimation method for over-
identified MIMIC models.
The release of the LISREL statistical software by Jöreskog in the 1970s
produced the greatest advancement in estimating MIMIC models. LISREL is still
popular among scholars because of its ability to incorporate factor analysis, path
30
analysis, and FASEMs into a general covariance structure model (Jöreskog and Sörbom
2001; Matsueda, 2012). Using the MIMIC model, identification and estimation of
formative models has become feasible in some circumstances.
2.4 Current Developments in SEM
Some of the most current developments in SEM include multilevel models,
generalised linear latent and mixed modeling (GLLAMM), partial least squares (PLS)
and SEM-based meta-analysis.
2.4.1 Multilevel models
Multilevel models can be estimated using multiple indicator measurement
models in SEM. Using this approach separate models for within and between group
covariances are considered. Further, by using a multiple group analysis, the parameters
can be calculated simultaneously for both levels (Muthén, 1994). Although this
estimation method can be applied using almost any SEM software, this is generally only
for a few specific models.
2.4.2 GLLAMM
As mentioned above, the use of multilevel models is limited to specific models
and cannot be applied to all models. In response to this limitation, a more advanced and
general estimation method, GLLAMM, was introduced by Rabe-Hesketh, Skrondal &
Pickles (2004), and further developed by Skrondal and Rabe-Hesketh’s (2004).
GLLAMM has three main components: a generalised linear model, a structural equation
model for latent variables, and distributional assumptions for these latent variables
(Matsueda, 2012). The generalised linear model is capable of analysing all types of
data; continuous, ordinal, dichotomous and discrete. The GLLAMM program is now
31
part of the Stata program. Many of the GLLAMM models can also be analysed by
MPlus, a powerful software package developed by Muthén and Muthén (2004).
2.4.3 PLS
The roots of PLS, as well as graphical models, can be traced to Herman Wold in
1977 (Geladi, 1988). Wold's PLS modeling was enhanced by the idea of principal
component analysis as well as Jöreskog's LISREL software program.
Originally, PLS was developed to solve the problem of multicollinearity in
multiple regression analysis. According to Wold (1979), PLS regression was an
appropriate estimation method for complex models with undeveloped theoretical
backgrounds. The original application of PLS was more for predictive models (Barclay,
Higgins, & Thompson, 1995). Later, as an alternative to Jöreskog’s covariate-dependent
SEM approach, Wold introduced SEM based on PLS. Because PLS-based SEM has
fewer underlying restrictions, such as normally distributed data and a large sample size,
it is known as soft modeling. Despite its less restrictive nature, PLS-based SEM did not
become as popular as covariate-dependent SEM. The main reason for this was a lack of
software for model estimation.
However, since 1984, and especially from the early 2000s, more user-friendly
software has been introduced for the estimation of PLS-based SEM, adding to the
popularity of the method. Software such as PLS-GUI (Li, 2005), Visual PLS (Fu,
2006a), PLS-Graph (Chin, 2004), SmartPLS (Ringle et al., 2005), SPAD-PLS (Test &
Go, 2006) and XLSTAT (Addinsoft, 2008) are among the recent developments
(Morales, 2011).
32
There has been much debate among the scholars about the application of PLS
and the lack of a goodness-of-fit test. These issues are discussed in Chapter 3.
2.4.4 SEM-based meta-analysis
The concept of SEM-based meta-analysis was introduced by Cheung (2008) to
integrate SEM results from different studies. As a result, studies in meta-analysis are
relevant to SEM. Although Cheung's proposed approach added a new and important
methodological development in SEM, it is not yet fully incorporated into the current
popular SEM software, limiting its further application in practice.
2.5 Conclusion
SEM is rapidly growing in popularity as a major research tool in psychology.
The early foundation of SEM can be traced back to factor analysis, principal component
analysis, regression and path analysis. It started in various disciplines such as
psychometrics, sociology, econometrics and biometry. The Interdisciplinary Conference
on Structural Equation Models in 1970 greatly influenced the integration of SEM work
in these disciplines. The work of Bentler and, especially, his development of EQS in
1970, was another turning point for the application of SEM in psychology. Since then,
SEM has rapidly developed. In particular MIMIC models were developed for the fitting
of formative models in some circumstance. Other recent developments such as PLS,
GLLAMM and multilevel models have extended the application of SEM techniques to a
higher level.
Although this area is progressing rapidly, there is a risk that the technique will
be misused due to its complexity or lack of knowledge among psychological
researchers. Some of the most controversial debates relate to model-based reliability,
33
model misspecification (formative vs. reflective) and the use of Partial Least Squares
SEM (vs. covariance-based SEM). These three issues are described in more detail in the
following three chapters to highlight their importance to researchers.
34
3
THE EVOLUTION OF MODEL-BASED RELIABILITY ESTIMATES
3.1 Introduction
The literature on reliability has developed following the introduction of classical
test theory in the 1900s. Coefficient alpha (α) has been widely utilised as a coefficient
of internal consistency for tests (measurement scales) where overall scores are
generated from the summation of test items (Bollen, 1989; Miller, 1995). Despite its
popularity, the application of coefficient alpha as a reliability estimate has been
contentious and has been subjected to numerous criticisms by scholars such as Green
and Hershberger (2000), Green and Yang (2009a) and Sijtsma (2009a). Some scholars
argue that coefficient alpha has been commonly misinterpreted as a measure of test
homogeneity or unidimensionality (Green & Yang, 2009b; Miller, 1995). Other scholars
such as Miller (1995) as well as Rogers, Schmitt, and Mullins (2002) claim that
coefficient alpha may not be suitable for multidimensional composites. From a differing
viewpoint, others claim that the conventional coefficient alpha leads to the
overestimating or underestimating of true reliability (Raykov, 1997, 1998; Miller,
1995). Based on Raykov (1998) and Bentler (2009) the coefficient alpha is correctly
estimated only when there is no correlation between error terms and the assumption of
essential Tau Equivalency is met. This term is explained below.
Due to the limitations of coefficient alpha, over the past decades attention has
shifted to a model-based internal consistency coefficient for measuring test score
reliability. Some scholars have embraced Structural Equation Modeling (SEM)
approaches for the estimation of model-based reliability as an alternative to coefficient
35
alpha to improve the reporting of psychometric internal consistency (Sijtsma, 2009b).
Unlike classical test theory which considers true-score variance, the SEM approach to
model-based reliability focuses on the composition of the true score. This means that
the true-score variance is partitioned into variance components allowing the researcher
to consider the importance of the different variance components that contribute to test-
score reliability (Sijtsma, 2009b).
For the purpose of this chapter the evolution of model-based reliability estimates
will be explored. The focus will initially be mainly on model-based reliability assuming
a unidimensional reliability coefficient (Jöreskog, 1971; McDonald, 1985; Bentler,
2007), however, reliability coefficients for multi-dimensional and bifactor models will
also be considered, in the form of the Omega, Omega total, Omega hierarchical and
Omega subscale reliability coefficients ((McDonald, 1978, 1999; Zinbarg, Revelle,
Yovel, & Li, 2005; Reise, Bonifay, & Haviland, 2012). The newer theory of covariate-
dependent and covariate-free reliability of Bentler (2014) will also be discussed. The
above mentioned model-based reliability coefficients are estimated using CB-SEM. For
completeness, the chapter will also briefly discuss composite reliability (CR) using
PLS-SEM. The application of composite reliability in scale models, involving formative
constructs, will be elaborated upon in a later chapter.
3.2 Classical Test Theory and Coefficient Alpha
Constructs or latent variables are commonly used to classify or group similar
behaviours or attributes. However, constructs in psychology are usually measured
indirectly, through tests, surveys, or tasks. Designing such measurement instruments
(scales) for measuring constructs is challenging. The test developer must deal with
many measurement problems. The study of measurement problems, including the extent
36
to which they influence the measurements and methods for dealing with these problems,
has evolved into a specialised discipline known as Test Theory. Test Theory “provides a
general framework for viewing the process of instrument development” (Crocker and
Algina, 1986; p. 7).
Historically the roots of Test Theory were developed mainly by psychologists
from Europe and the United States. In Europe, the early development of Test Theory
dates back to the mid 1800s with the work of Wilhem Wundt, Ernst Weber, Gustav
Fechner and their colleagues in Germany. In Great Britain, scientists including Sir
Francis Galton, Charles Darwin and Karl Pearson were among the main scholars who
significantly contributed to the development of Test Theory.
37
dd
Internal Consistency RELIABILITY
Kuder and Richardson (1937) Hoyt (1941)
Guttman (1945)
1900 1950 1970 2000
Coefficient alpha Cronbach (1951)
Unidimensional Reliability (Jöreskog, 1971; Heise &
Bohrnstedt, 1970)
Multidimensional Omega Reliability (McDonald, 1978)
Latent Variable Model Reliability rho (Bentler,
2007)
Covariate-Dependent & Covariate-Free Reliability
(Bentler, 2014)
Omega Hierarchical, Omega subscales and Omega Total (McDonald, 1999; Zinbarg, Revelle, Yovel, & Li, 2005; Reise, Bonifay, & Haviland,
Early Roots Recent Developments
Composite reliability ( cρ ) (Werts, Linn & Joreskog, 1974)
Parti
al L
east
Squ
ares
SEM
(PLS
-SEM
) C
ovar
ianc
e-ba
sed
SEM
(CB
-SEM
)
Figure 3.1. A pseudo path diagram and timeline for some of the developments in historical review of the conceptualisation and estimation of model-based reliability 38
Between 1905 and 1908, the French psychologists Alfred Binet and Theophile
Simon established an example of psychological assessment that has stood the test of
time. They successfully created an intelligence test (IQ) to measure the level of
intellectuality in children. The empirical test analysis and the advanced concept of norm
can be attributed to the work of Binet and is still used by modern test developers.
Early in the 20th century, American scientists made some progress developing
Test Theory further. In 1904, E. L. Thorndike published the first textbook on Test
Theory and James McKeen Cattell acknowledged the significance of norms and errors
in observations. The founding of the Psychometric Society in the 1930s promoted
further advancements for the establishment of Test Theory. Through the Psychometrika
and Educational, and Psychological Measurement journals, more opportunities were
available for scholars to exchange ideas and theories in this field.
In 1986, Galton's study of students from Cambridge University showed that
mental abilities could be distributed as a normal curve allowing the application of
statistical techniques to psychological test data. Karl Pearson’s work with the
computational formula of the correlation coefficient followed. The procedure known as
factor analysis was originally developed based on an advanced set of correlational
procedures designed by Charles Spearman, later becoming one of the most popular
statistical procedures for assessing the validity of measurement instruments.
The importance of Test Theory in research and evaluation is well recognised. In
order to achieve accurate, comparable outcomes, it is crucial for researchers to adhere to
the principles of Test Theory for developing or testing measurement instruments and to
evaluate the accuracy and sensitivity of these tools before utilising them for research
purposes.
39
The reliability coefficients measure is defined as “… the degree to which
individuals’ deviation scores, or z scores, remain relatively consistent over repeated
administration of the same test or alternate test forms” (Crocker & Algina, 1986; p
105). In Test Theory, different types of reliability are introduced. ‘Test-retest’, ‘Parallel
Forms’ and ‘Internal Consistency’.
‘Parallel Forms’ reliability is based on creating two scales which provide
composite scores for measuring the same construct (Nunnally & Bernstein, 1994). This
reliability measure is calculated as the squared correlation between the composite scores
of the two scales. However, although this procedure is a good procedure to identify
sources of error variance (Nunnally & Bernstein, 1994), it is hardly used in the
psychology literature.
‘Test-retest’ reliability is more commonly used. Instead of creating two different
scales and comparing the results as in parallel forms reliability, the consistency of the
responses over different time points are considered. Random measurement errors are
one of the main sources of inconsistency in the responses of individuals over time.
However, given the often limited time interval between test and retest, the accuracy of
the procedure has been criticised in the literature (Nunnally &Bernstein, 1994).
‘Internal Consistency’ reliability is less complex than parallel forms and test-
retest, in that a single scale is measured at only one time point. Two popular procedures
for estimating internal consistency reliability are ‘split-half’ and ‘Cronbach’s alpha’
(hereafter called coefficient alpha).
In the ‘split-half’ procedure, the same scale will be split into two parts and the
correlations between the two parts are compared. The stronger the positive correlation
between the two parts of the scale the better the internal consistency of the scale. There
40
are a few limitations with the split-half procedure. Firstly, there is no clear procedure or
justification for splitting the scale into halves. Secondly, for time-limited testing, such
as ability or IQ measurement, with items arranged from easy to hard, the reliability
estimates may be upwardly biased (Cronbach, 1960). Thus due to these limitations,
coefficient alpha was introduced by Cronbach (1951) as an average reliability of all
possible split-half estimates for estimating internal consistency.
Coefficient alpha was first cited in Cronbach’s famous article in Psychometrika
(1951). Other scholars (e.g. Kuder and Richardson, 1937; Miller, 1995) are credited
with the further development of this measure. In particular Kuder and Richardson
generated variance estimates for this measure using the mean of a series of reliability
coefficients calculated from a single study using a random split of items. Later, Hoyt
(1941) proposed a conservative estimation procedure for assessing the reliability of a
scale based on an analysis of variance decomposition of the data. This estimation
procedure delivers similar results to the KR20, described below, but underestimates
reliability.
As explained above the coefficient alpha formula was proposed as the mean of
all possible split-half coefficients for a particular scale with
2
121
1
n
ii
x
sn
n sα =
= −
−
∑ Equation 3.1
where the number of items is n , the estimated variance of item i is 2is and the estimated
variance of the scale (X) is 2xs . A value closer to one suggests a scale with better
internal consistency.
41
3.3 Major Problems with Using a Coefficient Alpha Reliability Analysis
Based on Cronbach's approach (1951, p 331-332) there are some essential
assumptions that should be considered before applying coefficient alpha to evaluate
internal consistency. Unfortunately, these assumptions are ignored by many researchers,
resulting in concerns regarding the validity of coefficient alpha results for internal
consistency evaluation. Three assumptions should be considered before using
coefficient alpha for internal consistency evaluation. These assumptions are “essential
tau equivalency”, “uncorrelated errors” and “uni-dimensionality”.
Essential Tau Equivalency assumes that each item makes an equal contribution
of variance to the true scale variance (Green and Yang, 2009a). However, equal factor
loadings are seldom found in a scale and, moreover, the majority of scales are
multidimensional with unequal variances explained by each dimension. Thus, the
Essential Tau Equivalency assumption, is often violated (Sijtsma, 2009). In violating
the assumption, a negatively biased reliability coefficient is possible (Green & Yang,
2009; Sijtsma, 2009). As a result, the alpha coefficient often underestimates true
reliability (Green & Yang, 2009; Sijtsma, 2009).
Uncorrelated Errors assumes no correlation between the item errors ( ie ) when
the ith item (xi) is expressed as a linear function of the factor (f)
i i ix f eλ= + Equation 3.2
This assumption is also commonly invalid (for details see Green & Yang, 2009;
Sijtsma, 2009; Bentler, 2009). Violating the uncorrelated errors assumption leads to
several problems. For example, Bentler (2009) explains that violating the assumption
results in overestimated alpha coefficients because of unwanted systematic variance.
42
Other scholars (e.g. Green & Yang, 2009a; Sijtsma, 2009) argue that violating the
assumption can lead to either overestimation or underestimation of coefficient alpha.
Due to the common violation of coefficient alpha’s assumptions, the use of
coefficient alpha is criticised by several researchers (Bentler, 2009; Green &
Hershberger, 2000; Green & Yang, 2009; Sijtsma, 2009). This has led to several
improvements being suggested. These suggestions include reporting the greatest lower
bound (glb) for coefficient alpha as a measure of internal consistency (Sijtsma, 2009),
and Bentler’s dimension-free lower bound for reliability (blb)1 (1972).
These above mentioned recommendations are not always appropriate for
computing reliability coefficients for several reasons. The assumption of blb and glb are
specified at a population level (lower bound reliability) and assume no sampling
covariance. As stated by Bentler (2009), “in practice, sample covariance and correlation
matrices must be used in the computation instead of their population counterparts,
which are essentially never available” (p. 141).
In addition, they also assume uncorrelated error terms and/or no known
dimension for the factor model. Thus, in the presence of correlated errors or strong
theoretical and empirical knowledge on the dimensionality of the model, it does not
seem appropriate to use the blb or glb.
Therefore, Coefficient alpha and the above blb or glb related measures are not
appropriate when:
1 Bentler’s dimension-free lower bound reliability ( blbρ ) is proposed by Bentler (1972) based on no assumption on number of factors. Under the same assumptions glb and blb are equal.
43
a) the assumptions of using coefficient alpha are violated,
b) the dimensionality of the measurement model is already established (as a
unidimensional or multidimensional), and
c) the model fits the data well.
In many situations a model-based reliability measure is preferable to coefficient alpha
and the above blb or glb related measures. This leads to the discussion of
unidimensional and multidimensional model-based reliability estimates which will be
discussed in the next section. These are based on sample covariances matrices and have
weaker assumptions than Coefficient alpha.
3.4 Unidimensional Model-based Reliability
In response to coefficient’s alpha limitations and within the setting of
confirmatory factor analysis, the analysis of congeneric measures was introduced by
Jöreskog (1971) to calculate the uni-dimensional model-based reliability coefficient 11ρ
. This reliability coefficient 11ρ is perhaps one of the earliest proposals for assessing
the reliability of 1-factor models which does not require equal item reliabilities
(Gerbing, & Anderson, 1988). Using Maximum Likelihood (ML) estimation, 11ρ can
be estimated in SEM using the following formula when item residuals (ei) are assumed
independent and k items with loadings λi, are included in a scale.
∑+
∑
∑
=
==
=
k
ii
k
ii
k
ii
eVar1
2
1
2
111
)(λ
λρ Equation 3.3
44
The assumptions of essential tau equivalency or equal variance among items is
less important for this coefficient. The reliability coefficient will not be affected by
large differences between item variances. But in the presence of equal factor loadings
and item variances, 11ρ is equal to coefficient alpha.
Similarly to reliability coefficient 11ρ proposed by Jöreskog (1971), the ρt
coefficient of Zimmerman (1972), defined below, is useful for estimating the model-
based reliability of 1-factor models when the assumption of equal factor loadings across
all items is not met or when errors terms are correlated (McDonald, 1978, Raykov,
2001). However, as with 11ρ , when we have a unidimensional construct with equal
factor loadings and error variances for all items, with no correlation between the
residuals, the numerical value of coefficient (ρt) will be equivalent to that of coefficient
alpha (Raykov & Shrout, 2002).
2
12
1 1 1( ) 2 ( , )
k
ii
t k k
i i i ji i i j k
Var e Cov e e
λρ
λ
=
= = ≤ < ≤
=
+ +
∑
∑ ∑ ∑Equation 3.4
Akin to coefficient alpha, the above mentioned methods involve a one-factor
model which explains a set of items and are therefore not suitable for instruments and
scales that are multidimensional. However, it should be noted that the uni-dimensional
rho (ρ) reliability coefficients mentioned previously, can also be interpreted as a
unidimensional measure that quantifies the proportion of variance due to the most
reliable single dimension in a multidimensional space (Bentler, 2007).
In order to address the need for multi-dimensional reliability measures the
Omega (ω) procedure was developed using multi-dimensional measurement models
45
fitted using SEM. McDonald’s (1978) coefficient omega (ω) is defined below in the
context of a 2-factor model with loadings (λij) for the ith item on the jth factor (ηj) and
errors (ei) for the ith item, i=1, 2,…,k. It represents the ratio of the true variance to the
observed variance for this measurement model.
∑ ∑+∑
∑ ∑
=
= ==
= =
k
i
k
iij
jij
k
ij
jij
eVar
Var
1 1
2
1
1
2
1
ηλ
ηλω Equation 3.5
More recent developments have addressed more complex measurement models
such as the bi-factor models (Gignac, 2013; Reise et al., 2012) described below. This is,
an under-investigated area in psychometrics, in which a general factor exists alongside
sub-factors (Revelle, & Zinbarg, 2008; Zinberg, Revelle, Yovel, & Li, 2005).
3.5 Recent Developments
3.5.1 Omega Hierarchical and Omega Subscale for Bi-factor Models
Using the same Omega formula described above, Omega hierarchical ( hω ) and
Omega subscale (ωs) together estimate the degree of proficiency of a test measure in
assessing the reliability of a hierarchical or bi-factor model. Omega hierarchical ( hω ) is
applicable for assessing the reliability of only the general factor loadings of a bifactor
model. More specifically, it refers to a measure of the variance in total scores that arise
from the general factor running across all the items (Reise, Bonifay & Haviland, 2013).
The degree of reliability of the proposed subscale scores can then be evaluated
after controlling for the variance generated from the general factor. This procedure
creates reliability measures known as Omega subscale ( sω ) for each subscale (Reise et
46
al., 2012) using the same Omega formula for each subscale. Reise, et al. (2012)
advocate the reporting of these reliability indices for all subscales (see Figure 5 for an
example). Reporting the Omega subscales is also very useful in bifactor models when
the plausibility of subscales are of special interest. Omega hierarchical and omega
subscales can be easily estimated using the R psych package (Revelle, 2013) and
AMOS. In addition, by calculating a confidence interval for the omega reliability
coefficients more useful estimates will be obtained.
A bi-factor model with 5 subscales is illustrated in Figure 3.2. The extent to
which multidimensionality affects both the general factor and subscale scores can be
appraised more accurately when the corresponding hω and sω values are reported in the
case of bifactor models (for more details on bifactor models, please see Chapter 4).
The formulae for Omega hierarchical (ωh) and Omega subscale (ωs) are
provided below in the case of k items (i = 1, 2,…,k) contributing to a general factor with
loadings λgi and P subscales with loadings λSj,i for i =1, 2,… SJ for each subscale Sj
j=1,2,..P.
∑∑∑∑∑
∑
=====
=
+
++
+
+
=k
ii
SP
iSPi
S
iiS
S
iiS
k
igi
k
igi
h
eVar1
2
1
22
12
21
11
2
1
2
1
)(... λλλλ
λω Equation 3.6
∑∑∑
∑
===
=
+
+
=1
1
21
11
21
1
21
11
1
)(S
ii
S
iiS
S
igi
S
iiS
s
eVari
λλ
λω Equation 3.7
47
∑∑∑
∑
===
=
+
+
=2
1
22
12
22
1
22
12
2
)(S
ii
S
iiS
S
igi
S
iiS
S
eVarλλ
λω Equation 3.8
etc.
where the items i=1,2,…S1 all belong to the S1 scale and the items i=1,2,…S2 all
belong to the S2 scale, etc. Combining these reliabilities the total reliability of the P-
factor measurement model is obtained using Omega total (ωt)
∑∑∑∑∑
∑∑∑∑
=====
====
+
++
+
+
++
+
+
=k
ii
SP
iSPi
S
iiS
S
iiS
k
igi
SP
iSPi
S
iiS
S
iiS
k
igi
t
eVar1
2
1
22
12
21
11
2
1
2
1
22
12
21
11
2
1
)(...
...
λλλλ
λλλλω Equation 3.9
48
Figure 3.2. Demonstrating Omega reliability coefficients for WOAQ
Note: hω =Omega hierarchical; sω = Omega subscale
3.5.2 Covariate-dependent and covariate free reliability
Bentler’s approach to covariate-dependent and covariate-free reliability based
on coefficient rho is the next new development to be discussed. To establish an
1sω
2sω
hω 3sω
4sω
5sω
49
acceptable level of reliability in measurement instruments, there should be positive
intercorrelations among indicators, especially when these indicators are supposed to
represent a single latent construct (Nunnally, 1978; Zeler & Carmines, 1980).
Reliability is usually quantified with a reflective measurement model that implies that
the latent factor generates the systematic responses to a set of items (Bollen & Lennox
1991). In such a model, since error and specific scores are confounded, an increase in
error variance or specific item variance implies a decrease in internal consistency
(Green & Yang, 2009).
More recently, Bentler (2014) introduced the concept of covariate-dependent
and covariate-free reliability that partitions total reliability into parts based on external
covariates and a part which is unaffected by such covariates. The following material on
covariate-dependent reliability is adapted from either personal conversations with
Bentler (2012, 2013) or Bentler (2014). Only the practical application of this concept is
assessed in this study.
Suppose that the covariance matrix for a given set of p variables Xi can be
modelled as cΣ = Σ +Ψ , where cΣ is the part of the covariance matrix that contains all
influences of latent common factors on the observed variables and Ψ is the covariance
matrix of the error (unique or residual) variation. In a confirmatory factor model with
′Σ = ΛΦΛ +Ψ , c ′Σ = ΛΦΛ represents the factor-implied covariances of the variables.
In this case the reliability coefficient rho, describing the internal consistency of the sum
score1
p
ii
X X=
=∑ , is calculated using a unit-weighting vector 1 as:
1 111 1XXρ′Ψ
= −′Σ
Equation 3.10
50
Where 1 1′Ψ is the sum of the unique or error variances associated with the X-variables,
and 2 1 1xσ ′= Σ is the sum of all the elements in the model-reproduced covariance matrix
of the X-variables. Clearly, XXρ represents the proportion of construct-based variance to
the total variance of the sum score (Figure 6).
Now suppose a model contains latent variables that are influenced by a set of
covariates that we may call Z-variables. For simplicity, we consider only models that
have a single latent factor, say F. Then, assuming that the covariates Z predict F, i.e.,
that the model contains one or more ZF paths, the covariate-free rho is:
2Z (1 )
1 1XXρ⊥ ′∆ ⋅ Λ=
′Σ Equation 3.11
Where∆ is the variance of the residuals in the regression of F on the Z- variables and
1′Λ is the sum of the factor loadings in the unstandardised solution.
The covariate-dependent or covariate-dependent rho can then be defined as
( )Z ZXX XX XXρ ρ ρ⊥= − . Equivalently, it can be calculated as:
2( ) ( )(1 )
1 1Z
XXγ φγρ′ ′Λ
=′Σ
Equation 3.12
where γ ′ is the row vector of regression coefficients of F on the Z covariates and φ is
the covariance matrix of the Z’s. In EQS, F may be called F1 and the residual in the
regression ZF may be called D1. Then γ φγ′ can be most simply computed as var (F1)
– var(D1), where var(F1) comes from the model reproduced covariance matrix and
var(D1) is a parameter estimate obtained from the residual variance of the model
(Bentler, 2006; 2014; personal communications 2012, 2013).
51
3.5.2.1 Interpretation of Covariate-dependent Reliability. Assuming a group
covariate, Bentler (2014; personal communications, 2012, 2013) defines the covariate-
dependent reliability as “… a measure of the effect of group differences on the trait
being measured relative to total variation, while the covariate-free reliability is a
measure of the reliable individual difference variance freed from any mean differences
due to the covariate(s)”.
The traditional view of reliability is defined as a measure of stable individual
difference variation (if the data comes from individuals) relative to total variation. But
in Bentler’s view (personal communications, 2013), “… any individual score might be
influenced by other sources, including group and other individual differences”. For
example, if nurses measure a wound size using a wound-measurement device, how
much of the accuracy in the measurement is due to individual differences, and how
much is due to factors such as level of experience or training? In this case
accuracy/reliability in wound measurement is measured with many indicators, and the
latent factor (= true score) is the trait of interest, with the researcher obtaining reliability
measures in the usual way.
A path diagram of four indicators, V1-V4, as measures of a construct (F1),
shown in Figure 3.3, can be used to illustrate the standard and covariate-dependent
reliability measures discussed above. In the standard case, one may assume the
unidimensional model described below.
52
Figure 3.3. A unidimensional construct with four indicators
The coefficients 1 4 toλ λ are constants representing the strengths of the effect
of F1 on the various indicators (observed variables V1-V4). Typically these are called
factor loadings. Letting the factor F1 be F and the random measurement errors E1-E4 be
1 4 to E E , the diagram corresponds to the following measurement equations:
V1= 1 1F Eλ +
V2= 2 2F Eλ +
V3= 3 3F Eλ +
V4= 4 4F Eλ + Equation 3.13
Standard internal consistency reliability coefficients attempt to provide the
proportion of variance in the scale (sum) score V1+V2+V3+V4 that is due to F. There is
no further variance partitioning.
Covariate-dependent reliability illustrated in Figure 3.4 by adding two covariates
(V5 and V6) to the model that predict the latent factor F.
1λ
2λ
3λ
4λ
53
Figure 3.4. A covariate-dependent construct with four indicators and two covariates
By path tracing, one can determine from Figure 3.4 that the variance of F1 (F)
can be partitioned into the variance due to the covariates V5 and V6, plus the variance
due to the residual D1. The former is used to yield the covariate dependent factor
variance, while the latter represents the part of the variance of F1 that is covariate-free.
These variances are then used in the model-based reliability formulae given above for
covariate-free and covariate-dependent reliability. However, Coefficient alpha can also
be partitioned in this way.
3.5.2.2 Covariate-dependent and Covariate-free Partition of Coefficient Alpha.
Coefficient alpha, previously given in equation 3.1, represents an estimate of
the reliability of 1
p
ii
X X=
=∑. Partitioning coefficient alpha into a part due to
covariates and a part unaffected by covariates requires another approach. As
presented by Bentler (2014, personal communications, 2012, and 2013), the
joint covariance matrix of covariates (Zi) and variables of interest (Xi) can be
presented as:
54
xx xz
zx zz
Σ Σ Σ Σ
Equation 3.14
where xxΣ is the covariance matrix of the original p variables X, zzΣ is the covariance
matrix of the set of q covariates (Z-variables), and xzΣ gives their joint covariances.
In order to calculate a covariate-dependent alpha coefficient, the computations
essentially require regressing X on Z. It is well-known in the regression literature that
such a regression partitions the covariance matrix xxΣ into two parts, the part
1( )xz zz zx−Σ Σ Σ predictable from Z and the part 1( )xx xz zz zx
−Σ −Σ Σ Σ not predictable from Z, that
is,
1 1( ) ( )xx xx xz zz zx xz zz zx− −Σ = Σ −Σ Σ Σ + Σ Σ Σ . Equation 3.15
As a consequence, ijσ , the average covariance in xxΣ , can also be partitioned as
( )Z Zij ij ijσ σ σ⊥= + Equation 3.16
where Zijσ ⊥ is the average off-diagonal element of the 1st right-hand term in equation
(3.15) and ( )Zijσ is the corresponding average of the 2nd right-hand term. Substituting
equation (3.16) into the defining formula for alpha given in equation (3.1), we have
2 2 ( ) 2 2 ( )
2 2 2 2
( )
( )
=
Z Z Z Zij ij ij ij ij
x x x xZ Z
p p p pσ σ σ σ σα
σ σ σ σ
α α
⊥ ⊥
⊥
+= = = +
+
. Equation 3.17
Hence, coefficient alpha can be partitioned into two additive parts, where one
part is free of the covariates and the other part is covariate-dependent. Two major
applications of this procedure will be discussed in chapters 9 to 11. One of the main
applications concerns the effect of a covariate on the reliability of a scale. The second
55
application concerns applying this method for demonstrating the effect of Common
Method Bias (CMB) on reliability.
3.5.3 Composite Reliability using PLS
All the above mentioned model-based reliability assessments require the use of
of reflective measurement models and covariance-based SEM (CB-SEM). However,
CB-SEM is not the only appropriate method for assessments of model-based reliability.
Partial Least Squares (PLS) SEM provides an alternative approach. CB-SEM uses
Maximum likelihood (ML) estimation while PLS-SEM uses partial least squares
estimation. PLS- SEM has fewer underlying restrictions than CB-SEM which usually
requires normally distributed data and large sample sizes. This composite reliability
measure obtained using PLS-SEM will be fully explored in study 3 (chapters 12-14).
3.6 Summary
In this chapter the history of model-based reliability using SEM was critically
explored. When Cronbach’s famous article on his coefficient alpha was published in
Psychometrika in 1951, a single general coefficient for assessing internal consistency
and reliability became available. Since then, the alpha coefficient has been widely used
by researchers in many fields. However, it has been recently criticised by several
researchers (Bentler, 2009; Green & Hershberger, 2000; Green & Yang, 2009; Sijtsma,
2009), resulting in recommendations for improvements. Although these
recommendations may be useful, there are other methods (such as model-based
reliability) that should also be considered.
56
Measures for model-based reliability calculated using SEM include one factor
model coefficients such as rho or 11ρ (Jöreskog, 1971), multi-factor model coefficients
such as McDonald’s Omega (ω) (1978), and model coefficients such as Omega
hierarchical ( hω ), Omega subscales ( sω ) and Omega total ( tω ) for bi-factor
models (Revelle et al., 2009). Finally, the covariate-free and covariate-dependent
reliability coefficients of Bentler (2014) are recent practical methods developed to
examine the effects of covariates on the internal consistency of scales using SEM.
Two major recent model-based reliability measurement developments were
discussed in more detail in this chapter. They included a) Omega hierarchical and
subscale with a focus on their application in bifactor models, b) the covariate-dependent
and covariate-free reliability coefficients of Bentler (2014). A third development, the
PLS-SEM procedure for computing Composite Reliability, will be introduced in a later
chapter.
Unfortunately, software for calculating these new model-based reliabilities has
not routinely been available to scholars, and despite the importance of multidimensional
model-based reliability measurement, there is a lack of empirical studies where these
coefficients are estimated. Either scholars in the disciplines do not recognise the
importance of model-based reliability coefficients based on latent constructs over the
classical alpha coefficient, or the appropriate statistical software is still not readily
available. For example, except for EQS (Bentler, 2006) which calculates model-based
reliability rho and the R psych package (Revelle, 2013), computing omega hierarchical
and subscales, most of the packaged software (e.g., SPSS), only provide the classical
alpha coefficient calculation.
57
Model-based reliability estimation provides a more accurate representation of
the true relative magnitude of systematic variance to total variance in a scale or
instrument. Therefore, once a SEM model fits well with its proposed constructs and
measured variables, a more accurate representation of its reliability can be obtained
using model-based reliability measures.
What comes in the following chapters are the application of these recent
developments in practice, with a special focus on bifactor and reflective-formative
models. Chapters 4-6 lay the theoretical ground work for these applications.
58
4
THE VALIDITY OF BIFACTOR VERSUS HIGHER-ORDER
MEASUREMENT MODELS
This chapter opens with an introduction to bifactor models. The chapter then
considers the application of the bifactor model in an organisational study of Work
Organisation Assessment (WOAQ). Bifactor Models
Constructs are often operationalised as multidimensional units (Diamantopoulos,
2010; Edwards & Bagozzi, 2000). When a number of dimensions or related attributes form
a latent factor, it is considered multi-dimensional. In a multi-dimensional construct,
dimensions can be conceptualised under an overall concept or a second-order (higher-
order) construct (Law, Wong, & Mobley, 1998). In second-order constructs, two levels of
constructs exist: the first-order level constructs composed of indicators and the second-
order level constructs composed of first-order constructs (Jarvis et al., 2003). Such models
are known as hierarchical (higher-order) models.
The majority of researchers in behavioural sciences, by default use higher-order
modeling to evaluate multidimensionality. However, this is not the only procedure for
evaluating multidimensionality and it may not always be the best way to evaluate a
multidimensional model. The use of other approaches, such as bifactor (direct hierarchical
order) modeling are not commonly found in the literature (Gignac, 2007; Reise, Moore, &
Haviland, 2010). In a bifactor model, all latent variables are modelled as first-order
constructs, in which first-order factors are nested within a general factors (Gignac, 2007;
Gustafsson & Balke, 1993; Holzinger & Swineford, 1937).
59
Perhaps the early roots for the development of bifactor (nested factors) models can
be traced back to the work of Holzinger and Swineford (1937) (for full history of SEM
development, see Karimi & Meyer, 2014). However, the bifactor approach is not well
appreciated in the literature, although there are some important advantages in using this
model as an alternative to the conventional higher-order modeling (Gignac, 2007, 2013).
As discussed by Gignac (2007) “the advantages are that bifactor (nested factors) models:
(a) tend to be associated with non-negligibly higher level of model fit; (b) allow for
statistical significance testing for all parameter estimates, and (c) allow for less ambiguous
interpretations of the factor loadings and the narrow factors ‘nested’ within the higher-order
factors (s)” (p, 40). As asserted, one can achieve a better model fit using bifactor modeling.
Imposing fewer restrictions on parameter estimates (as opposed to the number of
restrictions required in conventional CFA procedures) improves the validity of the reported
results in bifactor models. In addition, the bifactor model provides some evidence regarding
the plausibility of the subfactors and the extent of their contribution in a practical sense.
However, the bi-factor procedure is not without disadvantages. One of the main
limitations of using nested factor models is that they have fewer degrees of freedom which
may lead to model identification problems (Gignac, 2007). However this problem can be
managed simply by constraining some of the parameters in the model (Gignac, 2007,
2013). In this study all the latent variable variances are constrained to 1.0 in order to
achieve an identified model.
It is evident that, if the aim for the proposed model is to present both
multidimensionality and a general single factor at the same time, then a bifactor model is an
60
appropriate procedure to present the model (Reise et al., 2010). Using a bifactor model not
only demonstrates the contribution of the items to a general factor (broad construct) but
also provide information on the item contributions to sub-dimensions (narrow constructs)
(Reise et. al, 2010).
4.1 Bifactor Model of WOAQ
The Work Organisation Assessment Questionnaire (WOAQ) was developed as part
of a risk assessment procedure for stress-related exposures inherent in the manufacturing
sector. For a widely-used measure like the WOAQ, using a bifactor model is deemed to be
appropriate for several reasons.
First, having a broad or macro level assessment (using a general factor) would help
to get an overall picture of the organisation. Conversely, being able to assess the
organisation at a narrow or micro level (using subfactors) has practical implications in that
specific problematic areas can be identified and addressed. Evaluating the plausibility of
subfactors is very important in such contexts, making a direct hierarchical model for
WOAQ a good choice.
Second, as highlighted in recent studies (e.g. Wynne-Jones, Varnaya, Buck,
Karanika-Murray, Griffiths, Phillips, & Main, 2009), the latent structure of the WOAQ in
non-manufacturing sectors did not demonstrate a good fit to the model, suggesting that
conventional models are inadequate. Model fit is often a problem for the WOAQ when
conventional second-order models are considered. In the context of risk assessment in
organisations, a tool like WOAQ presents the overall work condition as a general single
factor. Additionally, it adds further benefit by highlighting the different subsections of work
61
organisation characteristics. Thus, evaluating the plausibility of subfactors is very important
in such contexts, suggesting that a direct hierarchical model for WOAQ would be an
appropriate choice.
One of the aims of study 1, therefore, is to compare a bifactor (nested factor) model
with a conventional second-order (higher-order) model of WOAQ. This is done in a health
setting. This study is expected to open up some empirical and methodological avenues for
further developments in this area.
A higher-order model (or full mediation model) and a bifactor model (partial
mediation model) of WOAQ can be distinguished, statistically (Gignac, 2007, 2008, 2013;
Yung, Thissen, & McLeod, 1999) and diagrammatically, as illustrated below.
62
Model 1. Higher-order model of WOAQ
Model 2: Bifactor model of WOAQ
Figure 4.1. Higher-order vs. Bifactor model of WOAQ
63
4.2 Summary
The distinction between a bifactor and a higher-order measurement model was the
focus of this chapter. It is evident that, a bifactor model has superiority over a higher-order
model when the aim of validating a measurement model is to present not only the
multidimensionality and plausibility of the subfactors but also the underlying general factor
of the scale on its own. A comprehensive measure of WOAQ, using a bifactor model, offers
multiple benefits. Firstly, it demonstrates the contribution of the items to a general factor of
WOAQ. Secondly, it provides information on the item contributions to subscales and
indicates the relative importance of the subscales. This procedure has practical implications
in organisational studies as it provides the researchers/practitioners with both a broad and a
detailed picture of WOAQ in a given setting. The general factor of WOAQ highlights if any
problems exist within the organisation. If so, the subscales of WOAQ would highlight the
more critical points that need attention.
In Chapters 6 to 8, a bifactor model of WOAQ will be validated and cross validated
across gender in a nursing and paramedics setting. The results will be compared with a
higher-order model.
64
5
THE VALIDITY OF FORMATIVE MEASUREMENT MODELS VERSUS
REFLECTIVE MODELS
By default, many researchers use reflective models, usually without precise
evaluation of the model (Diamantopoulos & Winklhofer, 2001). As a result of ensuing
model misspecification, two types of error may be caused (Type I and II errors). Recently
researchers in information systems (IS), leadership, management and marketing have
highlighted problems of misspecification in measurement model construction
(Diamantopoulos & Winklhofer, 2001; Jarvis, MacKenzie & Podsakoff, 2003; Podsakoff,
Shen, & Podsakoff, 2006).
As a result of this type of misspecification, some of the findings in the literature
may be misleading (Jarvis et al. 2003; MacKenzie et al. 2005; Petter, Straub, &Rai, 2007).
As mentioned by Jarvis et al. (2003), construct misspecification issues can lead to “serious
consequences for the theoretical conclusions drawn from the model” (p. 212). The extent of
this misspecification problem has been studied in several areas but never in organisational
psychology (Diamantopoulos & Winklhofer, 2001; Jarvis et al., 2003; Petter et al., 2007).
Therefore the four key aims in this chapter are to:
a) To distinguish between reflective and formative SEM models
b) Review some of the literature to identify the extent of possible SEM measurement
model misspecification problems in the area of organisational psychology,
c) To present an empirical example of misspecification using the Work Ability Scale
(WAS)
65
d) To propose a framework for distinguishing formative from reflective models.
It is hoped that the findings will assist researchers to distinguish which types of
measurement models to use for their research. This chapter is organised into four sections
based on the above aims.
5.1 Differences between Formative and Reflective Models
In 1973, a Swedish statistician Karl Jöreskog combined the factor analytic work of
two psychometricians, Charles Spearman and Louis Thurstone, and the path analysis work
of econometrician Sewall Wright, to develop what is now known as SEM (Cunningham,
2008). For nearly half a century, the SEM technique and the computer program LISREL,
which was the result of Jöreskog's work, aided the mapping of interrelated constructs in
broad areas of study. More programs have since been developed (AMOS, EQS, MPLUS)
extending the scope and simplifying the application of this technique.
SEM distinguishes between two different measurement models: reflective and
formative. When indicators are affected by a latent variable, reflective models are
appropriate. Yet in many settings, indicators may be considered as the cause of latent
variables, making formative models more appropriate.
By default, most researchers assume that models are reflective, although many
scholars (e.g. Blalock, 1971; Bollen, 1984; Diamantopoulos and Winklhofer, 2001; Jarvis
et al., 2003; Petter et al., 2007) have alerted researchers to the relevance of formative
models in some specific situations. This advice is unfortunately ignored in much of the
research literature.
66
The following section compares reflective and formative models conceptually.
5.1.1 First-order Reflective and Formative Models
In classical test theory, indicators (items) are considered to be dependent on a latent
variable, in which case:
x i = λ i ξ+ δ i Equation 5.1
where, xi (the ith indicator) is defined by the latent variable (ξ) , the measurement error (δ i),
and the expected coefficient (λ i).
Such measures can be called reflective in that the items are indicators of a latent
factor (Fornell & Bookstein, 1982). Such models provide the trigger for reliability
evaluation and common/confirmatory factor analysis (Bollen 1989; Long, 1983; Nunnally,
1978). A simple first-order model of reflective measurement is represented in Figure 5.1, in
which the latent variable (ξ) is conceptualised as the common cause of three items or
indicators, identified as x₁, x₂, and x₃.
67
x₁= λ₁ξ+ δ₁
x₂= λ₂ξ+ δ₂
x₃= λ₃ξ+ δ₃
Figure 5.1. First-order reflective model
Conversely, based on the nature of the model, the indicators might cause the
construct (Bollen& Lennox, 1991). When the construct is formed from its indicators a
formative model is suggested (Fornell & Bookstein, 1982). Equation 5.2 presents an
example of a formative model in which a weighted sum of the indicators (Σiλixi), represents
the construct (ξ) with an error (ζ):
ξ =Σiλixi+ ζ Equation 5.2
Figure 5.2 presents an example of a first-order formative model in which the causal action
flows from the indicators (x₁, x₂, x₃) to the composite variable (ξ).
68
ξ=λ₁x₁+λ₂x₂ +λ₃x₃+ζ Equation 5.3
Figure 5.2. First-order formative model
5.1.2 Higher-order Reflective and Formative Models
The reflective and formative models specified in Equation 5.1 and 5.2 are examples
of first-order reflective and formative measurement models. However, constructs are often
operationalised as multidimensional units (Diamantopoulos, 2010; Edwards and Bagozzi,
2000). When a number of dimensions or related first-order constructs form a latent factor, it
is considered a multi-dimensional construct. In a multi-dimensional construct, dimensions
can be conceptualised under an overall concept or a second-order construct (Law, Wong,
and Mobley, 1998), or both, as was seen in the bi-factor model in the last chapter.
In second-order constructs, two levels of constructs exist: the first-order level with
indicators and the second-order level with first-order constructs (Jarvis et al., 2003). As
illustrated in Figure 5.3, a reflective-reflective higher-order model has a reflective model
ξ
69
for each of the first-order constructs as well as a reflective model for the second-order
construct (η).
Equation 5.4
where the construct η is conceptualised as a second-order latent variable upon which the
first-order latent constructs ( ) are dependent with measurement error for each of these
first order constructs and expected coefficients .
Figure 5.3. Higher-order reflective-reflective measurement model
Higher-order effects between constructs can also be incorporated as a formative
model in which:
i i irξ γ η= +
iξ ir
iγ
η
1ξ
X₁
X₂
X₃
λ₁
λ₂
λ₃
δ ₁
δ ₂
δ ₃
X₄
X₅
X₆
λ₄
λ₅
λ₆
δ ₄
δ ₅
δ ₆
ξ ₂
γ₁
γ₂
r₁
r₂
70
A higher-order formative-formative model is presented in Figure 5.4 as an example.
In the model each first order construct is represented as a formative model while the second
order construct (η) is also represented as a formative construct.
Figure 5.4. Higher-order formative-formative measurement model
i i iη γ ξ ζ= Σ +
η
1ξ
X₁
X₂
X₃
λ₁
λ₂
λ₃
X₄
X₅
X₆
λ₄
λ₅
λ₆
ξ ₂
γ₁
γ₂
Ϛ
Ϛ
r₁
r₂
71
5.2 Applications of Formative Models
The most common uses of formative models include:
- Creating an induced latent variable
- Creating a block variable
- Illustrating the influence of an experimental intervention on a construct (Edwards &
Bagozzi, 2000).
Creating an induced latent variable is one of the common uses of formative models.
Examples of induced latent variable are presented by Crossley, Bennett, Jex and Burnfield
(2007) in their study concerning the creation of an idea for job embeddedness. Job
embeddedness represents “a broad array of influences on employee retention. The critical
aspects of job embeddedness are (a) the extent to which the job and community are similar
to, or fit with, the other aspects in a person's life space, (b) the extent to which this person
has links to other people or activities and, (c) what the person would sacrifice if he or she
left”. These aspects are important both on the job and off the job (Holtom, Mitchell, & Lee,
2006, p 320). Composite job embeddedness in this study is operationalised by three main
measures: Organisation- and community-fit (“an employee's perceived compatibility or
comfort with an organisation and with his or her environment”, p 320); links (“formal or
informal connections between an employee and institutions or people”, p320); and sacrifice
(“the perceived cost of material or psychological benefits that are forfeited by
organizational departure”, p320). Each construct represents various aspects of job
embeddedness. In this example, both constructs define the construct of job-embeddedness,
allowing the construction of a job-embeddedness index.
72
Some other examples of induced latent variables are social support indices which
include items that capture different aspects of social support (MacCallun & Browne, 1993),
like a socioeconomic status (SES) index, created as a function of education, income, job
status (Bollen& Lennox, 1991). In this instance, the combination of three diverse variables
(income, education, occupation) allows the construction of an SES index.
First-order formative models can also be used for creating block variables. A block
variable is a single construct which summarises the influence of several variables in a block
of outcome variable/s (Edwards & Bagozzi, 2000). In such cases, variables which
constitute the block variable usually illustrate the distinctive causes of the outcome. This
type of formative model was well-illustrated by Howell, Breivik, & Wilcox (2007), using
the study of family socialisation by Heise (1972). A block variable called “family
socialisation” was introduced by Heise (1972), which was a construct formed by the
mother/father’s liberalism, and other unspecified (disturbance) variables (Edwards &
Bagozzi, 2000).
Finally, another common application of formative modeling can be seen in studies
which involve intervention and the assessment of intervention effects on a construct
(Bagozzi, 1977; Costner, 1971). For instance, in an experimental study (Conster, 1971), a
fatigue construct was manipulated by depriving participants of sleep (indicator). In such
experimental studies that involve intervention, the measures can be considered as formative
constructs (Edwards & Bagozzi, 2000; Bagozzi, 1977; Costner, 1971).
73
5.3 Developing a Framework for Distinguishing Reflective- Formative Models
In this section an attempt is made to develop a clear and well-defined decision-
making framework for assessing whether a reflective or a formative model is appropriate.
Unfortunately, there is little information or practical guidelines for distinguishing the
reflective and formative models. The major work in this area was introduced by Jarvis et al
(2003), and Diamantopoulos and Winklhofer (2001), and then extended by Petter, Straub,
and Rai (2007) and Coltman, Devinney, Midgley, and Venail (2008).
What is presented here is a practical decision-making tree for evaluating reflective
and formative models of measurement, based mainly on a review of the works of Jarvis et
al (2003), Petter, Straub, and Rai (2007) and Diamantopoulos and Winklhofer (2001).
The background theory. The first step in identifying formative vs. reflective models
is to refer to the relevant background theory, to determine whether a construct is typically
viewed as a formative or reflective construct. This is usually considered to be the best way
of distinguishing between formative and reflective models. If there is doubt in the literature
or there are no solid theoretical frameworks available, then the following criteria might help
researchers in distinguishing between formative and reflective models. These criteria are
based mainly on the guidelines proposed by Jarvis, MacKenzie and Podsakoff (2003).
Direction of causality. The next step involves consideration of the direction of
causality between each construct and their indicators. As suggested by Jarvis et al (2003),
the researchers need to know, in the first instance:
1) Whether the items explain the latent factor, or if the latent factor represents the
indicators. In formative models, the indicators influence the latent factor or
74
“composite” variable (MacKenzie et al. 2005). But if the latent variable is fully
derived by its indicator items that manifest or represent the latent factor, a reflective
model is suggested.
2) The nature of changes in the latent factor. In formative models, the measurement
error is at the factor level; the latent factor is partially explained by random error
and is not fully explainable by its items. Any change in an item would lead to a
change in the latent factor, but not vice versa. The opposite is true in reflective
models; the measurement errors are at the item level, therefore, any change in the
indicator does not necessarily result in a change in the latent factor. However, any
change in the latent factor would result in a change in the items (Jarvis et al, 2003;
Petter, Straub, & Rai, 2007).
The interchangeability of the measures. The third step involves examining the
interchangeability of the measures (Jarvis et al, 2003):
1) The similarity of contents of the indicators. In reflective models, measures are
interchangeable and follow a common theme. Employing different themes suggests
formative measures which are not interchangeable.
2) Changes in the indicators. In formative measures, the latent factor is explained by
its items; removing any item of a formative factor would influence the meaning of
the latent or composite factor. In reflective models, however, removing an indicator
would not affect the meaning of the latent factor because they are outcomes of the
construct and not the cause (Jarvis et al, 2003; Petter, Straub, &Rai, 2007).
75
Co-variation among measures. The fourth step involves consideration of the
correlations among indicators; in other words, would variation in one indicator be
correlated with the variations in other indicators (Jarvis et al, 2003). In formative models,
because a construct is formed by different indicators, high correlations between the
indicators are not expected. The indicators in such models might represent totally different
content. However, with reflective models, because the indicators are presented by the latent
factor, high correlation between indicators is required. This suggests multicollinearity
which seems to be desirable for reflective measures. That is why establishing an acceptable
level of internal consistency is required for reflective models while it is not really
appropriate for formative models.
Nomological net of the latent factor indicators. The final decision rule is based on
the following criterion for reflective models: The same antecedents and consequences of.
With formative constructs, it is not expected that the observed variables have similar
predictors or outcomes. This is because the composite factors are formed by indicators that
are not necessarily correlated nor do they necessarily share the same content. Conversely,
with reflective models, due to the interchangeability of reflective indicators, the same
patterns of antecedents and consequences are expected for all indicators (Jarvis et al., 2003;
Petter, Straub, &Rai, 2007). Depending on the extent to which this criterion is met, the
researcher will be able to decide if it is a reflective or formative construct.
A summary of the above is presented in Figure 5.5 in the form of a decision tree.
These decision rules will be used in an examination of the organisational psychology
literature in the next section. However, although using this guideline helps to identify a
76
reflective or formative construct, in practice many constructs are mixed. In other words, a
construct has some items consistent with formative constructs and other items which are
consistent with reflective constructs.
77
Direction of causality
The interchangeability of the indicators
Co-variation among measures
Nomological net of the factor
Reflective model Formative model
yes yes
Figure 5.5. The developed framework for assessing the formative vs. reflective measurement models. Acknowledgement: The main contents of this framework are built based on the guidelines proposed by
Jarvis et al., (2003), Journal of Consumer Research, 30.
78
5.4 Measurement Model Misspecification in Organisational Psychology Literature
5.4.1 Empirical Evidence on Measurement Model Misspecification
In recent times, some researchers have focussed on misspecification in
measurement models. One of the earliest studies which highlighted misspecification in
formative constructs is that of Jarvis et al. (2003). In this study, the extent of
misspecification was assessed by reviewing four marketing journals (Journal of Marketing,
Journal of Marketing Research, Marketing Science and Journal of Consumer Research). A
29 per cent model misspecification rate was reported.
Fassot followed in 2006 (cited in Diamantopoulos, Riefler, & Roth, 2008),
reviewing three German management journals (Zeitschrift für Betriebswirtschaft,
Zeitschrift für etriebswirtschaftliche Forschung and Die Betriebswirtschaft). A 35 per cent
misspecification rate was reported.
In a similar process, Podsakoff et al. (2006) reported a misspecification rate of 62
per cent after reviewing the three most important strategic management journals (Academy
of Management Journal, Administrative Science Quarterly and Strategic Management
Journal). Similar results were also reported for leadership research (47 per cent
misspecification) by Podsakoff, MacKenzie, Podsakoff and Lee (2003) based on articles
published in The Leadership Quarterly, Journal of Applied Psychology, and Academy of
Management Journal.
Petter et al. (2007) examined complete volumes of MIS Quarterly and Information
Systems Research over three years. The study reported a 30 per cent misspecification for
formative constructs.
79
In more recent times, Roy et al. (2012) reviewed four journals in the area of
production, manufacturing and operations management (Journal of Management Science,
Journal of Operations Management, Decision Sciences Journal, and Journal of Production
and Operations Management Society) published between 2002 and 2006. They reported a
misspecification rate of 42.5 per cent.
In summary, the existing studies show a significant degree of misspecification in the
disciplines of information systems (IS), leadership, management and marketing. The
question is: “To what extent does misspecification exist in other disciplines such as
psychology?” To the researcher’s knowledge, no such study has been conducted in the area
of organisational psychology, hence the need for this study. The hypothesis of this study is
that:
Hypothesis 5.1: There is some degree of misspecification in formative vs reflective
measurement models in the organisational psychology literature.
5.4.2 Literature review strategy
Initially, the methodology for identifying misspecification will be discussed based
on the recent organisational psychology literature. To assess the prevalence of measurement
model misspecification, articles published within a nine-year period between 2006 and
2014 in two high profile journals - Journal of Applied Psychology and Personnel
Psychology - were reviewed. While this is not a broad review of the literature; it is
reasonable to assume that a problem exists if construct misspecifications are found in these
80
most cited journals in the discipline. As a result, there is a likelihood of Type I or II errors
in reported results in misspecified models. If a problem of misspecification exists, then
there is a need to take action and pay more attention to this neglected area of study.
The following inclusion criteria were followed in this review:
- Papers with measurement models - Constructs measured by two or more items.
The exclusion criteria were as follow:
- Papers consisting only of single-item measures - Papers that did not report their measurement items.
Based on these criteria, a total of 301 studies were considered in the analysis (See
Appendix H). The measurement items for each construct were examined by two researchers
independently, using the decision making framework provided in Figure 5.5. If both
researchers agreed that at least one construct was misspecified (e.g. modelled as reflective
while it should be formative or vice versa), the study was coded as misspecified, on the
grounds that misspecification for any construct could lead to error.
5.4.3 Inter-rater Reliability
IBM SPSS Statistics (SPSS) for MS Windows Release 21.0 (SPSS Inc., Chicago,
IL) was used to analyse this data. A Cohen’s Kappa was applied to measure the inter-rater
reliability of the decision for the two researchers (the student and her principal supervisor),
both experts in SEM and Organisational Psychology studies. Using only two researchers to
rate the measures is considered to be one of the limitations of this review. All the papers
were examined and the appropriateness of formative and reflective models was judged by
81
both raters in each case. The Cohen’s Kappa test examines the level of agreement between
raters, with a result of higher than 0.70 indicating good agreement between raters.
5.4.4 Results of the Review
A high level of agreement was obtained between the raters (Cohen’s Kappa=0.89),
suggesting that the classification of the articles based on the guidelines provided in Figure
5.5 was reliable. The findings of this review are summarised in Table 5.1.
Table 5.1
Measurement Model Classification
Should be
Reflective
Should be
Formative
Should be
Mixed
Total
Modeled as
Reflective
215 39 16 270 (90%)
Modeled as
Formative
0 21 0 21 (7%)
Modeled as Mixed 0 0 10 10 (3%)
Total 215 (71 %) 60 (20 %) 26 (9 %) 301(100)
* A total of 301 studies from articles published in the Journal of Applied
Psychology and Personnel Psychology between 2006 and 2014 were reviewed.
A misspecification level of 18 per cent (55/301) was found in this review. The
misspecification involved misspecifying a formative model as reflective or a mixed model
as a fully reflective model. Not surprisingly, the majority of the studies (90%) by default
considered measurement models as reflective. Unfortunately, there is no similar
misspecification study in this area to allow a comparison, however, higher percentages have
82
been found in other disciplines, as explained previously. As mentioned previously the
results of such misspecification in measurement models can lead to Type I or II errors.
5.4.5 Discussion
The issue of measurement model misspecification is a very critical topic in
measurement models. As mentioned before the majority of scholars by default consider
measurement models as reflective which leads to misspecificiation. As indicated in
previous studies (Jarvis et al., 2003; Petter et al., 2007; Roy et al., 2012), model
misspecification can bias the parameter estimation leading to Type I and II errors and
incorrect conclusions. Although a higher degree of misspecification has been reported in
other disciplines (e.g. Jarvis et al., 2003; Podsakoff et al., 2006; Petter, Straub and Rai,
2007; Roy et al., 2012), the finding of an 18 per cent reported misspecification rate in two
prestigious organisational psychology journals is nevertheless significant. If such a high
percentage of misspecification is found in top-ranked journals, the researchers predict
significantly higher misspecification rates in journals with less influence.
Given the reported problem of misspecification in the field, greater attention to
measurement model specification is imperative. A lack of awareness about the nature of
formative constructs could be one of the reasons for misspecification. As demonstrated by
previous studies (e.g. Jarvis et al., 2003; Petter et al., 2007) and as shown in Table 5.1, in
all the misspecified studies, researchers had miscategorised formative constructs as
reflective rather than the reverse.
What is needed is a simple but comprehensive framework to distinguish formative
and reflective measures. Also, it is important to ask why formative models are frequently
83
misspecified as reflective models. The problems that occur with the fitting of formative
models are partly to blame. Overall it is easier to fit reflective models. This topic is
discussed later using empirical examples in the context of a work ability measurement
model.
The review however is not without limitations. One of the main limitations in this
review is using only two researchers for rating the measudmrnet models. Reviewing only
two journals also considered as another limiattions of the review which limits the
generalizability of the results.
5.5 Summary and Conclusion
In this chapter an informative introduction was provided to distinguish formative
from reflective measurement models. A simple and easy to understand framework for
distinguishing formative models from reflective models was proposed in this chapter. The
misspecification of formative vs reflective models along with the possible outcomes of
misspecifications were discussed Then it was demonstrated how big the problem of
formative model misspecification is in the organisational psychology discipline. Using a
comprehensive literature review of misspecification over a 9-years period in two high
ranked journals in the discipline of organisational psychology, the misspecification rate was
demonstrated for the first time in this discipline.
In study 3 an example of misspecification involving the measurement of work
ability using the WAS measure is presented. In this study it will be empirically
demonstrated how different model specification/misspecification can yield different results
for a measurement model. The initial second-order WAS model will be re-examined using
84
reflective-reflective, formative-formative and reflective-formative models. Based on the
guidelines provided in Figure 5.5, and theoretical background, the model should be fitted as
reflective-formative (reflective for first-order constructs and formative for the second-order
construct). In Chapters 12 to 14, therefore, the validity and reliability assessments of the
correctly specified model of reflective-formative models of WAS is conducted using Partial
Least Squares SEM. The results will be compared and discussed with those obtained from
the misspecified reflective-reflective and formative-formative model of WAS, along with
the implications for the discipline
85
6
STUDY 1: MODEL-BASED RELIABILITY, VALIDITY AND CROSS
VALIDITY OF BIFACATOR MODEL FOR WOAQ
In chapters 6 to 8 the validity, cross validity and model-based reliability of the
Work Organisation Assessment Questionnaire (WOAQ) is assessed for a sample of nurses
and a sample of paramedics using a bi-factor model. This chapter introduces the data and
the relevant theory and hypotheses, Chapter 7 reports the results and Chapter 8 discusses
and summarises the implications of the results.
The Work Organisation Assessment Questionnaire (WOAQ) was previously
validated by Griffiths, et al. (2006) in a study involving manufacturing workers. In this
study, WOAQ was viewed as a bi-factor measure, including a general measure of the Work
Organisation Assessment Questionnaire (WOAQ) and five nested subfactors, with each
subfactor representing different dimensions of work organisation risk assessment. The five
nested subfactors are: quality of relationships with management, reward and recognition,
workload issues, quality of relationships with colleagues, and quality of the physical
environment.
This study is the first of its kind to be conducted on two groups of employees in an
Australian health setting. As mentioned in Chapter 4, in recent years bifactor modeling is
gaining popularity among the scholars of different disciplines. However, applications of the
bifactor model in the field of organisational psychology have been very limited. Lack of
knowledge or lack of information on the advantages of a bifactor model over a higher order
model in specific contexts may be among the main reasons for this neglect (Reise, 2012).
86
The focus of study one is validation of the Work Organisation Assessment Questionnaire
(WOAQ) which was originally proposed by Griffins et al in 2006. Although the scale has
been used in many studies since its development, there is little work evaluating its validity.
Among these studies, a poor fit is reported for WOAQ using a second-order model and
none of these studies employed bifactor modeling for validity assessment. In this research,
Study 1 consisted of three sub-studies which are discussed below.
a) Validation of the Work Organisation Assessment Questionnaire (WOAQ) for
nurses. This study presents a validity assessment of the WOAQ for nurses, in
which a conventional second-order model of WOAQ will be compared with a
bifactor model. No other such study has been undertaken in an Australian health
setting using such a broad and rigorous examination of model-based reliability and
validity procedures. In particular, the bifactor model of Work Organisation
Assessment Questionnaire (WOAQ), including its general factor and five sub-
factors, will be assessed in terms of construct validity.
b) Model-based reliability of WOAQ. The conventional coefficient alpha reliability
measures for WOAQ will be compared with the model-based reliability of Omega
total, Omega hierarchical and Omega subscales. Based on the literature on
multidimensional scales, the coefficient alpha overestimates reliability. Model-
based reliability coefficients are expected to provide more accurate reliability
measures for multidimensional models and/or when the assumptions for coefficient
alpha are not met.
87
c) Cross validation of the Work Organisation Assessment Questionnaire (WOAQ)
across gender among paramedics. The final section of study one considers the
cross-validity of the Work Organisation Assessment Questionnaire (WOAQ) across
gender considering only the paramedics sample. The best fitting model of Work
Organisation Assessment Questionnaire (WOAQ), obtained in the assessment of
cross-validity across the nurse and paramedic samples, will be tested for invariance
for males and females using the MACS procedure. This allows a statistical
comparison of factor structures and observed means. MACS was first introduced
by Sörbom (1974) for the cross validation of SEM models. However, practical use
of this procedure is often neglected in the literature.
This chapter along with Chapters 7 and 8 present the rational and objectives, the
methodology used, the results and discussion of the findings, the unique strengths of the
research, and possible directions for future related studies.
6.1 Rational and Objectives
6.1.1 Validity of Bifactor Model of WOAQ
One of the greatest challenges for society is sustaining an individual’s health and
quality of life in the workplace (Cox, 1997). There is a broad body of research revealing
damage to health and wellbeing in workplaces. Increasing awareness of the possible
deleterious effects of work related factors on health has led to the enforcement of
regulations and the introduction of legislation in many developed countries to ensure that
organisations put the health of their employees as a high priority (Faragher, Cooper &
88
Cartwright, 2004). As a result, management has also been encouraged to conduct risk
assessments for psychosocial hazards with a view to ensuring employees’ health and safety
in the workplace (Rick & Briner, 2000).
There are several driving forces which contribute to making the workplace a less
convivial place for employees to work in. For instance, the growing competitiveness of the
marketplace, the constant need to improve organisation efficiency and profitability and
radical changes in employment conditions are amongst the major driving forces responsible
for increasing stress in the workplace (Faragher, Cooper & Cartwright, 2004). But in
particular, an inability to incorporate proper work design in the workplace leads to a
negative effect on both employees and organisations (Griffiths, et al., 2006).
Much of the attention in the occupational health and safety (OH&S) literature has
been focusing on linking this inability to incorporate a suitable work design with the right
assessment tools and decreasing negative work related outcomes for individuals and
organisations (Griffiths, et al., 2006).
The efficacy of an OH&S tool in assessing the risk factors in the workplace
environment depends on how well it is designed, implemented, and developed. A more
practical approach is required in order to obtain information from relevant respondents,
taking into consideration the nature of their work (LaMontagne, 2004).
Adapting such approaches to a specific work context provides a benchmark which
can be used to identify the main organisational hazards and to progressively improve OHS
by improving safe work design and practices. The main challenge is to use a suitable
instrument to improve the capture of OH&S indicators.
89
Based on recommendations from previous studies, a good risk assessment process
can only be achieved by using multiple methods of assessment. A well-designed
assessment should recognize the risks in the workplace and also the employees at risk (The
Health and Safety Executive Guidelines, 2000). The organisational risk assessment is
obtained using questionnaire/survey scales. In order to evaluate risk and stress effectively,
this questionnaire must meet some important criteria such as being reliable and valid; easy
to complete; measuring the possible risks, their predictability of outcomes related to the
employees’ health, their size and impact on the target population; and applicable to both
organisations as a whole and at different work levels. To be able to meet such criteria, the
questionnaires are usually quite lengthy. As a result, the large amount of time it takes to
complete a questionnaire leads to a low response rate (Faragher, Cooper & Cartwright,
2004).
A short yet comprehensive risk assessment questionnaire is desirable. One such
instrument called Work Organisation Assessment Questionnaire (WOAQ) developed by
Griffiths, et al., (2006) may be able to overcome problems identified in previously validated
measures due to its short length and yet comprehensive content. The methodology
developed in WOAQ was based on identifying and collecting employees’ opinions on their
work, health, and their workplace design and management (Griffiths, et al., 2006). It was
designed to measure risk factors pertaining to the work design and management which may
influence employee health and health related behaviours in a manufacturing setting
(Griffiths, et al., 2006; Wynne-Jones et al., 2009). The overall score on WOAQ indicates
the extent to which the respondents believe that these dimensions of work are good and can
90
be used as predictors of wellbeing, subjective health and job satisfaction. A high score on
WOAQ indicates that the respondents perceive dimensions of work as good, and a low
score on WOAQ indicates that the respondents perceive dimensions of work as problematic
(Griffiths, et al., 2006).
The WOAQ was initially developed for a manufacturing setting and implemented in
the private sector; however the comprehensive approach to the risk assessment means that
this questionnaire may be used in other settings including non-manufacturing or health
settings.
It is therefore important to check if the WOAQ can be implemented effectively in
other work settings or professions (Wynne-Jones et al., 2009). Only a few studies have
evaluated the application of WOAQ in other workplaces. For example, Wynne-Jones et al.,
(2009) in their research of two large public sector organisations in South Wales, evaluated
the validity and reliability of WOAQ in the public sector. Using a higher order CFA, the
researchers only found a marginal fit for the original five subfactors of WOAQ. In the end
they identified a two-factor structure linked to four of the five scales of the WOAQ,
assessing Management and Work Design, and Work Culture. One of the aims in this study
is therefore to find out if the general and five subfactors of WOAQ can be implemented in a
non-manufacturing, health setting in Australia. Also, in addition to the conventional higher
order model of CFA used frequently by other scholars in the field (including the Waynne-
Jones’s study), a more practical bifactor model will be used to assess the general factor of
WOAQ and the plausibility of its five subfactors in an Australian community nursing
91
setting. As fully discussed in Chapter 4, a bifactor model of WOAQ is deemed to deliver a
better fit and more valuable information in such contexts. It is therefore hypothesised that:
Hypothesis 6.1. A bifactor model of WOAQ has acceptable construct validity in a
non-manufacturing, health setting in Australia.
Hypothesis 6.2: A bifactor model of WOAQ has superior fit over the conventional
higher order, five-factor model of the WOAQ.
In this study covariance-based SEM is used to fit reflective models to the WOAQ.
This allows the evaluation of model fit using conventional goodness of fit measures. In
addition it allows the extraction of model-based measures of reliability. It also allows the
use of invariance tests for comparing the cross-validity of models for different groups (e.g.
nurses and paramedics, males and females) as described below.
6.1.2 Model-based Reliability
One of the commonly used measures for reliability is coefficient alpha which was
proposed originally by Cronbach in 1951. Coefficient alpha was developed for only one-
dimensional scales and is therefore not appropriate for multidimensional constructs as
discussed previously (Sijtsma, 2009; Zinbarg, Revelle, Yovel, & Li, 2005). In the case of
multidimensional scales, coefficient alpha may lead to overestimation of the reliability
(Cortina, 1993).
However, model-based reliability assessments for multi-dimensional scales were
provided many years ago by Bentler (1968) and Heise and Bohrnstedt (1970) and, more
recently, by Bentler (2007, 2009) for factor analytic types of models, and, in a generalised
92
form, for any structural equation model with additive errors. Although reliability for a
general SEM model is rationalised based on the model’s multidimensional structure, it
should be noted that a uni-dimensional model-based coefficient, which we will call ρ or
rho, still quantifies the proportion of variance due to the most reliable single dimension in
multidimensional space (Bentler, 2007). However, there are a few empirical studies that
have also reported reliability coefficients such as omega hierarchical, omega subscale and
omega total that are suitable for bi-factor models with multiple subscales (e.g.Gignac &
Watkins 2013; Reise, Bonifay, & Haviland, 2012; Zinbarg et al., 2012).
For the purpose of this study, both traditional (conventional) estimates (i.e. the
conventional Coefficient alpha) and more modern model-based reliability estimates of
Omega (i.e. omega hierarchical, omega subscale and omega total) will be assessed and
compared for the bifactor WOAQ model.
Omega hierarchical ( hω ) estimates the degree of proficiency of a general factor
test measure in a bifactor model (Revelle, & Zinbarg, 2008; Zinberg, Revelle, Yovel, & Li,
2005). It is a measure of the variance in total scores that arise from the general factor
running across all the items (Reise, Bonifay, & Haviland, 2013). Omega subscale ( sω ) is
used to determine the degree of reliability of the proposed subscale scores after controlling
for the variance generated from the general factor (Reise et al., 2012). Omega total ( tω )
estimates the combined reliability for the general factor and the subscales (McDonald,
1978).
93
Reporting both conventional item based reliability estimates as well as all three
types of omega model-based reliability measures for a bifactor model will provide a more
detailed and comprehensive evaluation of the reliability for WOAQ. All the previous
studies on WOAQ reported only single item-based reliability measures which are not
appropriate given the multidimensional nature of the scale. These reliability measures will
be calculated and compared for the sample of paramedics.
It is hypothesised that:
Hypothesis 6.3: The model-based Omega reliability coefficients will provide
acceptable levels of internal consistency for the bifactor model of WOAQ for a sample of
paramedics.
Hypothesis 6.4: The conventional internal consistency reliability of alpha
coefficient overestimates the reliability of the WOAQ scale compared to the model-based
reliability coefficients of Omega for a sample of paramedics.
6.1.3 Cross Validation of Bifactor Model of WOAQ
An important aspect of a tool’s psychometric properties is its cross validity, or
whether the tool has good fit in other groups of individuals or populations. Once the
validity of a tool is established, it is time to assess its cross-validity. The main issue of
cross-validity is whether the validated tool fits well in more specific populations.
Measurement Model Invariance can be tested for two or more distinct samples using CFA.
If the model fits well in all the samples then it can be concluded that the model is
acceptable and valid across the corresponding populations. However, it is also necessary to
94
test whether the population parameters can be considered equal for all the samples. There
are several procedures for evaluating cross-validity in CFA models, each relating to a
hypothesis for a different set of key population parameters (e.g. Meredith, 1993; Widaman
and Reise, 1997; Bryne, 1995; Bryne & Watkins, 2003, Cheung & Rensvold, 2002). Little
(1997) categorises invariance testing into two major categories. The first type of invariance
procedure refers to evaluating the psychometric characteristics of the model parameters
(e.g. factor loading, measured-variable loading, variance/covariance of errors or factor
residuals) using analysis of covariance (COVS). This type of invariance must be
established before progressing to category two of invariance analysis, relating to invariance
in factor means (Cheung & Rensvold, 2002; Widaman and Reise, 1997). The invariance
analysis of mean and covariance structure (MACS) procedures for latent constructs was
first introduced by Sörbom (1974) for the cross validation of SEM models. Although
category one invariance testing (COVS) of measured (observable) parameters described by
Little (1997) is widely demonstrated by researchers, few scholars have paid attention to the
category two invariance testing of MACS (Cheung & Rensvold, 2002; Chen, Sousa, &
West, 2005; Vandenberg & Lance, 2000). In this study the invariance of all parameters for
the bifactor WOAQ model will be assessed between males and females in the paramedics’
sample.
The main technical aspects of invariance procedures, at both measurement and
construct levels, were introduced by Meredith (1993), Widaman and Reise (1997) and
Meredith and Horn (2001). The main limitation of the techniques they have introduced is
their design for only first-order models. For more complex bifactor or higher order models,
95
the literature on practical assessment techniques is more limited. As mentioned by Chen,
Sousa, and West (2005), previous scholars have not paid enough attention to the invariance
testing of bifactor or higher order models (e.g. Byrne, 1995; Byrne & Campbell, 1999;
Marsh & Hocevar, 1985). In particular, this is true in the case of MACS invariance tests for
bifactor models.
Thus, in this part of the study, invariance testing will be conducted on a bifactor
model of the WOAQ at both parameter and construct levels, building on the
recommendations of previous scholars (Cheung and Rensvold, 2002; Byrne, 1995; Byrne
and Watkins, 2003; Meredith, 1993, Meredith & Horn, 1997; Widaman & Reise, 1997 and
Chen, Sousa, & West, 2005; Yap et al., 2014).
In order to assess the validity and the cross-validity of the WOAQ across the data in
this study, three analyses will be carried out across gender for the bifactor model WOAQ
with its five nested factors. In the first set of analyses, the validity of the bifactor model of
WOAQ will be independently tested for male and female employees from a paramedic
organisation. Invariance testing for the measures will be carried out at the second step. If
the model shows satisfactory invariance of the measures, then in the final analysis the
construct means can be tested for invariance using the MACS approach. In this analysis
only the paramedic sample is considered because there were too few males in the nursing
sample to make a valid comparison.
It is evident that any occupational safety and health interventions should be
beneficial for both males and females. However, gender mainstreaming, or the gender-
sensitive approach to occupational health and safety (OH&S), has been recognised as
96
important to the OH&S agenda by the European Commission in its community safety and
health strategy 2002-06 (EU-OSHA – European Agency for Safety and Health at Work,
2014). Both male and females employees can better benefit from interventions aimed to
improve their health that are developed based on gender-sensitive approach. To obtain such
equality in OH&S, it is critical to recognise the gender differences and as a result the
differences in their work organisations and the way they perceive working conditions. This
should not just be limited to the physical elements (such as designing safety gear
specifically fitted for women) but should also take into account the psychosocial elements
of the work setting. Therefore, in recognition of the importance of a gender-sensitive
approach, the main goal of this study is to examine whether work characteristics are
experienced in the same way by both genders and, in this way, validate the WOAQ across
male and female paramedics.
Therefore it is hypothesised that:
Hypothesis 6.5: Baseline invariance. There is a baseline invariance of the bifactor
CFA model of the WOAQ in that the model describes both female and male paramedics.
Hypothesis 6.6: Configural invariance. There is a configural invariance of the
bifactor CFA model of the WOAQ across gender in that the model describes the combined
data set well.
Hypothesis 6.7: Invariant factor loadings. The bifactor CFA model of the WOAQ
exhibits invariance across gender, even after constraining the factor loadings on observed
variables to be equal for males and females.
97
Hypothesis 6.8: Invariant factor means. The factor (construct) means of the bifactor
CFA model of the WOAQ are invariant for male and female paramedics.
6.2 Method
The data collection for both studies of nurses and paramedics along with the
measures, ethical considerations and data analysis are described below.
6.2.1 Nursing Participants
Data were collected from a sample of Australian nurses for the validation of the
WOAQ. The study design was cross-sectional. A self-report questionnaire was used to
capture demographic-work characteristics and the WOAQ described below.
A questionnaire package that included a cover letter, information sheet, consent
form, questionnaires, and reply-paid envelopes was forwarded to all potential participants.
Three weeks after the mail-out, a letter was forwarded to the employees to thank them for
their participation, or to ask if they could complete and return the questionnaire if they had
not already done so. A total of 334 surveys were returned. Some of the returned surveys
were incomplete with a high percentage of missing data, therefore the decision was made to
remove these incomplete surveys. After data cleaning and removing the incomplete data,
312 surveys were included in the final data analysis.
98
6.2.2 Paramedic Participants.
The paramedic data was collected from a large Australian health organisation
employing paramedics3. The study design was cross-sectional. A self-report electronic
questionnaire was used to capture the variable of interests anonymously. Nine hundred and
seventy nine responses were received from the paramedics. Of these, 33 were from
volunteer paramedics which were excluded from the final database.
6.2.3 Measures
The measurement scale used was the comprehensive Work Organisation
Assessment Questionnaire (WOAQ) consisting of 28 items pertinent to aspects of the
respondents’ work organisation (Griffiths, et al. 2006). Respondents were asked to rate how
problematic or good each of the items were for them in the last six months, with higher
scores representing better quality of work environment. It was assumed that the WOAQ
consisted of a general 28-item summative factor with a five sub-factor structure. The five-
factor structure of the scale included: workload issues, reward and recognition, quality of
relationships with management, relationships with colleagues, and physical environment.
3 The data was collected as part of a study on the prevention of work-related musculoskeletal disorders and the development of a tool kit for workplace users. For this study, psychosocial workplace hazards were recognised as a significant predictor of discomfort /pain levels and absenteeism due to sickness (Jodi & Macdonald, 2012). Therefore, data on the WOAQ which was collected as part of quantifying the psychosocial workplace hazards in that study was also used in this study for evaluating covariate-dependent reliability and cross-validity.
99
6.2.4 Ethics
Human Research Ethics Committee approval was obtained from both the lead
university and the participating nursing and paramedic organisations.
6.2.5 Overview of Statistical Analysis
Normality of the data was assessed before conducting the CFA at both item level
and group level. At the first step of the validation process, the construct validity of a
bifactor model of WOAQ was compared with a higher order model. Although the responses
are captured on a 5-point ordinal scale they are treated as continuous normally distributed
variables. This is a limitation of the analysis although ordinal variables with five categories,
are usually treated as “continuous.” There is some evidence to support that it is unlikely
that this will have any significant practical impact on the results (e.g., Babakus, Ferguson,
& Jöreskog, 1987; Dolan, 1994; Johnson & Creech, 1983; Hutchinson & Olmos, 1998;
Rhemtulia, Brosseau-Liard, & Savalei, 2012). As demonstrated by some simulations
studies (Rhemtulia, Brosseau-Liard, & Savalei, 2012), for five to seven categories, robust
continuous methods of estimation, such as Maximum Likelihood (ML), will deliver similar
outcome as categorical methods of estimation such as categorical Least Squares (cat-LS).
Also, as asserted by Rhemtulia, Brosseau-Liard, and Savalei (2012), the continuous
methods of estimation are very familiar for researchers while there is limited knowledge on
estimation methods for categorical data.
An important factor that was considered in choosing the suitable fit indices was the
degree of penalty included for model complexity. Based on the suggestion by scholars (e.g.
Gignac, 2013), for evaluation of bifactor models, it is better to choose those close-fit
100
indices that include relatively greater penalties for model complexity (i.e. RMSEA,
NNFI/TLI, & AIC).
The fit indices reported in this study are summarised as follows:
- The root mean square error of approximation (RMSEA)
- The Tucker-Lewis Index (TLI) or Non-normed fit index (NNFI)
- The Akaike Information Criterion (AIC)
RMSEA values of less than.08, and .05 (MacCallum, Browne, and Sugawara, 1996)
and NNFI values of greater than 0.90 and 0.95 (Hu & Bentler, 1999) were considered as
marginal and good fit levels respectively . The model comparisons will be performed
based on a practical improvement in NNFI. NNFI reductions of at least .010 show
significant model improvement according to Vandenberg & Lance (2000).
The Akaike Information Criterion (AIC) is a comparative measure of fit which is
meaningful only when two different models are estimated. A smaller value of AIC and a
reduction (ΔAIC) of more than 10 indicates a superior model fit (Akaike, 1973; Raftery,
1995; Schwarz, 1978).
The chi-square goodness of fit test was also reported as the conventional, commonly
reported measure of fit in the literature. Traditionally, a chi square statistic is used for
assessing if the proposed model describes the data adequately. However, as acknowledged
by Hu & Bentler (1999), the chi square statistic is highly dependent on sample size and is
not appropriate for complex or non-normal data. The relative chi square (Chi-Square/DF) is
therefore preferred as a measure of model fit. For this statistic a value of 1 to 2 reflects
101
good fit, less than 3 represents acceptable fit (Kline, 1998), and less than 5 represents
adequate fit (Schumacker & Lomax, 2004).
The other commonly reported fit indices are the Standardized Root Mean Square
Residual (SRMSR) and Comparative Fit Index (CFI), however neither of these fit indices
were considered in this study because they do not adequately penalise for model
complexity (Marsh, Hau,& Grayson, 2005; Gignac, 2013).
6.2.6 Model-based reliability
Model-based reliability coefficients of omega total, omega hierarchical and omega
subscales, and conventional item-based coefficient alpha will be used for testing the
reliability of the WOAQ. Only R Psych package (Revelle, 2013) calculates these omega
coefficients directly. In other SEM software such as AMOS and EQS, omega coefficients
can be calculated indirectly using what is known as a reliability index (Fan, 2003), which is
in fact the implied correlation between a latent variable and its corresponding composite
score (Gignac, 2007).
As recommended by Gignac (2014) a practical approach for the estimation of hω
and sω is “to estimate the (squared) correlation between latent variables within a bifactor
model and their corresponding equally weighted composites scores (known as phantom
variables) within structural equation modeling programs” (p. 9). Figure 6.1, demonstrates
an example for this procedure using EQS. The confidence intervals associated with the
reliability coefficients in this procedure can also be evaluated using a combination of the
phantom variable squared correlation approach and bootstrapping. Due to an identification
102
problem (having only two indicators for the ‘relationship with colleagues’ construct), the
method could not be used in this study. Instead, using an excel spreadsheet and the
formulas for the omega coefficients, they were calculated manually. Using the factor
loadings and error variances of the well-fitting WOAQ, the coefficients of omega were
calculated.
The formulae for Omega hierarchical (ωh) and Omega subscale (ωs) are provided
below in the case of items (i = 1, 2,…,k=28) contributing to a general factor with loadings
λgi and five subscales with loadings λSj,i for i =1, 2,… SJ for each subscale Sj j=1,2,..5.
∑∑∑∑∑
∑
=====
=
+
++
+
+
=k
ii
S
iiS
S
iiS
S
iiS
k
igi
k
igi
h
eVar1
25
15
22
12
21
11
2
1
2
1
)(... λλλλ
λω Equation 6.1
∑∑∑
∑
===
=
+
+
=Sj
ii
Sj
iSji
Sj
igi
Sj
iSji
sj
eVar1
2
1
2
1
2
1
)(λλ
λω Equation 6.2
where the items i=1,2,…Sj all belong to the Sj scale for j=1, 2, ..5. Combining these
reliabilities the total reliability of the 5-factor measurement model is measured using
Omega total (ωt)
103
∑+
∑++
∑+
∑+
∑
∑++
∑+
∑+
∑
=
=====
====
k
ii
S
iiS
S
iiS
S
iiS
k
igi
S
iiS
S
iiS
S
iiS
k
igi
t
eVariiii
iiii
1
25
15
22
12
21
11
2
1
25
15
22
12
21
11
2
1
)(...
...
λλλλ
λλλλω Equation 6.3
6.2.7 Cross-validation of WOAQ
The WOAQ was initially validated using the nursing data and was then cross-
validated on the paramedics data. Finally invariance was tested for males and females in the
paramedics sample. At the first step of this invariance analysis, the baseline bifactor model
was tested separately for males and females and at the second step the cross validity of the
WOAQ was assessed using invariance testing across gender.
104
Figure 6.1. A demonstration of Bifactor model of WOAQ with phantom variables for calculating omega coefficients on the right-hand side (Note that the phantom variable
paths are constrained equal to 1 creating equally weighted composite scores).
105
6.3 Summary
This study has both theoretical and empirical implications. The WOAQ was
originally developed and used in manufacturing settings. In addition, the previous studies
used a higher order model of WOAQ and some reported poor fit for the scale. To the best
of the researchers’ knowledge, no study has been conducted in a non-manufacturing, health
setting in Australia using a bifactor modeling procedure. The present study used data
collected from a group of Australian nurses and a group of paramedics to assess the
validity, cross-validity and model-based reliability of WOAQ, a well-designed instrument
for assessing work and organisational factors as potential risks to employee health. The
main aim of the study was:
1) To assess the validity of WOAQ in an Australian health setting.
2) To compare a bifactor model (nested factor models) with a conventional higher
order model of WOAQ using Confirmatory Factor Analysis (CFA).
3) To assess and compare model-based reliability coefficients of Omega
hierarchical, Omega subscales and Omega total with the conventional
coefficient alpha.
4) To assess the cross-validity of the Work Organisations Assessment
Questionnaire (WOAQ) on a group of paramedics.
5) To assess the cross-validity of the Work Organisations Assessment
Questionnaire (WOAQ) on male and female paramedics
106
Unlike previous studies which used a higher-order Confirmatory Factor
Analysis (CFA) model for WOAQ, a bifactor modeling procedure was used in
this study. There is a very limited literature on the invariance testing of bifactor
models, making this a really novel research study.
107
7
STUDY 1: RESULTS
In this chapter, the study involving 312 nurses was used to validate the Bifactor
WOAQ model. This model is then fitted for a sample of 945 paramedics and a test of
invariance is used to evaluate the cross-validity of this model for male and female
paramedics.
7.1 Results: Study of Nurses-Validation of Bifactor Model of WOAQ
In this chapter, the bifactor model for WOAQ is validated for the sample of nurses
described in Chapter 6. Descriptive statistics are presented and then goodness of fit
statistics and reliability measures are derived using a higher order model and a bifactor
model. The results indicate that the bifactor model is a more valid representation of
WOAQ for this sample.
7.1.1 Descriptive Statistics for Demographics
Table 7.1 presents the frequencies, means and standard deviations for the
demographic variables. The majority of the participants were female (94.5%) with an
average age of 45.19. The majority had more than 4 years’ experience working in a nursing
setting (97.1%). About 40% of the participants were working full-time with the remaining
60% part-time employees.
108
Table 7.1 Descriptive Statistics of the Demographic Variables*
Frequency (%)
Gender
Male Female
17 (5.5) 290 (94.5)
Contacts with clients(hrs/workday)
< 2
2-4 4-6
6-8 >8
22 (7.1)
30 (9.7) 122 (39.4)
134 (43.2) 2 (0.6)
Years of experience(years) < 1
1-3 4-6
>6
2 (0.7)
7 (2.3) 16 (5.2)
282 (91.9)
Employment status Part-time 183 (60) Full-time 123 (40)
Mean Age (SD) 45.19 (9.54)
* n varies between 306-312 due to some missing responses
7.1.1.1 Descriptive Statistics at Item Level.
The twenty eight WOAQ items (Griffiths et al., 2006) are shown in Table 7.2,
grouped according to the five subscales.
109
Table 7.2 Subscales and WOAQ Items
WOAQ Subscales WOAQ Items
Quality of relationships with
management
3. Clear roles and responsibilities
5. Support from supervisor
7. Feedback on your performance
11. Appreciation or recognition of your efforts by
supervisors
16. Senior management attitudes
17. Clear reporting lines
22. Communication with supervisor
26. Status/recognition in the company –
27. Clear company objectives, values, procedures -
Reward & recognition 12. Consultation about changes in your job -
13. Sufficient training for this job -
14. Amount of variety in the work you do –
21. Opportunities for promotion –
23. Opportunities for learning new skills -
24. Flexibility of working hours -
25. Opportunities to use your skills -
Workload issues 6. Pace of work –
8. Your workload –
15. Impact of family/social life on work
19. Impact of work on family/social life
Quality of relationships with
colleagues
10. How you get on with your co-workers
(personally/socially)
28. How well you work with your co-workers (as a
team)
Quality of physical
environment
1. Facilities for taking breaks (places for breaks, meals)
2. Work surroundings (noise, light, temperature, etc.)
110
4. Exposure to physical danger
9. Health and safety at work
18. Equipment, tools, IT or software that you use
20. Work stations and work space
As shown in the next table, all the skewness and kurtosis coefficients were less than
one in absolute value demonstrating behaviour reasonably close to normality at item level
(West, Finch, & Curran, 1995).
Table 7.3 Item Characteristics of WOAQ
Items Mean SD Skew Kurtosis
WOAQ - quality of relationships with
management
3.43 1.03 -.33 -.48
3 3.60 1.02 -.38 -.49
5 3.60 1.21 -.55 -.71
7 3.15 1.04 -.11 -.55
11 3.29 1.11 -.22 -.85
16 3.12 1.15 -.09 -.78
17 3.55 .91 -.40 .09
22 3.49 1.05 -.41 -.45
26 3.39 .96 -.38 -.13
27 3.69 .88 -.48 .33
WOAQ - reward & recognition 3.37 .80 -.21 -.34
12 2.99 1.01 .01 -.50
13 3.52 .97 -.36 -.24
14 3.63 .83 -.18 -.16
111
21 3.06 .90 -.03 .11
23 3.63 .92 -.47 -.29
24 3.20 1.0 -.14 -.61
25 3.61 .89 -.28 -.39
WOAQ - workload issues 2.79 .98 .23 .64
6 2.79 1.16 .17 -1.0
8 2.68 1.0 .28 -.85
15 2.94 .83 .15 .58
19 2.75 .93 .34 .13
WOAQ - quality of relationships with
colleagues
3.94 .83 .58 .47
10 3.83 .82 -.31 -.23
28 4.06 .84 -.85 .72
WOAQ - quality of physical environment 2.97 1.07 .24 -.64
1 2.80 1.27 .26 -1.0
2 2.84 1.09 .27 -.61
4 3.00 .90 .62 .18
9 3.35 .99 .03 -.60
18 2.88 1.14 .21 -.99
20 3.00 1.04 -.05 -.49
Total 3.30 .94 -.31 -.51
7.1.1.2 Test of Model Assumptions
Although the normality assumptions were reasonably valid at item level, the
multivariate distribution of the items also needs to be checked. In this study CFA is testing
a multivariate statistical model using Maximum Likelihood (ML) estimation, assuming
multivariate normality (Hoyle, 2000). Multivariate normality can be evaluated using
112
Mardia’s multivariate skewness and kurtosis tests (Bentler & Wu, 2002). Although the
preliminary assessment at item level showed a relatively normal distribution for the data,
Mardia’s Multivariate Kurtosis coefficient (Mardia's ccoefficient (G2, P) = 109.40:
normalised estimate = 23.57) is a little high, indicating violation of the multivariate
normality assumptions. An upper limit of below 20 is usually required for Mardia’s
Multivariate Kurtosis coefficient (Byrne, 2010).
Non-parametric tests were therefore used for evaluating the model. As described in
the literature (Hu, Bentler, & Kano, 1992; Curran, West, & Finch, 1996), the Satorra-
Bentler (1988, 1994) chi-square test should be used when the assumption of normality is
violated. The scaled chi-square (χ2/df) and robust standard errors using ML estimation is a
method suggested by Satorra and Bentler (1988; 1994). It appears to be a good general
approach for dealing with departures from normality. As noted previously, ideally the
scaled χ2 has a value of between 1 and 2.
7.1.2 Model fit evaluation
The dimensionality of the general score of WOAQ and its nested five subfactors
were assessed using confirmatory factor analysis (CFA). Both the second-order model
(higher-order model) (Figure 7.1, model 1) and bifactor model were assessed (Figure 7.1,
model 2). The results of modification indices suggested correlation between three sets of
construct measurement errors, specifically for one of the environmental factor items (safety
at work with exposure to physical danger) and two of the workload factor items (Impact of
family/social life on work with Impact of your work on family/social life; pace of work
113
with workload). As suggested by Kenny (2011), if some items have similar content and are
theoretically meaningful, one may correlate the errors for these items.
Model 1. Higher order model of WOAQ
Model 2: Bifactor model of WOAQ
Figure 7.1. The proposed bifactor model of WOAQ vs. higher order
114
The results indicated that the higher order model provides a marginally acceptable
model for WOAQ and its five subfactors (SB Scaled χ2=2.14, RMSEA=0.06, NNFI=0.89).
The factor loadings for the subfactors suggest well-defined subfactors. In addition, the
factor loadings of the five subscales over the higher order factor of WOAQ were strong and
significant. The path coefficients were 0.71 for ‘quality of physical environment’, 0.57 for
‘quality of relationship with colleagues’, 0.93 for ‘quality of relationship with
management’, 0.99 for ‘reward and recognition’ and 0.76 for ‘workload issues’.
For meaningful comparison of the higher order model with the bifactor model, the
Schmid-Leiman transformation was conducted to obtain loadings for all items on the higher
order factor. Table 7.4 provides the Schmid–Leiman transformed factor loadings for the
higher order factor. As suggested by Gignac (2007), the Schmid–Leiman (S-L)
transformations were calculated by multiplying the first-order factor loadings with their
respective second-order factor loadings.
The results of the bifactor model suggest an acceptable fit (SB Scaled χ2=1.71,
RMSEA =0.04, NNFI=0.93). Table 7.5 presents the results of the CFA evaluation. Based
on these results, the bifactor model of WOAQ provides a superior fit with a smaller AIC
value (AIC=-89.66) compared to the conventional higher order model (AIC=50.44). The
ΔNNFI is bigger than 0.04 and ΔAIC is -140.10 indicating significant superiority of the
bifactor model over the higher order model.
Important differences were found in the factor loadings of the bifactor model
compared to the higher order model. The most important difference was found for the
‘quality of relationship with management’. The S-L solution of the higher order model
115
showed positively defined fairly uniform factor loadings for this factor, while the bifactor
model detected differentially directed loadings. In addition, in the bifactor model for the
two subscales ‘the quality of relationship with management’ and ‘the reward and
recognition’, items were highly loaded on the general WOAQ but poorly loaded on their
nested group constructs. However, in the higher order model the items for ‘the reward and
recognition’ subscale had low loadings in both cases.
116
Table 7.4 Completely Standardized Maximum Likelihood (ML) Solutions of Higher order Model and the Bifactor Model
Item number S-L Higher order Bifactor S-L QPE QRC QRM RR WI G QPE QRC QRM RR WI 1 0.47 0.55 .32 .60 2 0.54 0.62 .35 .78 3 0.57 0.23 .65 -.18 4 0.29 0.34 .24 .29 5 0.74 0.30 .72 .42 6 0.52 0.46 .47 .37 7 0.71 0.28 .73 .12 8 0.52 0.46 .48 .37 9 0.42 0.49 .44 .33 10 0.37 0.55 .35 .65 11 0.79 0.31 .78 .33 12 0.00 0.02 .74 -.11 13 0.00 0.02 .64 .06 14 0.00 0.02 .48 .43 15 0.43 0.38 .37 .58 16 0.73 0.29 .72 .29 17 0.72 0.28 .72 .18 18 0.34 0.39 .31 .28 19 0.48 0.42 .44 .52 20 0.54 0.62 .49 .52 21 0.57 0.02 .56 .06 22 0.78 0.31 .76 .40 23 0.74 0.02 .69 .30 24 0.60 0.02 .55 .09 25 0.69 0.02 .66 .44 26 0.73 0.29 .80 -.05 27 0.72 0.28 .79 -.11 28 0.45 0.67 .45 .55
Note: S-L= Schmid-Leiman Transformation of Item Loadings, G=General factor of WOAQ, QPE=Quality of physical environment, QRC=Quality of relationship with colleagues, QRM=Quality of relationship with management, RR=reward and recognition, WI=Workload issues.
117
118
Table 7.5 Summary of Model Fit Statistics of the CFA Models of WOAQ
Model SB
χ2/df
RMSEA NNFI AIC ΔNNFI† ΔAIC†
0. Independent Model 11.48
1. Higher order model 2.14 0.06 (.05, 0.06) 0.89 50.44
2. Bifactor model 1.71 0.04 (0.04, 0.05) 0.93 -89.66 0.04 -140.10
Note: SB=Scaled χ2. RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index; AIC = Akaike Information Criterion; †=differences model 2 - model 1.
7.1.3 Model-based reliability
Further analysis was carried out to assess the reliability of the well-fitting bifactor
model of WOAQ. The results of the model-based reliability evaluation of the
multidimensional WOAQ using the tω reliability coefficient (combined true score variance
across the general factor of WOAQ and its five nested subfactors) indicated excellent total
reliability for this scale (0.92). It seems that 92% of the WOAQ variance is true variance,
leaving 8% for error.
The Omega hierarchical reliability coefficient demonstrates that the general factor of
WOAQ explains 87% of the variance while the total contribution of the subfactors is minimal
in the presence of the general WOAQ. In other words, a substantial proportion of internal
consistency belongs to the general factor of WOAQ rather than its nested five subfactors.
118
119
To better understand the individual reliability of each nested subfactor, Omega subscale
reliability coefficient was calculated for each nested subfactor, controlling for the effects of
the general factor of WOAQ. The results show that among the five nested subfactors,
‘physical environment’ ( sω =.52), ‘workload issues’ ( sω =.39), and ‘relationships with
colleagues’ ( sω =.36) demonstrated higher reliability than the other two nested subscales,
independent of general WOAQ. The lowest omega subscale reliability coefficients belonged
to ‘the quality of relationship with management’ and ‘reward and ‘recognition’,
demonstrating more dependency on general WOAQ for these two subscales.
As expected, the conventional coefficient alpha reported an overestimation of the
reliability (α =.94) which was probably caused by a violation of the unidimensionality and
independent residuals assumptions.
Table 7.6 The Reliability Coefficients of WOAQ among Nursing Sample (n=312)
Constructs α tω hω sω
General WOAQ .94 .92 .87 -
Physical environment
Relationships with colleagues
Quality of relationships with management
Reward & Recognition
Workload issues
.51
.35
.16
.15
.39
This bifactor model was then fitted to the paramedics data.
119
120
7.2 Results: Study of Paramedics-Cross Validation of Bifactor Model WOAQ
After establishing the validity of the bifactor model of WOAQ in one health
population, it is important to cross validate the model in a different health population.
Therefore, using a different sample (here paramedics) the invariance evaluation of WOAQ
bifactor was assessed, first for the combined sample and then across gender. The demographic
characteristics of the total sample as well as the male and female samples are presented at
Table 7.7.
7.2.1 Descriptive Statistics for Demographics
As indicated in Table 7.7 this was a much larger sample with a much better
representation of males than the nursing sample. Also, there were very few part-time
employees compared to the nursing sample and average age was younger than for the nursing
sample. These results suggest therefore that this is a very different population, making it
appropriate for this sample to be used for the cross-validation of the bifactor model for
WOAQ in a health setting.
120
121
Table 7.7 Characteristics of Paramedic Participants
Total n=945 Males n=623 Females n=322
Frequency (%)
Gender
Male
Female
623 (65.9)
322 (34.1)
- -
Employment status Ϯ Part-time 895 (94.7) 610 (97.91) 287 (89.13)
Full-time 48 (5.1) 13 (2.09) 35 (10.87)
Years of experience < 1 year
1-3 years
4-6 years
> 6 years
92 (9.8)
127 (13.5)
133 (14.1)
588 (62.6)
38 (6.1)
55 (8.9)
62 (10.0)
465 (75.0)
54 (16.9)
72 (22.5)
71 (22.2)
123 (38.4)
Age (years) Mean (Range) 40.15 (21-65) 43.72 (22-65) 33.24 (21-56)
Note: Ϯ Due to their very low percentage (1.3 per cent in the paramedic organisation),
casual employees have been allocated to the part-time category.
At the first step, a baseline bifactor model of WOAQ that was evaluated in the
previous section, was assessed separately for both male and female paramedic groups. The
results in Table 7.8 show adequate model fit for the baseline bifactor model for males
(RMSEA =0.04, NNFI=0.94), and females (RMSEA =0.05, NNFI=0.92).
Table 7.8 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Gender Model SB χ2/df CFI RMSEA NNFI
Baseline Model:
Male 2.47 0.94 0.04 0.94
Female 1.87 0.93 0.05 0.92
Note: RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index.
121
122
Model 1: Configural model with no constraints. At the next step, a configural bifactor
model was fitted for the male and female groups simultaneously to determine if the model is
appropriate when there are no constraints. Based on the results, the configural model
(RMSEA =0.03, NNFI=0.97) showed good model fit (Table 7.9). This model shows that the
bifactor model is appropriate for paramedics as well as nursing staff, suggesting that this
model may have general applicability in health.
Model 2: Invariant loadings. After constraining the loadings to be equal for males and
females, the results still showed good fit (RMSEA=0.03, NNFI=0.96). To test for evidence of
invariance, the differences between the NNFI and AIC of Model 2 and Model 1 were
considered. This suggests no significant deterioration in model fit for constrained loadings
compared to the configural model (unconstrained loadings) in the case of NNFI (table 7.9);
but there was an increase of more than 10 in the AIC. In the circumstances it was unclear if
invariance could be claimed across gender.
As previously explained, reaching full invariance for all the parameters, or even the
most important ones, is very rare in most models (e.g. Byrne, Shavelson & Muthen, 1989). In
view of the conflicting results obtained above, a decision was therefore made to proceed to
the next stage of the invariance analysis considering the differences in construct means for
males and females.
122
123
Table 7.9 Invariance Testing Across Gender for the Bifactor Model of WOAQ.
Models SB χ2/df CFI RMSEA NNFI AIC Δ*NNFI Δ*AIC
Model 1 - Configural model-no constraints 1.49 0.97 0.03 0.97 -315.84 - -
Model 2 – M1+loadings invariance 1.60 0.96 0.03 0.96 -274.65 0.007 -41.19
Note: RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index; AIC = Akaike Information Criterion; Δ=change
123
124
The Group Differences for Construct Means. Gender differences in the mean values
for the general factor of WOAQ and the mean values for the nested five factor were
considered with the female group means selected as the reference level. The construct means
for the female group were therefore set to zero while the construct means associated with the
male group were estimated, providing an estimate for the mean differences between groups
for the constructs.
After setting equality constraints on loadings, and intercepts for the measured
variables, with factor intercepts of zero for female employees, the results showed a marginal
fit for the model (RMSEA=0.05, NNFI=0.917). The mean differences between the male and
female groups were significant on two of the nested constructs (‘co-worker’ and ‘reward-
recognition’) and also for the general factor of WOAQ. The Z score results showed that the
mean scores on these two nested constructs and the general factor of WOAQ are significantly
higher for male employees than for female employees. In the next chapter the implications of
these results are discussed.
124
8
STUDY 1: DISCUSSION
In this chapter we start by considering the previous WOAQ bifactor model results
obtained for the nursing sample. We then consider this model in the case of the paramedics
sample and, in particular, we probe the implications of the gender differences that were
exposed.
8.1 Discussion: Study of Nurses-Validation of Bifactor Model of WOAQ
The most common problems detected in the literature on full risk assessment relate to
the length of questionnaires. These are either very long or detailed or they are unable to detect
the hazardous nature of any identified problems in a work setting. In response to the
evidenced need for a short, valid s risk assessment, the WOAQ was developed.
Work Organisation Assessment Questionnaire (WOAQ) (Griffiths, et al., 2006) seems
to overcome these problems with its short length (28 items) and yet comprehensive content.
The WOAQ seeks to identify and collect employees’ opinions on their work and health
(Griffiths, Cox, Karanika, Khan, & Toma, 2006). The WOAQ was originally developed for a
manufacturing setting but it is widely used in non-manufacturing settings without having been
properly validated in these new settings.
The present research examines the validity and model-based reliability of WOAQ for
a group of Australian employees using the conventional higher order model and a bifactor
model using CFA. The WOAQ higher order model included a second-order factor and five
first order subfactors, each representing different dimensions of work organisation risk
125
assessment. The five subfactors are: ‘quality of relationships with management’, ‘reward and
recognition’, ‘workload issues’, ‘quality of relationships with colleagues’, and ‘quality of
physical environment’. The bifactor model of WOAQ included a general measure of WOAQ
and the above five subfactors.
Previous studies, using higher order modeling of WOAQ, failed to validate the model,
reporting a poor fit (e.g. Waynne-Jones, 2009). The present study therefore considered a
bifactor model of WOAQ, and compared this model with the conventional higher order model
of WOAQ. Based on previous studies, bifactor models in general demonstrate superior fit
over the higher order models (e.g. Gignac, 2007, 2014; Reise, Bonifay, Haviland, 2012;
Reise, 2012). In spite of their importance in the context of Organisational studies, evaluation
of bifactor models such as WOAQ is something quite new for the organisational psychology
discipline. While a conventional model of WOAQ provides an indirect relationship between
the higher order construct and the items, a bifactor model of WOAQ provides a full first order
multidimensional model, where both the general factor of WOAQ and its nested subfactors
are evaluated with a direct relationship to the WOAQ items. In addition, using model-based
reliability coefficients Omega, more valuable information can be obtained about the internal
consistency of each construct. The results of this study revealed the superiority of the bifactor
model over the conventional higher order model. In addition, very important differences were
found between the higher order model and the bifactor model. The most important difference
was detected when the conventional higher order model failed to recognise the low and
differentially directed loadings of the ‘quality of relationship with management’ items. The
results of the bifactor model showed that the subfactor for ‘quality of relationship with
management’ was poorly defined independently of the general measure of WOAQ. The
126
subfactor of ‘reward and recognition’ was found to be implausible in both models. Given the
fact that high correlations were observed between these two subscales and their high
dependency on the general factor of WOAQ, the reliability of these two subscales clearly rely
more on the variation in the general factor of WOAQ than on any sub-factor.
These results have important practical implications. They show that in the context of
community nursing, although the general measure of WOAQ is a valid and reliable measure
for organisational risk assessment, the most important plausible subscales are ‘quality of
physical environment’, ‘the workload issues’ and ‘quality of relationship with the colleagues’
respectively. Based on the findings, focusing on the two subscales of ‘the quality of
relationship with management’ and ‘reward and recognition’ without considering the general
WOAQ indicators will be unlikely to lead to any significant improvements. In contrast, the
other three sub-constructs, especially ‘the quality of work environment’, seem to have
significant unique reliability, independent of the general factor of WOAQ. In practice, this
means that any intervention to improve only the work environment would still have
significant effects on the level of perceived risks in workplaces.
Unfortunately, lack of previous studies makes it difficult to compare the findings in
other health areas. The majority of previous studies on WOAQ have been conducted in
manufacturing settings using the conventional higher order model procedure (e.g. Griffiths, et
al, 2006; Waynne-Jones, 2009). However, close evaluation of the work setting of nurses,
indicates that these findings should not be much of a surprise. These findings fit with the
nature of the community-nursing work environment. The reason behind this is that although
the nurses belong to a large organisation, they work in different, small branches with their
own immediate managers/supervisors. In such an environment, there is a more informal
127
relationship between the nurse and manager/supervisor. The relationships in community
nursing settings are more colleague-colleague relationships rather than nurse-manager
relationships so it should be expected that ‘the quality of relationship with management’
would be unimportant. Also ‘the reward and recognition’ factor is strongly tied to the
management relationship and only items representing a variety of tasks, opportunity for
learning and using the new skills appeared as important indicators of this subscale. Thus, in
practice, if an organisation is wanting to make risk management improvements, the main
plausible subfactors to look at are the work physical environment, the relationship with
colleagues and managing the workloads.
The bifactor model of WOAQ could also have some critical cost and efficacy
implications in the workplace. For example, consider the situation where there are limited
budgets or resources to be allocated to improve the overall quality of the work organisation,
or, if it is not feasible or realistic to change all the subconstructs of risk in the workplace
simultaneously. Using a bifactor model one can separate the specific effects of each subfactor
from the general factor of WOAQ and determine the most plausible construct for an
immediate, more feasible intervention. In more costly or complicated situations, the
practitioners or policy makers could take advantages of such bifactor modeling to determine
the most plausible sub-constructs for achieving improvement in the short-term.
Unsurprisingly, the results also indicated that the conventional reliability coefficient of
alpha was overestimated (though slightly) compared to the omega total and omega
hierarchical coefficients. This is consistent with results of previous studies in other disciplines
(Gignac, 2007, 2014; Reise, Bonifay, Haviland, 2012). These scholars indicated that in
models violating the assumptions required for a reliable alpha, the coefficient will often be
128
overestimated. Therefore, it is highly recommended that in any future studies and especially
for complicated multidimensional models, that scholars should by default use model-based
reliability coefficients. Although this is deemed to be more critical in clinical or health
studies, overall it is important for scholars of all disciplines to use these more accurate
reliability assessments in order to avoid serious errors.
8.2 Discussion: Study of Paramedics-Cross Validation of Bifactor Model of WOAQ
The cross validation results obtained using the paramedics sample indicate that the
bifactor WOAQ model is a valid tool in another very different health setting. It appears that
using a bifactor model presents not only a better fit than a higher order model, but also
highlights the importance of the subscales relative to a single general factor in a health setting.
Moreover the results suggest that this model can be used with both male and female
paramedics. However, although these results demonstrate good validity for the WOAQ across
gender, the mean differences between males and females were found to be significant. The
results showed that the scores on two of the five nested constructs and the general factor of
WOAQ were significantly higher for male employees compared to female employees,
demonstrating that male employees are happier with the ‘quality of relationships with co-
workers’ and the system of ‘reward and recognition’, as well as the general quality of WOAQ
than female employees in this parmedics organisation. The results have important
implications in practice, and specifically in relation to the occupational health and safety of
paramedics, particularly female paramedics.
Overall, the WOAQ, especially the general WOAQ measure, appears to be a superior
instrument for assessing risk factors associated with employees’ health and health-related
behaviour, due to its satisfactory psychometric properties and short length. More importantly,
129
based on the results from the bifactor analysis, it was shown that some of the subscales are
more important than others in a health setting. This indicated that concerns relating to the
importance of an identified problem in the work setting can be solved by fitting the bifactor
WOAQ model to assess the importance of various risk factors in the workplace. Ultimately,
this will assist management in identifying problem areas which may cause harm to their
employees and the organisation, and thus allow proper action to be undertaken in order to
improve the work environment.
The WOAQ tool can be especially useful when access to specialist occupational health
support is limited. This is because WOAQ is short and easy to use and, in more practical
terms, can directly inform workplace interventions to improve employee health and well-
being. Furthermore, by directly informing the development of targeted workplace
interventions to improve the psychosocial factors and work conditions for paramedics, the
WOAQ offers a potential to help avoid the structural labour force shortages experienced in
this area, especially in the developed world.
8.3 Strengths and Limitations
The strength of the study is the context of the research and the methodology used.
This can be further elaborated as 5 key points.
Firstly, to the best of the researchers’ knowledge, this study is one of the first that is
comparing a conventional higher order model with a bifactor model of WOAQ in a health
setting. The methodology used also has theoretical and practical implications in other
organisational studies. The conventional higher order modeling is based on full mediation of
item effects by first-order sub-constructs. In practice and real life situations, and especially in
130
organisational studies such as WOAQ, this has limited applicability. Bifactor modeling
assumes partial mediation which is much closer to the reality.
Depending on their nature of work (e.g. manufacturing vs. non-manufacturing) and
occupation types, organisations will have significant differences in regard to WOAQ. A risk
assessment tool like WOAQ is a very useful tool for assessing the organisational risk factors.
However in practice, not all of the WOAQ subfactors are plausible or important, as was found
in this study. Therefore, in the work setting, bifactor modelling of WOAQ is deemed to be
more appropriate as the results relate well to real life expectations.
The 2nd key point is that this study has considered only the most suitable fit indices,
based on the degree of penalty included for model complexity. These indices (i.e. RMSEA,
NNFI/TLI, & AIC) and differences between these indices have been demonstrated in this
empirical study for interpreting the complex model of WOAQ.
The 3rd key point is that the study has used model-based reliability coefficients.
Taking into account the multidimensional nature of WOAQ, i.e. both the general and the five-
factor model of WOAQ, omega reliability coefficients have been used to assess measurement
reliability. Using omega model-based reliability measure rather than the conventional
coefficient alpha is recommended for multidimensional models such as the WOAQ.
The 4th key point is that this is one of the first studies that has been conducted for a
group of Australian employees in a health setting as opposed to a manufacturing setting for
which the original WOAQ was developed.. No previous studies have been completed in a
health setting using a comprehensive, short scale of risk assessment similar to the WOAQ.
This study therefore initiates a critical avenue for more research in WOAQ.
131
As the final key point, the WOAQ is a useful tool in practice because of its ability to
provide organisational risk assessment using only 28 items. This meets workplace
requirements in terms of cost, time and resources. Using bifactor modeling the most plausible
subfactors were identified for improving the organisational risk environment in a health
setting.
One of the limitations in this study is that it focuses only on health professionals.
Further studies are needed to expand the concept to other non-manufacturing or ‘blue collar’
occupations.
In spite of the importance of the omega reliability coefficient, still there is no detailed
guideline on the cutoff points for interpreting omega for general scales and for subscales.
Reise et al. (2012) suggested a minimum cutoff point of greater than .50, this is not backed up
by any significant evidence yet. Further studies are needed to shed more light on this.
The lack of background literature in an Australian context for the use of bifactor
modeling of WOAQ makes it difficult to evaluate or compare the results with other studies.
Further studies are needed to fill this gap.
8.4 Summary and Conclusion
In this study, attempts were made to assess the validity and reliability of WOAQ in an
Australian health setting, using robust methodological procedures. Based on the literature,
several robust procedures were adopted for assessing the validity of WOAQ, including a
comparison of the conventional higher order of WOAQ with a bifactor model of WOAQ and
the testing of model-based reliability.
132
In general, results showed that the WOAQ appears to be a superior instrument for
assessing risk factors associated with employees’ health and health-related behaviour due to
its satisfactory psychometric properties and short length. Although the general factor of
WOAQ seems to be the dominant factor, some evidence of multidimensionality was found
and some subfactors appeared to play more critical roles in risk assessment in a nursing
setting. The cross validity of the scale on a paramedic sample was demonstrated when these
results were replicated in another very different health setting. However, interesting
differences in mean values for male and female paramedics indicated that this was a gender-
sensitive assessment tool.
In conclusion, this study adds to the evidence supporting the feasibility of the WOAQ
for both research and practice in a range of settings. However, future research should continue
to validate the WOAQ with other occupational groups and sectors using a bifactor model.
133
9
STUDY 2: APPLICATIONS OF COVARIATE-DEPENDENT RELIABILITY
The purpose of chapters 9 to 11 is to empirically demonstrate the Bentler’s 2014
approach of covariate-dependent reliability. There are two main proposed applications; the
first application considers the effects of potential covariates on scale reliability, the second
demonstrates the effects of Common Method Bias (CMB) on scale reliability. Different data
sets were used for these two applications. The WOAQ data was used for the first application.
For the second application, a student study of social desirability, emotional intelligence,
wellbeing and alcohol drinking behaviour was used. These applications are described in this
chapter but the actual analyses are left until Chapter 10 with the discussion following in
Chapter 11.
9.1 Rational and Objectives
9.1.1 Application of Covariate-dependent Reliability in Reliability Assessments
In 2012 (personal conversations), Bentler introduced the concept of covariate-
dependent and covariate-free reliability that partitions total reliability into two parts. The first
part relates to external covariates and the second part being unaffected by such covariates
(covariate-free reliability). The approach was officially presented in 2014 (Bentler, 2014).
The following material on covariate-dependent reliability was adapted from either personal
conversations with Bentler (2012, 2013) or Bentler (2014). Only the practical application of
this concept was assessed in this study, using the data previously described in Chapter 7 and a
134
second student data set relating to social desirability, emotional intelligence, wellbeing and
alcohol drinking behaviour.
9.1.1.1 First Application of Covariate-dependent Reliability.
Based on the above development, covariate-dependent reliability and covariate-free
reliability can be evaluated for the bifactor model of WOAQ, using the nursing and paramedic
group variables as a covariate. Although the model-based reliability of WOAQ has been
found to be acceptable in the nursing and paramedic organisations (within organisation
assessment), an evaluation of reliability across organisations has yet to be established. It is
hypothesised that although both organisations are health related, due to differences in the
nature of work and different demographic characteristics of the paramedics and home-based
nursing organisations, the type of organisation will affect the reliability of the WOAQ. Hence,
the home-based nursing organisation and the paramedic organisation must be compared in a
reliability assessment of the bifactor model as illustrated in Figure 9.1.
135
Figure 9.1. Covariate-dependent reliability assessment with the bifactor model of WOAQ across the nursing and paramedic organisations.
136
Both nursing and paramedic occupations can be categorised as providing clinical care,
however the nature and demands of these two occupations are very different. A clinician in
the home-based nursing service provides services over a period of time that is defined by a
client’s needs. While these clients may have acute clinical needs, they are generally medically
stable, and often have been discharged from hospital as they no longer require the acute
clinical care provided in the hospital setting. In contrast, paramedical practitioners are called
to respond quickly to clients in need of urgent medical care. The paramedics have short,
intensive interactions with their clients who are often acutely ill. Clearly, the demands and
expectations are different for each of these professions, and consequently for the organisations
in which they work. While both professional groups complete most of their work away from
their formal organisational settings, the nature of their interactions with their clients is
fundamentally different. Typically, the nurses interact with the clients in their own homes
while paramedics work with patients in a wide variety of settings where urgent medical
response is required. The nurses have the opportunity to ‘get to know’ the clients and interact
with them over time, while the paramedics normally interact for only a single short term
episode, during which the clients may not even be responsive.
In addition to the different nature of work and wokplace demands, both organisations
have different demographic characteristics. For example in this study, the majority of home-
based nurses are female while the majority of paramedics are males. Also in comparison to
the community nursing organisation, the paramedics’ organisation has more part-time workers
and a significantly lower average age of workers.
When there is a group covariate, such as organisation type, that affects a latent factor
(WOAQ in this case), the question is whether there are mean differences in the latent factor as
137
a function of the group covariate. As mentioned previously, covariate-dependent reliability is
a measure of the group differences in the trait being measured relative to total variation
(Equation 3.12). Covariate-free reliability is a measure of the individual differences relative to
total variation, freed from any mean differences due to the covariate(s) (Equation 3.11). None
of the Omega reliability coefficients based on the WOAQ bifactor model introduced
previously have partitioned the variance into its covariate-dependent and covariate-free parts.
Based on the above information, it is rational to argue that due to differences in the
nature of the work and the demographic characteristics of paramedic and home-based nursing
organisations, the type of organisation will influence the reliability of work organisation
assessments such as the Work Organisation Assessment Questionnaire (WOAQ), that
measure psychosocial/physical aspects of an organisation. As a result, it was hypothesised
that:
Hypothesis 9.1: The type of organisation (home-based nursing vs. paramedic) will be
one of the possible covariates affecting model-based reliability coefficients of the WOAQ.
Method. The data used for Study 1 (home-based nursing and paramedics) were used
to demonstrate the application of a covariate-dependent (here organisation-dependent)
reliability assessment of WOAQ.
The procedure proposed by Bentler (2014), and fully discussed in Chapter 3, was used
to calculate the covariate-dependent and covariate free coefficients of WOAQ in this study.
This procedure is only available in EQS, and only for higher order models. The calculation for
bifactor models is not implemented in EQS yet, therefore all the calculations for the bifactor
model were conducted manually.
138
9.1.2 Second Application of Covariate-dependent Reliability for Demonstrating CMB
There is a general belief among scholars that measurement error is a source of many
problems in research (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). Measurement error
has the capacity to misrepresent and confound the empirical findings of research, causing
erroneous conclusions to be drawn (Bagozzi & Yi, 1991). This issue becomes more salient
when researchers rely on a single source of data collection and self-report measures (Glick,
Jenkins, & Gupta, 1986; Meade, Watson, & Kroustalis, 2007). The widespread use of single-
source data collection tools is a potential concern in relation to common method variance
(CMV), which has been of interest to psychology since the 1950s.
One of the more popular and most convenient procedures for collecting data in
psychology is the self-rated questionnaire (Malhotra, Kim, & Patil, 2006). It is also a common
practice in psychology that data is gathered from a single self-rated questionnaire (Avolio,
Yammarino, & Bass, 1991). As a result, CMV appears to be a common problem in
psychological studies (Malhotra, et al., 2006). Yet, despite its long history in the field of
psychology, there seems to be a gap in the literature on CMV. The literature has paid more
devotion to post-hoc statistical remedies, while the causes of the bias have been neglected.
Although, only a few researchers have investigated the consequences of CMV for
measurement models, only a limited number of studies have been conducted which try to
determine the effects of CMV on reliability (Williams et al., 2010). The goal of the present
study was to address the current CMV gap in the literature, using the newly developed
procedure termed ‘covariate-dependent and covariate-free reliability’ (Bentler, 2014). Using
this procedure, CMV was introduced as a covariate for the study scales. If any covariate-
139
dependence is identified among the study scales, then we can conclude that CMV exists and
needs to be controlled for in the data analysis.
Acquiescence is a potential source of the common method bias (CMB) that results
from self-report surveys (Spector, 2006). According to Winkler, Kanouse and Ware (1982),
acquiescence response sets refer to the propensity of the respondent to indicate agreement
with items on the questionnaire independent of content.
Results from self-report measures are also susceptible to social desirability bias. Social
desirability bias describes the inclination of the respondent to complete the questionnaire in a
way which enables them to be presented in a positive light and to be in line with the norms
and standards defined by their culture (Donaldson & Grant-Vallone, 2002; Ganster,
Hennessey, & Luthans, 1983; Podsakoff & Organ, 1986). Their responses to questions are
usually determined by the level of social desirability (Schriesheim, Kinicki, & Schriesheim,
1979) inherent in the items of a questionnaire. This form of bias usually serves to hide the true
bivariate relationships between variables and interferes with the interpretation of average
tendencies as well as individual differences (Ganster, et al., 1983; Podsakoff, et al., 2003).
A preventive technique for detecting and controlling CMB can be used when the
assumed cause of the method bias is known to the researcher and can be identified and
measured. For example, this is commonly the case for social desirability. This preventive
technique involves the inclusion of the CMB measure as a covariate with the study variables.
This also allows the effects of the surrogate measure for CMB (e.g. a social desirability scale)
on the reliability of the study measures to be assessed.
140
9.1.3 The Effects of CMB and CMV on reliability of measures
CMB and CMV may affect the validity of a study (Doty & Glick, 1998). CMB and
CMV have the ability to confound the true relationship between variables, resulting in a bias
between the observed and the true relationships by either inflating or deflating the estimates
(Doty & Glick, 1998). Although assessing the presence and quality of CMB provides
important information on the effects on the parameter estimates, it can also be used to
demonstrate the effects on scale reliability. Williams et al. (2010) proposed that the estimation
of the reliability should be achieved by evaluating the decomposition of the overall reliability,
both with and without a “marker variable” to reflect CMB. The reliability decomposition
(composite reliability) formula was originally proposed by Werts, Linn, and Joreskog (1974).
Overall reliability = Equation 9.1
[F = The sum of factor loadings, squared; E = Sum of error variances]
The composite reliability in this instance is equal to coefficient rho for a single factor
model and Omega total (Equation 3.9). For the present study, Bentler’s (2014) approach was
used to compare the reliability of models with and without the presence of a CMB marker.
The procedure is simple and easy to calculate using EQS (see Chapter 3 for full details on this
procedure). Note that, as explained earlier, when rho is used to represent the reliability of a
multi-dimensional model it is quantifying the proportion of variance due to the most reliable
single dimension in this multidimensional space. Using this procedure, one can obtain
estimates of the CMB-dependent reliability, the CMB-free reliability, and the total reliability
in one calculation.
FF E+
141
This method was used to assess the influence of social desirability on the reliability of
the constructs used in a student study of emotional intelligence, wellbeing and alcohol
drinking behaviour. It was hypothesised that:
Hypothesis 9.2: A covariate-dependent reliability assessment, using social desirability as a
potential source of bias, will demonstrate the effects of CMB on the reliability of the
constructs in the study (Figure 9.2).
The main constructs in the data described below contained sensitive questions
(emotional intelligence, wellbeing, and alcohol drinking habits), sometimes prompting
participants to demonstrate socially acceptable responses rather than presenting their true
opinions. Socially desirable responses could therefore lead to bias for the estimated
relationships between the constructs of the study (Ganster, Hennessey & Luthans, 1983).
Based on the literature, if factor or item responses are highly correlated with social
desirability, then social desirability could be a potential source of bias that needs to be
controlled (Podsakoff et al., 2003; Thomas & Kilmann, 1975).
However, the above model can be adapted to control for CMV due to a single survey
source as well as CMB, using SEM procedures. By integrating both an unmeasured latent
variable for CMV and a directly measured latent method factor representing CMB into the
SEM model, CMV and CMB can be evaluated simultaneously. In the context of the above
student survey, this method can be used to test for rater bias, including social desirability, as
well as a single source survey bias. It was therefore hypothesised that:
142
Hypothesis 9.3: A SEM integrated approach, including an unmeasured latent common
method factor and a directly measured method factor (social desirability), can be used to
evaluate the presence of CMV in the above context.
9.2 Method
The data for this study were collected from a group of undergraduate students in one
faculty of the participating university. Participant groups were randomly selected, using a list
of all the active subjects in the faculty for Semester Two in 2011. Upon receiving the
lecturer’s consent, a questionnaire package that included a cover letter, information sheet,
consent form, and a questionnaire, was provided to each student during their lecture break.
The information sheet provided assurances that all participant information would remain
confidential. Upon completion of the survey, students were asked to place their questionnaires
in the locked box provided in the classroom. After discarding the incomplete surveys, the
final number of surveys included in the analysis was 341.
9.2.1 Measures. The questionnaire contained questions relating to wellbeing, alcohol
drinking behaviour, emotional intelligence, social desirability and demographics.
Emotional Intelligence. Emotional intelligence was measured using the 33-item Self-
Report Emotional Intelligence Test (SREIT) (Schutte, Malouff, Hall, Haggerty, Cooper,
Golden, & Dornheim, 1998). On a five-point Likert scale, respondents were asked to self-
report their preferences on a scale from 1 (strongly agree) to 5 (strongly disagree). The
reliability and validity evidence for this scale has been positively assessed in previous studies
(e.g., Schutte & Malouff, 1999; Abraham, 1999; Ciarrochi, Chan, & Caputi, 2000; Petrides &
Furnham, 2000).
143
General Wellbeing. General wellbeing was tested using the General Wellbeing
Questionnaire (GWBQ) (Cox, Thirlaway, Gotts, & Cox, 1983). The GWBQ is a 24-item
instrument used to measure sub-optimal health, using self-reported symptoms of general
malaise. It includes a set of general non-specific symptoms of ill-health, including reportable
aspects of cognitive, emotional, behavioural, and physiological function, none of which are
clinically significant in themselves. The GWBQ consists of two 12-item subscales of sub-
optimum health: (a) worn-out/exhausted and (b) tense/nervous. Respondents were asked to
indicate how often they had experienced the listed 24 symptoms within the previous six
months on a scale from 0 (never) to 4 (all the time).
Social Desirability. The 16-item Social Desirability Scale (SDS-16) (Stöber, 2001)
was used to measure the social desirability of the respondents. The scale is presented with six
reverse-keyed items. The original scale has 17 items, but the item “I have tried illegal drugs”
(e.g., marijuana, cocaine, etc.) was excluded because it is not suitable for the measurement of
social desirability (Stöber, 2001). The items were parcelled into three scales in order to
achieve SEM model identification.
Alcohol Drinking Behaviour Screening. The World Health Organisation’s Alcohol
Use Disorders Identification Test (AUDIT) is a tool used for screening alcohol drinking
behaviour. AUDIT was originally developed by Saunders, Aasland, Babor, de la Fuente and
Grant in 1993 and has been validated extensively across different populations. It consists of
three items on alcohol consumption, three on drinking behaviour and dependency, and four on
the consequences or problems related to drinking. The items were parcelled into three items to
achieve model identification.
144
9.2.2 Overview of analysis. Confirmatory factor analysis (CFA) was conducted to evaluate
the proposed models. EQS 6.2 (built 100) and standard-fit indices (CMIN/DF, CFI, NNFI,
and RMSEA) were used to evaluate the model fit. For reliability assessment and comparison,
coefficient Omega, and covariate-dependent and covariate-free reliability coefficients were
calculated as described in chapter 3.
To evaluate hypothesis 9.3, both constrained (equal-method factor loadings) and
unconstrained (free-method factor loadings) models were assessed to find out if CMV exists
and whether it has equal effects on the constructs of the study. Recently, a partial correlation
technique was introduced by Lindell and Whitney (2001) that can be used to test for CMV. In
this procedure, a ‘marker variable’ representing CMV was included in the analysis. Using a
partial correlation procedure, the association between the marker variable and any construct in
the model is used as an estimate of CMV. This allows all correlations among the constructs of
the study to be corrected for CMV using a partial correlation adjustment (Williams, Hartman,
& Cavazotte, 2010). This method is called the correlational marker technique. Building on the
partial correlation procedure of Lindell and Whitney (2001), further development has been
carried out by Richardson et al. (2009) and Williams, Hartman, & Cavazotte (2010) using a
structural equation modelling procedure for capturing and adjusting for CMV. This marker
variable procedure using CFA was employed to evaluate hypothesis 9.3.
In SEM by default, ML is used for parameter estimation. When the sample size is
large and data is normally distributed, ML provides the most accurate estimation with the
smallest standard errors (Bentler, 2006). However, ML is sensitive to departures from
normality. Therefore, assessment of normality is an essential requirement when using this
procedure. Although the preliminary assessment of the data showed a relatively normal
145
distribution for the data, Mardia’s normalised coefficient was high - (G2, P) = 216.79,
indicating a violation of normality assumptions. Outliers were detected in a further analysis,
however the deletion of these observations did not result in a significant improvement in the
fit indices. As a result, all cases were kept and a suitable, non-parametric test was used to
evaluate the model. The Satorra-Bentler (1988, 1994) chi-square test delivers a more accurate
assessment of model fit when the data does not have a normal distribution.
146
10
STUDY 2: RESULTS
In this chapter the two applications described in the previous chapter are applied. The
application relating to reliability assessment is illustrated using the WOAQ data and the
application relating to CMB (in the form of social desirability) and CMV (in the form of a
single survey source), is illustrated using the student data for emotional intelligence, well-
being, alcohol drinking behaviour and social desirability.
10.1 Results of Application for Reliability Assessments – The study of WOAQ
Because hypothesis 9.1 states that the type of organisation will affect the reliability of
the WOAQ, organisation was added to the validated bifactor model of the WOAQ as a
covariate, allowing the evaluation of the effect of organisation on the reliability of the WOAQ
assessment tool (see Figure 9.1).
10.1.1 Descriptive statistics at item level
At the first stage, the validity of the model was assessed before proceeding with the
reliability assessment. As shown in Table 10.1, the data at item level is relatively normal. All
skewness and kurtosis coefficients were less than two and seven, demonstrating reasonable
normality at item level (West, Finch, & Curran, 1995).
10.1.2 Descriptive statistics at group level
As discussed before, the multivariate normality in EQS can be evaluated using
Mardia’s multivariate skewness and kurtosis tests (Bentler & Wu, 2002). Because Mardia’s
147
coefficient (G2, P) = 116.64: normalised estimate = 50.44) indicated violation of the
normality assumptions, non-parametric tests were used to evaluate the model.
Table 10.1 The Descriptive Characteristics of the Main Study Constructs and Parameters (n=1255)
Constructs Mean SD Skewness Kurtosis WOAQ - quality of relationships with management
2.72 .90 .27 -.52
3 3.18 1.04 -.09 -.57 5 3.10 1.27 -.06 -1.09 7 2.63 1.13 .20 -.76 11 2.42 1.21 .44 -.86 16 2.15 1.19 .79 -.41 17 2.85 1.06 .08 -.60 22 3.06 1.13 -.03 -.79 26 2.52 1.12 .18 -.89 27 2.65 1.13 .19 -.75 WOAQ - reward & recognition 2.60 .82 .41 -.41 12 2.23 1.09 .54 -.58 13 2.62 1.18 .29 -.90 14 3.32 .97 -.17 -.26 21 2.35 1.02 .29 -.60 23 2.63 1.18 .26 -.93 24 2.08 1.18 .82 -.38 25 2.96 1.07 -.07 -.62 WOAQ - workload issues 2.65 .90 .09 -.49 6 2.47 1.14 .38 -.82 8 2.30 1.08 .46 -.70 15 2.56 .79 .48 1.69 19 2.90 1.12 .00 -1.2 WOAQ - quality of relationships with colleagues
3.92 .85 -.46 -.49
10 3.93 .88 -.42 -.54 28 3.92 .94 -.60 -.25 WOAQ - quality of physical environment
2.54 .81 .45 -.16
1 2.60 1.26 .39 -.94 2 2.73 1.12 .31 -.69 4 2.46 .97 .75 .40 9 2.55 1.10 .47 -.54 18 2.42 1.05 .56 -.37 20 2.49 1.08 .38 -.53 Total 2.73 .71 .32 -.29
148
Table 10.2 summarises the characteristics of both the nursing and paramedic
organisations showing some demographic differences between the organisations. In terms of
gender, there was a greater percentage of females in the nursing organisation (94.5%), while
there were more males in the paramedic group (65.9%). The majority of the paramedics were
part-time employees (94.7%), had more than six years of experiences (62.6%), and their
average age was lower than that of the nurses (40 vs. 45 years old).
Table 10.2 Nursing and Paramedic Demographic Characteristics
Nursing
n=312* Paramedics n=945
Frequency (%) Frequency (%)
Gender
Male
Female
17 (5.5)
290 (94.5)
623 (65.9)
322 (34.1)
Employment status† Part-time 183 (60.0) 895 (94.7)
Full-time 123 (40.0) 48 (5.1)
Years of experience/years < 1 year
1-3 years 4-6 years
> 6 years
2 (.7)
7 (2.3) 16 (5.2)
282 (91.9)
92 (9.8)
127 (13.5) 133 (14.1)
588 (62.6)
Age Mean (Range) 45 (22-77) 40 (21-65)
Note. * Due to some missing data, n varies between 306 and 312. † Due to their very low percentage (7% in the nursing organisation and 1.3% in the paramedic organisation), casual employees have been allocated to part-time categories.
Further analysis was conducted to see if the demographic differences between
organisations were statistically significant. Chi-Squared tests of association were carried out
to compare gender ratios, employment status ratios (part-time vs. full-time), and years of
149
experience between the two organisations. The results showed significant differences between
the paramedic and nursing organisations in terms of the gender of employees, level of
experience and employment status (p<0.05).
Table 10.3 Mean Age Differences between Nursing and Paramedic Organisations
Organisation N Mean SD T p
Nursing 308 45 9.54 7.77 * 0.001
Paramedic 942 40 10.88
Note. * Equal variances were not assumed.
The results of the t-test showed that the mean age difference was significant, with the
paramedics being on average younger than the nurses (Table 10.3). It is therefore evident that
the two organisations have significantly different demographic characteristics.
10.2 Model Fit Evaluation
In the next step, the model fit was evaluated for the whole population (combined
nursing and paramedics) and separately for each organisation. Only if the fitted models
described the data well could the reliability assessment proceed.
The bifactor model of WOAQ was assessed separately for each organisation. Table
10.4 shows adequate model fit for the bifactor model for the nursing organisation (RMSEA =
0.04, NNFI = 0.93 as reported in Chapter 7) and the paramedics organisation (RMSEA =
0.05, NNFI = 0.93).
150
Table 10.4 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Organisations
Model SB χ2/df AIC CFI RMSEA NNFI
Nurses 1.71 89.66 0.94 0.04 (0.04-.05) 0.93
Paramedics 3.77 565.32 0.93 0.05 (0.05-0.06) 0.93
Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed fit index.
The model fit for both organisations combined (Table 10.5) was also good (RMSEA =
0.03, NNFI = 0.96, CFI = 0.97), so it was appropriate to proceed with the reliability
assessment of the model.
Table 10.5 Summary of Model Fit Statistics of the Bifactor Models of WOAQ (n=1257)
Model SB χ2/df RMSEA NNFI CFI
0. Independent Model 52.91
1. Bifactor model 2.83 0.03 (.03-.04) .96 .97
Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed Fit Index.
The model-based reliability coefficients of Omega total, Omega hierarchical, and
Omega subscales were calculated for each organisation separately. As shown in Table 10.6,
both organisations demonstrated high reliability for both Omega total and Omega
151
hierarchical, with Omega hierarchical representing the reliability of the general WOAQ
factor. The reliability for the Omega subscales of ‘quality of relationship with management’
and ‘reward and recognition’ were also similar for the two samples. However, the reliabilities
for ‘the quality of physical environment’, ‘the workload issues’ and ‘the relationships with the
colleagues’ were quite different in these two samples. The ‘quality of physical environment
‘and ‘workload issues’ reliabilities were higher for the nurses, while the reliability for the
‘relationships with the colleagues’ construct for the paramedics was almost double that for the
nurses.
Table 10.6 WOAQ Reliability Statistics for Nursing (as reported in Chapter 7) and Paramedic Organisations
Nursing Paramedics
Constructs tω hω sω tω hω sω
General WOAQ .92 .87 - .94 .89 -
Physical environment Relationships with colleagues
Quality of relationships with management Reward & recognition
Workload issues
.51 .35
.16
.15
.39
.42 .70
.22
.13
.29
The covariate-free and covariate-dependent reliability coefficients are given in Table
10.7. The model-based reliability coefficient rho shows that although the WOAQ is very
reliable for the whole sample (coefficient rho = 0.95), some part of the reliability is
dependent on organisational type (covariate-dependent coefficient rho = 0.32). Based on the
results, the type of organisation accounts for around 33% of the reliability. This indicates that
152
once the organisation type is controlled, there is less consistency left in the WOAQ
(covariate-free coefficient rho = .63). This result suggests that different parameter estimates
might be required for the nursing and paramedic samples. This will be tested below.
Table 10.7 WOAQ Reliability Statistics across Organisations (n=1257)
Model 1: Configural model with no constraints. The next step included fitting a
configural bifactor model for the nursing and paramedic organisations simultaneously, to
determine if the model was appropriate across organisations when no constraints were
imposed. Based on the results, the configural model (Table 10.8) showed marginal model fit
across organisations (RMSEA = 0.06, NNFI = 0.88), suggesting that there are indeed
significant differences in the parameter estimates for these two samples.
Model 2: Invariant loadings. However, after constraining the loadings to be equal for
both nursing and paramedics, the results showed good fit (RMSEA = 0.04, NNFI = 0.92).
Bifactor WOAQ Combined organisations
Coefficient rho .95
Covariate-dependent rho .32
Covariate-free rho .63
153
Table 10.8 Invariance Testing Across Organisations for the Bifactor Model of WOAQ
Models SB χ2/df
AIC CFI RMSEA NNFI
Model 1 - Configural model-no constraints
3.73 1086.76 .90 .06 (.06-.07) .88
Model 2 - M1+loadings invariance 2.68 478.16 .93 .04 (.04-.05) .92
Construct mean differences
2.99
698.89
.95
.05 (.05-.06)
.94
Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed Fit Index; CFI = Comparative Fit Index.
Group Differences for Construct Mean. The mean differences for the nursing and
paramedic organisations were therefore considered for the general factor of WOAQ and the
nested five sub-factors, with the paramedic organisation means selected as the reference
category. The construct means for the paramedic organisation were therefore set to zero,
while the construct means associated with the nursing were tested, providing an estimate for
the mean differences between the groups for all the constructs.
After setting equality constraints on loadings and intercepts for the measured
variables, with factor intercepts of zero only for paramedics, the results showed a good fit for
the model (RMSEA = 0.05, NNFI = 0.94). The mean differences between nurses and
paramedics demonstrated differences primarily on two of the nested constructs (‘co-worker’
and ‘workload issues’). The results showed that the mean scores on ‘the relationships with the
co-workers’ were higher for the paramedics than for the nurses, while the mean scores on
‘workload issues’ were lower for the paramedics, confirming that the parameter estimates do
154
differ for these two samples and explaining why the covariate-dependent reliability is so high.
The reasons for these differences will be explored in the discussion in Chapter 11.
But now we return to the second application in which the effects of CMB, measured
using a social desirability scale, and CMV due to a single survey source, are evaluated for a
student study of emotional intelligence, wellbeing and alcohol drinking behaviour.
155
10.3 Application in Demonstrating CMB using Social Desirability
As explained in the previous chapter, covariate-dependent reliability can be used for
common method bias evaluation. This section reports on the results of a different sample
(students) with the purposes of demonstrating possible CMB. The demographic
characteristics of the participants are presented in Table 10.9.
Table 10.9
Summary of the Demographic Characteristics of the Participants (n=341)
%
Gender Male
Female
18.18
81.82
Study status: Part-time
Full-time
2.64
97.36
Age (Mean/SD) 20 (3.98)
The overall reliability of the model, with CMB (social desirability) as a covariate, is
illustrated in Figure 10.1.
156
Figure 10.1. The effects of latent factor bias (social desirability) on the reliability of the constructs. Note. Due to the identification issue, item parcelling was used for the emotional intelligence (EI) construct, allowing four EI subfactors to load as observed variables for this construct.
The effect of social desirability as a source of CMB was assessed by conducting
Bentler’s covariate-dependent and covariate-free reliability assessment procedures. If there
was a difference in the reliability coefficients of the constructs after including CMB as a
covariate in the model, it means that there is some degree of covariate-dependent reliability,
and we can therefore conclude that CMB, in the form of social desirability, has biased the
reliability of the constructs.
CMB-Social Desirability
F Alcohol Habits
Emotional Intelligenc
Wellbeing
157
Table 10.10
Comparison of the Covariate-dependent and Covariate-free Reliability Coefficients of the
Scales after Including CMB
As presented in Table10.10, the CMB variable (social desirability) inflated the
reliability of the scales by around 27%. Removing the effect of CMB reduced the reliability of
the scales to only 0.66. This result suggests that CMB had a remarkable effect on reliability,
providing support for its existence in this study. Further analysis has been conducted in order
to also test for common method variance (CMV) using the model shown in Figure 10.2.
Reliability Coefficients rho( tω )
Overall reliability Covariate-
dependent
reliability
Covariate-
free reliability
All Scales .84 .18 .66
158
Figure 10.2. The proposed model for evaluating CMB/CMV.
Note. CMV = common method variance, CMB = common method bias (social desirability).
WORN
NERVE
ALC1
ALC2
ALC3
EI – FAC
EI - REG
EI - UND
EI - PER
Social Des 1
Social Des 2
Social Des 3
WELBEING
ALCOHOL
EMOTIONAL INTELLIGENCE
CMV
CMB*
D5
159
Model 1. This is a baseline model illustrated in Figure 10.3, where the three study
constructs (well-being, alcohol drinking behaviour, and emotional intelligence) are correlated
with each other, but CMV and CMB weights are constrained to zero (i.e., are not controlled
for). This is used as a comparison model when there is no control for method bias and
variance.
160
Figure 10.3 . Model 1: Baseline model when all the study constructs are correlated
without controlling for CMV and CMB
WORN
NERVE
ALC1
ALC2
ALC3
EI - FAC
EI - REG
EI - UND
EI - PER
Social Des 1
Social Des 2
Social Des 3
WELBEING
ALCOHOL
EMOTIONAL INTELLIGENCE
CMV
CMB*
D5
161
Model 2. The second model, illustrated in Figure 10.4, was compared with the
baseline model. This is a constrained model in which CMV and CMB were included but the
loadings from CMV to the study indicators were constrained to have equal effects. CMB
(social desirability) was included in the model as a predictor of CMV. It is expected that
social desirability would be the main source of bias for the study’s self-rated questionnaire
when asking about alcohol drinking behaviour and emotional intelligence skills. However,
this model controls specifically for CMB caused by social desirability, as well as other
random sources of CMV (e.g. single survey source).
162
Figure 10.4. Model 2. Constrained equal loadings from CMV to the study indicators.
WORN
NERVE
ALC1
ALC2
ALC3
EI - FAC
EI - REG
EI - UND
EI - PER
Social Des 1
Social Des 2
Social Des 3
WELBEING
ALCOHOL
EMOTIONAL INTELLIGENCE
CMV CMB*
D5
163
Model 3. The third model, illustrated in Figure 10.5, is the same as the previous model
but the loadings of CMV to the indicators were allowed to differ. A comparison of the
constrained Model 2 and unconstrained Model 3 with the baseline model tests the amount of
CMV for each of the study constructs individually. A comparison of Model 2 (constrained)
with Model 3 examines whether the effects of CMV are equal for all three constructs
(wellbeing, alcohol drinking behaviour, and emotional intelligence).
As shown in Table 10.11, both the constrained CMV (Model 2) (SBχ2/df (48) = 2.44,
RMSEA = .07, CFI = .86) and unconstrained CMV (Model 3) (SBχ2/df (39) = 1.45, RMSEA
= .04, CFI = .96) describe the data significantly better than the baseline model, which does not
account for any common method variance or bias (SBχ2/df (52) = 3.61, RMSEA = .09, CFI =
.73). These results suggest that social desirability accounts for part of the method bias.
A comparison of Model 2 and Model 3 also showed that the latter, with varying
weights from CMV to the indicators, describes the data significantly better than Model 2 with
equal indicator loadings for CMV. The results suggest that CMV has different effects on the
indicator loadings for the three constructs (wellbeing, alcohol drinking behaviour, and
emotional intelligence), perhaps suggesting that social desirability may not be the only source
of CMV.
164
Figure 10.5. Model 3. Free loadings from CMV to the study indicators
WORN
NERVE
ALC1
ALC2
ALC3
EI – FAC
EI - REG
EI - UND
EI - PER
Social Des 1
Social Des 2
Social Des 3
WELBEING
ALCOHOL
EMOTIONAL INTELLIGENCE
CMV
CMB*
D5
165
Table 10.11
Summary of Fit Indices of Comparison Models
Model SB (df) *
/df
CFI RMSEA
(CI)
Comparison models df P
1. Baseline 188.16 (52) 3.61 .73 .09(.08, .11)
2. Constrained CMV† 117.19 (48) 2.44 .86 .07(.05, .08) Baseline vs. Constrained CMV
(1 vs. 2)
70.97 4 <0.001
3. CMB-CMV 56.62 (39) 1.45 .96 .04(.01,.06) Baseline vs. CMB-CMV(1 v 3)
(2 vs. 3)
131.54
60.57
13
9
<0.001
<0.001
*Satorra-Bentler scaled chi-square; † loadings from CMV set to be equal in this model.
2χ 2χ 2χ∆ ∆
166
Table 10.12 presents the differences between the standardised loadings of the three
constructs (wellbeing, emotional intelligence and alcohol drinking behaviour) when
CMV/CMB are controlled. As can be seen, in Model 3, CMV does not have equal effects
on the indicator loadings, and, comparing Model 1 and Model 3, emotional intelligence and
alcohol drinking behaviours have the most inflated weights when CMV and CMB are not
controlled.
Table 10.12 Standardised Factor Loadings for Different Models Compared to Baseline
Indicators Baseline-
Model 1
Constrained
CMV-CMB
Model 2
CMB-CMV
Model 3
Wellbeing: worn-out/
exhausted 0.71 0.68 0.70
Wellbeing:
nervous/tense 0.85 0.62 0.74
EI-Facilitation 0.40 0.36 0.19
EI-Regulation 0.92 0.71 0.69
EI-Understanding 0.43 0.56 0.26
EI-Perceiving 0.46 0.62 0.29
Alcohol – 1 0.55 0.39 0.11
Alcohol – 2 0.93 0.66 0.25
Alcohol – 3 0.43 0.37 0.09
Social Desirability- 1 0.50 0.61
Social Desirability- 2 0.44 0.37
Social Desirability- 3 0.55 0.48
CMB>CMV path -0.003 0.45
167
This example supports the hypothesis that the SEM-integrated approach allows
control for the effects of common method variance and bias due to social desirability as
well as other possible sources of CMV.
168
11
STUDY 2: DISCUSSION
An overview of the history of reliability assessment was presented in Chapter 3,
starting with the single general coefficient for assessing internal consistency reliability
which was published by Cronbach in 1951. The critique of this coefficient and
recommendations for improvements were discussed. Although these recommendations may
be useful, there are other methods that could be considered In particular, the newly
developed covariate-free and covariate-dependent coefficients (Bentler, 2014) provide
insight into the internal consistency of scales when covariates are controlled.
The influence of covariates on rho (a model-based reliability coefficient) and on the
development of covariate-free coefficients of reliability was described in chapter 9. An
empirical study demonstrated the role of organisational type on the reliability of WOAQ in
Chapter 10. The WOAQ is a widely used measure for risk assessment in organisations,
based on the identification and collection of employee opinion regarding their work and
health (Griffiths, Cox, Karanika, Khan, & Toma, 2006). The scale is relatively short with
28 items. Using another student data set (chapter 10) also demonstrated how the effect of
CMB on reliability could be evaluated using the covariate-dependent and covariate-free
reliability measures. In addition, the effects of CMV and CMB on each of the constructs
emotional intelligence, wellbeing and alcohol drinking behaviour were compared in
Chapter 10. This chapter provides a discussion for these two applications.
169
11.1 Discussion: Application in Reliability Assessment of WOAQ
In this section, the reliability and covariate-dependent reliability of the WOAQ is
discussed for Australian employees in two separate organisations – a community nursing
organisation and a paramedic organisation. The WOAQ was validated as a bifactor measure
in Chapter 7, including a general measure of WOAQ and five nested subfactors, each
representing different dimensions of work organisation risk assessment. The five nested
subfactors were: ‘quality of relationships with management’, ‘reward and recognition’,
‘workload issues’, ‘quality of relationships with colleagues’, and ‘quality of physical
environment’. Although the employees in both organisations provide clinical services, the
nature and demands of the occupations are very different. In chapter 10 the results of
Bentler’s covariate-dependent approach to reliability assessment showed that almost one
third of the reliability was accounted for by the type of organisation.
The invariance testing supported this conclusion. It was found that ‘relationship
with colleagues’ was much more important to paramedics than to nurses. This can be
explained by the nature of work done by the paramedics. The service delivery for
paramedics is based on teamwork; without teamwork the quality of work would be
affected. Therefore, it makes complete sense that for paramedics their relationship with
their colleagues had higher loadings than was the case for home-based nurses.
In contrast, workload issues were less important to paramedics than to nurses. The
items captured for the workload issue construct consisted of pace of work, workload, and
the impact of work on family and family on work. When comparing the demographic
characteristics of both organisations, it is not surprising that the workload issue had less
170
importance for paramedics than for nurses. One main reason is working hours. The
majority of paramedics work part-time, while the majority of nurses work full-time. Also
the majority of the nurses were female and the majority of the paramedics were males. The
literature on work-family conflict demonstrates that women, both as employees and
caregivers, tend to be more exposed to the experience of conflict in juggling their work and
family demands. It is therefore likely that the impact of work-family conflicts and workload
are less overwhelming for paramedics, who are mainly male and work part-time, than for
females who are mainly females working full-time.
‘Pace of work’ was another item included in the workload construct. Because
paramedics work in a fast-paced work environment, it is possible that individuals interested
in such working conditions tend to become paramedics. This suggests that the pace of work
does not bother them as much as it does home-based nurses. Both, the home-based nursing
staff and the paramedics provide their services outside of their formal organisational
settings. However, the nurses’ work sites tend to be within the homes of their clients. The
atmosphere is one of trust, as the nurses and their clients interact over a number of clinical
treatments over an extended period of time. In contrast, the paramedics have a variety of
work sites, ranging from road, school and workplace accidents, through to nursing homes
and clients’ homes. Their interaction is by definition urgent and filled with emotion. Often,
their clients are unable to interact with the paramedics. For these two seemingly similar
organisational types, there are very significant differences in the work environment that
influence the responses to both the ‘Your Work’ and ‘Your Well-being’ components of the
WOAQ.
171
The above comparisons show how the type of organisation and the nature of work
impact on the way work organisation is assessed. As the example of WOAQ has
demonstrated, using the Bentler’s covariate-dependent and covariate-free reliability
assessment could have many benefits in practice, allowing scholars and researchers to
extract meaningful information from these measures. The results of Study 1 and Study 2
clearly show that the WOAQ is a useful tool for assessing different aspects of work
organisations in health settings. However, different types of organisations put different
weightings on the parameters assessed in the WOAQ. When assessing WOAQ in a
paramedic organisation, more attention should be paid to teamwork and fostering a spirit of
teamwork in the organisation. If WOAQ needs to be improved in a home-based nursing
setting, the focus should be on workload issues and managing work-life balance.
Therefore, it is very important to consider the possible covariates of reliability to get
more precise and meaningful outcomes in assessments. Study 2 presented an example
relating to the application of WOAQ, but this procedure has many other potential uses in
educational, clinical and/or health settings that need further investigation.
11.2 Discussion: Application in Demonstrating CMB
The other application of Bentler’s covariate-dependent and covariate-free reliability
procedure covered in chapter 10 offers new and comprehensive techniques for controlling
CMB and CMV which are common problems in psychological studies. The issue of CMV
becomes more noticeable when researchers rely on a single source of data collection and
CMB becomes more noticeable when self-report measures are used. In Chapter 10 the
covariate-dependent reliability procedure was used to demonstrate the effect of CMB and
172
CMV on scale reliability. The effects of social desirability was evaluated as a covariate in
the student study of emotional intelligence, wellbeing and alcohol drinking behaviour. The
results showed that around 27% of the scale reliability was influenced by CMB as
measured by social desirability. A SEM approach introduced by Williams, Hartman, and
Cavazotte (2003) and Podsakoff et al., (2003), and further developed by other researchers
(e.g., Richardson et al., 2009; Williams et al., 2010), was then used for assessing the effect
of CMV and CMB on each of the study constructs, emotional intelligence, wellbeing and
alcohol drinking behaviour.
The results produced two main findings:
a) CMB (due to social desirability) appears to inflate the reliability measures of the
scales.
b) The measures of emotional intelligence and alcohol drinking habits were more
influenced by CMB and CMV than wellbeing.
Consistent with previous findings (e.g., Podsakoff et al., 2003; Richardson et al.,
2009; Williams et al., 2010), it seems that the CFA approach provides a practical method
for controlling for method variances and biases. The findings also demonstrate that forcing
equal CMV effects for all measures is not appropriate because it adversely affects the
model fit. The findings showed that CMV effects differ depending on the nature of each
measure; CMV weights should therefore be allowed to vary. This result is similar to the
finding of Williams et al., (2010) but contradicts the equivalent method effects technique
proposed by Lindell and Whitney (2001).
173
11.3 Strengths
The research conducted in Study 2 and reported in chapters 9 to 11 was underpinned
by several strengths. First of all, to the best of the researcher’s knowledge, this study was
the first of its kind to demonstrate covariate-dependent reliability empirically. While
methods of reliability generalization (Vacha-Haase, 1998; Vacha-Haase & Thompson,
2011) have been proposed to study variation in reliability, reliability generalization is a
meta-analysis methodology requiring data from a large number of studies. In contrast
Bentler’s (2014) new procedure for covariate-dependent and covariate-free reliability
coefficients requires only a single study and can provide more accurate coefficients of
reliability with estimates of how these are affected by group characteristics. This approach
can be adopted for the conventional internal consistency measure (coefficient α), as well as
model based reliability for multi-dimensional studies. In Study 2 this method was initially
adopted in the context of a bifactor model. Using Omega hierarchical and Omega subscales
the reliability of a general factor and its subfactors were assessed. Then using a covariate-
dependent reliability assessment the between group variation with regard to reliability was
assessed. This is a novel approach and has application potential whenever the reliability of
a given scale might be affected by grouping variations.
Secondly, this study has shown that the type of organisation influences the
reliability of assessments such as the WOAQ that measure the psychosocial and physical
aspects of an organisation. This finding introduces a novel area of research that needs
further exploration.
174
Thirdly, this study could be considered as one of the first of its kind that
demonstrates the application of covariate-dependent and covariate-free reliability in
assessing CMB effects on reliability. The procedure appears to provide a very
comprehensive and simple quantification of the method effects in self-reported studies of
this kind. More studies of this type are needed to provide a comprehensive understanding of
common method effects on the reliability of measures. The proposed covariate-dependent
and covariate-free reliability procedure can be easily calculated using EQS.
Fourthly, this is one of the first studies that integrates a measured and unmeasured
latent variable procedure (Podsakoff et al., 2003), controlling for CMV and CMB. The
procedure appears to provide a very comprehensive way of controlling method effects in
self-reported studies. However, there is a need for further studies in order to shed more
light on this procedure.
11.4 Limitations and Directions for Future Research
Despite the above strengths, this study is not without weaknesses and limitations.
One of the limitations in this study is the lack of background literature on covariate-
dependent reliability. This makes it difficult to evaluate and compare the results with other
studies. Further studies are required to expand this new, practical area of research.
In this study, only two occupations in the health field were compared in order to
assess the covariate-dependent reliability of the WOAQ. Further studies are needed to
expand the concept to white, blue, and pink collar workers, as well as other health
professional occupations.
175
Covariate-dependent reliability may have practical implications for cultural
comparisons using the WOAQ or other similar scales. Therefore, future studies could
consider the role of culture as a covariate in the assessment of reliability of such scales.
The marker variable choice for CMB (in this study, social desirability) is
controversial. Some scholars believe that the marker variable should not have any
relationship with other substantive constructs while others believe the opposite (e.g.,
Lindell and Whitney, 2001; Richardson et al., 2009; Williams et al., 2010). In their review
of previous studies, Williams et al., (2010) demonstrated that researchers use a broad range
of variables as marker variables for CMB. They concluded that “... no consideration has
been given to the role of theory associated with method processes to guide the selection of
marker variables and the understanding of their effects” (p. 505). Further studies are needed
to determine the best criteria for choosing an appropriate marker variable when controlling
for CMB.
176
12
STUDY 3: MODEL-BASED RELIABILITY AND VALIDITY OF
REFLECTIVE-FORMATIVE MODEL OF WAS USING PLS-SEM
All studies discussed in the previous chapters had constructs measured using
reflective models. The literature rarely reports formative-formative or reflective-formative
measurement models. The purpose of Study 3 is to demonstrate the validity and model-
based reliability of a reflective-formative model of the Work Ability Scale (WAS). The
results are compared with the misspecified reflective forms of the WAS model to highlight
the possible errors that occur because of misspecification of measurement models. The
results of Covariance-Based Structural Equation Modelling (CB-SEM) and Partial Least
Squares SEM procedures are presented in Chapter 13. The Discussion in Chapter 14
presents the practical implications of these findings. This chapter introduces this study.
12.1 Rationale and Objectives
In Australia, the Work Ability Survey (WAS) and the Work Ability Survey
Revised (WAS-R) were developed by the Business, Work and Ageing Research Centre, at
Swinburne University of Technology, Melbourne (Taylor & McLoughlin, Dec 2011). As
described in Chapter 5, a decision-making tree has been developed for distinguishing
formative from reflective models (Figure 5.5, Chapter 5),
As explained below this decision-making tree suggests that the WAS should be
specified as reflective in the first-order and formative at higher orders (i.e. a reflective-
formative model).
177
12.1.1 Empirical Example: The Study of Work Ability
This section presents background on an empirical example (study 3) comparing the
results of a reflective-reflective higher order measurement model for work ability scale,
fitted using CB-SEM, and corresponding reflective-reflective, reflective-formative and
formative-formative work ability models fitted using the PLS SEM procedure. Before
discussing the methodology used, there is a need to fully explain the theoretical background
on which this model is based. As demonstrated in the decision-making framework in
Chapter 5 (Figure 5.5), the first step is to evaluate the background theory of the measure to
find out if the work ability model has previously been considered as reflective or formative.
Therefore, a review of the empirical and theoretical background of work ability will be
provided next.
12.1.2 History of work ability research
More than thirty years of international research on work ability and age
management have provided proof that the working life can be improved and extended.
Work ability is predominantly about work-life balance. In the early 1980s, research on
work ability started in Finland to determine the length of people’s working life and how
this is affected by work contentment and job demands (Ilmarinen, 2009). Through the
years, work ability conceptualisation has progressed to become more holistic. The history
of work ability research can be divided into three phases: (1) Evolution (1980 – 1989), (2)
Conceptualisation and Implementation (1990 – 1999), and (3) Internationalisation (2000 –
present). A brief description of each phase will be presented next.
178
Evolution (1980 – 1989). Work ability in the 1980s was defined as “How good is the
worker at present and in the near future, and how able is he or she to do his or her work
with respect to work demands, health and mental resources?” (Illmarinen, 2003; p. 3).
This was a period of longitudinal research, driven primarily by the question as to
what would happen employment-wise to the post-war baby boomers in the 1990s as they
started approaching retirement. The research also examined the extent of people’s working
life span and retirement (Ilmarinen, 2010).
Based on this positive approach, a multidisciplinary team of scientists constructed
and validated the Work Ability Index (WAI) and, using a stress/strain concept, they applied
and evaluated the work ability index on a large number of participants between 1981 and
2009 (llmarinen, 1991).
Conceptualisation and Implementation (1990 – 1999). The main characteristic of
the research during this period was the large number of longitudinal studies of men and
women who worked in the same occupation throughout the entire study period. The aim of
these longitudinal studies was to find a way to prevent disease and disability among
workers who were approaching retirement. Concurrently, the researchers were seeking a
way to maintain workers’ health and work ability (Tuomi, Ilmarinen, Klockars, Nygård,
Seitsamo, & Huuhtanen, 1997). Emphasis was on changes in work, lifestyle, health, stress
symptoms and work ability, as well as on the causes of any change. Changes were analysed
based on age, gender, work contentment and work profile. The occupations of the
participants were divided into physical work, mental work, and both physically and
179
mentally demanding work (Tuomi, Ilmarinen, Seitsamo, Huuhtanen, Martikainen &
Klockars, 1997).
The results highlighted that the different interactions between biological ageing,
health, lifestyle and work strongly affect work ability. But it appeared that, in general, work
ability decreases with age (Ilmarinen, Tuomi, & Klockars, 1997).
Even though a decline was observed in the work ability of the participants with age,
the initial age did not explain observed differences in the magnitude of these changes in the
participants’ work ability. The authors suggested that, in order to improve work ability,
there is a need for better supervisor attitudes, increasing variety at work, leisure and
physical activity (Tuomi, Ilmarinen, Martikainen, Aalto, & Klockars, 1997). It seemed that
while the work ability of senior employees usually declined with age, the work ability of
employees could be improved regardless of their age.
It was also found that the mean WAI improved among 10 per cent of the
participants and declined dramatically among 30 per cent. For 60 per cent of participants,
the index was steady at a good or excellent level (Tuomi, 1997). Based on a logistic
regression analysis, it was found that factors relating to lifestyle, management, and
ergonomics explained both positive and negative changes in work ability (Ilmarinen et al.,
1997).
The outcomes of the research had a profound impact in Finland. The Finnish social
partners made an agreement to promote and maintain work ability in workplaces. A work
ability measure was created and validated, and health professionals including physicians
and nurses were trained in the application of the WAI (Ilmarinen, 2009).
180
The study showed that the behaviours of managers and supervisors are among the
most critical factors influencing work ability. Also, improved work ability of ageing
employees and workers was directly related to age awareness (Ilmarinen, 2010). Based on
the study results, a focus on age management became popular in the early 1990s, and
training in age management started shortly afterwards. This developed into an international
course on age management which is still running (Ilmarinen, 2010).
Internationalisation (2000 – present). The original WAI was translated into many
languages in the early 1990s. The international validation of the index showed good results.
The psychometric properties of the scale and its predictive ability and cultural
appropriateness have been acknowledged to be constant across Europe (Gould, Ilmarinen,
Järvisalo, & Koskinen, 2008).
The global use of the original WAI provides excellent possibilities for international
networks and databases related to the index. This allows new possibilities for research,
which will strengthen WAI networks worldwide.
However, the work ability concept has changed over time. Current
multidimensional work ability theory focuses on the promotion of longer and healthier
careers with employment growth and improved wellbeing of the population until retirement
and beyond. Today, work ability is related to nearly all factors of work and life including
work-related, individual and social factors (Gould et al., 2008). These connections to most
aspects of daily living make the definition of work ability challenging and its promotion
demanding.
181
Since the 1980s, a large amount of research which focused on work ability and its
related factors has helped in the understanding of work ability and its complex relationship
with these factors. The growing importance of work ability research and applications is also
due to changes in the organisation of work and wider societal and population trends across
the world. In order to preserve work ability, it is essential to strive for a healthy work-life
balance (Gould et al., 2008).
There are several other indicators of work ability available; however, the original
Work Ability Index is by far the most widely used measure. In a three-level assessment of
work ability, participants evaluate their current work ability regardless of whether they
work. They may be completely fit for work, partially disabled for work, or completely
disabled for work. The score is usually referred to as the ‘work ability estimate’, and ranges
from 0 to 10. (Gould et al., 2008). A range of scores from 0 to 10 indicates full work
disability to best work ability. In the next section the current WAI, which incorporates the
original work ability estimate, is explained briefly.
The current Work Ability Index (WAI). The Finnish Institute of Occupational
Health originally developed the current index as a tool to predict retirement age and to
record the work ability of employees. It was designed to identify the health risks of
employees at an early stage and to highlight the risks of early retirement so as to avoid
these risks (Morschhäuser & Sochert, 2006). The WAI validity was tested using clinical
studies for many years. It has been used for years in occupational health and safety research
and practice in order to investigate the association between human resources and other
182
work-life factors, as well as to compare work ability in different age groups (Ilmarinen,
Tuomi, & Seitsamo, June 2005).
The index involves a self-assessment questionnaire and has a strong focus on health
status, resources and the subjective estimation of work ability (Gould et al., 2008). It is
based on questions that incorporate both the physical and mental demands of an employee’s
work (Tuomi, Ilmarinen, Jahkola, Katajarinne, & Tulkki, 2006). In the original study, after
completing the questionnaire, each employee was interviewed by an occupational health
professional. Based on the assessment, an evaluation was made as to whether there could be
any restriction or improvement on the employee’s current work ability in the future (Tuomi
et al., 2006).
The WAI has seven items (See Table 12.1), with a total score ranging from 7 to 49.
There are four categories derived from the WAI score, reflecting poor work ability (7 – 27
points), moderate work ability (28 – 36 points), good work ability (37 – 43 points) and
excellent work ability (44 – 49 points) (Martus, Jakob, Rose, Seibt, & Freude, 2010). The
score refers not only to the employee’s current status of work ability but also provides some
information on health-related risk factors. The results give an indication as to whether the
appropriate strategy is to maintain the current work ability, improve it and support it, or re-
establish it. According to Ilmarinen, the WAI is capable of reliably predicting work
disability, retirement and mortality (Ilmarinen, 2007).
183
Table 12.1
Items of the Work Ability Index
Items Range
1 Current work ability compared with the lifetime best 0 – 10
2 Work ability in relation to the demands of the job 2 – 10
3 Number of current diseases diagnosed by a physician 1 – 7
4 Estimated work impairment due to diseases 1 – 6
5 Sick leave during the past year (12 months) 1 – 5
6 Own prognosis of work ability two years from now 1 – 7
7 Mental resources 1 – 4
Note: Reprinted from Ilmarinen, J. (2007). The Work Ability Index (WAI). Occupational
Medicine, 57, p. 160
The index is analysed based on two factors. The first factor reflects a subjective
assessment of current and future work ability. The second factor reflects objective data
regarding health status and sick leave (Radkiewicz, Widerszal-Bazyl, 2005 & the NEXT-
Study group, 2005). Items one, two, six and seven measure the subjective component of
the index. The third, fourth and fifth items measure the objective component of the index,
based on the occurrence or absence of different illnesses listed in the questionnaire.
The WAI is easy to use. It takes about ten to fifteen minutes to administer the
questionnaire and a further three to five minutes for evaluation. (Ilmarinen & Tuomi, 2004).
It is highly recommended that participation is voluntary because the WAI surveys provide
confidential data on an employee’s illnesses and work ability. This means that data
protection must be strictly observed.
184
There are several other instruments designed to assess work ability or health-related
risk factors. While most instruments focus on labour and human resources policies, the
advantage of the WAI is that it concentrates directly on the employee's self-assessed work
ability (Morschhäuser & Sochert, 2006).
The reliability of the index was further analysed in the Netherlands using a test-
retest evaluation within a four-week interval. The results indicated that 25 per cent of
participants achieved the same WAI score on both measurements. The average test and
retest results were also similar indicating scale reliability (de Zwart, Frings-Dresen & van
Duivenbooden, 2002).
Validity and reliability for this index have been assessed using correlation analyses.
Other psychometric properties of the WAI have also been tested, including internal and
predictive validity, and the results have been published in peer-reviewed literature and
reports (European Network for Workplace Health Promotion (ENWHP) & National Work
Ability Index Network, 2012).
For example, de Zwart, Frings-Dresen, & van Duivenbooden ( 2002) explored work
ability among well-educated professionals, while another study (Pensola, Järvikoski, &
Järvisalo, 2008) looked at unemployment and work ability. Long unemployment, poverty,
and a lack of education are well known risks for marginalisation, and the results showed
clearly that unemployed individuals had more limited work ability than those who were
employed. Work ability scores were directly related to the extent of unemployment such as
its length and frequency. It was noted that part of the relationship between unemployment
and limited work ability was linked to economic difficulties, especially among the long-
185
term unemployed. In a 1991 study, subjective assessments reported via the WAI
questionnaire were compared with clinical examinations including cardiovascular,
musculoskeletal, and psychological measurements. The clinical examinations for both male
and female workers were selected according to health and subjective work ability as
reported by the questionnaire. The researchers found that the results suggested a
relationship between the level of work ability and other clinically assessed factors. There
were some individual differences observed, but they were explained on the basis of the
available data (Eskelinen, Kohvakka, Merisalo, Hurri, & Wagar, 1991).
The WAI can be used for individuals and groups, or even an entire company
(Morschhäuser & Sochert, 2006). A selected review panel from European countries stated
that the index is a useful, valid, and reliable tool that addresses a very relevant issue in the
workplace (European Network for Workplace Health Promotion (ENWHP) & National
WAI Network, 2012). They also viewed the index as a powerful predictive tool for
premature retirement. Organisations can implement strategies to moderate the risk of early
retirement based on the item responses for this index. However, the panel highlighted a few
challenges in terms of practical applicability and this has led to the development of new
multidimensional work ability models.
Multidimensional work ability model. With a large amount of research
undertaken internationally, there have been substantial changes to the work ability concept
and the conceptual models used to describe work ability. At the beginning of this
development, the aim was to predict retirement age and to try to find out how long people
are able to continue working after retirement, and what role work satisfaction and job
186
demands play in determining these factors. Health status was viewed as the most important
component of an individual’s functional capacity. With the development of the concepts of
work ability in a more holistic direction, consensus grew that work ability could not be
analysed individually and that there was a need for a conceptual shift to more of a life-work
balance model of work ability (Gould et al., 2008).
In early 2000, the Finnish Institute of Occupational Health in Helsinki introduced a
more advanced model of work ability. It is based on studies and development projects
conducted in the 1990s on occupational wellbeing in different industrial sectors and among
different age groups. The multidimensional image of work ability includes both individual
resources as well as work-related and personal factors (Finnish Institute of Occupational
Health, 2011). The dimensions of work ability are presented in the form of a ‘work ability
house’.
The factors influencing work ability represent four floors in a house (Figure 12.1).
The first floor includes human resources such as health - physical, mental, and social
functioning. If the first floor is strong, the chances are that a person will have stronger work
ability throughout his or her working life.
The second floor of the house contains knowledge and skills and their constant
updating, including education and relevant training. The third floor refers to the inner
values and attitudes and also to circumstances that motivate people at work. Work
environment is located on the fourth floor, right above attitudes because it directly affects
attitudes. When a person is exposed to good experiences, his or her positive values and
attitudes towards work are strengthened. On the other hand, bad experiences weaken both
187
attitudes and values (Finnish Institute of Occupational Health, 2011; Ilmarinen, 2010). As
clearly presented in the work ability house, work ability is formed by the work environment
as well as personal health and abilities.
Figure 12.1. Multidimensional work ability model. Reprinted from Finnish Institute of Occupational Health. (2011). Multidimensional work ability model. Helsinki, Finland, p. 1.
Outside the work ability house are additional influences on work ability.
Community organisations that support work, occupational health care and safety, as well as
188
the immediate social environment (family, friends, relatives, etc.) are also important.
Finally, the operational environment of work is added, including society, culture, social and
health policies and legislation. Government policies contribute to creating significant
prerequisites for work ability, but they also create challenges for work ability, such as
demanding a higher employment rate. Evidence shows that the core structure of work
ability is very dynamic and can change greatly during a person’s career. Any conflict
between family life and work life will have an impact on work ability. Also, support or lack
of support from the community will affect one’s work ability. Likewise, the introduction of
new technologies, the impact of globalisation, or changes in retirement/health/welfare
systems and legislation status will make a difference to work ability (Gould et al., 2008).
The multidimensional work ability model is very versatile and can be applied to
planning research and developmental projects, as well as training and education programs
(Ilmarinen, 2010).
Work ability in Australia. The work ability index has been used in Australia for
more than ten years. The major reason researchers are interested in its application is the
ageing population, and the need to enhance health and labour systems (Taylor, 2010). Such
considerations have caused policymakers to rethink the length of working lives. As
Australia faces a skills shortage and an ageing workforce, the focus is on finding answers to
the following question: “How can we tap into the available talent in the workforce and
remove the barriers to a life in work?” (Australian Government, Compare, 2011).
Researchers in Australia have used the WAI for different purposes such as
predicting employees’ retirement intentions (Oakman & Wells, 2009), predicting work
189
ability of employees (Palermo, Webber, Smith, & Khor, 2009), and examining the
relationship between age and work ability (Webber, Smith, & Scott, 2006).
However, while individual factors remain significant predictors of work ability,
Palermo (2010) has found that other organisational factors such as occupational stress, job
satisfaction, leadership effectiveness and the nurturing of workers are significant positive
predictors of work ability. The outcomes of this study strongly support the Finnish findings
that managers and supervisors play key roles in influencing work ability (Ilmarinen, 2010).
Organisations that advocate and endorse caring values for others are more likely to return a
better work ability score.
Figure 12.2. WAI scores: Australia and Finland. Reprinted from Taylor, P., & McLoughlin, C. C. (Dec 2011). Pilot Study on Workability. Monash University.
Unpublished presentation. Melbourne, p. 10.
Australian studies have compared the predictive power of retirement intentions,
investigating the connection between age, injury proneness and work ability, and assessing
190
the influence of organisational values on work ability. Three predictors of the WAI
accounted for 42 per cent of its variation. These variables were “management respects
you”, “working beyond physical capacity” and “unevenly distributed work” (Brooke,
Goodall, & Mawren, 2010). A surprising finding in these studies was an extremely high
mean work ability score (Figure 12.2) compared to the Finnish population (Taylor, &
McLoughlin, Dec 2011; Palermo et al., 2009). These scores had a more negatively skewed
distribution compared to the Finnish distribution, even though the population studied varied
from private to public organisations, across locations and across different industries.
In Australia, a Work Ability Survey (WAS) and the Work Ability Survey Revised
(WAS-R) were instruments developed in four companies by the Business, Work and
Ageing Research Centre, Melbourne (Taylor & McLoughlin, Dec 2011). These authors
have kindly provided their data for use in this study. WAS is an organisational survey that
is aligned with the four levels of the multidimensional work ability model, as well as the
WAI described above. It consists of physical and psychosocial work demand measures. The
original model (McLoughlin, 2009; Taylor & McLoughlin, Dec 2011) was specified as a
higher order reflective-reflective model as illustrated in Figure 12.4.
The personal and organisational capacities are two independent constructs that
jointly form the WAS. However, the six factors contributing to the organisational capacity
construct and the five factors contributing to the personal capacity construct are expected to
be correlated, suggesting that a reflective-formative specification of this model would have
been preferable to the originally specified reflective-reflective format. Considering the
191
theoretical background and other criteria demonstrated in Figure 5.5, it is confirmed that
the WAS should be modeled as a reflective-formative model.
The major aim of this study was to demonstrate the validity and model-based
reliability of a correctly specified reflective-formative model of WAS. It was hypothesised
that:
Hypothesis 12.1: A reflective-formative higher order model of WAS has acceptable validity
and model-based reliability.
The review of the misspecification literature in Chapter 5 showed that
misspecification of measurement models is common. As mentioned in that chapter,
misspecification can lead to Type I and II errors. A Type I error is a false positive error that
occurs when a path is declared significant when it is not (incorrect rejection of a true null
hypothesis). A Type II error is a false negative error that occurs when declaring a path to be
nonsignificant when it is significant (failure to reject a false null hypothesis). In SEM, Type
I errors may result from the erroneous application of a reflective model instead of a
formative model, while a Type II error can occur with the erroneous application of
formative models in place of reflective models (Jarvis et al., 2003; MacKenzie, Podsakoff,
& Jarvis, 2005).
A study by Petter et al., (2007) considered a series of simulations for structural
models that contained no significant paths. They found that when the formative construct
was misspecified as reflective, upward bias in the parameter estimates often produced a
Type I error. Roy, Tarafdar, Ragu-Nathan & Marsillac (2012) presented similar results.
They reported that misspecifying a reflective model as formative leads to a deflation of path
192
coefficients and R square values (Type II error). Conversely, while Petter et al., (2007)
found that misspecifying a formative model as reflective results in the inflation of path
coefficients and R square values (Type I error). Petter et al., (2007; p. 631) stated that “The
danger of Type I error is that we, as researchers, may build new theories and models based
on prior research that finds support for a given relationship that does not actually exist. This
may affect the implications of our research for both academia and practice. The danger of
Type II error is that some interesting, valuable research may not be published if many of
the relationships within the model are found to be nonsignificant”. In a misspecified model
such as the reflective-reflective WAS (Figure 12.4), the variance of the constructs will
increase due to shared error. As a result, the path coefficients to the higher order constructs
will be increased, creating an upward bias in the result. The opposite will happen in a
misspecified formative-formative model of WAS, resulting in a downward bias. To the best
of this researcher’s knowledge, no study has investigated the consequences of
misspecifying a mixed model (reflective-formative and formative-reflective).
The secondary aim of this study was to increase the awareness of the
misspecification problem by demonstrating the possible consequences of model
misspecification in an empirical study. The correctly specified reflective-formative model
for WAS was therefore compared with the misspecified reflective-reflective and formative-
formative models, fitted using Partial Least Squares SEM in order to quantify any Type I or
II errors.
The Partial Least Squares SEM was used to evaluate the WAS models for several
reasons. First, evaluating the WAS reflective-formative model using the Conventional
193
Covariance-Based SEM procedure was very problematic due to identification problems. As
asserted by Bollen and Lennox (1991), a reflective measure can be easily identified and
evaluated using Covariance-Based SEM, while a formative measure cannot be easily
identified, except by placing the measure in a larger path structure with other variables that
can be evaluated (e.g. using a MIMIC model). Partial Least Squares SEM procedure is a
better alternative for evaluating models with formative measures; these can be simply
evaluated in isolation using this procedure. There is also less restriction in terms of
normality or sample size compared to Covariance-Based SEM (Roy et al., 2012).
The PLS-SEM results for the correctly specified reflective-formative model were
compared with the misspecified reflective-reflective WAS model evaluated using CB-SEM,
to evaluate the consequences of the misspecification, allowing the testing of the following
hypothesis
Hypothesis 12.2: The results for a misspecified reflective-reflective model (fitted
using Covariance-Based SEM) will demonstrate inflated loadings, compared to a correctly
specified reflective-formative WAS model (fitted using Partial Least Squares SEM).
The methodological procedure for fitting a reflective-formative model in Partial
Least Squares SEM is demonstrated in detail below.
194
Figure 12.3. The correctly specified reflective-formative model of WAS.
Q 1
Q 2
Q 4
Q 3
Q 6
Q 7
Q 8
Q 9
Q 11
Q 12
Q 13
Q 14
Q 15
Q 16
Q 17
Q 29
Q 10
Q 5
Q 30
Q 19
Q 28
Q 27
Q 26
Q 25
Q 24
Q 23
Q 22
Q 21
Q 20
Q 18
Q 31
Q 32
Q 33
Q 34
CONTROL
TRUST
RESPECT
SUPPORT
HARASSMENT
TRAINING
MENTAL H
PHYSICAL H
WORK-HOME
HOME-WORK
LEISURE
WAS
ORGANISATIONAL
PERSONAL
195
Figure 12.4. The misspecified reflective-reflective model of WAS
Q 1 Q 2
Q 4
Q 3
Q 6
Q 7
Q 8
Q 9
Q 11
Q 12
Q 13
Q 14
Q 15
Q 16
Q 17
Q 29
Q 10
Q 5
Q 30
Q 19
Q 28
Q 27
Q 26
Q 25
Q 24
Q 23
Q 22
Q 21
Q 20
Q 18
Q 31
Q 32
Q 33
Q 34
CONTROL
TRUST
RESPECT
SUPPORT
HARASSMENT
TRAINING
MENTAL H
PHYSICAL H
WORK-HOME
HOME-WORK
LEISURE
WAS
ORGANISATIONAL
PERSONAL
196
.
Figure 12.5. The misspecified formative-formative model of WAS.
Q 1 Q 2
Q 4
Q 3
Q 6
Q 7
Q 8
Q 9
Q 11
Q 12
Q 13
Q 14
Q 15
Q 16
Q 17
Q 29
Q 10
Q 5
Q 30
Q 19
Q 28
Q 27
Q 26
Q 25
Q 24
Q 23
Q 22
Q 21
Q 20
Q 18
Q 31
Q 32
Q 33
Q 34
CONTROL
TRUST
RESPECT
SUPPORT
HARASSMENT
TRAINING
MENTAL H
PHYSICAL H
WORK-HOME
HOME-WORK
LEISURE
WAS
ORGANISATIONAL
PERSONAL
197
12.1.3 Composite reliability using PLS
All the model-based reliability assessments mentioned in study 1 and 2, require the
use of reflective model and covariance-based SEM (CB-SEM). CB-SEM uses Maximum
likelihood (ML) estimation. Partial Least Squares (PLS) requires an alternative model
estimation approach called partial least squares estimation.
In the absence of normality or when the sample size is small, PLS-SEM seems to be
an appropriate alternative to CB-SEM for computing model-based reliability coefficients.
PLS-SEM is considered to be a correct and feasible method for estimating formative or
reflective-formative models. Usually models involving formative constructs present
identification problems and are difficult to evaluate using CB-SEM, while PLS-SEM is
commonly regarded as a good tool for evaluating such models. An additional advantage of
PL-SEM is acknowledged when developing measurements with new theoretical or
empirical backgrounds (Ridgon, 2012), PLS-SEM seems to provide a more appropriate
procedure for reliability assessments in this case. Pro-PLS scholars believe that by using
research data, one can help in building empirical background and unobservable conceptual
variables (Ridgon, 2012). On the other hand, CB-SEM followers believe that one should
specify a conceptual structure and seek evidence regarding whether these structures are
consistent with empirical evidence, so that results can challenge, support, or modify those
conceptualizations (please see previous chapter for more details on PLS-SEM vs. CB-
SEM).
Despite the less restrictive nature of PLS-based SEM, it is still not as popular as
covariate-dependent SEM in model-based reliability assessments. The main reason for this
198
previously was a lack of software for model estimation, but this problem is now being
addressed. Since 1984, and especially from the early 2000s, more user-friendly software
has been introduced for the estimation of PLS-based SEM, adding to the popularity of the
method.
Built on classical test theory and using PLS-SEM, Composite Reliability can be
estimated for constructs (Werts, Linn & Jo¨reskog, 1974). Composite reliability (CR) is the
reliability of multiple constructs with similar items. In other words, CR is the total true
score variance extracted over the total scale variance.
The CR will be equal to coefficient alpha when the essential tau-equivalency of all
items are met, otherwise CR is usually higher than coefficient alpha.
The reliability of reflective measures using PLS can be tested using Composite
Reliability ( cρ ) (Werts, Linn & Jo¨reskog, 1974). Composite reliability takes into account
the different outer loadings of the indicator variables in a model and therefore seems to
better reflect the model-based reliability compared to internal consistency coefficients such
as Coefficient alpha (Hair et al., 2014). Values of 0.60-0.70 or higher are acceptable for CR
for early stages of scale developments, and values of 0.80 and higher are satisfactory for
more developed (established) measures (Nunally & Bernstein, 1994).
To test the reliability of constructs, some scholars (Chin 1998; Hair et al., 2014;
Fornell & Larcker, 1981) suggest reporting not only the Composite Reliability of the scale
but also the reliability of each indicator (since the reliability of each indicator may differ) as
199
well as the Average Variance extracted (AVE), which measures how much indicator
variance is explained by the common factors.
As before, the convergent validity of the indicators of a construct is defined as “the
extent to which a measure correlates positively with alternative measures of the same
construct” (Hair et al., 2014, p 102) and the average variance extracted (AVEs) can be used
to test for convergent validity (Fornell & Larcker, 1981) with a cutoff point of greater than
0.50 required. In addition, if the square roots of AVE exceeds the estimates of the
intercorrelation of the construct with other constructs, discriminant validity is supported
(Chin 1998; Fornell & Larcker 1981). Reporting the Composite reliability as the reliability
of a summated scale is needed as much as the average variance extracted (Fornell &
Larcker, 1981).
LISREL does not output the CR directly and some manual calculation needs to be
done in order to obtain CR. However, Smart PLS reports not only the reliability at item
level and the CR at scale level but also the AVEs, all in one single analysis. In addition,
using SmartPLS, the confidence interval of the composite reliability can be estimated by a
bootstrapping procedure. This allows the testing of the hypothesis that the reliability
coefficient exceeds a specified value in the population.
200
12.2 Method
12.2.1 Participants
The data for the present study was obtained from the Redesigning Work for an
Ageing Society research program conducted by the Business, Work & Ageing Centre for
Research (BWA) at Swinburne University of Technology in Melbourne. The data was
collected from four case study organisations during 2007-2008 with an overall response
rate of around 40% (a total sample of 1687 respondents). The final data used in the present
study contained 1344 respondents, allowing for the removal of 343 incomplete survey
responses.
12.2.2 Measure
The Redesigning Work for an Ageing Society research program developed the
Work Ability Survey (WAS), through the works of McLoughlin (2009), Taylor, and
McLoughlin (2011). The WAS has two main sub-constructs entitled personal and
organisational capacities. The organisational capacities scale consists of six subconstructs:
control, respect, trust, support, harassment, and training. The personal capacities has five
subconstructs: leisure, work-home balance, home-work balance, mental health, and
physical health. A version of the questionnaire is presented in Appendix E with the
permission from the researchers involved in the original study.
12.2.3 Ethics
The original study obtained ethics clearance from the participating organisations
and permission to reuse the database in similar studies.
201
12.2.4 Overview of analysis
Covariance-Based SEM analysis using AMOS software, and a Partial Least
Squares SEM using SmartPLS (v2.0), was used to assess the reflective-reflective model of
WAS. An overview of the similarities and differences between Covariance-Based SEM and
Partial Least Squares SEM follows.
Debates regarding the superiority of Covariance-Based SEM over Partial Least
Squares SEM have existed since the early years of development of these procedures (see
Chapter 2 for details on the origins of each procedure). In particular, some scholars have
questioned the practicality and generalisation of the PLS method for factor estimation.
In spite of the wide criticism of Partial Least Squares in the literature, PLS has
specific strengths in specific situations that some Covariance-Based SEM scholars have
misunderstood or ignored. A comparison of some of the main features of both approaches,
along with some of the criticisms follows.
Predicting validity. The literature shows that PLS has capability as a prediction tool,
a fact that has not been fully appreciated. As such PLS provides a correct method for
evaluating formative constructs and for developing measurements with new theoretical or
empirical backgrounds (Ridgon, 2012). Scholars supporting PLS believe that using
research data allows the building of empirically-based theory and constructs (Ridgon,
2012). On the other hand, Covariance-Based SEM followers believe that theory is needed
to specify a conceptual structure, while research data is needed to test whether these
202
structures are consistent with empirical evidence. The argument is that the results can
challenge, support, or modify those conceptualisations.
Fit assessment test. Covariance-Based SEM assesses the overall fit of the model
using the covariance among the items, assuming that all measures are reflective, with less
interest in the individual effects of construct or path coefficients. In contrast, PLS does not
rely on item covariance and overall goodness-of-fit; instead, the focus is on the variances of
predicted variables or construct variances (Chin, 2010). In practice, in the presence of
formative constructs, PLS might be a better choice than Covariance-Based SEM. Indeed, as
explained below, Covariance-Based SEM cannot be used for third-order models, such as
the WAS model considered here.
Theoretical background. Due to the holistic and confirmatory approach of
Covariance-Based SEM, it is more appropriate when there is solid theoretical and
background knowledge of the model. In contrast, a Partial Least Squares approach, with its
exploratory nature and focus on the significance and strengths of individual paths and
constructs, seems to be an appropriate procedure for new models. It is particularly useful in
social and behavioural sciences when the background knowledge of the expected model is
limited (Chin & Newsted, 1999; Chin, 2010; Roldán & Sánchez-Franco, 2012).
Normality assumption. Covariance-Based SEM commonly uses ML estimation
assuming a normal distribution for the data, while for PLS there is no underlying
assumption for the data distribution. This indicates that for non-normal data, the use of
variance-based PLS is justified when sample sizes are too small to allow asymptotically
distribution-free Covariance-Based SEM or bootstrap analyses.
203
Sample size. One of the requirements for using Covariance-Based SEM is to have a
relatively large sample size, while PLS can be conducted with small sample sizes.
However, in PLS, the estimators are inconsistent and biased in that standard errors do not
decline with increasing sample size and expected parameter estimates do not converge to
their true values. This lack of consistency means that increasing sample size does not
provide a more reliable analysis in the case of Partial Least Squares SEM. However, in
Covariance-Based SEM models, if the underlying assumptions are met, consistency is
ensured and larger sample sizes do provide a more reliable analysis.
Reflective and Reflective-formative models. Partial Least Squares SEM and
Covariance-Based SEM are two different approaches for estimating a SEM model and both
can be used to fit reflective models. PLS can also be used to fit reflective-formative models
and formative-formative models. However, Covariance-Based SEM can only be used to fit
reflective-formative models when there is a reliable measure for the higher-order latent
constructs (using MIMIC models). Each approach is suitable for a specific context.
Researchers need to appreciate the characteristics of each method to be able to choose the
most suitable approach (Hair et al., 2010; Hair, Ringle, & Sarstedt, 2011; Hair, Hult,
Ringle, Sarstedt, 2014). As acknowledged by Hair et al., (2011), neither method is superior
to the other. They further state that “depending on the specific empirical context and
objectives of a SEM study, PLS‑SEM’s distinctive methodological features make it a
valuable and potentially better-suited alternative to the more popular Covariance-Based
SEM approach” (p. 149).
204
The attempt to use the conventional Covariance-Based SEM procedure in this study,
using the MIMIC model to evaluate a formative-formative WAS model, failed to identify
the model. When partial least squares (PLS) Structural Equation Modelling (SEM) was
used instead of the conventional CB-SEM to evaluate a formative-formative WAS model,
model identification was achieved. In order to evaluate the correctly specific reflective-
formative WAS model, Partial Least Squares SEM was also needed.
12.2.4.1 Building a higher-order reflective-formative model of WAS using PLS-SEM.
To the researcher's best knowledge, there are only a handful of studies (Becker,
Klein, Wetzels, 2012; Wetzels, Odekerken-Schroder, van Oppen, 2009) that recommend
guidelines for fitting a higher-order model in Partial Least Squares SEM. Wetzels et al.,
(2009) developed guidelines for building such a higher-order ‘reflective’ model. “PLS path
modeling can also be used for higher-order models with formative constructs or a mix of
formative and reflective constructs” (Wetzels et al., 2009, p. 189). In this study, mixed
approaches suggested by Wetzels et al. (2009) and Becker et al. (2012) were used to fit the
proposed reflective-formative WAS model, with some amendments to the guidelines as
proposed by Wetzels et al. (2009) for reflective models. To clarify the approaches used in
this study, a brief description of each approach including their advantages and
disadvantages, is provided below.
In the reflective-formative WAS model, the first-order constructs are reflective but
the higher-order constructs are formative (Figure 12.3). In the formative-formative WAS
model (Figure 12.5), the first-order and second-order constructs are both formative. The
repeated indicators approach and the two-stage approach are recommended to test a higher-
205
order reflective-formative model in Partial Least Squares SEM: (Becker et al., 2012; Hair,
Hult, Ringle, & Sarstedt, 2014; Wold, 1982). In the repeated indicators approach, all the
indicators of the first-order constructs are allocated to the second-order construct. This is
called the repeated indicators approach (Wold, 1982) because the indicator variables are
repeated twice in the model (i.e., for the first and second-order constructs). The two-stage
approach requires two steps in the model analysis. The first-order constructs are evaluated
at the first stage and the predicted value for the first order constructs are then used in the
second stage as indicators for the second-order constructs (Becker et al., 2012; Hair et al.,
2014; Wetzels et al., 2000). According to the simulation study by Becker et al., (2012) and
recommendations by other researchers in this area (Ringle, Sarstedt, Straub, 2012; Hair et
al., 2014), these approaches are only appropriate in specific circumstances.
The benefit of the repeated indicator procedure is the estimation of all constructs in
a single analysis. However, there are some weaknesses with this approach. First,
misspecifying the repeated loadings of higher-order constructs (reflective vs. formative)
could lead to incorrect results. It is advised by Becker et al., (2012) that for reflective
higher order models (reflective-reflective and formative-reflective models), the inner
indicators of the higher-order constructs should be reflective; while for any type of higher-
order formative model the repeated indicators of the higher-order constructs should be
specified as formative. Another weakness of this procedure is that unequally important
indicators of first-order constructs could lead to biased results (Chin et al., 2003; Ringle et
al., 2012). Although, simulation studies indicate this is a concern for reflective models
only, not for formative models (Becker et al., 2014). A further weakness is the production
206
of incorrectly correlated residuals due to repeated use of the same indicators for the first
and second-order constructs (Becker et al., 2012). A final weakness of this procedure is that
most of the variance is explained by the lower-order constructs. As a result, the path
coefficients of higher-orders are usually zero or non-significant (Ringle et al., 2012).
The two-stage approach also has advantages and disadvantages. In this approach, a
higher-order model is estimated separately from the first-order model, resulting in no risk
of misspecification of the repeated indicators for higher-order constructs. For reflective
models with unequally important indicators, this approach delivers a more reliable result
compared with the repeated indicator approach (Becker et al., 2012). Most importantly,
applying the two-stage approach and estimating the first-order constructs in a separate
analysis of higher-order constructs allows other variables to emerge to explain some of the
variances contributing to the higher-order formative constructs (Ringle et al., 2012). The
disadvantage is that the first-order and higher-order constructs are not estimated
simultaneously. Therefore, the model estimators might not be as precise as those obtained
with the repeated indicator approach.
For this study, a reflective-formative model was fitted using a mixture of ‘repeated
indicator’ and ‘two-stage approaches’. Based on the above recommendations, the repeated
indicator approach was used at the first stage, with the construct scores of the first-order
constructs used at the second stage as the manifests/indicators of the higher-order
constructs. Applying the repeated indicator approach at two stages creates less bias, more
reliable parameter estimates/scores, and a more precise estimation of path coefficients of
constructs (Becker, et al., 2012).
207
Figures 12.6 to 12.8 present the PLS model building process used in this study.
Figure 12.6 . Step one: constructing the first-order sub-constructs of both personal and organisational capacities of reflective-formative model of WAS using PLS path
modeling.
At step one of the PLS path modeling of WAS, the first-order sub-constructs of both
personal capacity (mental and physical health, leisure, work-home and home-work factors)
and organisational capacity (control, trust, training, respect, support, harassment) were
constructed individually (Figure 12.6). Although a small number of the constructs (e.g.,
physical health) had a formative structure, for consistency purposes all were considered as
reflective.
208
Figure 12.7. –Step two: building the second-order formative constructs (organisational and personal capacities) for the reflective-formative model of WAS.
Note. Due to limited space, only some of the indicators of organisational capacity
are shown.
At step two, the second-order formative constructs (organisational and personal
capacities) were built by relating them to their first-order reflective sub-constructs and the
firs-order indicators. Both personal and organisational constructs were estimated separately
to obtain the scores for the first-order latent factors (Figure 12.7).
209
Figure 12.8. Step three: The scores of the first-order latent factors, are used as the manifests of the second-order factors (i.e. organisational and personal capacities) and
forming the higher-order construct (WAS).
At step three (Figure 12.8), the scores of the first-order latent factors were used as
the manifests of the second-order factors (i.e., organisational and personal capacities) and
the higher-order construct (WAS) was built by relating it to the second-order constructs
(organisational and personal capacities). The inner and outer loadings were built on
repeated predictors of the first-order observed scores that were obtained at Step 1.
The third-order model was assessed in the final step (Figure 13.1). The inner and outer
models of first-, second-, and third-order loadings were estimated using SmartPLS.
The inner model in SEM refers the relationships between the independent and dependent
latent variables, while the outer model demonstrates the relationships between the latent
210
variables and their observed indicators. Because Partial Least Squares SEM does not
require any normality assumptions for the data, parametric test results could not be used for
inferential decision-making. Instead, to evaluate the significance of the coefficients, a
nonparametric bootstrapping procedure was applied (Chin 1998; Efron & Tibshirani, 1993;
Tenenhaus, Vinzi, Chatelin, & Lauro, 2005) in order to draw an inferential conclusion. The
number of bootstrapping subsamples needs to be higher than the number of valid
observations in the original dataset (in this study, higher than 1344). As a rule, 5000
bootstrap samples are recommended in Partial Least Squares SEM (Hair et al., 2014). The
number of cases used for each randomly chosen bootstrap sample is the same as the number
of cases used in the analysis (1344 cases in this study).
12.2.4.2 Measurement Model Evaluation Criteria for WAS Model
Reliability, convergent and discriminant validity of the WAS. The reliability of the
reflective measures at first-order was tested using model-based reliability coefficients, or
what is known in Partial Least Squares SEM as composite reliability (Chin 1998; Fornell &
Larcker 1981). The conventional Coefficient alpha was compared with these values.
Composite reliability takes into account the different outer loadings of the indicator
variables and therefore better reflects the reliability compared to internal consistency
coefficients such as Coefficient alpha (Hair et al., 2014). Similarly to Omega, the
composite reliability coefficient is defined as the ratio F/(F+E) where F is the sum of factor
loadings, squared and E is the sum of the error variances. As was the case for Omega,
composite reliability refers to a model-based reliability coefficient and values between
0.70-0.90 are satisfactory (Nunnally & Bernstein, 1994).
211
Internal consistency reliability coefficients such as Coefficient alpha are not
appropriate for second- and third-order formative conducts. For the higher order model of
WAS, where the first-order constructs are reflective, the model-based reliability was
calculated only for the first-order constructs; however, as explained by Edward (2001),
reliability is not an issue for the higher order formative constructs. Instead, the validity for
formative constructs is critical. According to Bollen and Lennox (1991), if the path from
each subconstruct, considered as an indicator of its corresponding formative construct, is
significant, then the validity of the formative construct is confirmed. The significance of all
these coefficient paths demonstrates the validity of the formative model for this construct.
Another part of the validation process was to find out how distinguishable the
constructs were (discriminant validity). As emphasised by Campbell and Fiske (1959, p.
84), “One cannot define without implying distinctions, and the verification of these
distinctions is an important part of the validation process.” One procedure for evaluating
discriminant validity requires assessing the intercorrelation of the constructs. If the square
root of the average variance extracted (AVE) exceeds the estimates of the intercorrelation
of a construct with the other constructs, discriminant validity is supported (Chin 1998;
Fornell & Larcker 1981).
In a previous chapter, convergent validity of a construct was defined as “the extent
to which a measure correlates positively with alternative measures of the same construct”
(Hair et al., 2014, p. 102). The average variance extracted (AVE) can be used to test for
convergent validity (Fornell & Larcker, 1981) with a cut-off point of greater than 0.50
required for demonstrating an acceptable convergent validity.
212
13
STUDY 3: RESULTS
In this chapter the correctly specified reflective-formative model of WAS was fitted
for evaluation using the Partial Least Squares SEM procedure described in chapter 12. The
results were then compared with the misspecified models of WAS to demonstrate the
consequences.
13.1 Results of Model Fit Evaluation
SmartPLS was employed to estimate the inner and outer first-, second-, and third-
order loadings. Tables 13.1 and 13.2 show the reliability results and convergent-
discriminant validity of the constructs at the first-order of WAS. Table 13.1 demonstrates
the standardised coefficients, Average Variance Extracted (AVE) for first-order constructs,
the model-based reliability at construct level, and the conventional coefficient alpha
reliability at item level. The model-based reliability measures for the constructs are higher
than the conventional coefficient alpha. The model-based reliability of all constructs greatly
exceeds the minimum acceptable level of .70, demonstrating great reliability of the
constructs. The AVE of all constructs, with the exception of ‘training and harassment’,
exceeds the cut-off point of .50, suggesting convergent validity in all but these two
constructs. Most importantly, all the lower order loadings were significant.
Table 13.2 presents the intercorrelations of the first-order constructs along with their
Square Roots of Average Variance extracted (AVE) for assessing the discriminant validity
of the constructs. The results confirm that discriminant validity of the first-order constructs
213
exists because the square root of AVE for each construct is higher than any intercorrelation
with the other constructs.
214
Table 13.1 Quality Criteria of the Reflective-formative WAS First-order Constructs using PLS-SEM
Latent Variable Indicators Loadings AVE Model-based reliability
Coefficient alpha Reliability
Convergent Validity
LEISURE Q56d 0.68 0.52 0.77 0.55
Yes Q56h 0.68 Q56i 0.78 HOME-WORK Q51c 0.92 0.82 0.90 0.78 Yes Q51d 0.89 WORK-HOME BALANCE
Q51a 0.94 0.88 0.94 0.86 Yes
Q51b 0.93 PHYSICAL HEALTH
Diagnosis 0.64 0.61
0.75
0.39
Yes
Q37 0.89 MENTAL HEALTH
Q47a 0.89 0.74
0.90
0.83
Yes
Q47b 0.89 Q47c 0.79 CONTROL Q15a 0.67 0.50 0.85
0.79
Yes
Q15b 0.68 Q15c 0.65 Q19a 0.68 Q19b 0.72 Q19c 0.76 TRAINING Q30a 0.53 0.44
0.76
0.59 No Q30c 0.77 Q30d 0.63 Q30e 0.68 HARASSMENT Q42a 0.74 0.49 0.79 0.65 No
215
Q42c 0.63 Q42d 0.73
Q42h 0.68 SUPPORT Q27a 0.89 0.80
0.92
0.87
Yes Q27e 0.90 Q27f 0.89 RESPECT Q22c 0.89 0.81 0.93
0.89
Yes Q22d 0.93
Q22e 0.87 TRUST Q24e 0.85 0.78 0.92 0.86
Yes
Q24f 0.89 Q24g 0.90
216
Table 13.2 Intercorrelation Analysis and the Square Roots of AVE of First-order Constructs of Reflective-formative PLS-SEM Model †
Discriminant
Validity
LEIS
URE
HOME-
WORK
WORK-
HOME
PHYSICAL
HEALTH
MENTAL
HEALTH CONTROL
TRAINI
NG
HARASS
MENT SUPPORT RESPECT TRUST
LEISURE YES 0.72
HOME-WORK YES 0.11 0.91
WORK-HOME YES 0.21 0.23 0.94
PHYSICAL HEALTH YES 0.20 0.11 0.21 0.78
MENTAL HEALTH YES 0.29 0.20 0.31 0.34 0.95
CONTROL YES 0.10 -0.01 0.19 0.10 0.19 0.71
TRAINING YES 0.05 -0.07 0.01 0.01 0.09 0.21 0.67
HARASSMENT YES 0.04 0.05 0.24 0.15 0.20 0.16 0.07 0.70
SUPPORT YES 0.10 0.00 0.20 0.09 0.15 0.30 0.37 0.24 0.89
RESPECT YES 0.10 0.09 0.30 0.20 0.26 0.37 0.20 0.41 0.53 0.90
TRUST YES 0.11 0.10 0.30 0.18 0.24 0.32 0.19 0.35 0.52 0.78 0.89
† The square roots of the average variance extracted (AVE) are in bold.
217
Upon satisfying the validity and reliability of the first-order constructs, the next step
involved the assessment of the validity of the second- and third-order formative constructs. As
mentioned previously, the issue of reliability is meaningless for formative constructs; instead, the
significance of the predictors’ paths (the path from the subcontracts to their corresponding
formative construct) is important. Tables 13.3 and 13.4 present the path coefficients of
subconstructs for the higher-order construct/s, confirming that all these paths are significant, and
hence all these formative constructs are valid, supporting hypothesis 12.1.
Table 13.3 The Standardised Mean Coefficients of the Second-order Formative Constructs of Reflective-formative PLS-SEM Model (n=5000 bootstrap)
Standardised Path Coefficients Mean
T Value Support
LEISURE -> PERSONAL 0.31 56.82 YES
HOME-WORK -> PERSONAL 0.34 55.95 YES
WORK-HOME -> PERSONAL 0.23 53.81 YES
PHYSICAL HEALTH -> PERSONAL 0.20 50.33 YES
MENTAL HEALTH-> PERSONAL 0.63 59.63 YES
CONTROL -> ORGANISATIONAL 0.74 64.46 YES
TRAINING -> ORGANISATIONAL 0.19 54.31 YES
HARASSMENT -> ORGANISATIONAL 0.33 58.62 YES
SUPPORT -> ORGANISATIONAL 0.34 67.59 YES
RESPECT -> ORGANISATIONAL 0.37 61.33 YES
TRUST -> ORGANISATIONAL 0.39 62.67 YES
Note: p<0.05
218
Table 13.4 Results for Third-order formative Constructs of Reflective-formative WAS (n=5000 bootstrap samples)
Standardised Path Coefficients (Mean)
T Value Support
ORGANISATIONAL -> WAS 0.67 74.79 YES
PERSONAL -> WAS 0.50 52.48 YES
Note: p<0.05
Both Table 13.4 and Figure 13.1 demonstrate the significant path coefficients for the
reflective-formative WAS model performed using bootstrapping (n=5000). The path coefficient
for organisational capacities (β=0.67) and personal capacities (β=0.50) suggest that
organisational capacity is a slightly stronger component of WAS than personal capacity.
219
Figure 13.1. The final model of reflective-formative WAS development using PLS path
modeling.
To evaluate the next hypothesis (12.2) and to demonstrate the possible Type I and II
errors resulting from measurement model misspecification, the misspecified models of WAS (i.e.,
reflective-reflective and formative-formative) were evaluated using PLS-SEM. In addition,
Covariance-based SEM was applied to evaluate the reflective-reflective model of WAS to
establish whether the difference in model or the difference in estimation method was responsible
for the differences in the results. Unfortunately, due to an identification problem, the formative-
formative model of WAS could not be evaluated using the MIMIC method with Covariance-
based SEM procedure. The full details of the results and the step-by step guide to evaluating the
220
misspecified models are presented in Appendix E. The next section presents the comparison of
path coefficients and reliability coefficients of the misspecified models with the correctly
specified model of WAS, fitted using PLS-SEM.
13.2 Comparison of the Misspecified Models with the Correctly Specified WAS Model
In this analysis, the results of all four sets of coefficients were compared - misspecified
reflective-reflective using both CB-SEM and PLS-SEM, formative-formative and correctly
specified reflective-formative WAS models. The full analysis and results are presented in
Appendix B. Table 13.5 presents a comparison of the path coefficients of all four analyses. The
results showed that, in comparison with the correctly specified reflective-formative model, the
paths of the misspecified reflective-reflective models were highly inflated , regardless of the
evaluation procedure used (i.e., CB-SEM or Partial Least Squares SEM). Conversely, in the
misspecified formative-formative model, the path coefficients were highly deflated, especially for
the lower order construct. The results indicate that the inflated (in reflective misspecified models)
and deflated (in formative misspecified model) path coefficients lead to Type I and II errors
respectively.
221
Table 13.5 Comparing the Standardized Path Coefficients of Misspecified and Correctly Specified WAS Models
Misspecified models Correctly specified model
Constructs 1) Reflective –Reflective CB-SEM
2)Reflective-Reflective PLS-SEM
3)Formative-Formative PLS-SEM
4)Reflective-formative PLS-SEM
LEISURE -> PERSONAL .44 .55 -0.02† 0.31
HOME-WORK -> PERSONAL .34 .46 -0.01† 0.34
WORK-HOME -> PERSONAL .57 .65 0.70 0.23
PHYSICAL HEALTH -> PERSONAL .68 .53 0.20 0.20
MENTAL HEALTH-> PERSONAL .71 .80 0.40 0.63
CONTROL -> ORGANISATIONAL .53 .58 0.24 0.74
TRAINING -> ORGANISATIONAL .32 .39 -0.15 0.19
HARASSMENT -> ORGANISATIONAL .52 .50 0.36 0.33
SUPPORT -> ORGANISATIONAL .65 .74 0.04† 0.34
RESPECT -> ORGANISATIONAL .95 .87 0.33 0.37
TRUST -> ORGANISATIONAL .92 .84 0.36 0.39
ORGANISATIONAL -> WAS .72 0.90 0.68 0.67
PERSONAL -> WAS .62 0.71 0.51 0.50
Note: †- non-significant paths.
In contrast, the model-based reliability coefficients of a misspecified reflective-reflective
CB-SEM show a downward (deflating) bias compared to the correctly specified reflective-
formative WAS fitted using PLS-SEM (Table 13.6). This is primarily due to the shared
measurement errors in reflective second- and third-order models. In the reflective-formative
models, the first–order constructs predict the second-order construct, preventing the sharing of
222
measurement error with the reliability of the first–order loadings. As expected, the results of the
reflective-reflective WAS model, evaluated with Partial Least Squares SEM, showed the same
reliability coefficients as the correctly specified reflective-formative WAS model. This occurred
because in Partial Least Squares SEM, the reliability coefficients of the first-order constructs
were evaluated in isolation. Therefore, the misspecification of second or third-order constructs
does not affect the reliability of the first-order constructs. The reliability coefficients for the
misspecified formative-formative model of WAS were not calculated. As stated previously, the
issue of reliability is meaningless for formative constructs where the indicators are predictors of
the construct.
223
Table 13.6 Comparing the Model-based Reliability Coefficients of a Misspecified Reflective-reflective WAS (CB-SEM) with the Correctly Specified Reflective-formative Model of WAS
Reflective-reflective WAS (CB-SEM)
Reflective-formative WAS (PLS-SEM)
Latent Variable Indicators Model-based reliability
Model-based reliability
LEISURE Q56d .59 0.77 Q56h Q56i HOME-WORK Q51c .80 0.90 Q51d WORK-HOME BALANCE
Q51a .86 0.94
Q51b
PHYSICAL HEALTH No of conditions
.42 0.75
Q37 MENTAL HEALTH Q47a .83 0.90
Q47b Q47c CONTROL Q15a .89 0.85
Q15b Q15c Q19a Q19b Q19c TRAINING Q30a .59 0.76
Q30c Q30d Q30e HARASSMENT Q42a .66 0.79
Q42c Q42d Q42h SUPPORT Q27a .87 0.92
Q27e Q27f RESPECT Q22c .89 0.93
Q22d Q22e TRUST Q24e .86 0.92 Q24f Q24g
224
14
STUDY 3: DISCUSSION
This chapter provides a discussion of the results obtained in Chapter 13. While
validation of reflective models is common in the literature, there is little work on the
validation and assessment of model-based reliability for measurement models containing
formative constructs. The purpose of Study 3 was to illustrate empirically the fitting,
validation, and model-based reliability assessments of a reflective-formative model for the
Work Ability Scale (WAS), using Partial Least Squares SEM. The Work Ability Scale is
misspecified in the literature as a full reflective model. The proposed reflective-formative
model of WAS is a correctly specified model based on the theory described in Chapter 5
and the related decision-making tree that was developed in Chapter 5 (Figure 5.5). The
secondary aims of this study were to demonstrate the likelihood of Type I and II errors
occurring. This was achieved by comparing the correctly specified model with a
misspecified reflective-reflective model (fitted using Partial Least Squares SEM and
Covariance-Based SEM), and a misspecified full formative-formative model (fitted using
Partial Least Squares SEM). Unfortunately, due to identification issues, evaluation of the
formative-formative model using Covariance-Based SEM was not achieved, allowing only
an evaluation of the formative-formative model fitted using Partial Least Squares SEM.
The proposed revised reflective-formative WAS model was based on the work of the
Redesigning Work for an Ageing Society research program at Swinburne University of
Technology (2009) and was evaluated using AMOS and SmartPLS. The three different
225
models of WAS (reflective-reflective, formative-formative and reflective-formative) were
built using the PLS path modelling approach, employing the repeated indicators approach
and the two-stage approach (Becker et al., 2012; Hair, Hult, Ringle, & Sarstedt, 2014;
Wold, 1982). Based on the literature, applying the repeated indicator approach at two stages
creates less bias, more reliable parameter estimates/scores, and a more precise estimation of
path coefficients of constructs (Becker, et al., 2012).
The results of the Partial Least Squares SEM approach to the fitting of the
reflective-formative model for WAS showed that the proposed second-order WAS model in
this empirical illustration contained relatively valid indicators and predictors for the WAS
model. The t-statistics generated by bootstrapping also showed significant paths for both
organisational and personal capacities. The correctly specified model demonstrated
acceptable discriminant and convergent validity (with exceptions in the case of the training
and harassment subconstructs).
The model-based reliability measures were acceptable for the first-order reflective
constructs. The internal reliability coefficients alpha were clearly underestimated compared
to the model-based reliability coefficients of the first-order constructs. The main reason for
the underestimation of reliability by coefficient alpha is thought to be due to the assumption
of essential tau-equivalence, assumed in calculating the coefficient alpha. The reported
reliability in coefficient alpha is also only a lower bound for reliability and therefore results
in underestimation of the true reliability (Graham, 2006).
Comparisons of the model-based reliability coefficients of the correctly specified
model of WAS with its misspecified reflective-reflective model, produced thought-
226
provoking results. The model-based reliability for the misspecified model underestimated
the reliability compared to the correct model. These results are important for several
reasons. Comparing the correlation between the first-order constructs it seems that the
misspecified model produced an overestimation of correlation among the constructs of the
misspecified model compared to the correct model. The results support the literature claim
stating that underestimated reliability coefficients increase the correlations between the
constructs (e.g., Fan, 2003; Revelle & Zinbarg, 2008). As determined by these scholars,
underestimating the reliability results in an overestimation of correlation among constructs
and vice versa. Revelle and Zinbarg (2008) stress that selecting the proper reliability
coefficient is important in multidimensional studies. The findings of this study add to their
recommendation that researchers should also pay attention to the specification of the
measurement models. Even if proper model-based reliability coefficients were used, if the
model is misspecified (in terms of reflective vs formative nature) it leads to bias in the
reliability estimations.
The results of the comparison of the correct and misspecified models showed
inflated and deflated path coefficients when the model was misspecified as reflective or
formative respectively. When comparing reflective-reflective misspecified models (fitted
using Covariance-Based SEM and Partial Least Squares SEM) with the correctly specified
reflective-formative model, inflated path coefficients reported in the majority of the paths
presented a higher likelihood of Type I error. Based on the above comparison in terms of
reliabilities this attenuation of inter-relationships is related to the lower reliabilities found in
the misspecified reflective-reflective model. Nonsignificant and deflated coefficients were
227
reported for some paths when a full formative-formative model was compared with the
reflective-formative model. This demonstrated the presence of Type II error (rejecting a
true hypothesis). Based on the simulation study of Jarvis et al., (2003), when the structural
paths originate from a misspecified construct for reflective models, there is a high
possibility of inflated path estimates, resulting in the Type I error.
As mentioned in the reliability discussion, since reliability is under-estimated in
reflective constructs, the path coefficients are inflated compared to the formative constructs
(Fan 2003). In both misspecified reflective-reflective models fitted using Covariance-
Based SEM and Partial Least Squares SEM, the path coefficients were therefore inflated
compared to the correctly specified reflective-formative model. This is consistent with
previous empirical and simulation studies (Aguirre-Urreta and Marakas 2008, 2012; Jarvis
et al., 2003; Law and Wong 1999; MacKenzie et al., 2005; Petter et al., 2007). Similar to
previous studies, when a formative model was misspecified as reflective, the misspecified
constructs upwardly bias the coefficients of the model (Petter et al., 2011).
Any bias in a study leads to misleading conclusions and therefore it is critical to pay
attention to model specification (Petter et al., 2011). It is evident that the significant level of
misspecification in the area of psychology identified in Chapter 5 (18%) demonstrates the
need to pay greater attention to model specification in order to achieve reliable results.
These findings are important not only for this specific example but also for future studies,
opening a new area of study necessitating further research and development.
228
14.1 Implications for Work Ability Assessments
A validated scale of work ability would have many practical benefits. The work
ability concept and the Work Ability Index have far-reaching and strategic benefits for
work organisations, resulting in better productivity. Specific benefits of the concept include
early prediction of work disability, initiation of preventive procedures, recognition of work
ability status and the need for promotion (Daws, 2012; Ilmarinen, 2010).
The concept of work ability has advanced significantly from the original research on
the Work Ability Index due to the multidimensional holistic view provided by the work
ability model. According to scholars, work ability research in the future will include some
of the following (Daws, 2012; Ilmarinen, 2010):
• utilisation of a multidimensional work ability model with a link between research and practice;
• development of new work ability measures with better capacities for the identification of problems;
• comprehensive evaluation of effects of interventions; • development of national and international work ability networks; • development of national surveys and the creation of datasets; • international studies of long-term effects on the Third Age (silent and boom
generations) using a generational framework of analysis; • improvement of tools for training; and • development of curricula for occupational gerontology at universities.
The concept of work ability provides an all-inclusive and evidence-based concept
for quality of work life as well as positive ageing. However, major attitudinal, managerial
and occupational health and safety (OH&S) reforms are needed in the modern work-life
environment (Ilmarinen, 2010; Taylor, Sep 2008)
229
The importance of workplace as a component of quality of life is well known.
Effective evaluation of work ability, appropriate management and supervision of workers,
and the improvement of work ability and occupational well-being to achieve a win-win
situation are the key ingredients. While the work ability concept is primarily concerned
with the working population, it is equally important to maintain the workability of the
unemployed.
Population ageing in many countries has led to concerns about labour supply, thus
giving rise to an increasing emphasis on prolonging working life (Taylor, Sep 2008). The
creation of a ‘golden age’ for older workers requires overcoming an early retirement
mentality, changing business behaviour and attitudes among the social actors, and
instituting new public policies.
In the meantime, we need reliable information based on follow-up studies or data
from workplaces. We also need international comparisons of the work ability of
populations and, more particularly, we need to identify the factors that maintain and
promote work ability (Gould et al., 2008). Estimates of the work ability of different
populations are required to support decision-making on health, work, and pension policies.
One of the critical challenges is for studies to focus on the future – “How can we find the
best predictors for the development of the population’s (future) work ability?”
14.2 Limitations and Directions for Future Research
Part of the study focus was to clarify the difference between reflective and
formative measurement models. The literature review has revealed that there are some
serious misspecification problems in the field of organisational psychology. It seems that
230
lack of knowledge could be one of the main reasons for misspecified models. Based on the
literature, a framework for identifying formative vs. reflective models has been presented in
Chapter 5 to help researchers to better identify the most appropriate type of measurement
models for constructs. The proposed decision making framework is easy to understand and
at the same time very comprehensive, and should therefore be of benefit to researchers.
However, some important issues regarding the identification of formative vs.
reflective models still need to be resolved. In some cases, the relevance of
reflective/formative constructs may differ according to the group (e.g., gender, occupation
level, etc.) or situation. Further studies are required to shed more light on such specific
group/situation complexities.
The difficulties encountered when fitting models for formative constructs using
Covariance-Based SEM is another hurdle in choosing formative constructs. Some of the
well-known solutions include using Monte Carlo simulations and MIMIC models in which
the reflective-formative models are expanded by adding reflective indicators for the higher-
order latent constructs. Despite MIMIC being a suitable procedure for the identification of
formative measures in most cases, this solution is criticised in the literature. With this
procedure the formative ƒ construct is replaced with F (represented by another standard
common factor), resulting in the deterioration of the intended meaning of the formative
construct, which is formed by its antecedents (Treiblmaier, Bentler & Maira, 2011). More
importantly, it is not clear how to use this method with a third-order model, such as that
considered in this study. This problem is solved by using PLS-based modelling for
formative constructs instead of the more popular variance-covariance-based SEM. A more
231
recent solution for the estimation of reflective-formative model is proposed by Treiblmaier,
Bentler and Maira (2011). They proposed substituting ƒ with F with minimal manipulation.
In this procedure, using canonical correlation in a two-step approach, the items belonging
to each formative construct are split into two (or more) composites. The newly developed
canonical constructs can then be treated as common reflective factors and can be placed
into any reflective SEM model (Treiblmaier, Bentler, & Maira, 2011). However, further
studies are needed to shed more light on the estimation problems encountered when fitting
formative constructs.
The results of both studies (Study 1 and Study 2) showed that conventional
coefficient alpha is not the best method for the estimation of internal consistency. Further
studies should report model-based reliability coefficients especially for multidimensional
scales like WAS.
232
15
SUMMARY
In the final chapter of this thesis, a summary of each study along with their main
contributions to the literature as well as a general concluding summary will be presented.
15.1 Study 1: Model-Based Reliability, Validity and Cross Validity of the Bifactor Model
for WOAQ
In this study, attempts were made to assess the validity, cross-validity and reliability of
the WOAQ in two Australian health settings, the community nursing and paramedic industries.
Based on the literature, a robust procedure of bifactor modeling was adopted for assessing the
validity of the WOAQ, which was then compared with a higher order model. Cross-validity
procedures, using mean and covariate structures (MACS), were adopted to evaluate the
invariance across gender in regard to covariance structure and observed means. Also the means
at construct level in the bifactor model were evaluated. This is a neglected area in the literature.
To estimate robust and more accurate reliability coefficients, instead of relying on the
conventional internal consistency measure (Coefficient α), the model based reliability of omega
coefficients were used.
In general, the results showed that the WOAQ appears to be a superior instrument for
assessing risk factors due to its satisfactory psychometric properties and short length. Also it was
demonstrated that a bifactor model of WOAQ fits the data better than a higher order model. A
bifactor model of WOAQ provides more information not only for the general (overall) WOAQ
factor but also for its nested factors and their relative importance in a given setting. In this study,
233
it was documented that a general WOAQ factor has more importance than its nested factors in a
health setting. This result may be related to the differences in model structure for a
nursing/paramedics setting compared to a manufacturing setting.
WOAQ was initially developed as part of a risk assessment in the manufacturing sector
where direct line management is important. However, relationships with colleagues is more
important in the health sector. Therefore the nested factors on management can be expected to be
less important that relationships with colleagues in health settings.
The path loadings for some of the nested constructs were low indicating that a dominant
proportion of the variation within each indicator is attributable to a general factor of WOAQ
rather than its nested sub factor. Therefore, it is recommended that future studies should consider
WOAQ as a single general score even though it contains five nested factors. Methodologically,
calculating only a separate single score for each of the nested factors does not appear to be a good
choice in such settings.
This recommendation is further supported by the results of omega model-based reliability
at both general level and subscales’ level. These model-based reliabilities were acceptable though
demonstrating more reliability for the general factor of WOAQ than its subscales. The general
factor of WOAQ attributed the largest portion of the variance compared to the nested factors,
especially ‘the relationship with the management’ and ‘reward and recognition’ subscales. It is
therefore suggested that for the current sample of community nursing service, it is more
appropriate to use and report the general factor of WOAQ rather than its nested factors in
isolation. When they were compared with the conventional coefficient alpha, the results showed
overestimated reliability for coefficient alpha compared to the omega coefficients. It is therefore
234
recommend that in any future studies researchers should by default use only model-based
reliability coefficients.
15.2 Study 2: Applications of Covariate-dependent Reliability
Study two presented two applications of the newly proposed covariate-dependent and
covariate-free reliability approach of Bentler’s. The applications demonstrated in this study were
the reliability assessment of WOAQ and the role of occupation type and also the effects of CMB
on reliability. Using Covariate-dependent and covariate-free approach it was demonstrated that
although WOAQ showed acceptable reliability in a nursing and paramedic organisation
separately, when these samples were combined a considerable proportion of the WOAQ was
attributable by the organisation type. Surprisingly the results showed that although ‘within’
organisation reliability exists for the WOAQ, ‘between’ organisation assessments failed to
demonstrate a high degree of reliability between the nursing and paramedic samples. The reasons
for seeing such differences in reliability was explained in terms of the differences between these
organisations, their demographic characteristics, the different pace of work, different work
settings and different ways of interacting with the patient and providing service delivery. Often
scholars neglect to perform reliability assessments for their scales even when being used for the
first time in a new setting. WOAQ is one example of many scales that are highly influenced by
the type of organisation and/or the demographic characteristics of the population.
The second application of the Bentler’s covariate-dependent approach was demonstrated
in the context of CMB. This new procedure was proposed for assessing the effects of CMB on
the reliability of a model. . It appears that CMB has a marked effect on the reliability of the
model considered in this study. This seems to be an interesting area to be explored in further
research. The presence of CMB was backed up by further analysis using a confirmatory factor
235
analysis (CFA) marker approach for controlling for CMV/CMB. The results supported the
presence of CMB in this application therefore supporting the influence of CMB on the reliability
of the scales. This is an important finding; if scholars using covariate-dependent reliability
assessment can show any CMB effect, then they must control for CMB in the rest of their
analysis using a marker variable or some other approaches. Covariate-dependent reliability
assessment therefore provides a new quick and easy method for testing for CMB/CMV.
Focusing on the causes and consequences of CMB using preventive procedures, as well
as statistical procedures, better ways to prevent and control for the possible effects of CMV/CMB
are recommended. In this study using a preventative procedure, one of the potential common
method biases (social desirability) was detected and measured. Using statistical procedures,
unmeasured sources of possible bias (CMV) were also controlled and evaluated.
15.3 Study 3: Model-based Reliability and Validity of Reflective-formative Model of WAS
In study 3, a comprehensive statistical and theoretical explanation of the differences
between formative and reflective models as presented. Then, based on the literature, a
comprehensive, simple and easy to follow decision making tree was proposed to easily
distinguish reflective from formative models. The next aim was to illustrate how big the
misspecification problem is in the area of organisational psychology. Although a few literature
reviews have been carried out in other disciplines highlighting the misspecification rate, no study
has been carried out in the psychology area. Given that scholars in the psychology discipline
usually hold strong statistical knowledge, there was an expectation of a lower level of error in this
area. Using this decision making tree, a broad literature review in two top journals of
Organisational Psychology were undertaken over a 9 years period (2006-2014). The two
researchers found a high level of agreement (Kappa=.89) on distinguishing between misspecified
236
models using the proposed decision making tree. An 18 percent misspecification rate was found
in the literature review of these two organisational psychology journal articles.
One of the main reasons for misspecification could be a lack of knowledge or problems in
fitting formative models. The majority of the readily available software for SEM is designed for
fitting reflective models. Therefore the main aim of study 3 was to empirically fit and evaluate
the validity and model-based reliability of a mixed reflective-formative model for WAS. WAS is
misspecified in the literature as a reflective model. The second aim was to compare and to
demonstrate the outcomes of model misspecification and the likelihood of Type I and II errors.
As a first step, it is important to design and distinguish the structure of a measurement
model before commencing the data collection. Using the literature and conceptual background of
the constructs, it should be determined at the outset whether the constructs are formative or
reflective. The decision flowchart was applied in the context of a work ability survey (WAS).
Using empirical data, an evaluation of the reflective-formative, formative-formative and
reflective-reflective higher order models was performed. In this evaluation, two model fitting
procedures (CB-SEM and PLS-SEM) were used for all models. Unfortunately, due to
identification problems, the evaluation of a formative-formative model using CB-SEM was not
possible. Two common procedures, repeated indicators and a two-stage path-modeling approach
were used for fitting the three models using PLS-SEM with SmartPLS software. The fitted
models showed major differences between the correctly specified reflective-formative model and
the incorrectly specified reflective-reflective and formative-formative models. For the incorrectly
specified reflective-reflective model, the structural paths were significantly inflated compared to
the correctly specified reflective-formative model, suggesting in a higher probability for Type I
errors. Interestingly this was more of a problem when PLS-SEM was used to fit the reflective-
237
reflective model than when CB-SEM was used. The comparison of the incorrectly specified fully
formative-formative model with the correctly specified reflective-formative model showed some
deflated/nonsignificant loadings. This was more evident for the lower order constructs,
suggesting a higher probability for Type II error. These findings exhibited empirically the
dangers of model misspecification.
It is highly recommended the scholars specify their measurement models with more
caution in order to avoid Type I or II errors. The nature of constructs needs to be identified before
the model fitting software is chosen. The theoretical background should always be considered as
a first step to identify and conceptualise the nature of constructs (reflective vs formative or
mixed).
15.4 Thesis contributions to SEM
The contribution of this thesis and directions for future research are discussed in detail at
the end of each chapter. A summary of the contribution of the thesis, specifically in regard to the
SEM discipline, is presented below. The findings of the 3 studies undertaken in this project,
contribute to SEM by:
- Path diagrams showing history of SEM and model based reliability. An overview
of the literature on SEM and model-based reliability in psychology was provided using
two simple, yet comprehensive path diagrams. In Chapter 2, an overview of the
development of SEM in psychology was presented using a path diagram (figure 2.1).
Similarly, in chapter 3 a history of model-based reliability was presented using a path
diagram (Figure 3.1). This diagram illustrates the history, recent developments and
current gaps in the literature, along with some justifications for carrying out the studies in
this thesis were highlighted. These diagrams can be utilized as effective training tools for
238
both Statistical and Psychology students to better understand the early roots of SEM and
how more recent SEM developments relate to each other.
- Validating a bifactor model in a health setting using SEM. A comprehensive
procedure for the validation of a bifactor measurement model was assessed in study 1
using SEM. Study of bifactor models and their implications is a neglected area in the
literature and especially in the psychology discipline. These findings shed more light on
this poorly investigated area.
- Model-based reliability of a bifactor scale. Calculating and comparing the model-
based reliability coefficients of a bifactor model with the overestimated conventional
coefficient alpha, demonstrated the importance of using model-based reliability for
multidimensional constructs or complex scales.
- Cross-validation of a bifactor scale using latent factor means and covariance
structures (MACS) procedure in SEM. In study 1, cross-validation of a bifactor
measurement scale WOAQ was assessed across gender using MACS. The conventional
procedure for the cross-validation of measurement models considers only covariances and
observed means. Using MACS, the cross validity goes beyond observed parameter
invariance assessment and looks at the mean differences at construct level. This procedure
for a bifactor model is the contribution to the SEM literature that provides a more
comprehensive assessment of the validity of a scale in different populations.
- Presenting an empirical application for the novel concept of covariate-dependent
reliability using SEM. Two new applications of covariate-dependent reliability were
introduced for the first time in study 2. Using an empirical example, it is shown how ‘type
of occupation’ can affect the reliability of a scale. As such, a tool that is highly reliable in
239
one specific organisation might show very poor reliability in another organisation after
controlling for cofounding variables, for example, controlling for ‘organisation type’
reduces the reliability of a scale considerably. This procedure is expected to have many
implications in the SEM discipline and with issues related to model-based reliability.
- Demonstrating the novel application of covariate-dependent reliability in the
evaluation of CMB using SEM. A novel approach is proposed in study 2 by drawing
attention to the possible effects of CMB on the reliability of a model. In this study the
covariate-dependent reliability procedure was applied to assess the effects of CMB
(measured using a social desirability scale) on the reliability of a model. The results
highlight clearly how CMB can influence the reliability of scales. This is a novel area of
study and will have many applications in further studies.
- Developing a flowchart for distinguishing formative versus reflective models.
Providing a simple, yet comprehensive guideline using a flowchart (Figure 5.5) for
distinguishing between formative and reflective SEM measurement models was another
contribution to SEM. By having a clear guideline or procedure, researchers gain better
confidence in specifying the nature of new appropriate measurement models (e.g.
formative models). Lack of knowledge or clear rules/principles may otherwise create a
high risk of misspecification in the field.
- Demonstrating the misspecification rate of SEM measurement models (formative
vs. reflective) in the Organisational Psychology literature. As mentioned in the literature
review of Chapter 5, lack of knowledge is one of the reasons for the misspecification in
the case of the measurement models used in the field of Organisational Psychology.
Presenting a misspecification review for a 9-year period in the Organisational Psychology
240
literature will create some awareness of the extent of the problem. The results showed an
18% misspecification rate suggesting a problem in this discipline, as is the case in some
other disciplines. Misspecification of measurement models may lead to incorrect findings
and the development of misleading theories, leading to false findings.
- Presenting an empirical example for fitting a reflective, a formative and a
reflective-formative model using a partial least squares-based SEM approach. The
majority of the SEM software on the market is built mainly for conducting CB-SEM
evaluations. However, fitting a formative model of any type using CB-SEM is deemed to
be difficult and usually results in identification problems. One of the solutions in such
situations is the use of a PLS-SEM program to fit formative models. This is still a new
area of study and as a result there is limited knowledge on fitting and evaluation
approaches. In study 3, three different types of SEM measurement model were fitted and
evaluated using PLS-SEM. The procedures developed in this thesis for fitting higher-
order models for mixed models using PLS-SEM therefore represent a significant
methodological advance.
- Empirical comparisons of correctly specified versus misspecified measurement
models. The majority of the studies in the SEM discipline compare different types of
measurement models using simulation studies. In study 3, empirical data was used to
evaluate different types of measurement model using two common approaches (CB-SEM
and PLS-SEM).
- Assessing the likelihood of Type I and Type II errors as a result of measurement
model misspecification. The results of study 3 clearly showed how misspecified models
can lead to inflated, deflated or non-significant results, thereby increasing the risk of Type
241
I and/or Type II errors. These findings highlight the importance of correctly specifying
measurement models, thereby avoiding fundamental biases or errors. The danger of Type
I and II errors was well highlighted in study 3. This contributes to increasing awareness
among scholars about the consequences of measurement model misspecification.
These findings are all based on solid SEM theory and they are illustrated with
empirical analyses, which are of interest in their own right.
15.5 Summary
Overall this thesis has shown interesting applications where SEM is used for evaluating
model-based reliability and validity using both CB-SEM and PLS-SEM procedures. It has
also highlighted the procedures and applications of model-based reliability and validity
for under-investigated measurement models, such as the bifactor and mixed reflective-
formative models. In particular it has been shown how SEM makes possible the
estimation of model-based reliability covariate-dependent reliability and covariate free
reliability.. In addition this thesis has demonstrated the need for careful identification of
the nature of constructs as formative or reflective in Organisational Psychology and the
usefulness of PLS-SEM for fitting formative models. Recent SEM developments suggest
that the importance of SEM will be growing in the future as its capabilities become more
powerful. It is hoped that this thesis has contributed to this growth in a small way.
242
16 APPENDICES
243
PLEASE NOTE The articles listed below are not able to be reproduced online. Please consult the print copy of this thesis held in the Swinburne library. Karimi, L & Meyer, D 2015, ‘Validity and model-based reliability of the work organisation assessment questionnaire among nurses’, Nursing Outlook, vol. 63, no. 3, pp. 318-330, doi: 10.1016/j.outlook.2014.09.003 Karimi, L & Meyer, D 2014, ‘Structural equation modelling in psychology: the history, development and current challenges’, International Journal of Psychology Studies, vol. 6, no. 4, pp. 123-133, doi: 10.5539/ijps.v6n4p123 Karmi, L 2015 (in press), ‘Cross-validation of the work organization assessment questionnaire across gender: a study of Australia health organization’, Journal of Occupational and Environmental Medicine.
16.5 EVALUATING A HIGHER-ORDER MISSPECIFIED REFLECTIVE MODEL OF
WAS USING CB-SEM.
Using the AMOS and CB-SEM approach, the misspecified reflective models of WAS
were assessed using ML estimation, assuming normally distributed data. However, Mardia’s
Multivariate Kurtosis coefficient of 120.6 suggests that normality assumptions are not supported
(DeCarlo, 1997). Therefore bootstrapping methods were used to determine bias-corrected
confidence intervals for the parameter estimates. The bootstrap analysis indicated that the
structural paths were significant. Based on the results from the second-order model in Figure 16.1
and according to Byrne (2009), this reflective model of WAS describes the data well (χ2/df =
2.54, CFI=.95, RMSEA=.03). The standardised path parameter estimates are presented in Figure
16.2 and Table 16.5. The loading for Organisational Capacity is clearly stronger than the loading
for Personal Capacity suggesting that Organisational Capacity is a more important component of
work ability than Personal Capacity in the Australian context.
283
Figure 16.1. The standardised path parameter estimates of the misspecified reflective WAS model
284
Table 16.1 The Standardised Path Parameter Estimates for the Misspecified Reflective WAS Model Using the CB-SEM Procedure
Estimate
LEISURE <--- PERSONAL CAPACITY .44
HOME-WORK BALANCE <--- PERSONAL CAPACITY .34
WORK-HOME BALANCE <--- PERSONAL CAPACITY .57
PHYSICAL HEALTH <--- PERSONAL CAPACITY .68
MENTAL HEALTH <--- PERSONAL CAPACITY .71
CONTROL <--- ORG. CAPACITY .53
TRAINING <--- ORG. CAPACITY .32
HARASSMENT <--- ORG. CAPACITY .52
SUPPORT <--- ORG. CAPACITY .65
RESPECT <--- ORG. CAPACITY .95
TRUST <--- ORG. CAPACITY .92
ORG. CAPACITY <--- WORK ABILITY INDEX (WAI) .72
PERSONAL CAPACITY <--- WORK ABILITY INDEX (WAI) .62
The reliability of the reflective WAS subfactors was assessed. The results are
presented in Table 16.2. In summary, the majority of the subfactors produced acceptable
levels of model-based reliability (CR>0.60) (Byrne, 2009) with the exception of “leisure”
(CR=0.59), “physical health” (CR=0.42), and “training” (CR=0.59). Construct reliability will
be discussed in more detail in the next chapter.
Convergent validity is defined as “the extent to which a measure correlates positively
with alternative measures of the same construct” (Hair et al., 2014, p 102). A procedure to
evaluate convergent validity uses the average variance extracted (AVEs) (Fornell & Larcker,
1981). A cutoff point of greater than 0.50 should be considered accounting for more than 50
per cent of variance of the indicators (Fornell & Larcker, 1981). The discriminant validity
285
WAS assessed using intercorrelation between subfactors and comparing them with the
construct’s square roots of average variance extracted (AVE) (Table 16.3). If the square root
of AVE was higher than the construct’s higher correlation with other constructs, then
discriminant validity existed. Based on the results, discriminant validity was shown for all
factors with no cross-loadings, apart from the “trust” construct. There is high cross-loading
between “trust” and “respect” constructs showing lack of discriminant validity for these two
subfactors.
Table 16.2 The Parameter Estimates for First-order Reflective Model Using the CB-SEM Procedure
Estimate* AVE CR Alpha Cronbach
Convergent validity
Q56d <--- LEISURE .351 .34 .59 .79 No
Q56h <--- LEISURE .586 Q56i <--- LEISURE .743
Q5 <--- HOMEWORK_BALANCE .922 .67 .80 .89 YES
Q51d <--- HOMEWORK_BALANCE .700
Q51b <--- WORKHOME_BALANCE .824 .76 .86 .93 YES
Q51a <--- WORKHOME_BALANCE .922
Diagnosis <--- PHYSICAL_HEALTH .371 .28 .42 .67 NO
Q37 <--- PHYSICAL_HEALTH .652
Q47c <--- MENTAL_HEALTH .658 .90 .83 .91 YES Q47b <--- MENTAL_HEALTH .850 Q47a <--- MENTAL_HEALTH .850 Q19c <--- CONTROL .827 .56 .89 .86 YES Q19b <--- CONTROL .719 Q19a <--- CONTROL .710 Q15a <--- CONTROL .734 Q15b <--- CONTROL .797 Q15c <--- CONTROL .735 Q30e <--- TRAINING .545 .26 .59 .80 NO Q30d <--- TRAINING .543 Q30c <--- TRAINING .580 Q30a <--- TRAINING .381
286
Q42h <--- HARASSMENT .538 .32 .66 .83 NO Q42d <--- HARASSMENT .621 Q42c <--- HARASSMENT .493 Q42a <--- HARASSMENT .624 Q27a <--- SUPPORT .829 .69 .87 .93 YES Q27e <--- SUPPORT .851 Q27f <--- SUPPORT .827 Q22c <--- RESPECT .858 .72 .89 .94 YES Q22d <--- RESPECT .898 Q22e <--- RESPECT .804 Q24e <--- TRUST .757 .67 .86 .92 YES Q24f <--- TRUST .873 Q24g <--- TRUST .838 Note: *= all loadings are significant at P<0.05. CR=composite reliability (model-based reliability)
287
Table 16.3 Intercorrelation analysis and the square roots of AVE for all subfactors
Discriminant Validity 1 2 3 4 5 6 7 8 9 10 11
Personal Capacity Subfactor Org. Capacity Subfactors PERSONAL CAPACITY
.45
1.LEISURE YES .58 2.HOME-WORK BALANCE
YES .14 .82
3.WORK-HOME BALANCE
YES .25 .19 .87
4.PHYSICAL HEALTH YES .29 .23 .39 .53 5.MENTAL HEALTH YES .30 .24 .40 .48 .95 ORG. CAPACITY 6.CONTROL YES .10 .08 .13 .16 .16 .75 7.TRAINING YES .06 .04 .08 .09 .10 .16 .51 8.HARASSMENT YES .10 .08 .13 .16 .16 .27 .16 .57 9.SUPPORT YES .12 .09 .16 .19 .20 .33 .20 .33 .83 10.RESPECT YES .18 .14 .24 .29 .30 .50 .30 .49 .61 .85 11.TRUST NO .18 .14 .23 .28 .29 .48 .29 .47 .59 .87 .82
† The square roots of the average variance extracted (AVE) are presented in bold.
288
16.5.1 Measurement Model Evaluation Results for the PLS-SEM Misspecified
Reflective Model:
To be able to compare the path coefficients of a correctly specified reflective-
formative model with a misspecified reflective model, another analysis was run using PLS-
SEM for a full reflective model (Figure 16.2). A similar procedure to the one described
above was used for model construction in PLS, but modified to demonstrate a reflective
model. The results are presented in the following section with Table 16.4 and Table 16.5
indicating significant paths in all cases.
Figure 16.2. The reflective model of WAS using PLS-SEM
289
Table 16.4 The Path Coefficients Results for Second-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap).
Standardised Path Coefficients (M)
T Value Support*
LEISURE -> PERSONAL .55 18.15 YES
HOME-WORK -> PERSONAL .46 11.26 YES
WORK-HOME -> PERSONAL .65 29.32 YES
PHYSICAL HEALTH -> PERSONAL .53 17.58 YES
MENTAL HEALTH-> PERSONAL .80 60.53 YES
CONTROL -> ORGANISATIONAL .58 21.77 YES
TRAINING -> ORGANISATIONAL .39 13.24 YES
HARASSMENT -> ORGANISATIONAL .50 17.08 YES
SUPPORT -> ORGANISATIONAL .74 55.38 Yes
RESPECT -> ORGANISATIONAL .87 125.56 YES
TRUST -> ORGANISATIONAL .84 99.62 YES
Note: * p<0.05
290
Table 16.5 The Path Coefficinets Results for Higher-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap samples).
Standardised Path Coefficients (M)
T Value Support*
ORGANISATIONAL -> WAS 0.90 145.71 YES
PERSONAL -> WAS 0.71 38.36 YES
Note: * p<0.05
As demonstrated in Figure 16.3 and Table 16.5, the WAS reflective-reflective
model presents significant regression paths for higher-order constructs of the reflective
WAS. As before, the path coefficients for organisational capacities (β=0.90) represent a
stronger path than those for personal capacities (β=0.71). However, both these paths are
much stronger than those obtained for the reflective-formative model, suggesting that the
risk of Type 1 errors has been increased as a result of the second-order model
misspecification. Also these paths are much stronger than those for the reflective-reflective
model fitted using CB-SEM, suggesting perhaps that models fitted using PLS-SEM are
more sensitive to model misspecification than models fitted using CB-SEM
Figure 16.3. The reflective WAS development using PLS path modeling.
291
16.5.2 Measurement Model Evaluation Results for the PLS-SEM Misspecified Full
Formative Model:
To be able to compare the path coefficients of a correctly specified reflective-
formative model with a misspecified formative-formative model, another analysis was
carried on using PLS-SEM for a full formative model (Figure 12.5). The similar steps as
reflective-formative model building were adapted with one main difference. Through the
model building process for using PLS, all the indicators at first and second order and
repeated measures were regarded as formative. A snapshot of the process is presented at
Figure 16.4.
292
Step1: Building the repeated measures of personal capacities construct
Step 2: Building the repeated measures of organisational capacities
construct
Step 3: Building the repeated measures of WAS
Figure 16.4. The model building process for full formative model of WAS using PLS-SEM.
293
The results of path coefficients form the first order constructs are presented at
Table 16.6. As shown in the Table, all path coefficients are significant except for the two
sub-constructs of personal capacities (leisure and home-work) and one sub-construct of
organisational capacities (support). However, the T-Values are much smaller than was the
case for the correctly specified reflective-formative model suggesting the occurrence of
Type II error as a result of the misspecification of the first order model
Table 16.6 The Path Coefficinets Results for Second-order Reflective Constructs of Full Formative PLS-SEM Model (n=5000 bootstrap).
Standardised Path Coefficients (M)
T Value Support
LEISURE -> PERSONAL -0.02 0.40 NO†
HOME-WORK -> PERSONAL -0.01 0.16 NO†
WORK-HOME -> PERSONAL 0.70 11.94 YES
PHYSICAL HEALTH -> PERSONAL 0.20 2.72 YES
MENTAL HEALTH-> PERSONAL 0.40 5.23 YES
CONTROL -> ORGANISATIONAL 0.24 3.43 YES
TRAINING -> ORGANISATIONAL -0.15 2.08 YES
HARASSMENT -> ORGANISATIONAL 0.36 4.97 YES
SUPPORT -> ORGANISATIONAL 0.04 0.58 NO†
RESPECT -> ORGANISATIONAL 0.33 2.87 YES
TRUST -> ORGANISATIONAL 0.36 3.12 YES
Note: † p>0.05, hence not significant.
294
Figure 16.5. The full formative model of WAS using PLS-SEM
As shown at Figure 16.5 and Table 16.7, the WAS full formative-formative model
demonstrates significant regression paths for the higher-order constructs. The path
coefficient for organisational capacities (β=0.68) and for personal capacities (β=0.51) are
similar to those for the reflective-formative model.
Table 16.7 The Path Coefficient Results for Higher-order Reflective Constructs (n=5000 bootstrap samples).
Standardised Path Coefficients (M)
T Value Support
ORGANISATIONAL -> WAS 0.68 75.29 YES
PERSONAL -> WAS 0.51 50.60 YES
Note: p<0.05
295
16.6 DEFINITIONS OF IMPORTANT TERMS
Measure. A measure in this study is defined as “a quantified record, or datum, taken as an
empirical analogy to a construct” (Edwards and Bagozzi, 2000, p. 156). In this definition,
measure is not a tool for data gathering, instead it is considered to be an observed score
gathered through self-report, interview, observation or some other means (e.g. Messick,
1995).
Construct. A construct is a conceptual term used to describe a phenomenon of theoretical
interest (Cronbach & Meehl, 1955; Nunnally, 1978 and Schwab, 1980 as cited by Edwards
& Bagozzi, 2000). In this definition construct refers to a real phenomenon (observable or
unobservable) in an abstract sense. As acknowledged by Edwards and Bagozzi (2000),
these phenomena involve some degree of measurement error and must be viewed through
an imperfect epistemological lens.
Measurement error. Measurement error is defined “as that part of an observed variable
that is not 'determined by' a construct” (Lord and Novick, 1968, p. 531). Reflective and
formative models. Reflective and formative models are known also as effect and cause
indicators respectively (Blalock, 1971). A reflective model considers effect indicators and
a formative model considers cause indicators. Detailed characteristics of both models will
be discussed in chapter two.
Covariate-dependent reliability. Bentler (in press; personal communications, 2013)
defines (group) covariate-dependent reliability as “… a measure of the group differences
on the trait being measured relative to total variation, while covariate-free reliability is a
measure of the reliable individual difference variance freed from any mean differences due
to the covariate(s)”.
296
Higher-order model. In multidimensional measurement constructs, a minimum of two
levels of constructs exist: the first-order level with indicators and the second-order (higher-
order) level with first-order constructs (Jarvis et al., 2003). Such models are known as
hierarchical (higher-order or second-order in this example) models.
Bifactor model. Among the higher-order models, a bifactor model is defined where all
latent variables are modelled as first-order constructs, in which first-order factors nested
within the general factors (Gignac, 2007; Gustafsson & Balke, 1993; Holzinger &
Swineford, 1937).
Reliability rho/ 11ρ . Within the setting of model-based reliability, the analysis of
congeneric measures presented by Jöreskog (1971) to calculate the reliability coefficient
11ρ was introduced. This reliability coefficient 11ρ perhaps is one of the earliest
proposals for assessing reliability of 1-factor model which ignores equal item reliabilities
(Gerbing, & Anderson, 1988).
Omega total reliability coefficient. Similar to reliability rho/ 11ρ , omega total ( tω )
estimates the combined proportion of true score variance from the general factor and any
subscales (McDonald, 1978).
Omega hierarchical reliability coefficient. Omega hierarchical (ω h) estimates the degree
of proficiency of a test measure in assessing the reliability of a hierarchical model
(Revelle, & Zinbarg, 2008; Zinbarg, Revelle, Yovel, & Li, 2005).
Omega subscales reliability coefficient. Omega subscale ( sω ) determines the degree of
reliability of the subscale scores of a bifactor model after controlling for the reliable
variance generated from the general factor (Reise et al., 2012).
297
Common method bias. Common method bias (CMB) is a type of error inherent in a
measure attributed to the particular method used for data collection (Bagozzi and Yi,
1990).
Common method variance. Common method variance (CMV) is a major type of
systematic measurement error (Bagozzi & Yi, 1990). It represents the variance in
measurement generated from the specific instrument used to collect the data (Spector,
1987).
Type I error. Type I error is a false positive error that occurs when a path is declared as
significant when it is not really significant (rejecting the null hypothesis when it is true)
(Jarvis et al., 2003; MacKenzie, Podsakoff, & Jarvis, 2005).
Type II error. Type II error is a false negative by declaring a path as nonsignificant when
it is really significant (failure to reject the null hypothesis when it is not true). (Jarvis et al.,
2003; MacKenzie, Podsakoff, & Jarvis, 2005). MacKenzie et al. (2005) reported that the
primary cause of Type II errors is when both constructs (i.e. exogenous and endogenous)
are misspecified as reflective instead of formative, resulting in a higher standard error for
the parameter being reported.
Convergent validity of constructs. This type of validity evaluation refers to convergent
validity where the individual loadings of each indicator on its own construct are the focus
(Mackenzie, Podsakoff & Podsakoff, 2011). Variable correlations within their related
factors refer to the convergent validity of the tool. If the t-ratios for the loadings are
significant and the factor loadings are above the recommended level (0.40), the convergent
validity of the scale is supported (Hair et al., 2014).
Analysis of covariance (COVS) in invariance testing. This type of invariance procedure
refers to comparing the covariance structure of the model parameters (e.g. factor loading,
298
measured-variable loading, variance/covariance of errors or factor residuals) across groups
using analysis of covariance (COVS) (Byrne, 2009).
The invariance analysis of mean and covariance structure (MACS).This procedure
refers to invariance testing of constructs means. Once the invariance of the covariance
structure has been established, the invariance of construct means can be evaluated using
MACS (Cheung & Rensvold, 2002; Widaman and Reise, 1997). MACS was first
introduced by Sörbom (1974) for the cross validation of SEM models.
299
16.7 THE WOAQ AND ITS SUBFACTORS ITEMS.
Item number/Factor
Quality of physical environment
1 - Facilities for taking breaks
2- Work surroundings
4- Exposure to physical danger
9- Safety at work
18- The equipment/IT that you use
20- Work station/work space
Quality of relationship with colleagues
10- Your relationship with your co-workers
(socially)
28- How well you work with your co-workers (as a
team)
Quality of relationship with management
3- Clear roles and responsibilities
5- Support from line manager/supervisor
7- Feedback on your performance
11- Appreciation of efforts from line
managers/supervisors
16- Senior management attitudes
17- Clear reporting line(s)
22- Communication with line manager/supervisor
26- Status/recognition in the workplace
27- Clear workplace objectives, values, procedures
Reward and recognition
12- Consultation about changes in your job
13- Adequate training for your current job
14- Variety of different tasks
21- Opportunities for promotion
300
23- Opportunities for learning new skills
24- Flexibility of working hours
25- Opportunities to use your skills
Workload issues
6- Pace of work
8- Your work load
15- Impact of family/social life on work
19- Impact of your work on family/social life
301
16.8 The R-WAS questionnaire
The complete R-WAS questionnaire (copied with permission from Prof Philip
Taylor, the Redesigning Work for an Ageing Society research program conducted by the
Business, Work & Ageing Centre for Research (BWA) at Swinburne University of
Technology) (2009)
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
16.9 List of items used in construction of WAS
The list of items of WAS copied with permission from the Redesigning Work for
an Ageing Society research program conducted by the Business, Work and Ageing Centre
for Research (BWA) at Swinburne University of Technology (2009)
323
324
325
16.10 Ethics clearance
a) Letter of approval
To: Dr Denny Meyer, FLSS/Ms Leila Karimi
[BC: Ms Leila Karimi]
Dear Dr Meyer,
SUHREC Project 2011/175 The effects of common method variance on structural equation
modeling
Dr Denny Meyer, FLSS/Ms Leila Karimi
Approved Duration: 22/09/2011 To 28/02/2014
I refer to the ethical review of the above project protocol undertaken on behalf of
Swinburne's Human Research Ethics Committee (SUHREC) by SUHREC Subcommittee
(SHESC2) at a meeting held on 5 September 2011. Your response to the review as e-
mailed on 16 September 2011 was reviewed by a SHESC2 delegate.
I am pleased to advise that, as submitted to date, the project has approval to
proceed in line with standard on-going ethics clearance conditions here outlined.
- All human research activity undertaken under Swinburne auspices must conform to
Swinburne and external regulatory standards, including the National Statement on Ethical
Conduct in Human Research and with respect to secure data use, retention and disposal.
- The named Swinburne Chief Investigator/Supervisor remains responsible for any
personnel appointed to or associated with the project being made aware of ethics clearance
conditions, including research and consent procedures or instruments approved. Any
change in chief investigator/supervisor requires timely notification and SUHREC
endorsement.
326
- The above project has been approved as submitted for ethical review by or on behalf of
SUHREC. Amendments to approved procedures or instruments ordinarily require prior
ethical appraisal/ clearance. SUHREC must be notified immediately or as soon as possible
thereafter of (a) any serious or unexpected adverse effects on participants and any redress
measures; (b) proposed changes in protocols; and (c) unforeseen events which might affect
continued ethical acceptability of the project.
- At a minimum, an annual report on the progress of the project is required as well as at the
conclusion (or aboundonment) of the project.
- A duly authorised external or internal audit of the project may be undertaken at any time.
Please contact me if you have any queries about on-going ethics clearance. The SUHREC
project number should be quoted in communication. Chief Investigators/Supervisors and
Student Researchers should retain a copy of this e-mail as part of project record-keeping.
Best wishes for the project.
Yours sincerely
XXXX
Secretary, SHESC2
*******************************************
327
XXXX
Administrative Officer (Research Ethics)
Swinburne Research (H68)
Swinburne University of Technology
P O Box 218
HAWTHORN VIC 3122
Tel +61 3 9214 8468
328
MEMORANDUM
RESEARCH SERVICES
To: Dr Leila Karimi, School of Public Health, Faculty of Health
Sciences
From: Secretary, La Trobe University Human Ethics Committee
Subject: Review of Human Ethics Committee Application No. 11-054
Title: The effects of common method variance on structural equation
modeling
Thank you for your recent correspondence in relation to the research project
referred to above. The project has been assessed as complying with the National
Statement on Ethical Conduct in Human Research. I am pleased to advise that your
project has been granted ethics approval and you may commence the study.
The project has been approved from the date of this letter until 31 December
2012.
329
Please note that your application has been reviewed by a sub-committee of the
University Human Ethics Committee (UHEC) to facilitate a decision about the study
before the next Committee meeting. This decision will require ratification by the full
UHEC at its next meeting and the UHEC reserves the right to alter conditions of approval
or withdraw approval. You will be notified if the approval status of your project changes.
The UHEC is a fully constituted Ethics Committee in accordance with the National
Statement on Ethical Conduct in Research Involving Humans- March 2007 under Section
5.1.29.
The following standard conditions apply to your project:
• Limit of Approval. Approval is limited strictly to the research proposal as
submitted in your application while taking into account any additional conditions advised
by the UHEC.
• Variation to Project. Any subsequent variations or modifications you wish
to make to your project must be formally notified to the UHEC for approval in advance
of these modifications being introduced into the project. This can be done using the
appropriate form: Ethics - Application for Modification to Project which is available on
the Research Services website at http://www.latrobe.edu.au/research-
services/ethics/HEC_human.htm. If the UHEC considers that the proposed changes are
significant, you may be required to submit a new application form for approval of the
revised project.
• Adverse Events. If any unforeseen or adverse events occur, including
adverse effects on participants, during the course of the project which may affect the
ethical acceptability of the project, the Chief Investigator must immediately notify the
UHEC Secretary on telephone (03) 9479 1443. Any complaints about the project
received by the researchers must also be referred immediately to the UHEC Secretary.
330
• Withdrawal of Project. If you decide to discontinue your research before its
planned completion, you must advise the UHEC and clarify the circumstances.
• Annual Progress Reports. If your project continues for more than 12
months, you are required to submit an Ethics - Progress/Final Report Form annually, on
or just prior to
12 February. The form is available on the Research Services website (see above
address). Failure to submit a Progress Report will mean approval for this project will
lapse. An audit may be conducted by the UHEC at any time.
• Final Report. A Final Report (see above address) is required within six months of the
completion of the project or by 30 June 2013.
If you have any queries on the information above or require further clarification
please contact me through Research Services on telephone (03) 9479-1443, or e-mail at:
On behalf of the University Human Ethics Committee, best wishes with your
research!
XXXX
Administrative Officer (Research Ethics) University Human Ethics Committee
Research Compliance Unit / Research Services
La Trobe University Bundoora, Victoria 3086
P: (03) 9479 – 1443 / F: (03) 9479 - 1464 http://www.latrobe.edu.au/research-
services/ethics/HEC_human.htm
331
332
16.11
A List of Articles Included in the Review
No Authors Title Journal / year/ issue / page
1 John E. Mathieu and Lucy L. Gilson, Thomas M. Ruddy
Empowerment and Team Effectiveness: An Empirical Test of an Integrated Model Journal of Applied Psychology, 2006, Vol. 91, No. 1, 97–108
2 Yaping Gong and Jinyan Fan Longitudinal Examination of the Role of Goal Orientation in Cross-Cultural Adjustment Journal of Applied Psychology, 2006, Vol. 91, No. 1, 176–184
3 Christopher C. Rosen, Paul E. Levy, and Rosalie J. Hall
Placing Perceptions of Politics in the Context of the FeedbackEnvironment, Employee Attitudes, and Job Performance
Journal of Applied Psychology, 2006, Vol. 91, No. 1, 211–220
4 Bradley J. Alge, Gary A. Ballinger, Subrahmaniam Tangirala, and James L. Oakley
Information Privacy in Organizations: Empowering Creative and Extrarole Performance Journal of Applied Psychology, 2006, Vol. 91, No. 1, 221–232
5 Sabine Sonnentag, Fred R. H. Zijlstra Job Characteristics and Off-Job Activities as Predictors of Need for Recovery, Well-Being, and Fatigue Journal of Applied Psychology, 2006, Vol. 91, No. 2, 330–350
6 Kimberly A. Eddleston, John F. Veiga and Gary N. Powell
Explaining Sex Differences in Managerial Career Satisfier Preferences: The Role of Gender Self-Schema Journal of Applied Psychology, 2006, Vol. 91, No. 2, 437–445
7 Sharon K. Parker, Helen M. Williams Modeling the Antecedents of Proactive Behavior at Work Journal of Applied Psychology, 2006, Vol. 91, No. 3, 636–652
8 J. Craig Wallace, Eric Popp and Scott Mondore
Safety Climate as a Mediator Between Foundation Climates and Occupational Accidents: A Group-Level Investigation Journal of Applied Psychology, 2006, Vol. 91, No. 3, 681–688
9 Douglas J. Brown, Richard T. Cober, Kevin Kane, Paul E. Levy and Jarrett Shalhoop
Proactive Personality and the Successful Job Search: A Field Investigation With College Graduates Journal of Applied Psychology, 2006, Vol. 91, No. 3, 717–726
10 Dishan Kamdar, Daniel J. McAllister and Daniel B. Turban
“All in a Day’s Work”: How Follower Individual Differences and Justice Perceptions predict OCB Role Definitions and Behavior
Journal of Applied Psychology , 2006, Vol. 91, No. 4, 841–855
11 Christine L. Jackson , Jason A. Colquitt, Michael J. Wesson and Psychological Collectivism: A Measurement Validation and
Linkage to Group Member Performance Journal of Applied Psychology , 2006, Vol. 91, No. 4, 884–899 12 Cindy P. Zapata-PhelanWesson
13 Debra A. Major, Jonathan E. Turner, and Thomas D. Fletcher
Linking Proactive Personality and the Big Five to Motivation to Learn and Development Activity Journal of Applied Psychology, 2006, Vol. 91, No. 4, 927–935
333
14 Vivien K. G. Lim and Qing Si Sng Does Parental Job Insecurity Matter? Money Anxiety, Money Motives, and Work Motivation Journal of Applied Psychology, 2006, Vol. 91, No. 5, 1078–1087
15 Alannah E. Rafferty and Mark A. Griffin Perceptions of Organizational Change: A Stress and Coping Perspective Journal of Applied Psychology, 2006, Vol. 91, No. 5, 1154–1162
16 Frederick P. Morgeson, Stephen E. Humphrey The Work Design Questionnaire (WDQ): Developing and Validating a Comprehensive Measure for Assessing Job Design and the Nature of Work
Journal of Applied Psychology, 2006, Vol. 91, No. 6, 1321–1339
17 Laura M. Graves, Patricia J. Ohlott and Marian N. Ruderman
Commitment to Family Roles: Effects on Managers’ Attitudes and Performance Journal of Applied Psychology,2007, Vol. 92, No. 1, 44–56
18 Jonathon R. B. Halbesleben,Wm. Matthew Bowler
Emotional Exhaustion and Job Performance: The Mediating Role of Motivation Journal of Applied Psychology, 2007, Vol. 92, No. 1, 93–106
19 Samuel Aryee, Zhen Xiong Chen, Li-Yun Sun, and Yaw A. Debrah,
Antecedents and Outcomes of Abusive Supervision: Journal of Applied Psychology, 2007, Vol. 92, No. 1, 191–201
20 Test of a Trickle-Down Model
21 Gilad Chen, Bradley L. Kirkman, Ruth Kanfer, Don Allen, Benson Rosen
A Multilevel Study of Leadership, Empowerment, and Performance in Teams Journal of Applied Psychology, 2007, Vol. 92, No. 2, 331–346
22 Mo Wang Profiling Retirees in the Retirement Transition and Adjustment Process: Examining the Longitudinal Change Patterns of Retirees’Psychological Well-Being
Journal of Applied Psychology, 2007, Vol. 92, No. 2, 455–474
23 Hui Liao Do It Right This Time: The Role of Employee Service Recovery Performance in Customer-Perceived Justice and Customer Loyalty After Service Failures
Journal of Applied Psychology, 2007, Vol. 92, No. 2, 475–489
24 Adam B. Butler Job Characteristics and College Performance and Attitudes: A Model of Work–School Conflict and Facilitation Journal of Applied Psychology, 2007, Vol. 92, No. 2, 500–510
25 Jo Silvester, Fiona Patterson, Anna Koczwara, Eamonn Ferguson
“Trust Me. . .”: Psychological and Behavioral Predictors of Perceived Physician Empathy Journal of Applied Psychology, 2007, Vol. 92, No. 2, 519–527
26 Richard D. Arvey, Zhen Zhang, Bruce J. Avolio, Robert F. Krueger
Developmental and Genetic Determinants of Leadership Role Occupancy Among Women Journal of Applied Psychology, 2007, Vol. 92, No. 3, 693–706
27 Seokhwa Yun, Riki Takeuchi, Wei Liu Employee Self-Enhancement Motives and Job Performance Behaviors: Investigating the Moderating Effects of Employee Role Ambiguity and
Journal of Applied Psychology, 2007, Vol. 92, No. 3, 745–756
28 Hilary J. Gettman and Michele J. Gelfand When the Customer Shouldn’t Be King: Antecedents and Consequences of Sexual Harassment by Clients and Customers Journal of Applied Psych2007, Vol. 92, No. 3, 757–770ology,
29 James M. Diefendorff, Kajal Mehta The Relations of Motivational Traits With Workplace Deviance Journal of Applied Psychology, 2007, Vol. 92, No. 4, 967–977
334
30 Craig D. Crossley, Rebecca J. Bennett, Steve M. Jex and Jennifer L. Burnfield,
Development of a Global Measure of Job Embeddedness and Integration Into a Traditional Model of Voluntary Turnover Journal of Applied Psychology2007, Vol. 92, No. 4, 1031–1042,
31 Michael Frese, Harry Garst, Doris Fay Making Things Happen: Reciprocal Relationships Between Work Characteristics and Personal Initiative in a Four-Wave Longitudinal Structural Equation Model
Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1084–1102
32 Marie S. Mitchell and Maureen L. Ambrose Abusive Supervision and Workplace Deviance and the Moderating Effects of Negative Reciprocity Beliefs Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1159–1168
33 Christian Vandenberghe, Kathleen Bentein, Richard Michon, Jean-Charles Chebat, Michel Tremblay, Jean-Franc¸ois Fils
An Examination of the Role of Perceived Support and Employee Commitment in Employee–Customer Encounters Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1177–1187
34 Daniel J. McAllister, Dishan Kamdar, Elizabeth Wolfe Morrison, Daniel B. Turban
Disentangling Role Perceptions: How Perceived Role Breadth, Discretion, Instrumentality, and Efficacy Relate to Helping and Taking Charge
Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1200–1211
35 Kathi Miner-Rubino, Lilia M. Cortina Beyond Targets: Consequences of Vicarious Exposure to Misogyny at Work Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1254–1269
36 Dishan Kamdar, Linn Van Dyne The Joint Effects of Personality and Workplace Social Exchange Relationships in Predicting Task Performance and Citizenship Performance
Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1286–1298
37 Mo Wang, Riki Takeuchi The Role of Goal Orientation During Expatriation: A Cross-Sectional and Longitudinal Investigation Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1437–1445
38 Christine A. Sprigg, Christopher B. Stride, Work Characteristics, Musculoskeletal Disorders, and the Mediating Role of Psychological Strain: A Study of Call Center Employees
Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1456–1466 39 Toby D. Wall, and David J. Holman, Phoebe
R. Smith
40 Michael Frese, Stefanie I. Krauss, Nina Keith, Susanne Escher, Rafal Grabarkiewicz, Siv Tonje Luneng, Business Owners’ Action Planning and Its Relationship to
Business Success in Three African Countries Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1481–1498
41 Constanze Heers, Jens Unger, and Christian Friedrich
42 Wei-Chi Tsai, Chien-Cheng Chen, Hui-Lu Liu
Test of a Model Linking Employee Positive Moods and Task Performance Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1570–1583
43 Brent A. Scott, Jason A. Colquitt and Cindy P. Zapata-Phelan
Justice as a Dependent Variable: Subordinate Charisma as a Predictor of Interpersonal and Informational Justice Perceptions Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1597–1609
44
45 Julia Levashina, Michael A. Campion Measuring Faking in the Employment Interview: Development and Validation of an Interview Faking Behavior Scale Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1638–1656
335
46 David G. Allen, Raj V. Mahto, Robert F. Otondo
Web-Based Recruitment: Effects of Information, Organizational Brand, and Attitudes Toward a Web Site on Applicant Attraction
Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1696–1708 47
48 Zhi-Xue Zhang, Paul S. Hempel, Yu-Lan Han, Dean Tjosvold
Transactive Memory System Links Work Team Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1722–1730
49 Characteristics and Performance
50 Henry Moon, Dishan Kamdar, David M. Mayer, Riki Takeuchi
Me or We? The Role of Personality and Justice as Other-Centered Antecedents to Innovative Citizenship Behaviors Within Organizations
Journal of Applied Psychology, 2008, Vol. 93, No. 1, 84–94
51 Sandy Lim, Lilia M. Cortina, Vicki J. Magley Personal and Workgroup Incivility: Impact on Work and Health Outcomes Journal of Applied Psychology, 2008, Vol. 93, No. 1, 95–107
52 Bradford S. Bell, Steve W. J. Kozlowski, Active Learning: Effects of Core Training Design Elements on Self-Regulatory Processes, Learning, and Adaptability Journal of Applied Psychology, 2008, Vol. 93, No. 2, 296–316
53 Lillian T. Eby, Jaime R. Durley, and Sarah C. Evans, Belle Rose Ragins
Mentors’ Perceptions of Negative Mentoring Experiences: Scale Development and Nomological Validation Journal of Applied Psychology, 2008, Vol. 93, No. 2, 358–373
54 James R. Detert, Linda Klebe Trevin˜o, Vicki L. Sweitzer
Moral Disengagement in Ethical Decision Making: Journal of Applied Psychology, 2008, Vol. 93, No. 2, 374–391
55 A Study of Antecedents and Outcomes
56 Severin Hornung, Denise M. Rousseau, Ju¨rgen Glaser
Creating Flexible Work Arrangements Through Idiosyncratic Deals Journal of Applied Psychology, 2008, Vol. 93, No. 3, 655–664
57 Dov Zohar and Orly Tenne-Gazit Transformational Leadership and Group Interaction as Climate Antecedents: A Social Network Analysis Journal of Applied Psychology, 2008, Vol. 93, No. 4, 744–757
58 Mahesh Subramony, Nicole Krause, Jacqueline Norton, and Gary N. Burns
The Relationship Between Human Resource Investments and Organizational Performance: A Firm-Level Examination of Equilibrium Theory
Journal of Applied Psychology, 2008, Vol. 93, No. 4, 778–788 59
60 Arnold B. Bakker, Evangelia Demerouti, Maureen F. Dollard
How Job Demands Affect Partners’ Experience of Exhaustion: Integrating Work–Family Conflict and Crossover Theory Journal of Applied Psychology, 2008, Vol. 93, No. 4, 901–911
61
62
Shaul Oreg, Mahmut Bayazıt, Maria Armenakis, Rasa Barkauskiene, Nikos Bozionelos, Yuka Fujimoto, Luis Gonzalez, Jian Han, Martina Hrˇebı´cˇkova, Nerina Jimmieson, Jana Kordacova, Hitoshi Mitsuhashi, Boris Mlacˇic´, Ivana Feric´, Marina Kotrla Topic, Sandra Ohly, Per Øystein Saksvik, Hilde Hetland and Ingvild Saksvik and Karen van Dam
Dispositional Resistance to Change: Measurement Equivalence and the Link to Personal Values Across 17 Nations Journal of Applied Psychology, 2008, Vol. 93, No. 4, 935–944
336
63 Greg L. Stewart and Susan L. Dustin, Murray R. Barrick, Todd C. Darnold Exploring the Handshake in Employment Interviews Journal of Applied Psychology, 2008, Vol. 93, No. 5, 1139–1146
64 David J. Henderson and Sandy J. Wayne, Lynn M. Shore, William H. Bommer, Lois E. Tetrick
Leader–Member Exchange, Differentiation, and Psychological Contract Fulfillment: A Multilevel Examination Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1208–1219
65 Fiona A. White, Margaret A. Charles, and Jacqueline K. Nelson
The Role of Persuasive Arguments in Changing Affirmative Action Attitudes and Expressed Behavior in Higher Education Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1271–1286
66 Daniel P. Skarlicki, Danielle D. van Jaarsveld, and David D. Walker
Getting Even for Customer Mistreatment: The Role of Moral Identity in the Relationship Between Customer Interpersonal Injustice and Employee Sabotage
Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1335–1347
67 Samantha D. Montes, P. Gregory Irving Disentangling the Effects of Promised and Delivered Inducements: Relational and Transactional Contract Elements and the Mediating Role of Trust
Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1367–1381
68 Brent A. Scott, Timothy A. Judge The Popularity Contest at Work: Who Wins, Why, and What Do They Receive? Journal of Applied Psychology, 2009, Vol. 94, No. 1, 20–33
69 Eric Kearney, Diether Gebert Managing Diversity and Enhancing Team Outcomes: The Promise of Transformational Leadership Journal of Applied Psychology, 2009, Vol. 94, No. 1, 77–8
70 Hans-Georg Wolff and Klaus Moser Effects of Networking on Career Success: A Longitudinal Study Journal of Applied Psychology, 2009, Vol. 94, No. 1, 196–206
71 Susan M. Stewart, Mark N. Bing and H. Kristl Davison, David J. Woehr and Michael D. McIntyre
In the Eyes of the Beholder: A Non-Self-Report Measure of Workplace Deviance Journal of Applied Psychology, 2009, Vol. 94, No. 1, 207–215
72 Jin Nam Choi, Jae Yoon Chang Innovation Implementation in the Public Sector: An Integration of Institutional and Collective Dynamics Journal of Applied Psychology, 2009, Vol. 94, No. 1, 245–253
73 Yaping Gong, Kenneth S. Law and Song Chang, Katherine R. Xin
Human Resources Management and Firm Performance: The Differential Role of Managerial Affective and Continuance Commitment
Journal of Applied Psychology, 2009, Vol. 94, No. 1, 263–275
74 Peter W. Hom and Anne S. Tsui, Joshua B. Wu, Thomas W. Lee, Ann Yan Zhang, Ping Ping Fu, Lan Li
Explaining Employment Relationships With Social Exchange and Job Embeddedness Journal of Applied Psychology, 2009, Vol. 94, No. 2, 277–297
75 Greet Van Hoye and Filip Lievens Tapping the Grapevine: A Closer Look at Word-of-Mouth as a Recruitment Source Journal of Applied Psychology, 2009, Vol. 94, No. 2, 341–352
76 Tove Helland Hammer, Mahmut Bayazit, David L. Wazeter
Union Leadership and Member Attitudes: A Multi-Level Analysis Journal of Applied Psychology, 2009, Vol. 94, No. 2, 392–410
337
77 Steven L. Blader and Tom R. Tyler Testing and Extending the Group Engagement Model: Linkages Between Social Identity, Procedural Justice, Economic Outcomes, and Extrarole Behavior
Journal of Applied Psychology, 2009, Vol. 94, No. 2, 445–464
78 Maureen L. Ambrose and Marshall Schminke The Role of Overall Justice Judgments in Organizational Justice Research: A Test of Mediation Journal of Applied Psychology, 2009, Vol. 94, No. 2, 491–500
79 Lei Lai, Denise M. Rousseau, Klarissa Ting Ting Chang Idiosyncratic Deals: Coworkers as Interested Third Parties Journal of Applied Psychology, 2009, Vol. 94, No. 2, 547–556
80 Gregory M. Hurtz, Kevin J. Williams Attitudinal and Motivational Antecedents of Participation in Voluntary Employee Development Activities Journal of Applied Psychology, 2009, Vol. 94, No. 3, 635–653
81 Chad H. Van Iddekinge, Gerald R. Ferris, Alexa A. Perryman, rFred R. Blass, Thomas D. Heetderks
Effects of Selection and Training on Unit-Level Performance Over Time: A Latent Growth Modeling Approach Journal of Applied Psychology, 2009, Vol. 94, No. 4, 829–843
82 D. Scott DeRue and Ned Wellman Developing Leaders via Experience: The Role of Developmental Challenge, Learning Orientation, and Feedback Availability
Journal of Applied Psychology, 2009, Vol. 94, No. 4, 859–875
83 Remus Ilies, Ingrid Smithey Fulmer, Matthias Spitzmuller, Michael D. Johnson
Personality and Citizenship Behavior: The Mediating Role of Job Satisfaction Journal of Applied Psychology, 2009, Vol. 94, No. 4, 945–959
84 Karin A. Orvis, Sandra L. Fisher and Michael E. Wasserman
Power to the People: Using Learner Control to Improve Trainee Reactions and Learning in Web-Based Instructional Environments
Journal of Applied Psychology, 2009, Vol. 94, No. 4, 960–971
85 Jessica B. Rodell and Jason A. Colquitt, Looking Ahead in Times of Uncertainty: The Role of Anticipatory Justice in an Organizational Change Context Journal of Applied Psychology, 2009, Vol. 94, No. 4, 989–1002
86 Brian C. Holtz, Crystal M. Harold Fair Today, Fair Tomorrow? A Longitudinal Investigation of Overall Justice Perceptions Journal of Applied Psychology, 2009, Vol. 94, No. 5, 1185–1199
87 Jerry W. Grizzle, Alex R. Zablah, Tom J. Brown, and John C. Mowen, James M. Lee
Employee Customer Orientation in Context: How the Environment Moderates the Influence of Customer Orientation on Performance Outcomes
88 David R. Hekman, H. Kevin Steensma, Gregory A. Bigley,
Effects of Organizational and Professional Identification on the Relationship Between Administrators’ Social Influence and Professional Employees’ Adoption of New Work Behavior
Journal of Applied Psychology, 2009, Vol. 94, No. 5, 1325–1335 89 and James F. Hereford
90 Martha C. Andrews, K. Michele Kacmar, Kenneth J. Harris
Got Political Skill? The Impact of Justice on the Importance of Political Skill for Job Performance Journal of Applied Psychology, 2009, Vol. 94, No. 6, 1427–1437
91 Abraham Carmeli and Batia Ben-Hador, David A. Waldman, Deborah E. Rupp
How Leaders Cultivate Social Capital and Nurture Employee Vigor: Implications for Job Performance Journal of Applied Psychology, 2009, Vol. 94, No. 6, 1553–1561
338
92 Timothy A. Judge, Remus Ilies and Nikolaos Dimotakis
Are Health and Happiness the Product of Wisdom? The Relationship of General Mental Ability to Educational and Occupational Attainment, Health, and Well-Being
Journal of Applied Psychology, 2010, Vol. 95, No. 3, 454–468
93 Leigh Anne Liu, Chei Hwee Chua, Gu¨nter K. Stahl
Quality of Communication Experience: Definition, Measurement, and Implications for Intercultural Negotiations Journal of Applied Psychology, 2010, Vol. 95, No. 3, 469–487
94 Richard G. Netemeyer and James G. Maxham III, Donald R. Lichtenstein
Store Manager Performance and Satisfaction: Effects on Store Employee Performance and Satisfaction, Store Customer Satisfaction, and Store Customer Spending Growth
Journal of Applied Psychology, 2010, Vol. 95, No. 3, 530–545
95 Tracy D. Hecht, Julie M. Mccarthy Coping With Employee, Family, and Student Roles: Journal of Applied Psychology, 2010, Vol. 95, No. 4, 631–647
96 Thomas W. H. Ng, Daniel C. Feldman, Simon S. K. Lam
Psychological Contract Breaches, Organizational Commitment, and Innovation-Related Behaviors: A Latent Growth Modeling Approach
Journal of Applied Psychology, 2010, Vol. 95, No. 4, 744–751
97 Elizabeth E. Umphress, John B. Bingham, Marie S. Mitchell
Unethical Behavior in the Name of the Company: The Moderating Effect Of Organizational Identification and Positive Reciprocity Beliefs on Unethical Pro-Organizational Behavior
Journal of Applied Psychology, 2010, Vol. 95, No. 4, 769–780
98
Robert Eisenberger, Gokhan Karagonlar, Florence Stinglhamber, Pedro Neves, Thomas E. Becker, M. Gloria gonzalezmeta Steiger-Mueller-Morales,
Leader–Member Exchange and Affective Organizational Commitment: The Contribution of Supervisor’s Organizational Embodiment
Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1085–1103
99 Xiao-Hua (Frank) Wang, Jane M. Howell Exploring the Dual-Level Effects of Transformational LeadershipOn Followers Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1134–1144
100 Murray R. Barrick and Brian W. Swider, Greg L. Stewart
Initial Evaluations in the Interview: Relationships with Subsequent Interviewer Evaluations and Employment Offers Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1163–1172
101 John P. Trougakos, Christine L. Jackson, Daniel J. Beal
Service Without a Smile: Comparing the Consequences of Neutral and Positive Display Rules Journal of Applied Psychology, 2010
102 Myriam N. Bechtoldt, Sonja Rohrmann, Irene E. De Pater and Bianca Beersma
The primacy of perceiving: Emotion regulation buffers negative effects of emotional labor Journal of Applied Psychology, 2011, Vol. 96, No. 5, 1087-1094
103 Pamela Tierney and Steven M. Farmer Creative self-efficacy development and creative performance over time Journal of Applied Psychology, 2011, Vol. 96, No. 2, 277-293
104 Bradley L. Kirkman, John E. Mathieu, John L. Cordery, Benson Rosen and Michael Kukenberger
Managing a new collaborative entity in business organizations: Understanding organizational communities of practice effectiveness
Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1234-1245
105 Tal Yaffe and Ronit Kark Leading by example: The case of leader OCB Journal of Applied Psychology, 2011, Vol. 96, No. 4, 806-826
339
106 Chad H. Van Iddekinge, Dan J. Putka and John P. Campbell
Reconsidering vocational interests for personnel selection: The validity of an interst-based selection test in relation to job knowledge, job performance, and continuance intentions
Journal of Applied Psychology, 2011, Vol. 96, No. 1, 13-33
107 J. Craig Wallace, Paul D. Johnson, Kimberly Mathe and Jeff Paul
Structural and psychological empowerment climates, performance, and the moderating role of shared felt accountability: A managerial perspective
Journal of Applied Psychology, 2011, Vol. 96, No. 4, 840-850
108 Scott E. Seibert, Gang Wang and Stephen H. Courtright
Antecedents and consequences of psychological and team empowerment in organizations: A meta-analytic review Journal of Applied Psychology, 2011, Vol. 96, No. 5, 981-1003
109 Debra L. Shapiro, Alan D. Boss, Silvia Salas, Subrahmaniam Tangirala and Mary Ann Von Glinow
When are transgressing leaders punitively judged? An empirical test Journal of Applied Psychology, 2011, Vol. 96, No. 2, 412-422
110 Jason D. Shaw, Jing Zhu, Michelle K. Duffy, Kristin L. Scott, His-An Shih and Ely Susanto A contingency model of conflict and team effectiveness Journal of Applied Psychology, 2011, Vol. 96, No. 2, 391-400
111 Stefan Diestel and Klaus-Helmut Schmidt Costs of simultaneous coping with emotional dissonance and self-control demands at work: Results from two German samples
Journal of Applied Psychology, 2011, No. 96, No. 3, 643-653
112 Jia Hu and Robert C. Liden Antecedents of team potency and team effectiveness: An examination of goal and process clarity and servant leadership Journal of Applied Psychology, 2011, Vol. 96, No. 4, 851-862
113 Ronald Bledow, Antje Schmitt, Michael Frese and Jana Kuhnel The affective shift model of work engagement Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1246-1257
114 Gilad Chen, Payal Nangia Sharma, Suzanne K. Edinger, Debra L. Shapiro and Jiing-Lih Farh
Motivating and demotivating forces in teams: Cross-level influences of empowering leadership and relationship conflict Journal of Applied Psychology, 2011, Vol. 96, No. 3, 541-557
115 Spencer H. Harrison, David M. Sluss, Blake E Ashforth
Curiosity adapted the cat: The role of trait curiosity in newcomer adaptation Journal of Applied Psychology, 2011, Vol. 96, No. 1, 211-220
116 John Schaubroeck , Simon S. K. Lam and Ann Chunyan Peng
Cognition-based and affect-based trust as mediator of leader behaviour influences on team performance Journal of Applied Psychology, 2011, Vol. 96, No. 4, 863-871
117 John P. Hausknecht, Michael C. Sturman and Quinetta M. Roberson
Justice as a dynamic construct: Effects of individual trajectories on distal work outcomes Journal of Applied Psychology, 2011, Vol. 96, No. 4, 872-880
118 Sven Gross, Norbert K. Semmer, Laurenz L. Meier, Wolfgang Kalin, Nicola Jacobshagen and Franziska Tschan
The effect of positive events at work on after-work fatigue: They matter most in face of adversity Journal of Applied Psychology, 2011, Vol. 96, No. 3, 654-664
340
119 Filip Lievens and Fiona Patterson The validity and incremental validity of knowledge tests, low-fidelity simulations, and high-fidelity simulations for predicting job performance in advanced-level high-stakes selection
Journal of Applied Psychology, 2011, Vol. 96, No. 5, 927-940
120 Maria L. Kraimer, Scott E. Seibert, Sandy J. Wayne, Robert C. Liden and Jesus Bravo
Antecedents and outcomes of organizational support for development: The critical role of career opportunities Journal of Applied Psychology, 2011, Vol. 96, No. 3, 485-500
121 Jessica Lang, Paul D. Bliese, Jonas W. B. Lang and Amy B. Adler
Work gets unfair for the depressed: Cross-lagged relations between organizational justice perceptions and depressive symptoms
Journal of Applied Psychology, 2011, Vol. 96, No. 3, 602-618
122 Huy Le, In-Sue Oh, Steven B. Robbins, Remus Ilies, Ed Holland and Paul Westrick
Too much of a good thing: Curvilinear relationships between personality traits and job performance Journal of Applied Psychology, 2011, Vol. 96, No. 1, 113-133
123 Ning Li, T. Brad Harris, Wendy R. Boswell and Zhitao Xie
The role of organizational insiders’ developmental feedback and proactive personality on newcomers’ performance: An interactionist perspective
Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1317-1327
124 Dong Liu and Xiao-Ping Chen and Xin Yao From autonomy to creativity: A multilevel investigation of the mediating role of harmonious passion Journal of Applied Psychology, 2011, Vol. 96, No. 2, 294-309
125 Dong Liu and Ping-ping Fu Motivating proteges’ personal learning in teams: A multilevel investigation of autonomy support and autonomy orientation Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1195-1208
126 Christopher D. Nye and Fritz Drasgow Effect size indices for analyses of measurement equivalence: Understanding the practical importance of differences between groups
Journal of Applied Psychology, 2011, Vol. 96, No. 5, 966-980
127 Dong Liu, Shu Zhang, Lei Wang and Thomas W. Lee
The effects of autonomy and empowerment on employee turnover: Test of a multilevel model in teams Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1305-1316
128 Nora Madjar, Ellen Greenberg and Zheng Chen
Factors for radical creativity, incremental creativity, and routine, noncreative performance Journal of Applied Psychology, 2011, Vol. 96, No. 4, 730-743
129 Jake G. Messersmith, Pankaj C. Patel, David P. Lepak and Julian Gould-Williams
Unlocking the black box: Exploring the link between high-performance work systems and performance Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1105-1118
130 Elizabeth Wolfe Morrison, Sara L. Wheeler-Smith and Dishan Kamdar
Speaking up in groups: A cross-level study of group voice climate and voice Journal of Applied Psychology, 2011, Vol. 96, No. 1, 183-191
131 Kok-Yee Ng, Christine Koh, Soon Ang, Jeffrey C. Kennedy, and Kim-Yin Chan
Rating leniency and halo in multisource feedback ratings: Testing cultural assumptions of power distance and individualism- collectivism
Journal of Applied Psychology, 2011, Vol. 96, No. 5, 1033-1044
132 Muammer Ozer A moderated mediation model of the relationship between organizational citizenship behaviors and job performance Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1328-1336
341
133 S. Douglas Pugh, Markus Groth and Thorsten Hennig-Thurau
Willing and able to fake emotions: A closer examination of the link between emotional dissonance and employee well-being Journal of Applied Psychology, 2011, Vol. 96, No. 2, 377-390
134 Simon Lloyd D. Restubog, Kristin L. Scott and Thomas J. Zagenczyk
When distress hits home: The role of contextual factors and psychological distress in predicting employees’ responses to abusive supervision
Journal of Applied Psychology, 2011, Vol. 96, No. 4, 713-729
135 Zhaoli Song, Maw-Der Foo, Marilyn A. Uy and Shuhua Sun
Unraveling the daily stress crossover between unemployed individuals and their employed spouses Journal of Applied Psychology, 2011, Vol. 96, No. 1, 151-168
136 Sabine Sonnentag, Eva J. Mojza, Evangelia Demerouti and Arnold B. Bakker
Reciprocal relations between recovery and work engagement: The moderating role of job stressors Journal of Applied Psychology, 2012, Vol. 97, No. 4, 842-853
137 Andreas W. Richter, Giles Hirst, Daan van knippenberg and Markus Baer
Creative self-efficacy and individual creativity in team contexts: Cross-level interactions with team informational resources Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1282-1290
138 Anat Rafaeli, Amir Erez, Shy Ravid, Rellie Derfler-Rozin, Dorit Efrat Treister and Ravit Scheyer
When customers exhibit verbal aggression, employees pay cognitive costs Journal of Applied Psychology, 2012, Vol. 97, No. 5, 931-950
139 Steffen Raub and hui Liao Doing the right thing without being told: Joint effects of initiative climate and general self-efficacy on employee proactive customer service performance
Journal of Applied Psychology, 2012, Vol. 97, No. 3, 651-667
140 Steven W. Whiting, Timothy D. Maynes, Nathan P. Podsakoff and Philip M. Podsakoff
Effects of message, source, and context on evaluations of employee voice behaviour Journal of Applied Psychology, 2012, Vol. 97, No. 1, 159-782
141 Chia-Huei Wu and Mark A. Griffin Longitudinal relationships between core self-evaluations and job satisfaction Journal of Applied Psychology, 2012, Vol. 97, No. 2, 331-342
142 Thomas W. H. Ng and Daniel C. Feldman The effects of organizational and community embeddedness on work-to-family and family-to-work conflict Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1233-1251
143 Karsten Mueller, Kate Hattrup, Sven-Oliver Spiess and Nick Lin-Hi
The effects of corporate social responsibility on employees’ affective commitment: A cross-cultural investigation Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1186-1200
144 Lisa Schurer Lambert, Bennett J Tepper, Jon C. Carr, Daniel T. Holt and Alex J. Barelka
Forgotten but not gone: An examination of fit between leader consideration and initiating structure needed and received Journal of Applied Psychology, 2012, Vol. 97, No. 5, 913-930
145
Hannes Leroy, Bart Dierynck, Frederik Anseel, Tony Simons, Jonathon R. B. Halbesleben, Deirdre McCaughey, Grant T. Savage and Luc Sels
Behavioral integrity for safety, priority of safety, psychological safety, and patient safety: A team-level study Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1273-1281
342
146 Jason A. Colquitt, Jeffery A. LePine, Ronald F. Piccolo, Cindy P. Zapata, and Bruce L. Rich
Explaining the justice-performance relationship: Trust as exchange deepener or trust as uncertainty reducer? Journal of Applied Psychology, 2012, Vol. 97, No. 1, 1-15
147 Deanne N. Den Hartog and Frank D. Belschak When does transformational leadership enhance employee proactive behaviour? The role of autonomy and role breadth self-efficacy
Journal of Applied Psychology, 2012, Vol. 97, No. 1, 194-202
148 Bart A. de Jong and Kurt T. Dirks Beyond shared perceptions of trust and monitoring in teams: Implications of asymmetry and dissensus Journal of Applied Psychology, 2012, Vol. 97, No. 2, 391-406
149 Marne L. Arthaud-Day, Joseph C. Rode and William H. Turnley
Direct and contextual effects of individual values on organisational citizenship behaviour in teams Journal of Applied Psychology, 2012, Vol. 97, No. 4, 792-807
150 Samuel Aryee, Fred O. Walumbwa, Emmanuel Y. M. Seidu and Lilian E. Otaye
Impact of high-performance work systems on individual- and branch level performance: Test of a multilevel model of intermediate linkages
Journal of Applied Psychology, 2012, Vol. 97, No. 2, 287-300
151 Richard G. Netemeyer, Carrie M. Heilman and James G. Maxham, III
Identification with the retail organization and customer-perceived employee similarity: Effects on customer spending Journal of Applied Psychology, 2012, Vol. 97, No. 5, 1049-1058
152 Richard P. Bagozzi, Massimo Bergami, Gian Luca Marzocchi, and Gabriele Morandin
Customer-organization relationships: Development and test of a theory of extended identities Journal of Applied Psychology, 2012, Vol. 97, No. 1, 63-76
153 Uta K. Bindl, Sharon K. Parker, Peter Totterdell and Gareth Hagger-Johnson
Fuel of the self-starter: How mood relates to proactive goal regulation Journal of Applied Psychology, 2012, Vol. 97, No. 1, 134-150
154 Xiao-Ping Chen, Dong Liu and Rebecca Portnoy
A multilevel investigation of motivational cultural intelligence, organizational diversity climate, and cultural sales: Evidence from U.S. real estate firms
Journal of Applied Psychology, 2012, Vol. 97, No. 1, 93-106
155 Lisa Dragoni and Maribeth Kuenzi Better understanding work unit goal orientation: Its emergence and impact under different types of work unit structure Journal of Applied Psychology, 2012, Vol. 97, No. 5, 1032-1048
156 D. Scott DeRue, Jennifer D. Nahrgang, John R. Hollenbeck and Kristina Workman
A quasi-experimental study of after-event reviews and leadership development Journal of Applied Psychology, 2012, Vol. 97, No. 5, 997-1015
157 Crystal I. C. Chien Farh, Myeong-Gu Seo and Paul E. Tesluk
Emotional intelligence, teamwork effectiveness, and job performance: The moderating role of job context Journal of Applied Psychology, 2012, Vol. 97, No. 4, 890-900
158 David M. Fisher, Suzanne T. Bell, Erich C. Dierdorff, and James A. Belohlav
Facet personality and surface-level diversity as team mental model antecedents: Implications for implicit coordination Journal of Applied Psychology, 2012, Vol. 97, No. 4, 825-841
159 Ravi S. Gajendran and Aparna Joshi Innovation in globally distributed teams: The role of LMX, communication frequency, and member influence on team decisions
Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1252-1261
343
160 Michele J. Gelfand, Lisa M. Leslie, Kirsten Keller and Carsten de Dreu
Conflict cultures in organizations: How leaders shape conflict cultures and their organizational-level consequences Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1131-1147
161 Dvora Geller and Peter A. Bamberger The impact of help seeking on individual task performance: The moderating effect of help seekers’ logics of action Journal of Applied Psychology, 2012, Vol. 97, No. 2, 487-497
162 Robert T. Keller Predicting the performance and innovativeness of scientists and engineers Journal of Applied Psychology, 2012, Vol. 97, No. 1, 225-233
163 Karoline Strauss, Mark A. Griffin and Sharon K. Parker
Future work selves: How salient hoped-for identities motivate proactive career behaviors Journal of Applied Psychology, 2012, Vol. 97, NO. 3, 580-598
164 Sharon Toker and Michal Biron Job burnout and depression: Unraveling their temporal relationship and considering the role of physical activity Journal of Applied Psychology, 2012, Vol. 97, No. 3, 699-710
165 Le Zhou, Mo Wang, Gilad Chen and Junqi Shi Supervisors’ upward exchange relationships and subordinate outcomes: Testing the multilevel mediation role of empowerment
Journal of Applied Psychology, 2012, Vol. 97, No. 3, 668-680
166 Herman H. M. Tse, Catherine K. Lam, Sandra A. Lawrence and Xu Huang
When my supervisor dislikes you more than me: The effect of dissimilarity in leader-member exchange on coworkers’ interpersonal emotion and perceived help
Journal of Applied Psychology, 2013, Vol. 98, No. 6, 974-988
167 Subrahmaniam Tangirala, Dishan Kamdar, Vijaya Venkataramani and Michael R. Parke
Doing right versus getting ahead: The effects of duty and achievement orientations on employees’ voice Journal of Applied Psychology, 2013, Vol. 98, No. 6, 1040-1050
168 Daniel J. Beal, John P. Trougakos, Howard M. Weiss, and Reeshad S. Dalal Affect spin and the emotion regulation process at work Journal of Applied Psychology, 2013, Vol. 98, No. 4, 593-605
169 Junqi Shi, Russell E. Johnson, Yihao Liu and Mo Wang
Linking subordinate political skill to supervisor dependence and reward recommendations: A moderated mediation model Journal of Applied Psychology, 2013, Vol. 98, No. 2, 374-384
170 Mindy K. Shoss, Robert Eisenberger, Simon Lloyd D. Restubog and Thomas J. Zagenczyk
Blaming the organization for abusive supervision: The roles of perceived organizational support and supervisor’s organizational embodiment
Journal of Applied Psychology, 2013, Vol. 98, No. 1, 158-168
171 Aaron M. Watson, Lori Foster Thompson, Jane V. Rudolph, Thomas J. Whelan, Tara S. Behrend, and Amanda L. Gissel
When big brother is watching: Goal orientation shapes reactions to electronic monitoring during online training Journal of Applied Psychology, 2013, Vol. 98, No. 4, 642-657
172 Julie Holliday Wayne, Wendy J. Casper, Russell A. Matthews, and Tammy D. Allen
Family-supportive organization perceptions and organizational commitment: The mediating role of work-family conflict and enrichment and partner attitudes
Journal of Applied Psychology, 2013, Vol. 98, No. 4, 606-622
173 James W. Beck and Aaron M. Schmidt State-level goal orientations as mediators of the relationship between time pressure and performance: A longitudinal study Journal of Applied Psychology, 2013, Vol. 98, No. 2, 354-363
344
174 D. Lance Ferris, Russell E. Johnson, Christopher C. Rosen, Emilija Djurdjevic, Chu-Hsiang Chang and James A Tan
When is success not satisfying? Integrating regulatory focus and approach/avoidance motivation theories to explain the relation between core-self-evaluation and job satisfaction
Journal of Applied Psychology, 2013, Vol. 98, No. 2, 342-353
175 Adam M. Grant and Nancy P. Rothbard When in doubt, seize the day? Security values, prosocial values, and proactivity under ambiguity Journal of Applied Psychology, 2013, Vol. 98, No. 5, 810-819
176 Nina Gupta, Daniel C. Ganster and Sven Kepes Assessing the validity of scales self-efficacy: A cautionary tale Journal of Applied Psychology, 2013, Vol. 98, No. 4, 690-700
177
Sean T. Hannah, John M. Schaubroeck, Ann C Peng, Robert G Lord, Linda K Trevino, Steve W. J. Kozlowski, Bruce J. Avolio, Nikolaos Dimotakis and Joseph Doty
Job influences of individual and work unit abusive supervision on ethical intentions and behaviors: A moderated mediation model
Journal of Applied Psychology, 2013, Vol. 98, No. 4, 579-592
178 Daniel S. Stanhope, Samuel B. Pond III and Erica A. Surface
Core self-evaluations and training effectiveness: Prediction through motivational intervening mechanisms Journal of Applied Psychology, 2013, Vol. 98, No. 5, 820-831
179 Kristin L. Scott, Simon Lloyd D. Restubog and Thomas J. Zagenczyk
A social exchange-based model of the antecedents of workplace exclusion Journal of Applied Psychology, 2013, Vol. 98, No. 1, 37-48
180 Scott E. Seibert, Maria L. Kraimer, Brooks C. Holtom and Abigail J. Pierotti
Even the best laid plans sometimes go askew: Career self- management processes, career shocks, and the decision to pursue graduate education
Journal of Applied Psychology, 2013, Vol. 98, No. 1, 169-182
181 Guo-hua Huang, Helen Hailin Zhao, Xiong-ying Niu, Susan J. Ashford and Cynthia Lee
Reducing job insecurity and increasing performance ratings: Does impression management matter? Journal of Applied Psychology, 2013, Vol. 98, No. 5, 852-862
182 Laura Huang, Marcia Frideger and Jone L. Pearce
Political skill: Explaining the effects of non-native accent on managerial hiring and entrepreneurial investment decisions Journal of Applied Psychology, 2013, Vol. 98, No. 6, 1005-1017
183
Michael G. Hughes, Eric Anthony Day, Xiaoqian Wang, Matthew J. Schuelke, Matthew L. Arsenault, Lauren N. Harkrider, and Olivia D. Cooper
Learner-controlled practice difficulty in the training of a complex task: Cognitive and motivational mechanisms Journal of Applied Psychology, 2013, Vol. 98, No. 1, 80-98
184 Ryan C. Johnson and Tammy D. Allen Examining the links between employed mothers’ work characteristics, physical activity, and child health Journal of Applied Psychology, 2013, Vol. 98, No. 1, 148-157
185 Timothy A. Judge, Jessica B. Rodell, Ryan L. Klinger, Lauren S. Simon and Eean R. Crawford
Hierarchical representations of the five-factor model of personality in predicting job performance: Integrating three organizing frameworks with two theoretical perspectives
Journal of Applied Psychology, 2013, Vol. 98, No. 6, 875-925
186 Jun Liu, Cynthia Lee, Chun Hui, Ho Kwong Kwan and Long-Zeng Wu
Idiosyncratic deals and employee outcomes: The mediating roles of social exchange and self-enhancement and the moderating role of individualism
Journal of Applied Psychology, 2013, Vol. 98, No. 5, 832-840
345
187 Lisa M. Leslie, Mark Snyder and Theresa M. Glomb
Who gives? Multilevel effects of gender and ethnicity on workplace charitable giving Journal of Applied Psychology, 2013, Vol. 98, No. 1, 49-62
188 Wu Liu, Subrahmaniam Tangirala and Rangaraj Ramanujam The relational antecedents of voice targeted at different leaders Journal of Applied Psychology, 2013, Vol. 98, No. 5, 841-851
189 Julie M. McCarthy, Chad H. Van Iddekinge, Filip Lievens, Mei-Chuan Kung, Evan F. Sinar and Michael A. Campion
Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance
Journal of Applied Psychology, 2013, Vol. 98, No. 5, 701-719
190 Laurenz L. Meier and Paul E. Spector Reciprocal effects of work stressors and counterproductive work behaviour: A five-wave longitudinal study Journal of Applied Psychology, 2013, Vol. 98, No. 3, 529-539
191 Nathan T. Carter, Dev K. Dalal, Anthony S. Boyce, Matthew S. O’Connell, Mei-Chuan Kung and Kristin M. Delgado
Uncovering curvilinear relationships between conscientiousness and job performance: How theoretically appropriate measurement makes an empirical difference
Journal of Applied Psychology, 2014, Vol. 99, No. 4, 564-586
192 Song Chang, Liangding Jia, Riki Takeuchi and Yahua Cai
Do high-commitment work systems affect creativity? A multilevel combinational approach to employee creativity Journal of Applied Psychology, 2014, Vol. 99, No. 4, 665-680
193 Jinseok S. Chun and Jin Nam Choi Members’ needs, intragroup conflict, and group performance Journal of Applied Psychology, 2014, Vol. 99, No. 3, 437-450
194 Stephen H. Courtright, Amy E. Colbert and Daejeong Choi
Fired up or burned out? How developmental challenge differentially impacts leader behaviour Journal of Applied Psychology, 2014, Vol. 99, No. 4, 681-696
195 Jeroen P. de Jong, Petru L. Curseu and Roger Th. A. J. Leenders
When do bad apples not spoil the barrel? Negative relationships in teams, team performance, and buffering mechanisms Journal of Applied Psychology, 2014, Vol. 99, No. 3, 514-522
196 Lisa Dragoni, Haeseen Park, Jim Soltis and Sheila Forte-Trammell
Show and tell: How supervisors facilitate leader development among transitioning leaders Journal of Applied Psychology, 2014, Vol. 99, No. 1, 66-86
197 Lisa Dragoni, In-Sue Oh, Paul E. Tesluk, Ozias, A. Moore, Paul VanKatwyk and Joy Hazucha
Developing leaders’ strategic thinking through global work experience: The moderating role of cultural distance Journal of Applied Psychology, 2014, Vol. 99, No. 5, 867-882
198 Crystal I. C. Farh and Zhijun Chen Beyond the individual victim: Multilevel consequences of abusive supervision in teams Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1074-1095
199 David M. Fisher Distinguishing between taskwork and teamwork planning in teams: Relations with coordination and interpersonal processes Journal of Applied Psychology, 2014, Vol. 99, No. 3, 423-436
200 David M. Fisher A multilevel cross-cultural examination of role overload and organizational commitment: Investigating the interactive effects of context
Journal of Applied Psychology, 2014, Vol. 99, No. 4, 723-736
346
201 Erik Gonzalez-Mule, David S. DeGeest, Brian W. McCormick, Jee Young Seong and Kenneth G. Brown
Can we get some cooperation around here? The mediating role of group norms on the relationship between team personality and individual helping behaviors
Journal of Applied Psychology, 2014, Vol. 99, No. 5, 988-999
202 Vicente Gonzalez-Roma and Ana Hernandez Climate uniformity: Its influence on team communication quality, task conflict, and team performance Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1042-1058
203 Rebecca L. Greenbaum, Matthew J. Quade, Mary B. Mawritz, Joongseo Kim and Durand Crosby
When the customer is unethical: The explanatory role of employee emotional exhaustion onto work-family conflict, relationship conflict with coworkers, and job neglect
Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1188-1203
204 Julia E. Hoch and Steve W. J. Kozlowki Leading virtual teams: Hierarchical leadership, structural supports, and shared team leadership Journal of Applied Psychology, 2014, Vol. 99, No. 3, 390-403
205 Xu Huang, JJ Po-An Hsieh, and Wei He Expertise dissimilarity and creativity: The contingent roles of tacit and explicit knowledge sharing Journal of Applied Psychology, 2014, Vol. 99, No. 5, 816-830
206 Jaclyn M. Jensen, Pankaj C. Patel and Jana L. Raver
Is it better to be average? High and low performance as predictors of employee victimization Journal of Applied Psychology, 2014, Vol. 99, No. 2, 296-309
207 Howard J. Klein, Joseph T. Cooper, Janice C. Molloy and Jacqueline A. Swanson
The assessment of commitment: Advantages of a unidimensional, target-free approach Journal of Applied Psychology, 2014, Vol. 99, No. 2, 222-238
208 Alex Ning Li and Hui Liao How do leader-member exchange quality and differentiation affect performance in teams? An integrated multilevel dual process model
Journal of Applied Psychology, 2014, Vol. 99, No. 5, 847-866
209 Wen-Dong Li, Doris Fay, Michael Frese, Peter D. Harms and Xiang Yu Gao
Reciprocal relationship between proactive personality and work characteristics: A latent change score approach Journal of Applied Psychology, 2014, Vol. 99, No. 5, 948-965
210 Huiwen Lian, D. Lance Ferris, Rachel Morrison and Douglas J. Brown
Blame it on the supervisor or the subordinate? Reciprocal relations between abusive supervision and organizational deviance
Journal of Applied Psychology, 2014, Vol. 99, No. 4, 651-664
211 Sandy Lim and Kenneth Tai Family incivility and job performance: A moderated mediation model of psychological distress and core self-evaluation Journal of Applied Psychology, 2014, Vol. 99, No. 2, 351-359
212 Songqi Liu, Mo Wang, Hui Liao and Junqi Shi Self-regulation during job search: The opposing effects of employment self-efficacy and job search behaviour self-efficacy Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1159-1172
213 Russell A. Matthews, Julie Holliday Wayne and Michael T. Ford
A work-family conflict/ subjective well-being process model: A test of competing theories of longitudinal effects Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1173-1187
214 M. Travis Maynard, Margaret M. Luciano, Lauren D’Innocenzo, John E. Mathieu and Matthew D. Dean
Modeling time-lagged reciprocal psychological empowerment- performance relationships Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1244-1253
347
215 Timothy D. Maynes and Philip M. Podsakoff Speaking more broadly: An examination of the nature, antecedents, and consequences of an expanded set of employee voice behaviors
Journal of Applied Psychology, 2014, Vol. 99, No. 1, 87-112
216 Susan Mohammed and Sucheta Nadkarni Are we all on the same temporal page? The moderating effects of temporal team cognition on the polychronicity diversity-team performance relationship
Journal of Applied Psychology, 2014, Vol. 99, No. 3, 404-422
217 Inbal Nahum-Shani, Melanie M. Henderson, Sandy Lim and Amiram D. Vinokur
Supervisor support: Does supervisor support buffer or exacerbate the adverse effects of supervisor undermining? Journal of Applied Psychology, 2014, Vol. 99, No. 3, 484-503
218 Christopher D. Nye, Bradley J. Brummel and Fritz Drasgow
Understanding sexual harassment using aggregate construct models Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1204-1221
219 Jerel E. Slaugher, Daniel M. Cable and Daniel B. Turban
Changing job seekers’ image perceptions during recruitment visits: The moderating role of belief confidence Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1146-1158
220 Gergana Todorova, Julia B. Bear and Laurie R. Weingart
Can conflict be energizing? A study of task conflict, positive emotions, and job satisfaction Journal of Applied Psychology, 2014, Vol. 99, No. 3, 451-467
221 Prajya R. Vidyarthi, Berrin Erdogan, Smriti Anand, Robert C. Liden and Anjali Chaudhry
One member, two leaders: Extending leader-member exchange theory to a dual leadership context Journal of Applied Psychology, 2014, Vol. 99, No. 3, 468-483
222 David D. Walker, Danielle D. van Jaarsveld and Daniel P. Skarlicki
Exploring the effects of individual customer incivility encounters on employee incivility: The moderating roles of entity (In)civility and negative affectivity
Journal of Applied Psychology, 2014, Vol. 99, No. 1, 151-161
223 Kai Chi Yam, Ryan Fehr and Christopher M. Barnes
Morning employees are perceived as better employees: Employees’ start times influence supervisor performance ratings Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1288-1299
224 Craig Wallace, Gilad Chen A Multilevel Integration Of Personality, Climate, Self-Regulation, And Performance Personnel Psychology, 2006, 59, 529–557
225 Michael Mount, Remus Ilies, Erin Johnson Relationship of personality traits and counterproductive work behaviors: The mediating effects of job satisfaction Personnel Psychology, 2006, 59, 591–622
226 David a. Hofmann, barbara mark An investigation of the relationship Between safety climate and medication Errors as well as other nurse And patient outcomes Personnel Psychology, 2006, 59, 847–869
227 Patrick f. Mckay, derek r. Avery, Scott Tonidandel, Mark a. Morris, Morela Hernandez, Michelle r. Hebl
Racial differences in employee retention: are diversity climate perceptions the key? PERSONNEL PSYCHOLOGY, 2007, 60, 35–62
228 Erich c. Dierdorff, eric a. Surface Placing Peer Ratings In Context: Systematic Influences Beyond Ratee Performance PERSONNEL PSYCHOLOGY, 2007, 60, 93–126
348
229 Peng wang, fred o. Walumbwa Family-Friendly Programs, Organizational Commitment, And Work Withdrawal: The Moderating Role Of Transformational Leadership
PERSONNEL PSYCHOLOGY, 2007, 60, 397–427
230 Fred luthans and bruce j. Avolio, james b. Avey, steven m. Norman,
Positive psychological capital: Measurement and relationship with Perfor mance and satisfaction PERSONNEL PSYCHOLOGY, 2007, 60, 541–572
231 Hao zhao, sandy j. Wayne, brian c. Glibkowski, jesus bravo
The impact of psychological contract Breach on work-related outcomes: PERSONNEL PSYCHOLOGY, 2007, 60, 647–680
232 A meta-analysis
233 Colin m. Gill, gerard p. Hodgkinson Development and validation of the Five-factor model questionnaire : an adjectival-based personality inventory for use in occupational settings
PERSONNEL PSYCHOLOGY, 2007, 60, 731–766
234 Mel fugate, angelo j. Kinicki, gregory e. Prussia
Employee coping with organizational Change: an examination of alternative Theoretical perspectives and models PERSONNEL PSYCHOLOGY, 2008, 61, 1–36
235 Paul j. Taylor, wen-dong li, kan shi, walter c. Borman The transportability of job information Across countries PERSONNEL PSYCHOLOGY, 2008, 61, 69–111
236 Michael k. Mount, in-sue oh, and melanie burns
Incremental validity of perceptual Speed and accuracy over general Mental ability PERSONNEL PSYCHOLOGY, 2008, 61, 113–139
237 Jeffery a. Lepine, ronald f. Piccolo, christine l. Jackson, john e, jessica r. Saulmathieu,
A meta-analysis of teamwork processes: Tests of a multidimensional model And relationships with team Effectiveness criteria
PERSONNEL PSYCHOLOGY, 2008, 61, 273–307
238 Lisa h. Nishii, david p. Lepak, benjamin Schneider
Employee attributions of the “why” Of hr practices: their effects on Employee attitudes and behaviors, And customer satisfaction PERSONNEL PSYCHOLOGY, 2008, 61, 503–545
239 Ronald Bledow And Michael Frese A situational judgment test of personal Initiative and its relationship To performance PERSONNEL PSYCHOLOGY, 2009, 62, 229–258
240 Jixia yang, james m. Diefendorff The Relations Of Daily Counterproductive Workplace Behavior With Emotions, Situational Antecedents, And Personality Moderators: A Diary Study In Hong Kong
PERSONNEL PSYCHOLOGY, 2009, 62, 259–295
241 Janet h. Marler, sandra l. Fisher, weiling ke Employee self-service technology Acceptance: a comparison Of pre-implementation and Post-implementation relationships PERSONNEL PSYCHOLOGY, 2009, 62, 327–358
242 Herman mark d. Mazurkiewiczaguinis, eric d. Heggestad
Using web-based frame-of-reference Training to decrease biases in Personality-based job analysis: An experimental field study PERSONNEL PSYCHOLOGY, 2009, 62, 405–438
349
243 Chad h. Van iddekinge, gerald r. Ferris, tonia s. Heffner
Test of a multistage model of distal And proximal antecedents Of leader performance PERSONNEL PSYCHOLOGY, 2009, 62, 463–495
244 Daniel b. Turban, cynthia k. Stevens Effects of conscientiousness and Extraversion on new labor market Entrants.’ job search: the mediating role of metacognitive activities and positive emotions
PERSONNEL PSYCHOLOGY, 2009, 62, 553–573
245 Brian j. Hoffman, david j. Woehr Disentangling the meaning of multisource Performance rating source and dimension Factors PERSONNEL PSYCHOLOGY, 2009, 62, 735–765
246 Chih-Hsun Chuang, Hui Liao Strategic Human Resource Management In Service Context: Taking Care Of Business By Taking Care Of Employees And Customers
PERSONNEL PSYCHOLOGY, 2010, 63, 153–196
247 Connie r. Wanberg, zhen zhang, erica w. Diehn
Development of the “getting ready for Your next job” inventory for unemployed Individuals PERSONNEL PSYCHOLOGY, 2010, 63, 439–478
248 Gary j. Greguras, james m. Diefendorff Why does proactive personality predict Employee life satisfaction and work Behaviors? A field investigation of the Mediating role of the self-concordance model
PERSONNEL PSYCHOLOGY, 2010, 63, 539–560
249 Mar´Ia Del Carmen Triana, Mar´Ia Fernanda Garc´Ia, Adrienne Colella
Managing diversity: how organizational Efforts to support diversity moderate the effects of perceived racial discrimination On affective commitment
PERSONNEL PSYCHOLOGY, 2010, 63, 817–843
250 Dawn S. Carlson, Merideth Ferguson, Pamela L. Perrewe and Dwayne Whitten
The fallout from abusive supervision: An examination of subordinates and their partners Personnel Psychology, 2011, 64, 937-961
251 Shoshana R. Dobrow and Jennifer Tosti-Kharas Calling: The development of a scale measure Personnel Psychology, 2011, 64, 1001-1049
252 Lisa Dragoni, In-Sue Oh, Paul Vankatwyk and Paul E. Tesluk
Developing executive leaders: The relative contribution of cognitive ability, personality, and the accumulation of work experience in predicting strategic thinking competency
Personnel Psychology, 2011, 64, 829-864
253 J. Robert Baum, Barbara Jean Bird and Sheetal Singh
The practical intelligence of entrepreneurs: Antecedents and a link with new venture growth Personnel Psychology, 2011, 64, 397-425
254 Sean T. Hannah, Fred O. Walumbwa and Louis W. Fry
Leadership in action teams: Team leader and members’ authenticity, authenticity strength, and team outcomes Personnel Psychology, 2011, 64, 771-802
255 Theresa M. Glomb, Devasheesh P. Bhave, Andrew G. Miner and Melanie Wall
During good, feeling good: Examining the role of organizational citizenship behaviors in changing mood Personnel Psychology, 2011, 64, 191-223
256 Brian J. Hoffman, Klaus G. Melchers, Carrie A. Blair, Martin Kleinmann and Robert T. Ladd
Exercises and dimensions are the currency of assessment centers Personnel Psychology, 2011, 64, 351-395
257 Jason L. Huang and Ann Marie Ryan Beyond personality traits: A study of personality states and situational contingencies in customer service jobs Personnel Psychology, 2011, 64, 451-488
350
258 Brian K. Griepentrog, Crystal M. Harold, Brian C. Holtz, Richard J. Klimoski and Sean M. Marsh
Integrating social identity and the theory of planned behaviour: Predicting withdrawal from an organizational recruitment process
Personnel Psychology, 2012, 65, 723-753
259 Scott B. Mackenzie, Philip M. Podsakoff, Nathan P. Podsakoff
Challenge-oriented organizational citizenship behaviors and organizational effectiveness: Do challenge-oriented behaviors really have an impact on the organization’s bottom line?
Personnel Psychology, 2011, 64, 559-592
260 Shaul Oreg and Yair Berson Leadership and employees’ reactions to change: The role of leaders’ personal attributes and transformational leadership style Personnel Psychology, 2011, 64, 627-659
261 Suzanne J. Peterson, Fred Luthans, Bruce J. Avolio, Fred O. Walumbwa and Zhen Zhang
Psychological capital and employee performance: A latent growth modelling approach Personnel Psychology, 2011, 64, 427-450
262 Christopher R. Plouffe and Yany Gregoire Intraorganizational employee navigation and socially derived outcomes: Conceptualization, validation, and effects on overall performance
Personnel Psychology, 2011, 64, 693-738
263 John J. Sumanth and Daniel M. Cable Status and organizational entry: How organizational and individual status affect justice perceptions of hiring systems Personnel Psychology, 2011, 64, 963-1000
264 Fred O. Walumbwa, Russell Cropanzano and Barry M. Goldman
How leader-member exchange influences effective work behaviors: Social exchange and internal-external efficacy perspectives
Personnel Psychology, 2011, 64, 739-770
265 Mo Wang and Elizabeth Mccune Understanding newcomers’ adaptability and work-related outcomes: Testing the mediating roles of perceived P-E fit variables
Personnel Psychology, 2011 64, 163-189
266 Riki Takeuchi, Zhijun Chen and Siu Yin Cheung
Applying uncertainty management theory to employee voice behaviour: An integrative investigation Personnel Psychology, 2012, 65, 283-323
267 Subrahmaniam Tangirala and Rangaraj Ramanujam
Ask and you shall hear (but not always): Examining the relationship between manager consultation and employee voice Personnel Psychology, 2012, 65, 251-282
268 Belle Rose Ragins, Jorge A. Gonzalez, Kyle Ehrhardt and Romila Singh
Crossing the threshold: The spillover of community racial diversity and diversity climate to the workplace Personnel Psychology, 2012, 65, 755-787
269 Mary Bardes Mawritz, David M. Mayer, Jenny M. Hoobler, Sandy J. Wayne and Sophia V. Marinova
A trickle-down model of abusive supervision Personnel Psychology, 2012, 65, 325-357
270 Celia Moore, James R. Detert, Linda Klebe Trevino, Vicki L. Baker and David M. Mayer
Why employees do bad things: Moral disengagement and unethical organizational behaviour Personnel Psychology, 2012, 65, 1-48
351
271 Brian J. Hoffman, C. Allen Gorman, Carrie A. Blair, John P. Meriac, Benjamin Overstreet and E. Kate Atchley
Evidence for the effectiveness of an alternative multisource performance rating methodology Personnel Psychology, 2012, 65, 531-563
272 Yuanyuan Huo, Wing Lam, Ziguang Chen Am I the only one this supervisor is laughing at? Effects of aggressive humor on employee strain and addictive behaviors Personnel Psychology, 2012, 65, 859-885
273 Suzanne J. Peterson, Benjamin M. Galvin and Donald Lange
CEO servant leadership: Exploring executive characteristics and firm performance Personnel Psychology, 2012, 65, 565-596
274 Myeong-Gu Seo, M. Susan Taylor, N. Sharon Hill, Xiaomeng Zhang, Paul E. Tesluk and Natalia M. Lorinkova
The role of affect and leadership during organizational change Personnel Psychology, 2012, 65, 121-165
275 Sabine Sonnentag and Adam M. Grant Doing good at work feels good at home, but not right away: When and why perceived prosocial impact predicts positive affect
Personnel Psychology, 2012, 65, 495-530
276 Stanley M. Gully, Jean M. Phillips, William G. Castellano, Kyongji Han and Andrea Kim
A mediated moderation model of recruiting socially and environmentally responsible job applicants Personnel Psychology, 2013, 66, 935-973
277 Derek R. Avery, Mo Wang, Sabrina D. Volpone and Le Zhou
Different strokes for different folks: The impact of sex dissimilarity in the empowerment-performance relationship Personnel Psychology, 2013, 66, 757-784
278 Erik R. Eddy, Scott I. Tannenbaum and John E. Mathieu
Helping teams to help themselves: Comparing two team-led debriefing methods Personnel Psychology, 2013, 66, 975-1008
279 Alicia A. Grandey, Nai-Wen Chi and Jennifer A. Diamond
Show me the money! Do financial rewards for performance enhance or undermine the satisfaction from emotional labor? Personnel Psychology, 2013, 66, 569-612
280 Angelo J. Kinicki, Kathryn J. L. Jacobson, Suzanne J. Peterson and Gregory E. Prussia
Development and validation of the performance management behaviour questionnaire Personnel Psychology, 2013, 66, 1-45
281 Ning Li, Dan S. Chiaburu, Bradley L. Kirkman and Zhitao Xie
Spotlight on the followers: An examination of moderators of relationships between transformational leadership and subordinates’ citizenship and taking charge
Personnel Psychology, 2013, 66, 225-260
282 Thomas W. H. Ng and Daniel C. Feldman Changes in perceived supervisor embeddedness: Effects on employees’ embeddedness, organizational trust, and voice behaviour
Personnel Psychology, 2013, 66, 645-685
283 Gera Noordzij, Edwin A. J. Van Hooft, Heleen Van Mierlo, Arian Van Dam and Marise Ph. Born
The effects of a learning-goal orientation training on self-regulation: A field experiment among unemployed job seekers Personnel Psychology, 2013, 66, 723-755
284 Robert S. Rubin, Erich C. Dierdorff and Daniel G. Bachrach
Boundaries of citizenship behaviour: Curvilinearity and context in the citizenship and task performance relationship Personnel Psychology, 2013, 66, 377-406
352
285 Deborah E. Rupp, Ruodan Shao, Meghan A. Thornton and Daniel P. Skarlicki
Applicants’ and employees’ reactions to corporate social responsibility: The moderating effects of first-party justice perceptions and moral identity
Personnel Psychology, 2013, 66, 895-933
286 Daniel B. Turban, Felissa K. Lee, Serge P. Da Motta Veiga, Dana L. Haggard and Sharon Y. Wu
Be happy, don’t wait: The role of trait affect in job search Personnel Psychology, 2013, 66, 483-514
287 Devasheesh P. Bhave The invisible eye? Electronic performance monitoring and employee job performance Personnel Psychology, 2014, 67, 605-635
288 Stephan A. Boehm, Florian Kunze and Heike Bruch
Spotlight on age-diversity climate: The impact of age-inclusive HR practices on firm-level outcomes Personnel Psychology, 2014, 67, 667-704
289 Wendy R. Boswell, Julie B. Olson-Buchanan and T. Brad Harris
I cannot afford to have a life: Employee adaptation to feelings of job insecurity Personnel Psychology, 2014, 67, 887-915
290 Amy E. Colbert, Murray R. Barrick and Bret H. Bradley
Personality and leadership composition in top management teams: Implications for organizational effectiveness Personnel Psychology, 2014, 67, 351-387
291 Hong Deng and Kwok Leung Contingent punishment as a double-edged sword: A dual-pathway model from a sense-making perspective Personnel Psychology, 2014, 67, 951-980
292 Graham Brown, Craig Crossley and Sandra L. Robinson
Psychological ownership, territorial behaviour, and being perceived as a team contributor: The critical role of trust in the work environment
Personnel Psychology, 2014, 67, 463-485
293 T. Brad Harris, Ning Li, Wendy R. Boswell, Xin-An Zhang and Zhitao Xie
Getting what’s new from newcomers: Empowering leadership, creativity, and adjustment in the socialization context Personnel Psychology, 2014, 67, 567-604
294 Dong Liu, Morela Hernandez and Lei Wang The role of leadership and trust in creating structural patterns of team procedural justice: A social network investigation Personnel Psychology, 2014, 67, 801-845
295 Jean M. Phillips, Stanley M. Gully, John E. McCarthy, William G. Castellano and Mee Sook Kim
Recruiting global travellers: The role of global travel recruitment messages and individual differences in perceived fit, attraction, and job pursuit intentions
Personnel Psychology, 2014, 67, 153-201
296 Belle Rose Ragins, Karen S. Lyness, Larry J. Williams and Doan Winkel
Life spillovers: The spillover of fear of home foreclosure to the workplace Personnel Psychology, 2014, 67, 763-800
297 B. Sebastian Reiche, Pablo Cardona, Yin-teen Lee, Miguel Angel Canela, Esther Akinnukawe , et al.
Why do managers engage in trustworthy behaviour? A multilevel cross-cultural study in 18 countries Personnel Psychology, 2014, 67, 61-98
298 Hong Ren, Margaret A. Shaffer, David A. Harrison, Carmen Fu and Katherine M. Fodchuk
Reactive adjustment or proactive embedding? Multistudy, multiwave evidence for dual pathways to expatriate retention Personnel Psychology, 2014, 67, 203-239
353
299 Ruodan Shao and Daniel P. Skarlicki Service employees’ reactions to mistreatment by customers: A comparison between North America and East Asia Personnel Psychology, 2014, 67, 23-59
300 Jerel E. Slaughter, Michael S. Christian, Nathan P. Podsakoff, Evan F. Sinar and Filip Lievens
On the limitations of using situational judgement tests to measure interpersonal skills: The moderating influence of employee anger
Personnel Psychology 2014, 67, 847-885
301 Jeffrey R. Spence, Douglas J. Brown, Lisa M. Keeping and Huiwen Lian
Helpful today, but not tomorrow? Feeling grateful as a predictor of daily organizational citizenship behaviors Personnel Psychology, 2014, 67, 705-738
354
17 REFERENCES
Abraham R. (1999) Emotional intelligence in organisations: a conceptualization.
Genetic, Social and General Psychology Monographs 125, 209–227.
Akaike, H. (1973). Information theory and an extension of the maximum
likelihood principle. In B. N. Petrov & B. F. Csaki (Eds.), Second International
Symposium on Information Theory, (pp. 267-281). Academiai Kiado: Budapest.
Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis.
Proceedings of the Third Berkeley Symposium on Mathematical Statistics and
Probability (pp. 111-150). Berkeley: University of California Press.
Australian Government. (Compare, 2011). Work ability—Professor Juhani
Ilmarinen. Retrieved 1 March, 2013, from
http://www.comcare.gov.au/news__and__media/news_listing/work_abilityprofessor_juh
ani_ilmarinen
Avolio, B. J., Yammarino, F. J., & Bass, B. M. (1991). Identifying common
methods variance with data collected from a single source: An unresolved sticky issue.
Journal of management, 17, 571-586.
Bagozzi, R. P. (1977). Structural equation models in experimental research.
Journal of Marketing Research, 14, 209-226.
Bagozzi, R. P. (1980). Causal Modeling in Marketing, Wiley & Sons,. New
York, NY.
355
Bagozzi, R. P., & Yi, Y. (1990). Assessing method variance in multitrait-
multimethod matrices: The case of self-reported affect and perceptions at work. Journal
of Applied Psychology, 75(5), 547-560.
Bagozzi, R. P., & Yi, Y. (1991). Multitrait-Multimethod matrices in consumer
research. Journal of Consumer Research, 17(4), 426-439.
Barclay, D., Higgins, C. and Thompson, R. (1995). The Partial Least Squares
(PLS) Approach to Causal Modeling: Personal Computer Adoption and Use an
Illustration, Technology Studies, 2, 285-309.
Becker, J.M., Klein, K., Wetzels, M., (2012). Hierarchical latent variable models
in PLS-SEM: guidelines for using reflective-formative type models, Long Range
Planning 45 (6), 359-394.
Bentler, P. M. (1968). Alpha-maximized factor analysis (Alphamax): Its relation
to alpha and canonical factor analysis. Psychometrika, 33, 335-345.
Bentler, P. M. (1972). A lower-bound method for the dimension-free
measurement of internal consistency. Social Science Research, 1, 343-357.
Bentler, P. M. (1986). Structural modeling and psychometrika: An historical
perspective on growth and achievements. Psychometrika, 51(1), 35-51
Bentler, P. M. (2006). EQS 6 structural equations program manual. Encino, CA:
Multivariate Software (www.mvsoft.com). Los Angeles.
356
Bentler, P. M. (2007). Covariance structure models for maximal reliability of
unit-weighted composites. In S. –Y. Lee (Ed.), Handbook of latent variable and related
models (pp. 1-19). Amsterdam: North-Holland.
Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal
consistency reliability. Psychometrika, 74, 137-143.
Bentler, P. M. (2014). Covariate-free and Covariate-dependent Reliability. The
79th Annual Meeting of the Psychometric Society. Madison, Wisconsin. July 22-25.
Blakeley, J. A. & Ribeiro, V. E. S. (2008). Early retirement among registered
nurses: Contributing factors. Journal of Nursing Management, 16, 29-37.
Blalock, H. M. (1971). Causal models in the social sciences. Chicago: Aldine-
Atherton.
Bock, R. D., and Aitkin, M. (1981). Marginal maximum likelihood estimation of
item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.
Bollen, K. A. (1984). Multiple indicators: Internal consistency or no necessary
relationship? Quality and Quantity, 18, 377-385.
Bollen, K. A. (1989). Structural Equations with Latent Variables, New York:
John Wiley & Sons.
Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A
structural equation perspective. Psychological Bulletin, 110, 305-314.
357
Boumans, N. P. G., de Jong, A. H. J., & Vanderlinden, L. (2008). Determinants
of early retirement intentions among Belgian nurses. Journal of Advanced Nursing,
63,(1), 64-74.
Brooke, E., Goodall, J., & Mawren, D. (2010). Retaining older workforces in
aged care work. Paper presented at the 4th International Symposium on Work Ability:
Age Management during the Life Course, pp. 187-197, Tampere, Finland.
Browne, M. W. (1968). A comparison of factor analytic techniques.
Psychometrika, 33, 267-334.
Buck, R., Varnava, A., Wynne-Jones, G., Phillips. C., Farewell, D., Porteous, C.,
Webb, K., Buttton, L., Cooper, L., & Main, C. (2008). Health and well-being in work in
Merthyr Tydfill: A biopsychosocial approach. Well-being in Work Stage 2: Final Report
to the Wales Centre for Health and Welsh Assembly Government.
www.wellbeinginwork.org
Burnham, K. P. & Anderson, D. R. (2004). Multimodel Inference Understanding
AIC and BIC in Model Selection. Sociological Methods & Research, 33 (2).
BWA Centre for Research. (2007a). The redesigning work for an ageing society
project: Fact Sheet 1. Retrieved 2 March, 2013, from
http://www.swinburne.edu.au/business/business-work-
ageing/documents/ARC_FactSheet1_16Mar07.pdf
358
BWA Centre for Research. (2007b). What is work ability?: Fact Sheet 2.
Retrieved 2 March, 2013, from http://www.swinburne.edu.au/business/business-work-
ageing/documents/ARC_FactSheet2_10Sep07.pdf
Byrne, B. M. (2006). Structural equation modeling with EQS: Basic concepts,
application, and programming. New Jersey: Lawrence Elbaum Associates.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation
by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.
Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. M. (2006).
Measurement Error in Nonlinear Models: A Modern Perspective, 2nd ed. Chapman and
Hall/CRC Press: Boca Raton.
Cattell, R. B. (1978). The scientific use of factor analysis in behavioural and life
sciences. New York: Plenum.
Cheung, G. W. & Rensvold, R. B. (2002): Evaluating Goodness-of-Fit Indexes
for Testing Measurement Invariance, Structural Equation Modeling: A Multidisciplinary
Journal, 9(2), 233-255.
Cheung, M. W. (2008). A Model for Integrating Fixed-, Random-, and Mixed-
Effects Meta-Analyses Into Structural Equation Modeling. Psychological Methods,
13(3), 182–202.
Chin, W. W., & Newsted, P. R. (1999). Structural equation modeling analysis
with small samples using partial least squares. In Hoyle, R. (Ed.), Statistical strategies
for small samples research (pp. 307–341). Thousand Oaks, CA: Sage.
359
Chin, W.W. (1998). The partial least squares approach to structural equation
modeling. In: Marcoulides, G.A. (Ed.), Modern Methods for Business Research.
Erlbaum, Mahwah, pp. 295e358.
Chin, W.W., Marcolin B.L., & Newsted, P.R. (2003) A partial least squares
latent variable modeling approach for measuring interaction effects. Results from a
Monte Carlo simulation study and an electronic-mail emotion/adoption study. Inf Syst
Res 14(2):189–217
Chin, W. W. (2010). How to write up and report PLS analyses. In Esposito, V., et
al. (eds.),. Handbook of Partial Least Squares, pp 655-690.
Ciarrochi J., Chan A.Y.C. & Caputi P. (2000) A critical evaluation of the
emotional intelligence concept. Personality and Individual Differences 28, 1477–1490.
Copertano, A., Bevilacqua, G., Barbaresi, M., Barchiesi, F., & Copertano, B.
(2010). Work-related stress: Risk assessment in the local regional health service unit of
Ancona [La valutazione dello stress lavoro-correlato nell’azienda sanitaria di Ancona].
Giornale Italiano di Medicina del Lavoro ed Ergonomia, 29(4), 128-129.
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and
applications. Psychological Bulletin, 78, 98-104.
Costner, H. L. (1971). Utilizing causal models to discover flaws in experiments.
Sociometry, 34, 398-410.
Cox T, Thirlaway M, Gotts G, Cox S. (1983). The nature and assessment of
general wellbeing. Journal of Psychosomatic Research, 27, 353-359.
360
Cox T. (1997). Workplace health promotion. Work & Stress, 11, 1-5.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test
theory. New York: Holt, Rinehart, and Winston.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.
Psychometrika, 16, 297-334.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological
tests. Psychological Bulletin, 52, 281-302.
Crossley, C. D., Bennett, R. J., Jex S. M. , & Burnfield, J. L. (2007).
Development of a global measure of job embeddedness and integration into a traditional
model of voluntary turnover. Journal of Applied Psychology. 92(4),1031-1142.
Cunningham, E. (2008). A practical guide to Structural Equation Modeling using
AMOS. Melbourne: Statsline.Daws, J. (2012). Finnish history of work ability research
and age management Retrieved 6 March, 2013, from
http://www.ngssuper.com.au/assets/Images/Supermembers/NGS-SA-2011-12-
WinnerJimDaws-1102-1012.pdf
D’Errico, A., Viotti, S., Baratti, A., Mottura, B., Barocelli, A.P., Tagna, M.,
Sgambelluri, B., Battaglino, P., & Converso, D. (2013). Low back pain and associated
presenteeism among hospital nursing staff. Journal of Occupational Health, 55, 276-
283.
361
de Zwart, B. C., Frings-Dresen, M. H., & van Duivenbooden, J. C. (2002). Test–
retest reliability of the Work Ability Index questionnaire. Occupational Medicine, 52(4),
177-181.
DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological
Methods, 2, 292-307.
DeMars, C. (2010). Item Response Theory. New York: Oxford University Press.
Diamantopoulos, A, & Winklhofer H. (2001). Index Construction with Formative
Indicators: An Alternative to Scale Development. Journal of Marketing Research, 38
(2), 269-277.
Diamantopoulos, A. (2010). Reflective and Formative Metrics of Relationship
Value: Response to Baxter’s Commentary Essay, Journal of Business Research, 63(1),
91-93.
Diamantopoulos, A. and Winklhofer, H. M. (2001), "Index Construction with
Formative Indicators: An Alternative to Scale Development." Journal of Marketing
Research, 38, 269-277.
Diamantopoulos, A., Riefler, P., and Roth, K. P. (2008). Advancing Formative
Measurement Models, Journal of Business Research, 61(12), pp. 1203-1218.
Donaldson, S. I., & Grant-Vallone, E. J. (2002). Understanding self-report bias in
organizational behavior research. Journal of Business and Psychology, 17(2), 245-260.
Doty, D. H., & Glick, W. H. (1998). Common method bias: Does common
methods variance really bias results? Organizational Research Methods, 1(4), 374-406.
362
Edwards, J. R., and Bagozzi, R. P. (2000). On the Nature and Direction of
Relationships Between Constructs and Measures, Psychological Methods (95:2), 155-74.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap.
Monographs on Statistics and Applied Probability, no. 57. New York, NY: Chapman
and Hall.
Eskelinen, L., Kohvakka, A., Merisalo, T., Hurri, H., & Wagar, G. (1991).
Relationship between the self-assessment and clinical assessment of health status and
work ability. Scand J Work Environ Health, 17(Suppl 1), 40-47.
European Network for Workplace Health Promotion (ENWHP) & National
WORK ABILITY INDEX (WAI) Network. (2012). Work Ability Index - Europe.
Retrieved 27 Feb, 2013, from http://www.thcu.ca/workplace/sat/pubs/tool_159.pdf
Faragher, E. B., Cooper CL, & Cartwright S. (2004). A shortened stress
evaluation tool (ASSET). Stress and Health, 20, 189-201.
Finnish Institute of Occupational Health. (2011). Multidimensional work ability
model. Helsinki, Finland.
Fochsen, G., Josephson, M., Hagberg, M., Toomingas, A., & Lagerström, M.
(2006). Predictors of leaving nursing care: a longitudinal study among Swedish nursing
personnel. Occupational and Environmental Medicine, 63(3), 198-201.
doi:10.1136/oem.2005.021956
363
Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with
unobservable variables and measurement error. Journal of Marketing Research, 18, 39–
50.
Fornell, C., and Bookstein, F. L. (1982). A Comparative Analysis of Two
Structural Equation Models: LISREL and PLS Applied to Market Data, in C. Fornell
(ed.), A Second Generation of Multivariate Analysis, New York: Praeger, 289-324.
Frisch, R. (1934). Statistical confluence analysis by means of complete
regression systems. Oslo: Oslo University.
Frisch, R. and Waugh, F. (1933). Partial Time Regressions as Compared with
Individual Trends. Econometrica, 1 (4), 387-401.
Ganster, D. C., Hennessey, H. W., & Luthans, F. (1983). Social desirability
response effects: Three alternative models. The Academy of Management Journal 26(2),
321-331.
Geladi, P. (1988) "Notes on the History and Nature of Partial Least Squares
(PLS) Modeling", Journal of Chemometrics, 2, 231-246
Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale
development incorporating unidimensionality and its assessment. Journal of Marketing
Research, 25, 186-192.
Gignac, G. E. (2007). Multifactor modeling in individual differences research:
Some recommendations and suggestions. Personality and Individual Differences, 42, 37-
48.
364
Gignac, G. E. (2008). Higher-order models versus direct hierarchical modes: g as
superordinate or breadth factor? Psychology Science, 50, 21-43.
Gignac, G. E. (2013). Modeling the Balanced Inventory of Desirable
Responding: Evidence in favour of a revised model of socially desirable responding.
Journal of Personality Assessment, 95, 645-656.
Gignac, G. E., & Watkins, M. W. (2013). Bifactor modeling and the estimation
of model-based reliability in the WORK ABILITY INDEX (WAI)S-IV. Multivariate
Behavioral Research, 48, 639-662.
Gilbreath, B., & Frew, E. J. (2008). The stress-related presenteeism scale.
Colorado State University - Pueblo, Hasan School of Business, Colorado State
University – Pueblo, Pueblo, CO.
Glick, W. H., Jenkins, G. D., Jr., & Gupta, N. (1986). Method versus substance:
How strong are underlying relationships between job characteristics and attitudinal
outcomes? Academy of Management Journal, 29(3), 441-464.
Goldberger, A. S. (1971). Econometrics and psychometrics: A survey of
communalities. Psychometrika, 36, 83-107.
Goldberger, A. S., & Duncan, O. D. (Eds.). (1973). Structural equation models in
the social sciences. New York.
Gould, R., Ilmarinen, J., Järvisalo, J., & Koskinen, S. (Eds.). (2008). Dimensions
of Work Ability: Results of the Health 2000 Survey. Helsinki, Finland.
365
Grandey, A. A. (2000). Emotion regulation in the workplace: a new way to
conceptualize emotional labor. Journal of Occupational Health Psychology 5, 95-110.
Green, S. B., & Hershberger, S. L. (2000). Correlated errors in true score models
and their effect on coefficient alpha. Structural Equation Modeling, 7, 251-270.
Green, S. B., & Yang, Y. (2009a). Commentary on coefficient alpha: A
cautionary tale. Psychometrika, 74, 121-135.
Green, S. B., & Yang, Y. (2009b). Reliability of summed item scores using
structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74,
155-167.
Green, S.B. & Hershberger, S.L. (2000). Correlated errors in true score models
and their effect on coefficient alpha. Structural Equation Mode1ing, 7, 251-270.
Griffiths, A., Cox, T., Karanika, M., Khan, S., & Tomas, J.M. (2006). Work
design and management in the manufacturing sector: development and validation of the
work organisation assessment questionnaire. Occupational and Environmental Medicine,
63, 669-675.
Guo, K.H., Yuan, Y., Archer, N.P., & Connelly, C.E. (2011). Understanding
nonmalicious security violations in the workplace: A composite behavior model. Journal
of Management Information Systems, 28(2), 203-236.
Gustafsson, J. E., & Balke, G. (1993). General and specific abilities as predictors
of school achievement. Multivariate Behavioral Research, 28, 407–434.
366
Guttman, L. (1952). Multiple group methods for common factor analysis: Their
basis, computation, and interpretation. Psychometrika, 17, 209-222.
Guttman, L. A. (1945). A basis for analyzing test-retest reliability.
Psychometrika, 10, 255-282.
Haavelmo, T. (1943). The Statistical Implications of a System of Simultaneous
Equations. Econometrica, 11, 1-12.
Hair, J. F. Hult,G., Ringle, C. and Sarstedt M. (2014). A primer on partial least
squares structural equation modeling (PLS-SEM). SAGE Publications, Thousand Oaks,
Calif.
Hair, J. F. Hult,G., Ringle, C. and Sarstedt M. (2014). A primer on partial least
squares structural equation modeling (PLS-SEM). SAGE Publications, Thousand Oaks,
Calif.
Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data
Analysis, seventh ed. Prentice Hall, Englewood Cliffs.
Hair, J.F., Ringle, C.M., & Sarstedt, M., 2011. PLS-SEM: indeed a silver bullet,
Journal of Marketing Theory and Practice 19 (2), 139-151.
Hasselhorn, H-M., Muller, B.H., & Tackenberg, P. (2005, July). NEXT Scientific
Report. Retrieved 22 September 2015 from
http://www.econbiz.de/archiv1/2008/53602_nurses_work_europe.pdf
367
Hauser, R. M., and Goldberger, A. S. (1971). The Treatment of Unobservable
Variables in Path Analysis. Chapter 4 in Sociological Methodology, edited by H.L.
Costner. San Francisco: Jossey-Bass.
Health and Safety Executive Guidelines (2010). Retrieved from:
http://www.hse.gov.uk/guidance/ on 12/10/2011.
Heise, D. R., & Bohrnstedt, G. W. (1970). Validity, invalidity, and reliability. In
E. F. Borgatta (Ed.), Sociological methodology 1970 (pp. 104-129). San Francisco:
Jossey-Bass.
Heise, D.R. (1972). Employing nominal variables, induced variables, and block
variables in path analysis. Social Methods Research, 1, 147–173.
Hendry, D. F., and Morgan. M. (1989). A Re-Analysis of Confluence Analysis.
Oxford Economic Papers. 41, 35-52.
Holtom, B. C., Mitchell, T. R., & Lee, T. W. (2006). Increasing human and
social capital by applying job embeddedness theory. Organizational Dynamics, 35(4),
316–331.
Holzinger, K. J. (1941). Factor Analysis. Chicago: University of Chicago Press.
Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika,
2, 41-54.
Hotelling, H. (1933). Analysis of a Complex of Statistical Variables into
Principal Components. Journal of Educational Psychology, 24, 498-520.
368
Howell, R.D., Breivik, E., & Wilcox, J.B. (2007). Reconsidering formative
measurement. Psychological Methods, 12, 205–218.
Hoyt, C. (1941). Test Reliability Estimated by Analysis of Variance,
Psychometrika, 6, 153-160.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance
structure analysis: Conventional criteria versus new alternatives. Structural Equation
Modeling, 6, 1-55.
Ilmarinen, J. (2003). Work Ability Index: a tool for occupational health research
and practise. Paper presented at the 11th Annual EUPHA meeting, Rome, Italy.
Ilmarinen, J. (2007). The Work Ability Index (WAI). Occupational Medicine,
57, 160.
Ilmarinen, J. (2009). Work ability: a comprehensive concept for occupational
health research and prevention; Editorial. Scand J Work Environ Health, 35(1), 1-5.
Ilmarinen, J. (2010). 30 years’ work ability and 20 years’ age management.
Paper presented at the 4th International Symposium on Work Ability: Age Management
during the Life Course, pp. 12-22, Tampere, Finland.
Ilmarinen, J., & Tuomi, K. (2004). Past, present and future of work ability.
Helsinki, Finland: Finnish Institute of Occupational Health.
Ilmarinen, J., Tuomi, K., & Klockars, M. (1997). Changes in the work ability of
active employees as measured by the work ability index over an 11-year period. Scand J
Work Environ Health 23(Suppl 1), 49-57.
369
Ilmarinen, J., Tuomi, K., & Seitsamo, J. (June 2005). New dimensions of work
ability. International Congress Series, 1280, 3-7.
Irwin, J. O. (1935). On the indeterminacy in the estimate of g. British Journal of
Psychology, 25, 393-394.
Jackson, P., & Agunwamba, C. (1977). Lower bounds for the reliability of the
total score on a test composed of non-homogeneous items: I: Algebraic lower bounds.
Psychometrika, 42(4), 567-578.
Jarvis, C. B., MacKenzie, S. B., and Podsakoff, P. M. (2003). A Critical Review
of Construct Indicators and Measurement Model Misspecification in Marketing and
Consumer Research, Journal of Consumer Research 30 (2), 199-218
Jennrich, R. I. & Sampson, P.F. (1966). Rotation for simple loadings.
Psychometrika, 31, 313-323.
Jennrich, R.I., Clarkson. D. B. (1980). A Feasible Method for Standard Errors of
Estimate in Maximum Likelihood Factor Analysis. Psychometrika, 45, 237-247.
Johnson, S., Cooper, C., Cartwright, S., Donald, I., Taylor, P., Millet, C. (2005).
The experience of work‐related stress across occupations. Journal of Managerial
Psychology, 20, 2, 178 – 187.
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood
factor analysis. Psychometrika, 34, 183-202.
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations.
Psychometrika, 36, 409–426.
370
Jöreskog, K. G. (1973). A General Method for Estimating a Linear Structural
Equation System. In Structural Equation Models in the Social Sciences. Edited by A.
Goldberger and O.D. Duncan. (pp. 85-112), New York: Academic Press.
Jöreskog, K. G., and Sörbom, D. (2001). LISREL 8 User's Reference Guide.
Chicago: Scientific Software International.
Kaiser, H. F. (1958). The Varimax Criterion for Analytic Rotation in Factor
Analysis. Psychometrika, 23, 187-200.
Karimi, L., & Bentler, P. M. (under review). Application of covariate-free and
covariate-dependent reliability.
Karimi, L., & Meyer, D. (2014).Validity and Model-Based Reliability of the
Work Organisation Assessment Questionnaire WOAQ Among Nurses. Nursing Outlook.
Klein, L, and Goldberger. A. S. (1955). An Econometric Model of the United
States 1929- 1952. Amsterdam: North-Holland.
Klein, L. (1950). Economic Fluctuations in the United States 1921-1941. New
York: John Wiley.
Kline, R. B. (1998). Principles and Practice of Structural Equation Modeling.
New York: The Guilford Press.
Koopmans, T. (1945). Statistical Estimation of Simultaneous Economic
Relations. Journal of the American Statistical Association, 40, 488-66.
371
Kuder, G. E, & Richardson, M. W. ( 1937). The theory of estimation of test
reliability. Psychometrika, 2, 151-160.
LaMontagne, A (2004), Improving OHS policy through intervention research.
Journal of Occupational Health and Safety, 20 (2), 107-113.
Laschinger HKS (2012) Job and career satisfaction and turnover intentions of
newly graduated nurses, Journal of Nursing Management 20, 472–484
Law, K. S., Wong, C. S., Mobley, W. H. (1998). Towards a Taxonomy of
Multidimensional Constructs. Academy of Management Review, 23 (4), 741-755.
Lawley, D. N. (1940). The Estimation of Factor Loadings by the method of
Maximum Likelihood. Proceedings of the Royal Society of Edinburgh, 60, 64-82.
Lindell, M. K., & Whitney, D. J. (2001). Accounting for common method
variance in cross-sectional research designs. Journal of Applied Psychology, 86, 114-
121.
llmarinen, J. (1991). The aging worker. Editorial. Scand J Work Environ Health,
17 (Suppl 1), 141 p.
Long, J. S. (1983). Confirmatory Factor Analysis, Beverly Hills, CA: Sage
Publications.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores.
Reading, MA: Addison-Wesley.
372
MacCallum, R. C., & Browne, M.W. (1993). The use of causal indicators in
covariance structure models: Some practical issues. Psychological Bulletin, 114, 533-
541.
MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis
and determination of sample size for covariance structure modeling. Psychological
Methods, 1, 130-149.
MacKenzie, S. B. Podsakoff, P. M. & Podsakoff, N. P. (2011): Construct
Measurement and Validation Procedures in MIS and Behavioral Research, Integrating
New and Existing Techniques. MIS Quarterly, 35 (2), 293–334.
MacKenzie, S. B., Podsakoff, P. M., and Jarvis, C. B. (2005). The Problem of
Measurement Model Misspecification in Behavioral and Organizational Research and
Some Recommended Solutions, Journal of Applied Psychology, 90 (4), 710-730.
Magnavita N, Mammi F, Roccia K, & Vincenti F (2007). WOA: un
questionnario per la valutazione dell’ organizzazione del lavoro. Traduzione e
validazione della versione italiana. [WOA: a questionnaire for the evaluation of work
organization. Translation and validation of the Italian version]. Giornale Italiano di
Medicina del Lavoro ed Ergonomia, 29, 663-665.
Malhotra, N. K., Kim, S. S., & Patil, A. (2006). Common method variance in IS
research: A comparison of alternative approaches and a reanalysis of past research.
Management Science, 52(12), 1865-1883.
373
Mann, H. B., and Wald, A. (1943). On the Statistical Treatment of Linear
Stochastic Difference Equations. Econometrica 11, 173-220.
Marsh, H. W., Hau, K. T., & Grayson, D. (2005). Goodness of fit in structural
equation models. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary
psychometrics: A Festschrift for Roderick P. McDonald (pp.225-340). Mahwah, NJ:
Erlbaum.
Martus, P., Jakob, O., Rose, U., Seibt, R., & Freude, G. (2010). A comparative
analysis of the Work Ability Index. Occupational Medicine, 60(7), 517-524.
Matsueda R. L. (2012). Key Advances In The History Of Structural Equation
Modeling. Handbook of Structural Equation Modeling. Edited by R. Hoyle. New York,
NY: Guilford Press
McDonald, R. P. (1970). The theoretical foundations of principal factor analysis,
canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and
Statistical Psychology, 23, 1-21.
McDonald, R. P. (1999). Test Theory: A unified treatment. Mahwah. N.J.:
Erlhaum.
Meade, A. W., Watson, A. M., & Kroustalis, C. M. (2007). Assessing common
methods bias in organisational research. Paper presented at the 22nd Annual Meeting of
the Society for Industrial and Organizational Psychology, New York.
374
Messick, S. (1995). Validity of psychological assessment: Validation of
inferences from persons’ responses and performances as scientific inquiry into score
meaning. American Psychologist, 50, 741-749.
Miller, M. (1995). Coefficient alpha: A basic introduction from the perspectives
of classical Test Theory and structural equation modeling. Structural Equation
Modeling, 2, 255-273.
Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-
381.
Morales, M. G. (2011). Partial Least Squares (PLS) Methods: Origins, Evolution,
and Application to Social Sciences. Communications in Statistics Theory and Methods,
40 (13), 2305-2317.
Morschhäuser, M., & Sochert, R. (2006). Healthy Work in an Ageing Europe:
Strategies and Instruments for Prolonging Working Life. Essen, Germany.
Mosier, C. I. (1939). Determining a simple structure when loadings for certain
tests are known. Psychometrika, 4, 149-162.
Muthén, B. (1984). A General Structural Equation Model with Dichotomous,
Ordered Categorical, and Continuous Latent Variable Indicators. Psychometrika, 49,
115-32.
Muthén, B. (1994). Multi-Level Covariance Structure Analysis. Sociological
Methods and Research, 22, 376-98.
Muthén, B., and Muthén. L. K. (2004). Mplus User’s Guide. Los Angeles, CA:
375
Nelson, C. R. (1972). The Prediction Performance of the FRB-MIT-PENN
Model of the U.S. Economy. American Economic Review. 62, 902-917.
Nunnally, J. C. (1978). Psychometric Theory (2nd ed.), McGraw-Hill, New
York.
Nunnally, J. C., and Bernstein, I. H. (1994). Psychometric Theory (3rd ed.), New
York: McGraw Hill.
Oakman, J., & Wells, Y. (2009). Can organizations influence employees’
intentions to retire? Paper presented at the 3rd International Symposium on Work
Ability: Promotion of Work Ability Towards Productive Aging, pp. 133 -138, Hanoi,
Vietnam.
Palermo, J. (2010). Investigating modifiable organizational factors relating to
workability: a focus on gendered culture. Paper presented at the 4th International
Symposium on Work Ability: Age Management during the Life Course, pp. 365-377,
Tampere, Finland.
Palermo, J., Webber, L., Smith, K., & Khor, A. (2009). Factors that predict work
ability: Incorporating a measure of organizational values towards ageing. Paper
presented at the 3rd International Symposium on Work Ability: Promotion of Work
Ability Towards Productive Aging, pp. 45 -58, Hanoi, Vietnam.
Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge, UK:
Cambridge University Press.
376
Pearson, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in
Space. Philosophical Magazine, 6, 559-72.
Pensola, T., Järvikoski, A., & Järvisalo, J. (2008). Unemployment and Work
Ability. In Gould, R., Ilmarinen, J., Järvisalo, J., & Koskinen, S. (Eds.) Dimensions of
Work Ability: Results of the Health 2000 Survey. Helsinki, Finland.
Petrides K.V. & Furnham A. (2000) On the dimensional structure of emotional
intelligence. Personality and Individual Differences 29, 313–320.
Petter, S., Straub, D., and Rai, A. (2007). Specifying Formative Constructs in
Information Systems Research, MIS Quarterly, 31 (4), 623-656.
Podsakoff, N. P., Shen, W., and Podsakoff, P. M. (2006). The Role of Formative
Measurement Models in Strategic Management Research: Review, Critique, and
Implications for Future Research, Research Methodology in Strategy and Management
(3), 197-252.
Podsakoff, P. M., & Organ, D. W. (1986). Self-reports in organizational research:
Problems and prospects. Journal of Management, 12, 531-544.
Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003).
Common method biases in behavioral research: A critical review of the literature and
recommended remedies. Journal of Applied Psychology, 88(5), 879-903.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized Multilevel
Structural Equation Modeling. Psychometrika 69, 167-90.
377
Radkiewicz, P., Widerszal-Bazyl, M., & the NEXT-Study group. (2005).
Psychometric properties of Work Ability Index in the light of comparative survey study.
International Congress Series, 1280, 304–309.
Raftery, Adrian E. 1995. Bayesian Model Selection in Social Research.
Sociological Methodology 25:111-95.
Raykov, T., & Marcoulides, G. A. (2011). 7 Procedures for Estimating
Reliability. In Introduction to Psychometric Theory. (pp160–196). Abingdon, Oxon:
Routledge.
Reise, P., Moore, T. M. & Haviland, M. G. (2010). Bifactor Models and
Rotations: Exploring the Extent to which Multidimensional Data Yield Univocal Scale
Scores. Journal of Personality Assessment. 92(6), 544–559.
Reise, S. P, Bonifay, W. E., & Haviland, M. G. (2012). Scoring and modeling
psychological measures in the presence of multidimensionality. Journal of Personality
Assessment, 95, 129-140.
Revelle, W., & Zinbarg, R. E. (2008). Coefficient alpha, beta, omega, and the
glb: Comments on Sijtsma. Psychometrika, 74, 145-154.
Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three
perspectives: Examining post hoc statistical techniques for detection and correction of
common method variance. Organizational Research Methods, 12, 762-800.
Rick, J., & Briner, R.B. (2000). Psychosocial risk assessment: problems and
prospects. Occupational Medicine, 50(5), 310-314.
378
Rigdon, E.E. (2012). Rethinking partial least squares path modeling: in praise of
simple methods, Long Range Planning 45 (5e6), 341-358.
Ringle, C.M., Sarstedt, M., & Straub, D.W. (2012). A critical look at the use of
PLS-SEM in MIS quarterly, MIS Quarterly 36 (1), iiiexiv.
Ringle, C.M., Wende, S., & Will, S. (2005). SmartPLS 2.0 (M3) Beta,
Hamburg http://www.smartpls.de.
Rogers, W. M., Schmitt, N., & Mullins, M. E. (2002). Correction for unreliability
of multifactor measures: Comparison of alpha and parallel forms approaches.
Organizational Research Methods, 5, 184-199.
Roldán, J. L. and Sánchez-Franco, M. J. (2012). Variance-Based Structural
Equation Modeling: Guidelines for Using Partial Least Squares in Information Systems
Research. Research Methodologies, Innovations and Philosophies in Software Systems
Engineering and Information Systems. IGI Global, 193-221.
Roy, S. Tarafdar, M., Ragu-Nathan, T.S. & Marsillac, E. (2012). The Effect of
Misspecification of Reflective and Formative Constructs in Operations and
Manufacturing Management Research. Journal of Business Research Methods, 10 (1),
34-52.
Satorra, A., & Bentler, P. M. (1988). Scaling corrections for statistics in
covariance structure analysis (UCLA Statistics Series 2). Los Angeles: UCLA,
Department of Psychology.
379
Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard
errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds), Latent
variables analysis: Applications for developmental research (399-419).
Saunders, J.B., Aasland, O.G., Babor, T.F., de la Fuente, J.R. and Grant, M.
(1993). Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO
collaborative project on early detection of persons with harmful alcohol consumption. II.
Addiction, 88, 791-804.
Schmid, J., & Leiman, J. (1957). The development of hierarchical factor
solutions. Psychometrika, 22, 53--61.
Schriesheim, C. A., Kinicki, A. J., & Schriesheim, J. F. (1979). The effect of
leniency on leader behavior descriptions. Organizational Behavior and Human
Performance 23, 1-29.
Schumacker, R. E., & Lomax, R. G. (2004). A beginner's guide to structural
equation modeling, Second edition. Mahwah, NJ: Lawrence Erlbaum Associates.
Schutte N.S. & Malouff J.M. (1999) Measuring Emotional Intelligence and
Related Constructs. E. Mellen Press, Lewiston, NY.
Schutte N.S., Malouff J.M., Hall L.E., Haggerty D.J., Cooper J.T., Golden C.J. &
Dornheim L. (1998) Development and validation of a measure of emotional intelligence.
Personality and Individual Differences 25, 167–177.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6,
461-464.
380
Shipley, B. (2000). Cause and Correlation in Biology; A User’s Guide to Path
Analysis, Structural Equations and Causal Inference. Cambridge, UK: Cambridge
University Press.
Sijtsma, K. (2008). On the use, the misuse, and the very limited usefulness of
Cronbach’s alpha. Psychometrika, 74, 107-120.
Sijtsma, K. (2009b). Reliability beyond theory and into practice. Psychometrika,
74, 169-173.
Skrondal, A., and Rabe-Hesketh, S. (2004). Generalized Latent Variable
Modeling: Multilevel, Longitudinal, and Structural Equation Models. Boca Raton:
Chapman and Hall.
Sočan, G. (2000). Assessment of reliability when test items are not essentially Ʈ-
equivalent. Developments in Survey Methodology, 15, 23-35.
Sörbom, D. (1974). A general method for studying differences in factor means
and factor structures between groups. British Journal of Mathematical and Statistical
Psychology, 27, 229-239.
Spearman, C. (1904). General Intelligence, Objectively Determined and
Measured. American Journal of Psychology. 15, 201-93.
Spector, P. E. (1987). Method variance as an artifact in self-reported affect and
perceptions at work: Myth or significant problem. Journal of Applied Psychology, 72(3),
438-443.
381
Steiger, J. H., & Schönemann, P. H. (1978). A history of factor indeterminacy. In
S. Shye (Ed.), Theory construction and data analysis. Chicago: University of Chicago
Press.
Stober, J. (2001). The social desirability scale-17 (SDS17): Convergent validity,
discriminant validity, and relationship with age. European Journal of Psychological
Assessment, 17(3), 222–232.
Taylor, P. (2010). Planning for an Ageing Workforce. Paper presented at the 4th
International Symposium on Work Ability: Age Management during the Life Course,
pp. 23-33, Tampere, Finland.
Taylor, P. (Sep 2008). Assessing Workability in the Workplace. Unpublished
presentation. OHSIG. Aotea Centre, Auckland, New Zealand.
Taylor, P., & McLoughlin, C. C. (Dec 2011). Pilot Study on Workability.
Monash University. Unpublished presentation. Melbourne, Australia.
Tenenhaus, M., Vinzi, V. E., Chatelin, Y.-M., & Lauro, C. (2005). PLS path
modeling. Computational Statistics & Data Analysis, 48(1), 159–205.
Thomas, K. W., & Kilmann, R. H. (1975). The social desirability variable in
organizational research: An alternative explanation for reported findings. The Academy
of Management Journal, 18(4), 741-752.
Thomson, G. H. (1916), A hierarchy without a general factor. British Journal of
Psychology, 1904-1920, 8, 271–281.
382
Thomson, G. H. (1935). The definition and measurement of "g" (general
intelligence). Journal of Educational Psychology, 26, 241-262.
Thurstone, L. L. (1935). The Vectors of Mind. Chicago: University of Chicago
Press.
Thurstone, L. L. (1947). Multiple factor analysis. Chicago: Chicago University
Press.
Treiblmaier, H. Bentler, P., and Mair, P. (2011). Formative Constructs
Implemented via Common Factors, Structural Equation Modeling, 18 (1), 1-17.
Tucker, R. (1955). The Objective Definition of Simple Structure in Linear Factor
Analysis. Psychometrika, 20, 209-225.
Tuomi, K. (1997). Eleven-year follow-up of aging workers; Editorial. Scand J
Work Environ Health, 23(Suppl 1), 66–71.
Tuomi, K., Ilmarinen, J., Jahkola, M., Katajarinne, L., & Tulkki, A. (2006). Work
Ability Index. 2nd revised edition. Helsinki, Finnish Institute of Occupational Health.
Tuomi, K., Ilmarinen, J., Klockars, M., Nygård, C.-H., Seitsamo, J., &
Huuhtanen, P. (1997). Finnish research project on aging workers in 1981-1992. Scand J
Work Environ Health, Suppl 1, 7-11.
Tuomi, K., Ilmarinen, J., Martikainen, R., Aalto, L., & Klockars, M. (1997).
Aging, work, life-style and work ability among Finnish municipal workers in 1981-
1992. Scand J Work Environ Health, 23(Suppl 1), 58-65.
383
Tuomi, K., Ilmarinen, J., Seitsamo, J., Huuhtanen, P., Martikainen, R., C-H, N.,
& Klockars, M. (1997). Summary of the Finnish research project (1981-1992) to
promote the health and work ability of aging workers. Scand J Work Environ Health,
Suppl 1, 66-71.
Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in
measurement error affecting score reliability across studies. Educational and
Psychological Measurement, 58, 6–20.
Vacha-Haase, T., & Thompson, B. (2011). Score reliability: A retrospective look
back at 12 years of reliability generalization studies. Measurement and Evaluation in
Counseling and Development, 44, 159-168.
Van der Heijden, B.I.J.M., Van Dam, K., & Hasselhorn, H.M. (2009). Intent to
leave nursing. The importance of interpersonal work context, wok-home interference,
and job satisfaction beyond the effect of occupational commitment. Career Development
International, 14(7), 616-635.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the
measurement invariance literature: suggestions, practices, and recommendations for
organizational research. Organizational Research Methods, 3(1), 4–70.
Warr, P., Cook, J., & Wall, T. (1979). Scales for the measurement of some work
attitudes and aspects of psychological well-being, Journal of Occupational
Psychology, 1979, 52(2), 129-148.
384
Webber, L., Smith, K., & Scott, K. (2006). Age, work ability and plans to leave
work. Paper presented at the Joint Conference of the Australian Psychological Society
and the New Zealand Psychological Society, pp. 479-483, Auckland, New Zealand.
Werts, C. E., Linn, R. L., & Joreskog, K. G. (1974). Interclass reliability
estimates: Testing structural assumptions. Educational and Psychological Measurement,
34, 25-33.
Wetzels, M., Odekerken-Schroder, G., van Oppen, C., (2009). Using PLS path
modeling for assessing hierarchical construct models: guidelines and empirical
illustration, MIS Quarterly 33 (1), 177-195.
Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of
psychological instruments: Applications in the substance use domain. In K. J. Bryant, M.
Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from
alcohol and substance abuse research (pp. 281–324). Washington, DC: American
Psychological Association.
Williams, L. J., Cote, J. A., & Buckley, M. R. (1989). Lack of method variance
in self-reported affect and perceptions at work: Reality or artifact? Journal of Applied
Psychology, 74(3), 462-468.
Williams, L. J., Hartman, N. & Cavazotte, F. (2010). Technique Method
Variance and Marker Variables: A Review and Comprehensive CFA Marker.
Organizational Research Methods, 13 (3), 477-514.
385
Wilson, E. B. (1928a) Review of 'The abilities of man, their nature and
measurement' by C. Spearman. Science, 67, 244-248.
Wilson, E. B. (1928b). On hierarchical correlation systems. Proceedings of the
National Academy of Sciences, 14, 283-291.
Wilson, E. B. (1929). Review of 'Crossroads in the mind of man: A study of
differentiable mental abilities' by T. L. Kelley. Journal of General Psychology, 2, 153-
169.
Wilson, E. B., & Worcester, J. (1939). A note on factor analysis. Psychometrika,
4, 133-148.
Winkler, J. D., Kanouse, D. E., & Ware, J. E., Jr. (1982). Controlling for
acquiescence response set in scale development. Journal of Applied Psychology, 67(5),
555-561.
Wold, H. (1979). Model construction and evaluation when theoretical knowledge
is scarce: Theory and application of partial least squares. In J. Kmenta & J. B. Ramsey
(Eds.), Evaluation of econometric models (pp. 47-74). New York: Academic.
Wold, H. (1982). Soft modeling: the basic design and some extensions, In:
Jöreskog, K.G., Wold, H. (Eds.), Systems Under Indirect Observations: Part II. North-
Holland, Amsterdam, pp.1 e54.
Wolfe, A. W. (1966). Factor analysis to 1940. Psychometric Monograph, 3.
386
Woodhouse, B., & Jackson, P. (1977). Lower bounds for the reliability of the
total score on a test composed of non-homogeneous items: H: A search procedure to
locate the greatest lower bound. Psychometrika, 42(4), 579-591.
Wright, Sewall. (1920). The relative importance of heredity and environment in
determining the piebald pattern of guinea-pigs. Proceedings of the National Academy of
Sciences. 6. 320-332.
Wynne-Jones, G., Buck, R., Varnava, C.J., & Main, C. (2011). Impacts on work
performance; what matters 6 months on? Occupational Medicine, 61, 205-208.
Wynne-Jones, G., Varnaya, A., Buck, R., Karanika-Murray, M., Griffiths, A.,
Phillips, C., & Main, C.J. (2009). Examination of the work organisation assessment
questionnaire in public sector workers. Journal of Occupational & Environmental
Medicine, 51(5): 586-593.
Yang, Y., & Green, S. B. (2011). Coefficient alpha: A reliability coefficient for
the 21st century. Journal of Psychoeducational Assessment, 29, 377-392.
Yung, Y. F., Thissen, D., & McLeod, L. D. (1999) On the relationship between
the higher-order factor model and the hierarchical factor model. Psychometrika, 64:113–
128.
Zeller, R. A., Measurement in the Social Sciences: The Link Between Theory
and Data. Cambridge University Press.
http://www.google.com.au/search?tbo=p&tbm=bks&q=inauthor:%22Edward+G.+Carmi
nes%22&source=gbs_metadata_r&cad=4 Carmines, E.G. (1980).
387
Zellner, A, and Theil. H (1962). Three-Stage Least Squares: Simultaneous
Estimation of Simultaneous Equations. Econometrica, 30, 54-78.
Zellner, A. (1962). An Efficient Method of Estimating Seemingly Unrelated
Regressions and Tests of Aggregation Bias. Journal of the American Statistical
Association, 57, 348-68.
Zimmerman, D.W. (1972) Test reliability and the Kuder-Richardson formulas:
Derivation from probability theory. Educational and Psychological Measurement, 32,
939-954.
Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s
β, and McDonald’s ωh: Their relations with each other and two alternative
conceptualizations of reliability. Psychometrika, 70, 123-133.
388