personality, faking and the ability to identify criteria

97
PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA: CAN FORCED CHOICE FORMATS UNTANGLE THEIR RELATIONSHIPS? by LI GUAN (Under the Direction of Nathan T. Carter and Gary J. Lautenschlager) ABSTRACT Although personality testing has been used for many years in the workplace because of its ability to reduce adverse impact and provide incremental validity for predicting job performance, the impact of intentional response distortion (i.e., faking) is always considered due to the fact that faking alters the internal validity of the test and misrepresents rank ordering. In fact, the factorial structure in personality tests may change under different administration conditions (e.g., honest condition vs. job applicant condition) (Schmit & Ryan, 1993). Previous research (Kleinmann, Ingold, Lievens, Jansen, Melchers & Konig, 2011) indicates that to the extent job applicants are able to recognize the relevant schema of an evaluative situation (i.e., ability to identify criteria; ATIC) they are capable of faking their responses in a manner that increases their chance of being hired. A within-subjects design study was conducted and suggested that faking effect of the ATIC was reduced by the forced-choice measure when the single-stimulus measure failed. Mixed results of the FC format measure were found and suggestions were made for future research. INDEX WORDS: personality, faking, response distortion, ability to identify criteria (ATIC), testing format, forced-choice

Upload: others

Post on 18-Apr-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA:

CAN FORCED CHOICE FORMATS UNTANGLE THEIR RELATIONSHIPS?

by

LI GUAN

(Under the Direction of Nathan T. Carter and Gary J. Lautenschlager)

ABSTRACT

Although personality testing has been used for many years in the workplace because of

its ability to reduce adverse impact and provide incremental validity for predicting job

performance, the impact of intentional response distortion (i.e., faking) is always considered due

to the fact that faking alters the internal validity of the test and misrepresents rank ordering. In

fact, the factorial structure in personality tests may change under different administration

conditions (e.g., honest condition vs. job applicant condition) (Schmit & Ryan, 1993). Previous

research (Kleinmann, Ingold, Lievens, Jansen, Melchers & Konig, 2011) indicates that to the

extent job applicants are able to recognize the relevant schema of an evaluative situation (i.e.,

ability to identify criteria; ATIC) they are capable of faking their responses in a manner that

increases their chance of being hired. A within-subjects design study was conducted and

suggested that faking effect of the ATIC was reduced by the forced-choice measure when the

single-stimulus measure failed. Mixed results of the FC format measure were found and

suggestions were made for future research.

INDEX WORDS: personality, faking, response distortion, ability to identify criteria (ATIC),

testing format, forced-choice

Page 2: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA:

CAN FORCED-CHOICE FORMATS UNTANGLE THEIR RELATIONSHIPS?

by

LI GUAN

Bachelor of Science, University of Illinois Urbana-Champaign, 2012

A Thesis Submitted to the Graduate Faculty of the University of Georgia in Partial Fulfillment of

the Requirements for the Degree

ATHENS, GEORGIA

2015

Page 3: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

© 2015

Li Guan

All Rights Reserved

Page 4: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA:

CAN FORCED CHOICE FORMATS UNTANGLE THEIR RELATIONSHIPS?

by

LI GUAN

Major Professor: Gary J. Lautenschlager

Committee: Brian J. Hoffman

Gary J. Lautenschlager

Nathan T. Carter

Robert Mahan

Electronic Version Approved:

Julie Coffield

Interim Dean of the Graduate School

The University of Georgia

May 2015

Page 5: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

iv

To my mom and dad

Page 6: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

v

ACKNOWLEDGEMENTS

First, I would like to thank the members of my thesis committee, Nathan Carter, Gary

Lautenschlager, Brian Hoffman and Robert Mahan, special thanks go to Nathan and Gary, who

directed me and helped me greatly to make this research project happen.

Also, I would like to thank my dad, who encouraged me to pursue degree in the U.S and

supported my education since eight years ago. He has always been a great role model to me.

I always want to thank Dr. Fritz Drasgow, who helped me on my undergraduate thesis,

and opened the door to the IRT world for me.

Page 7: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

vi

TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS .............................................................................................................v

LIST OF TABLES ......................................................................................................................... ix

LIST OF FIGURES .........................................................................................................................x

CHAPTER

1 INTRODUCTION .........................................................................................................1

2 IDEAL EMPLOYEE FACTOR AND ABILTIY TO IDENTIFY CRITERIA.............6

3 DOES THE MEASUREMENT FORMAT MATTER TO REDUCE FAKING

EFFECTS OF THE ATIC? ..........................................................................................10

The appropriateness of using forced-choice personality items ..............................10

Why multidimensional forced-choice format? ......................................................12

4 MEASUREMENT METHODS AND ASSUMPTIONS ............................................15

Measurement methods: ipsative scoring and partially ipsative scoring ................15

Scoring assumptions: dominance and ideal point ..................................................17

Ideal point model to score single-stimulus measure ..............................................20

Ideal point model to score multidimensional forced-choice measure ...................20

5 METHODS ..................................................................................................................25

Participants .............................................................................................................25

Measures ................................................................................................................25

Procedures ..............................................................................................................28

Page 8: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

vii

6 DATA ANALYSES.....................................................................................................30

Unidimensionality and model fit ...........................................................................30

Item and person parameter estimation ...................................................................30

Assessing the ideal-employee factor ......................................................................31

Assessing the ability to identify criteria ................................................................32

Assessing construct validity ..................................................................................33

Participant reaction towards single-stimulus and forced-choice format testing ...34

7 RESULTS ....................................................................................................................35

Unidimensionality and model fit ...........................................................................35

Item and person parameter estimation ...................................................................35

Assessing the ideal-employee factor ......................................................................36

Assessing the ability to identify criteria ................................................................37

Assessing construct validity ..................................................................................37

Participant reaction towards single-stimulus and forced-choice format testing ...40

7 DISCUSSIONS ............................................................................................................42

Conclusions regarding validity related issues ........................................................42

Conclusions regarding testing formats...................................................................45

Remaining concerns with the FC measure and future research directions ............46

Contributions to the field .......................................................................................48

Limitations of the current study .............................................................................50

REFERENCES ..............................................................................................................................52

APPENDICES

A SINGLE-STIMULUS MEASURE ..............................................................................61

Page 9: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

viii

B FORCED-CHOICE

MEASURE ..............................................................................................................62

C JOB FLYER ................................................................................................................63

D ABILITY TO IDENTIFY CCRITERIA MEASURE..................................................64

E PARTICIPANTS’ REACTION MEASURES ............................................................65

Page 10: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

ix

LIST OF TABLES

Page

Table 1: Descriptive Statistics and Coefficient Alpha of Each Scale……………………………66

Table 2: Factor Loadings of Each Single-Stimulus Scale of the Honest and Applicant Conditions .

……………..……………………………………………………………………………………..67

Table 3: Model-Data Fit Statistics of the Single-Stimulus Personality Measure under Both the

Honest and the Applicant Conditions……………………………………………………… ……68

Table 4: Item Parameters for the SS Measure under the Honest and Applicant Conditions…….69

Table 5: Item Parameters of the FC Measure under the Honest Condition……………………...71

Table 6: Item Parameters of the FC Measure under the Applicant Condition…………………...72

Table 7: Goodness-of-Fit Indices for the Ideal-Employee Confirmatory Factor Analyses Models

Tested…...………………………………………………………………………………………..73

Table 8: Goodness-of-Fit Indices for the Structural Equation Models Tested…………………..74

Table 9: Correlation of the Honest and the Applicant Conditions in the Single-Stimulus

Measure…………………………………………………………………………………………..75

Table 10: Correlation of the Honest and the Applicant Conditions in the Forced-Choice

Measure…………………………………………………………………………………………..76

Table 11: Correlations of the Traits Estimates that Obtained from Single-Stimulus and Forced-

Choice Measures under the Honest and the Applicant Conditions………………………………77

Table 12: Correlations of the ATIC with Each Trait Estimated between Two Testing Formats

across the Honest and the Applicant Conditions………………………………….......................78

Table 13: Descriptive Statistics for Reaction Measures…………………………………………79

Table 14: AONVA Analyses that Compare the Effect of Testing Formats on Five Reaction

Measure Scales………………………………………..…………………………………………80

Page 11: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

x

LIST OF FIGURES

Page

Figure 1: Example of a Multidimensional Forced-Choice Item…………………………………81

Figure 2: Example of a Multidimensional Pairwise Preference Item that Representing Order and

Self-Control………………………………………………………………………………………82

Figure 3: Example of the Dominance Response Process ………………………………………..83

Figure 4: Example of the Ideal Point Response Process…………………………………………84

Figure 5: A Hypothetical Item Response Surface of a MUPP Item……………………………..85

Figure 6: Model 2...........................................................................................................................86

Page 12: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

1

CHAPTER 1

INTRODUCTION

Personality testing is used frequently in the workplace for various purposes, such as

personnel selection and placement due to its ability to provide incremental validity for predicting

job performance over other predictors (Schimdt & Hunter, 1998), and reduces adverse impact

(Ryan, Ployhart & Friedel, 1998). According to the National Broadcasting Corporation,

approximately one-third of employers currently use personality tests for hiring and promotion

purposes, and pre-hire testing has grown by approximately 20% annually in the past few years

and continues to grow like a “wildfire” (Tahmincioglu, 2011). Despite its wide acceptance, there

are on-going discussions regarding credibility of using the personality test (Morgeson, Campion,

Hollenbeck, Murphy & Schmitt, 1997). In particular, many researchers have expressed concern

over potential response distortions by applicants that results in overestimation of desirable

personality traits for applicants who distort their responses (McFarland & Ryan, 2000). Research

has also shown that job applicants frequently lie on their pre-hire personality tests for getting

jobs even though they are not qualified to hold the positions (Ross, 1998, Los Angeles Times).

Response distortions in personality testing usually occur under two circumstances.

Carelessness, disinterest, mood changes, overconfidence or changes in the depth of cognitive

processing about the self lead to unintentional response distortion (Dunning, Griffin, Milojkovic

& Ross, 1990), whereas intentional response distortion (i.e., faking) occurs when respondents

tend to either "fake bad" to obtain resources such as disability compensation in clinical settings

or ''fake good'' to make good impressions or to hide their sensitive personal information from

Page 13: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

2

others in organizational settings (Richman, Kiesler, Weisband & Drasgow, 1999). Many current

personality testing studies in an applicant context (high-stakes) focus on intentional response

distortion (i.e., faking good), which is a tendency of job applicants to answer questions in a more

socially desirable direction under some administrations than they would under administrations

with no tangible outcome (Richman, Kiesler, Weisband & Drasgow, 1999).

Most studies have shown that intentional response distortion adversely impacts hiring

decisions. One of the first faking studies was conducted by Wesman (1952), who detected score

variations in different simulated employment situations and the distorted scores appeared to be

more favorable to an employer. Later studies were able to indicate that job applicants inflated

test scores by one-half standard deviation (Ones, Viswesvaran & Korbin, 1995), which changed

the rank ordering specifically in the upper quartile of the score distribution. These possible

response distortions have dramatic impacts on the hiring decisions (e.g., Jackson, Wroblewski

and Ashton, 2000). In practice, one bad hiring decision could easily cost a company $40,000

(Goltz, 2011, the New York Times). Therefore, it is worthwhile for organizations to understand

intentional response distortion better to prevent such unnecessary cost, and as applied

psychologists we may have more confidence to serve practitioners in the field.

Personality tests in an organizational setting are supposed to measure respondents’

standing on job-relevant personality traits, indicating that the construct validity of personnel

selection procedures is based on job-related knowledge, skills, abilities and personal

characteristics (Kleinmann, Ingold, Lievens, Jansen, Melchers & Konig, 2011). However, Guan,

Carter, Tryba and Griffth (2014) found that correlations between honest and applicant conditions

of the same trait dropped drastically when comparing the correlation of the full sample with the

correlation of the top 100 applicants, for example, the correlation of trait of Conscientiousness

Page 14: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

3

under the honest and applicant condition is .585 for the full sample and .060 for the top 100

applicants. Theoretically, the scores from two conditions should be highly correlated to each

other if job applicants do not engage in the response distortions, however, the .060 correlation on

the trait of Conscientiousness clearly indicates that the scores for the top 100 applicants in the

applicant condition are not consistent with their scores in the honest condition. In real selection

situations, despite these top 100 applicants are only the ones who were able to fake the most,

they are most likely to be hired or be advanced to the next stage of the selection process based on

their high scores. Research suggests the factorial structure in personality test may change under

different administration conditions (Schmit & Ryan, 1993). In this case, the recruiters may not

even know what selection criteria used for making the hiring decisions when applicants fake,

which seriously calls into question the construct-related validity of personality tests in hiring

scenarios, and begs the question: What do personality tests actually measure in job application

settings?

Little is known about what is being actually measured when job applicants fake their

responses to personality tests. Kleinmann et al. (2011) introduced the concept of ability to

identify criteria (ATIC), which represents the ability level of an individual to correctly identify

the relevant schema in an evaluative situation. Individuals high in ATIC is expected to be able to

distort the responses in line with the personality profile desired by employers and increase their

chances of being hired. Kleinmann et al. (2011) also indicate that the ATIC may be one type of

individual differences that is measured indirectly in the job applicant setting. In practice, the

ATIC relates to social-intelligence to predict future job performance (Viswesvaran & Ones,

1999). The ATIC is positively related to self-reported social skills (Schollaert & Lievens, 2008),

which are positively related to impression management (Kelhe, Kleinmann, Hartstein, Melchers,

Page 15: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

4

König, Heslin & Lievens, 2012); the ATIC also shows a modest correlation with general mental

ability (Melchers et al., 2009).

In this study, we propose that in job application scenarios, self-report personality tests

may be contaminated with the ATIC due to the ability of high-ATIC persons to identify the

desired responses from the personality tests. Second, we explore whether or not applying a

potentially more appropriate testing format, namely the forced-choice (FC) format, can be used

to more accurately tap personality traits in such scenarios. Currently, most personality tests are

designed using Likert type response format where one item is supposed to be evaluated at a time

and the option (e.g., Totally Disagree, Disagree, Neutral, Agree and Totally Agree) that is close

to respondents’ standing on the trait continuum will be endorsed (Brown & Maydeu-Olivares,

2013). However, job applicants are able to recognize the favorable option from the Likert scale

and relate their responses to the evaluative theme. Further, those with high ATIC should be able

to distort their answers better than others. On the contrary, the FC format is designed to be a

remedy for intentional response distortion, because of which forces individuals to compare

judgments among several statements with similar social desirability so that the job applicants are

hard pressed to recognize the more favorable one among the options, and thus the FC format

testing is thought to be more faking resistant (McCloy, Heggestad & Reeve, 2005). An example

of a FC item can be found from Figure 1 in which all four statements in this item are quite

favorable to endorse. In this case, the FC may be a better format to use in personality testing,

especially in job applicant settings.

This study is the first attempt to reduce the ATIC effects using the FC format. Overall,

the study presents two major contributions to the field. First, the study indicates whether or not

the ATIC is actually being measured from the Likert format personality test in job applicant

Page 16: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

5

settings. Second, the study introduces the FC format to the ATIC literature for the first time

because of its resistant-to-faking nature. Therefore, both researchers and practitioners could

benefit from this research to measure the personality traits precisely as what is supposed to

measure. First, a more detailed discussions of the past research and proposed hypotheses will be

presented in the following sections.

Page 17: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

6

CHAPTER 2

IDEAL EMPLOYEE FACTOR AND ABILTIY TO IDENTIFY CRITERIA (ATIC)

A latent variable called the ideal-employee factor has been detected from various faking

studies (e.g., Schmit &Ryan, 1993; Pauls & Crost, 2005), and serves as a function of ATIC. The

ideal-employee factor in general is a positive factor (e.g., thoughtful, considerate, active, self-

control etc.) that usually loads across different desirable personality dimensions, and these

positive factors can be easily identified by the respondents (Schmit & Ryan, 1993). Kelhe and

his colleagues (2012) used a dataset containing multiple simulated applicant conditions to

compare a traditional measurement model that underlies the Big Five personality traits with

another model that duplicated the first model but added an extra latent variable: the ideal-

employee factor. Model-data fit results of confirmatory factor analysis (CFA) showed that the

model with the ideal-employee factor was superior to the one without the ideal-employee factor.

Therefore, they conclude that the ideal-employee factor is a hidden factor that is measured from

the personality test other than the designated personality traits, especially under the job applicant

settings.

More research results reveal the possible existence of the ideal-employee factor. Guan et

al. (2014) conducted a within-subjects study where participants completed a self-report

personality measure twice under two responding conditions: honest condition and job applicant

condition. They found out traits were more highly correlated with each other under the applicant

condition: Agreeableness correlated .583 to Conscientiousness, Agreeableness correlated .521 to

Extraversion, and Conscientiousness correlated .541 to Extraversion. However, these three traits

Page 18: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

7

were less correlated with each other under the honest condition: Agreeableness correlated .321 to

Conscientiousness, Agreeableness correlated .362 to Extraversion and Conscientiousness

correlated .394 to Extraversion. Theoretically, Five Factor Model traits should be uncorrelated

with each other because they are supposed to be a set of five relatively orthogonal traits, and

correlations practically should range from -.20 to .25 among those five traits (Michel, 2010). In

this case, moderately high correlations among traits under the applicant condition could be an

indicator of the ideal-employee factors because they may cross-load on the different personality

traits, and recognized by respondents. Therefore, we propose to confirm the following

hypotheses:

Hypothesis 1a: The use of an ideal-employee factor will exhibit a good fit to the

personality data under the applicant condition.

Hypothesis 1b: The use of an ideal-employee factor will not exhibit a good fit to the

personality data under the honest condition.

Generally speaking, people are able to control how others perceive them during social

interactions to some extent, especially when strong positive impressions lead to a desired job

offer. Furnham (1990) asked participants to present themselves as ideal candidates for three

different jobs: advertising executive, banker and librarian. He found out that participants were

able to successfully create different profiles regarding each job. Individuals have to identify what

has being assessed and then demonstrate their responses and behaviors in line with the

corresponding assumptions of what is being assessed in order to achieve better scores in the

applicant condition (Kleinmann et al., 2011). This ability to identify what has been assessed from

the measurement is called ability to identify criteria (ATIC). More specifically, the ATIC reflects

the ability level of respondents who are able to correctly capture the relevant schema (i.e., ideal-

Page 19: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

8

employee factor) in an evaluative situation. Theoretically, the ideal-employee factor in a self-

report measure is positively related to the ability to the ATIC and serve as a function of the ATIC

(Kelhe et al., 2012). Job applicants are able to conceptualize how an ideal-employee factor for a

given selection situation might look like via their ATIC. Knowles (1988) points out that job

applicants usually inspect each item and compare the inspection to their own constructed ideal-

employee factor by using their ATIC. After the inspection, they usually provide a consistent self-

presentation once their constructed ideal-employee factors are confirmed. More specifically, the

ATIC could be a possible explanation of why high correlations among traits were detected under

the applicant condition regarding the Guan et al. (2014) study, because respondents were able to

identify the testing theme and tried to present them in line with their assumptions through the

testing questions. In sum, we propose the following hypothesis:

Hypothesis 2a: The ideal-employee factor will be positively related to the ATIC under

the applicant condition.

It is worthwhile to understand the ATIC, because of which benefits the psychometric

properties of psychological testing due to the response distortions, and for example, ATIC is

believed to be a remedy to criterion-related validity when the test is violated by faking-related

behavior. Komar et al. (2008)’s Monte Carlo simulation study investigated impact of response

distortions and significant changes were found in the criterion-related validity; the change was as

large as .226. Also, Douglas, McDaniel and Snell (1996) presented consistent findings, their

multitrait-multimethod (MTMM) investigation showed a .22 criterion-related validity change

using Agreeableness and Conscientiousness personality scales. On the contrary, Kleinmann et al.

(2011) believe that the ATIC could be a potential complementary explanation for the criterion-

related validity in selection process because perceptions of an individual about what is relevant

Page 20: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

9

to a situation are related to perceivable cues of the situation. ATIC is related to individuals’

understanding of the job position and job selection procedures, so most likely their

corresponding behaviors will be shown in those selection situations. In sum, I hypothesis that:

Hypothesis 2b: Under the applicant condition, all self-report personality test scores will

be positively related to ATIC.

Page 21: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

10

CHAPTER 3

DOES THE MEASUREMENT FORMAT MATTER TO REDUCE FAKING EFFECTS OF

THE ATIC?

Currently, most items in the personality test are constructed using single-stimulus (SS) self-

report (i.e., rating scale), where one item is supposed to be evaluated at a time and the option that

is close to respondents’ standing on the trait continuum will be endorsed (Brown & Maydeu-

Olivares, 2013). The SS format is widely used in the current psychological testing, particularly

for the promotion and selection purposes. Nevertheless, respondents are able to distort their

responses because favorability of each option in the SS measure are easily identified.

Furthermore, individuals may show different tendency of interpreting and endorsing items, such

as central tendency bias (i.e., the tendency to utilize the middle of the scale) or extreme

responding (i.e., the tendency to endorse the high or low options in a response scale) (Brown and

Maydeu-Olivares, 2013). Interpretation of SS measures’ results become problematic because

different types of distortions are easily involved. Therefore, there has been a resurgence in

research devoted to reducing such distortions using the forced-choice (FC) testing format

because of its resistant-to-faking nature (e.g., Vasilspoulos et al., 2006; Stark & Chernyshenko,

2007; McCloy, Heggestad & Reeve, 2005).

The appropriateness of using forced-choice (FC) personality items

In general, the FC format is designed to be a remedy for response distortions, because it

forces individuals to compare judgments among several statements with similar social

desirability. All options seem positive to endorse, so it is difficult for respondents to identify the

Page 22: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

11

response most likely to raise their chances of being hired. For instance, mean increases were

detected from both the SS and FC measures using Dependability Scale under honest and job

applicant conditions; mean inflation in the SS measure was three times greater than the inflation

in the FC measure (Jackson, Wroblewski and Ashton, 2000). In a between-subject study

conducted by Christiansen, Burns and Montgometry (2005), participants completed both the SS

and FC inventories in honest and instructed to fake conditions, even though FC inventory scores

were higher across traits in the instructed to fake condition [d=.43] compared with the honest

condition, whereas the SS inventory scores showed an even larger difference [d=.71] between

conditions. The effect size comparison in this study confirmed that the FC measure was superior

to the SS measure. If respondents are motivated to make the best possible impression, being

forced to choose between items with similar social desirability in perceived relevance to the job

tends to reduce the impression management (Jackson, Wroblewski and Ashton, 2000). Thus, the

respondents have to endorse the option that comes closest to their own personality traits. In

particular, I propose the following hypothesis:

Hypothesis 3: Same trait in FC measure between conditions (honest condition versus

applicant condition) will have higher correlations than correlations in SS measure because

FC format is designed to be more resistant to faking-related behaviors. Moreover, the FC

measure will show lower correlations among traits within conditions compare to the SS

measure.

Because the FC measure is more faking-resistant, job applicants are required to have

higher level ATIC to accurately identify the ideal-employee factors from the test. In this case, the

job applicants are hard to identify the desired responses from the FC measure, which implicates

that the FC measure assesses personality traits under both honest and applicant conditions.

Page 23: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

12

However, the SS measure is easier for the job applicants to identify the ideal-employee factor via

their ATIC when the job applicants are only required to identify the most favorable option from

the rating scale. Thus, the SS measure assesses personality traits in the honest condition, whereas

it measures the ATIC and the designated personality traits in the applicant condition. In sum, I

propose the following hypotheses:

Hypothesis 4a: The FC measure and SS measure should correlate under the honest condition;

however, these two measures will show a smaller correlation under the applicant condition

due to the ATIC effect.

Hypothesis 4b: SS measure will correlate more with the ATIC than FC measure under the

applicant condition.

In spite of these positive results, FC format personality tests have received relatively little

attention. In applied settings, only one out of fourteen most commonly used personality tests is

based on the FC format (Goffin & Christiansen, 2003). Further, little has been published

regarding this promising topic. A search of PsycInfo for the terms “forced choice”, “personality”

and “selection” returned only 45 articles. The research results of this current study may promote

the use of FC format in the personality testing.

Why multidimensional forced-choice format?

FC measure is designed to be more faking-resistance and may reduce such ATIC effect while

the FC measure can vary in different formats, such as unidimensional forced-choice format and

multidimensional forced-choice format. A unidimensional force-choice (unidimensional-FC)

item usually consists multiple statements (e.g., an FC item consists two statements is a dyad,

three is triad, four is tetrad) from one personality dimension. For example, a “triad”

unidimensional-FC item may assess one personality dimension that composes three statements

Page 24: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

13

that respectively on low, moderate and high standing on one trait continuum (McCloy,

Heggestad & Reeve, 2005). Respondents are required to select one statement that to be the most

reflective of their own behaviors, feelings, or thoughts. The process of endorsing a

unidimensional-FC item is similar to endorsing a SS item, most likely a respondent is able to

recognize the most desirable statement from the unidimensional-FC as endorsing the most

desirable option from a SS item, and the respondent is most likely to endorse the most favorable

one. Therefore, the unidimensional-FC format will not be used in the current study because of its

similarity to the SS format measure.

A multidimensional forced-choice (multidimensional-FC) item could be more complicated

that contains two or more statements from multiple personality dimensions. The

multidimensional-FC items require respondents to either indicate the most/least reflective

statement or rank order the statements with respect to the personal behavior among several

statements (McCloy, Heggestad & Reeve, 2005). Multidimensional-FC finds to be a superior

format to the unidimensional-FC because all statements from the multidimensional-FC item are

constructed to have the similar social desirability to some extent so that the possibility of

endorsing a statement based on social desirability is reduced. An example of a multidimensional-

FC item can be seen from Figure 1 where all four statements from different personality

dimensions and are all quite favorable to endorse.

There are two types of the multidimensional-FC item. Multistatement multidimensional-FC

usually involves more than two statements from different personality dimensions, and

multidimensional pairwise preference (MUPP) is considered to be a special case of the

multistatement multidimensional-FC. Chernyshenko and his colleagues (2009) indicate that

multi-statement multidimensional-FC item requires a more complex judgment process that may

Page 25: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

14

increase the cognitive load on respondents leading to greater errors, a less reliable score may

occur due to this complexity. Moreover, a complex model to link item and person parameters is

required. However, there is no proper model to evaluate the quality of a multistatement

multidimensional-FC item at this time, and therefore the multistatement multidimensional-FC

items will not be used in this current study. On the contrary, a MUPP item seems to be a

relatively better format because of its simplicity, which composes two statements with similar

social desirability level and each statement represents one personality dimension. The MUPP

item is even simpler to endorse than the SS item because the MUPP item does not require

judgments concerning degrees of assent (Böckenholt, 2004). It is considered to be one of the

most faking-resistance testing format so far. An example of a MUPP item can be found from

figure 2, which represents personality dimensions from Order and Self-Control (Chernyshenko et

al., 2009). Two statements in this example appear to be quite favorable to the respondents, so

they are “forced” to choose the one that reflects themselves the most. In sum, the MUPP is

superior to other types of the FC formats, so the MUPP items were used in the FC format

measure for this current study.

Page 26: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

15

CHAPTER 4

MEASUREMENT METHODS AND ASSUMPTIONS

Measurement Methods: Ipsative Scoring and Partially Ipsative Scoring

Measurement issues limit which rating scale format should be used in a personnel selection

setting to a great extent. An assessment for the selection purpose usually requires an

interindividual comparison, which is supported by the normative data (Hicks, 1970) that

indicates a higher score reflects a higher standing on the trait continuum so that recruiters are

able to make the selection decision based on the corresponding rank ordering information. Most

organizations use the traditional measurement approach to make the selection decisions (i.e., the

best job candidate may have the highest trait sum score), in the meantime, item scoring based on

a more robust method, the item response theory (IRT) is developed. First, different measurement

methods (i.e., ipsative approach and partially ipsative approach) and their effectiveness of the FC

format are explained in this section as well as the measurement approach to be used for the

current study.

Ipsative measurement method used in multidimensional-FC provides intraindividual

comparison when fixed points are distributed to each construct and total score on the instrument

is constant for all respondents (McCloy, Heggestad& Reeve, 2005). For example, person A and

person B are supposed to select the most descriptive statement of the self from one FC item that

consists of two statements, one from each of two personality dimensions: Conscientiousness and

Extraversion. Person A endorses the Conscientiousness statement and person B endorses the

Extraversion statement. I can easily conclude that person A is high on Conscientiousness,

Page 27: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

16

whereas person B is high on Extraversion. However, I cannot conclude who has higher

Conscientiousness level on the trait continuum. Suppose that person A is 50 percentile on the

Conscientiousness and 10 percentile on the Extraversion; person B is 60 percentile on the

Conscientiousness and 80 percentile on the Extraversion. In this circumstance, even though

person A endorses the Conscientiousness statement, he/she still has lower Conscientiousness

level than person B. However, if person A is at 70 percentile on Conscientiousness, then I can

conclude that he/she has higher Conscientiousness level than person B. In this case, ipstative

measurement method is limited and only indicates the relative standing on the trait continuum

with an unknown absolute standing. Thus, this method might not be useful in a personnel

selection scenario because normative information is required for the selection purpose (e.g.,

Hicks, 1970; Chernyshenko, 2009).

The FC measure does not always result in ipsativity properties, for example, partially ipsative

measurement method allows the interindividual comparison. Measures that “allow total score

variability but maintain the property that score elevation on one scale produces score depression

on another scale is called partially ipsative (Hicks, 1970).” In this case, whether or not partially

ipsative measurement approach provides normative scores largely depend on how statements

composing and scoring an item (McCloy, Heggestad& Reeve, 2005). Hicks (1970) organized

seven partially ipsative criteria: partially order item; scales differ in number of items; not all

alternatives ranked by respondents are scored; scales scored differently for respondents with

different characteristics; scored alternatives are weighted differently; test contains normative

sections and no ipsative predictor set scale is included in the analysis. Thus, partially ipsative for

the FC measure is superior measurement method, moreover, it provides better predictive validity

(e.g., White & Young, 1998; Jackson et al., 2000). Partially ipsative scoring method takes on

Page 28: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

17

characteristics of both ipsative and normative scores, so recent researchers tend to adapt such

measurement method to their research designs (e.g., Heggested, Morrison, Reeve and McCloy,

2006). For instance, Jackson and his colleagues (2000) required participants to select two from

four statements to result in a partially ipsative measurement. Due to these advantages of the

partially ipsative measurement, I employed this measurement method to the current study.

Scoring Assumptions: Dominance and Ideal Point

It is also imperative to know which response assumption that test developer employs in order

to select appropriate model to score measurement responses accurately, so it is also necessary to

demonstrate the relative and absolute accuracy of scale scores (Stark, Chernyshenko & Drasgow,

2005). Both Likert (1932) and Thurstone (1927, 1928) develop methods of scale construction

and scoring in the context of research on attitudes, and their methods have great applications in

the context of personality assessment.

Likert (1932) scale based on the classic testing theory underlies the dominance assumption.

Usually, a large sample of homogeneous items is first developed and administered to a group of

individuals to indicate their level of agreement on a 1 to 5 scale (i.e., 1=Totally Disagree,

2=Disagree, 3=Moderate, 4=Agree, and 5=Totally Agree). After reverse coding of negatively

worded items, the scale is created by items that show high item-total correlations (Chernyshenko,

Stark, Drasgow & Roberts, 2007). Items with low item-total correlations are removed from the

item pool because they are assumed to have low discrimination among individuals. Respondents

tend to endorse positively when their standing on the trait continuum is higher than the item, and

endorse negatively when their standing on the trait continuum is lower than the item. Most

current personality tests apply this scoring assumption, which presumes a monotonic increasing

relationship between the trait level (θ) and probability of endorsement that can be represented by

Page 29: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

18

an item response curve (IRC). An example of the IRC follows the dominance assumption is

shown in Figure 3, indicating that the higher the standing on the trait level, the higher probability

of endorsement of the item.

Recent research suggests that the Likert scale could be problematic sometimes due to the lack

of intermediate items (e.g., Cao, Drasgow & Cho, 2014), because the scale with only extreme

items is less likely to capture the ideal point standing of a respondent on the trait continuum. For

example, it is unlikely to find a respondent with extreme high level of Extraversion disagrees to

an item such as “I am the life of party.” Items in the assessment needs to have greater variability

in order to distinguish individuals’ standing on the trait continuum, however, the scale only

retains extreme items may underrepresent such variability, especially when intermediate items

have the capability to accurately differentiate moderately high and high levels of personality

traits (e.g., Carter et al., 2014; Chernyshenko et al., 2007).

Thurstone (1927) first suggests the law of comparative judgment indicates that endorsing

psychological measurement involves comparisons of a series of stimuli on the same trait

continuum. Ideal point response assumption is applied to this law because respondents tend to

endorse the item only when it closes to their standing on the trait continuum. For example, an

item underlies the ideal point assumption such as “I tend to do just enough work to get by,”

respondents could reject the item due to two reasons: their standing on the trait continuum is too

far above (i.e., respondents tend to work extremely hard on every task)/or too far below (i.e.,

respondents tend to not working at all) the item. This comparison process leads to a “bell-

shaped” relationship between the trait level (θ) and probability of endorsement because the

respondent rejects the item when it does not match their ability level and only endorses the item

Page 30: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

19

when it closes to their own standings, and an example of IRC is in Figure 4, and peak point of

the curve represents the ability level that is measured from the item.

From a psychometric perspective, recent research suggests that the ideal point assumption

overcomes some shortcomings of the dominance assumption as a better assumption to construct

scale and obtain more accurate results from the psychological testing. For example, Drasgow,

Chernyshenko and Stark (2010) address that scores could be calculated as the mean of the

endorsed item locations within the ideal point assumption, whereas statistical strategies such as

sum scores, item-total correlations and factor analytic methodology within the dominance

assumption could be misleading, particularly the item-total correlation is not a precise indication

of item quality when intermediate items are presented in the item pool. Furthermore, Carter and

his colleagues (2014) indicate that using the ideal point assumption is able to capture the

curvilinear relationship between Conscientiousness and job performance 100% of the time,

whereas scoring under the dominance assumption fails to capture such relationships using sum

score approach. In another study, Carter, Guan, Williamson, Maples and Miller (2014) compare

dominance versus ideal point models for scaling the personality trait of Conscientiousness to

examine its relationship to work- and health-related outcomes (e.g., self-esteem, job satisfaction),

suggesting that the curvilinear relationships can be captured between the trait of

Conscientiousness and those outcomes, particularly ideal point model is able to capture such

relationships 100% of the time, and produce more meaningful results than the sum-score when

the dominance model fails to capture the relationships.

To summarize, the ideal point model tends to be a better assumption to use in the personality

assessment to construct and score items because a better match of the item and the person is the

fundamental requirement of endorsing personality items in the reality, especially in the job

Page 31: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

20

applicant settings. Respondents tend to endorse the item has higher mismatch with their standing

on the trait continuum under the dominance assumption, whereas respondents tend to endorse the

item has better match with their standing on the trait continuum under the ideal point assumption

(Stark et al., 2006). Therefore, both the SS and FC format measures were scored under the ideal

point assumption in this study.

Ideal point model to score single-stimulus measure

Generalized Graded Unfolding Model (GGUM; Roberts, Donoghue & Laughlin, 2000) is

a newly developed IRT model that underlying the ideal point response assumption. Because all

items in SS measures of this current study range from 1 to 6 (i.e., 1=Totally Disagree,

2=Disagree, 3=Slightly Disagree, 4=Slightly Agree, 5=Agree, and 6= Totally Agree), the

probabilistic function of the GGUM of endorsing an item i of a person j in a polytomous case

was applied:

𝑃[𝑈𝑖 = 1|𝜃𝑗] =exp(𝛼𝑖[(𝜃𝑗 − 𝛿𝑖) − 𝜏1𝑖]) + exp(𝛼𝑖[(2(𝜃𝑗 − 𝛿𝑖) − 𝜏1𝑖])

1 + exp(𝛼𝑖[(3𝜃𝑗 − 𝛿𝑖)]) + exp(𝛼𝑖[(𝜃𝑗 − 𝛿𝑖) − 𝜏1𝑖]) + +exp(𝛼𝑖[(2(𝜃𝑗 − 𝛿𝑖) − 𝜏1𝑖])(1)

where θj denotes the location of respondent j on the latent dimension underlying responses, δi

represents the location of item i on the latent continuum, αi refers to the discrimination parameter

for item i, and τi1 indicates the location of the subjective response category threshold on the

latent continuum. Probability of endorsing each option in one item can be obtained based on the

item location parameter, discrimination parameter and the location of the subjective response

category thresholds.

Ideal point model to score multidimensional forced-choice measure

As previous research suggests that IRT approach is able to result in partially ipsative

scoring and proved to be an efficient method (e.g., Heggested, Morrison, Reeve and McCloy,

2006; Stark, 2005; Brown and Maydeu-Olivares, 2013), and therefore there has been a growing

Page 32: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

21

interest in using IRT as a new scoring method of the FC measure. The probability of endorsing a

statement depends on the individual’s trait level (θ) as well as the model chosen to characterize

the response process (Stark, 2005). Coombs (1950) indicates that rank-order responses in the

unidimensional-FC item follow the same as the ideal point model, and later on researchers

extended this idea to the multidimensional ideal point model (e.g., Hays & Bennett, 1961),

demonstrating that responses to a mutidimensional-FC item can be captured by the ideal point

model.

Currently, there are three proposed models to capture the partially ipsative property from

a FC measure. Brown and Maydeu-Olivares (2011) proposed a Thurstonian IRT model using the

factor analytic methodologies to derive item and person scores from the multidimensional-FC

measure. This model was not used in the current study because underlying assumption of this

procedure is based on the dominance model. McCloy, Heggested and Reeve (2005) proposed a

multidimensional-FC model that retrieves normative information from the test based on the ideal

point assumption, this model was not applied in this current study as well due to the lack of a

well-established guideline to achieve item/person parameters using such model. Stark (2005)

proposed a multi-unidimensional pairwise preference (MUPP) model, and this scoring method

has proved to support accurate trait score recovery in simulation studies (Chernyshenko et al.,

2009). The United States Army (Drasgow et al., 2012) used the MUPP in the Tailored Adaptive

Personality Assessment System (TAPAS) to support army selection and classification decisions,

and the TAPAS supports that the MUPP is validly and efficiently assess personality in a more

faking-resistant way. Therefore, the multidimensional case of this model was employed in the

current study.

Page 33: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

22

Under the MUPP model, individuals are required to choose one statement that reflects

themselves the most from a pair of statements. Each statement in each pair is represented by a

stimulus s and a stimulus t. Individual evaluates independent stimulus (statement) separately. A

preference can be represented by two joint outcomes {agree (1), disagree (0)} that represents a

preference of stimulus s over stimulus t, or {disagree (0), agree (1)} that represents a preference

of stimulus t over stimulus s. The probabilistic function of endorsing stimulus s over stimulus t

is:

𝑃(𝑠>𝑡)𝑖(𝜃𝑑𝑠, 𝜃𝑑𝑡) =𝑃𝑠𝑡{1,0}

𝑃𝑠𝑡{1,0} + 𝑃𝑠𝑡{0,1}≈

𝑃𝑠{1}𝑃𝑡{0}

𝑃𝑠{1}𝑃𝑡{0} + 𝑃𝑠{0}𝑃𝑡{1}, (2)

where:

i= index for items (pairs of stimuli), where i = 1 to I,

d = index for dimensions, where d = 1,…, D,

s, t = indices for first and second stimuli, respectively, in a pairing,

ds, dt= latent trait values for a respondent on dimensions ds and dt, respectively,

Ps {1}, Ps {0} = probability of endorsing/not endorsing stimulus s atds,

Pt {1}, Pt {0} = probability of endorsing/not endorsing stimulus t at dt,

Pst {1, 0} = joint probability of endorsing stimulus s, and not endorsing stimulus t at (ds, dt), Pst {0, 1} = joint probability of not endorsing stimulus s, and endorsing stimulus t at

(ds, dt),

and

P(s >t) i(ds, dt) = probability of a respondent preferring stimulus s to stimulus t in pairing i.

In order to implement the MUPP model successfully, the FC person scoring by equation

(2) includes three steps: first, parameters of each statement (i.e., location parameters,

discrimination parameters and the location of the thresholds on the trait continuum) were

generated by the GGUM to analyze the item responses from the FC measure, and the probability

function in a dichotomous case can be expressed as:

𝑃[𝑈𝑖 = 1|𝜃𝑗] =exp(𝛼𝑖[(𝜃𝑗−𝛿𝑖)−𝜏𝑖])+exp(𝛼𝑖[(2(𝜃𝑗−𝛿𝑖)−𝜏𝑖])

1+exp(𝛼𝑖[(3𝜃𝑗−𝛿𝑖)])+exp(𝛼𝑖[(𝜃𝑗−𝛿𝑖)−𝜏𝑖])++exp(𝛼𝑖[(2(𝜃𝑗−𝛿𝑖)−𝜏𝑖])(3)

where θj denotes the location of respondent j on the latent dimension underlying responses, δi

represents the location of item i on the latent continuum, αi refers to the discrimination parameter

Page 34: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

23

for item i, and τi indicates the location of the subjective response category of item i. The GGUM

not only underlies the ideal response assumption, but also has met several similar assumptions

regarding the FC format (McCloy, Heggested & Reeve, 2005). For instance, GGUM assumes

that individuals only endorse a statement when the statement location is close to their true traits’

level on the trait continuum (Roberts et al., 2000).

And then, these parameters of each statements along with ds (i.e., trait level of endorsing

stimulus s) and dt (i.e., trait level of endorsing stimulus t) were used to calculate response

probabilities for each individual using equation (4) and (5). The equation (4), (5) and (6) for

endorsing stimulus s over t of each individual are shown as follows:

𝑃[𝑍𝑠 = 0|𝜃𝑑𝑠] =1 + exp(𝛼𝑠[3(𝜃𝑑𝑠 − 𝛿𝑠)])

𝛾𝑠,(4)

and

𝑃[𝑍𝑠 = 1|𝜃𝑑𝑠] =exp(𝛼𝑠[(𝜃𝑑𝑠 − 𝛿𝑠) − 𝜏𝑠1]) + exp(𝛼𝑠[2(𝜃𝑑𝑠 − 𝛿𝑠) − 𝜏𝑠1])

𝛾𝑠, (5)

where

Zs = an observable response to stimulus s,

θds = the location of respondent j on the latent dimension represented by stimulus s,

δs = the location of stimulus s on the latent continuum,

αs = the discrimination parameter for stimulus s,

τsk = the location of the kth subjective response category threshold on the latent continuum, and

γs = a normalizing factor that is required to make the observable response probabilities,

summed over response options, add to 1. Thus, γs = P [Zs = 0| θds] + P [Zs = 1|θds ] as following

shows:

𝛾𝑠 = 1 + exp(𝛼𝑠[3(𝜃𝑑𝑠 − 𝛿𝑠)]) + exp(𝛼𝑠[2(𝜃𝑑𝑠 − 𝛿𝑠) − 𝜏𝑠1]) + exp(𝛼𝑠[(𝜃𝑑𝑠 − 𝛿𝑠) − 𝜏𝑠1]).(6)

Lastly, equations (4), (5) and (6) were substituted to the MUPP equation (2) for the

scoring purposes once the response probabilities for individuals were computed. The relationship

of trait levels and the parameters for the statements composing a MUPP item can lead to an item

response surface (IRS), Figure 5 is the IRS of a hypothetical MUPP item that endorsing stimulus

t over stimulus s (αs=0.7, δs=1.5, τs=-0.3; αt=2.0, δt=0.9, τt=0.1). The IRS of the MUPP item

tends to be a “saddle-shaped” surface, which is a visual representation to show the probability of

Page 35: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

24

endorsement from different combination of different levels of two traits. Values on the vertical

axis indicates the probability of endorsing statement t over statement s, conditional on trait

levels, which are the values on two horizontal axis. The trait levels usually ranges from -3 to 3

for both stimuli s and t.

Page 36: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

25

CHAPTER 5

METHOD

Participants

1565 participants were recruited from Amazon Mechanical-Turk across the United States

population, all participants were asked if they were previously or currently employed for the

purpose of the present study as an initial screen question, and they were paid $.75 for their

participation. Because all participants were forced to response all questions through Qualtrics’

setting, no missing data were observed. The participants who were not full-time employees,

responded the survey less than 10 minutes, worked under 20 hours a week, and did not respond

to the consent forms were discarded from the dataset. The final dataset contains 1130

participants after this data cleaning procedure. Age of participants in the final dataset ranges

from 18 to 86 years old (Mean=35.16, SD=11.10). 41.9% were male participants and 58.1%

were female participants. The demographic information showed that 76.5% were Caucasian

(non-Hispanic), 9.6% were African-American, 5.8 % were Asian, 5.5 % were Hispanic/Latino,

0.2% were Middle Eastern, 0.5% were Native American and 1.9% indicated as Others.

Participants on average worked with 40.62 hours (SD=8.08) per week.

Measures

1. Single-stimulus measure

The International Personality Item Pool-NEO (Goldberg, 2001) were used to construct the SS

measure for each personality domain. IPIP-NEO is a 300-item self-report inventory of the Five

Factor Model (FFM) personality traits that assesses the five domains and the six lower-level

Page 37: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

26

facets of each domain. The short version (10 items for each domain) of IPIP-NEO was used in

this study to obtain the Big Five domain: Agreeableness, Conscientiousness, Extraversion,

Emotional Stability, and Openness to Experience. Alphas for the domains range from .77 to .86.

Two inappropriate items (“Tend to vote for liberal political candidates” and “Tend to vote for

conservative political candidates”) that ask about individual political preference were discarded

from the Openness to Experience domain. Also, because one unidimensional-FC item (i.e.,

“Have a good for everyone” and “Respect others” from the Agreeableness domain) was

discarded from the multidimensional FC measure, two statements in this item were dropped from

the final SS measure in order to maintain the question consistency in both formats of measures.

Thus, a 46-item SS measure with 8 items of each Agreeableness and Openness to Experience

domain, 10 items of each Conscientiousness, Extraversion and Emotional Stability domain was

created and scale ranged from 1 to 6 (1=Strongly disagree, 2=Disagree, 3=Slightly Disagree,

4=Slightly Agree, 5=Agree and 6=Strongly Agree) to fulfill the ideal point response assumption

to allow the future IRT analyses. The full content of this SS measure can be found in Appendix

A.

2. Forced-choice measure

Nine graduate students in Industrial/Organizational Program served as Subject Matter

Experts (SMEs) for this study, who rated the social desirability of each statement from the SS

measure on a 1 to 5 scale (1=Not Socially Desirable At All, 2=Somewhat Not Socially Desirable,

3=Neutral, 4=Somewhat Socially Desirable, and 5=Very Socially Desirable). Items from

different personality domains but with similar social desirability were paired, and twenty three

MUPP pairs were created. The content of the FC measure can be found in Appendix B.

3. Ability to identify criteria self-report measure

Page 38: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

27

An ATIC self-report measure was created for a sales job position. Sales job (Appendix C;

created with the O*Net information based on the sales job position) is a low entry level job to

fulfill a wide range of participants from Amazon Mechanical Turk, also, sales job position has

been used in various faking studies and is well-suited for the purpose of the current study. Items

in the ATIC were created based on the O*Net website information on job skills and job abilities

for this sales job position. Twenty items were selected and each item in this measure ranges from

1 to 4 (1=Not Apply, 2=Apply Somewhat, 3=Apply, 4=Apply Perfectly), and the content can be

found in Appendix D.

4. Reaction measure

In addition, fairness and participants’ perception towards different testing formats was

assessed from a reaction measure. Chance to perform scale (4 items) and propriety of questions

scale (3 items) in the reaction measure were derived from the Selection Procedural Justice Scale

(SPJS; Bauer, Trucillo, Sanchez, Craig, Ferrara & Campion, 2001) that intends to measure

perceptions of fairness dimensions related to Gilliland’s (1993) rules of procedural justice. Face

validity scale (5 items) and perceived predictive validity scale (5 items) were utilized to measure

the applicant reactions to selection procedures (Smither, Reilly, Millsap, Pearlman & Stoffey,

1993). Reactions to honesty test scale (10 items) intends to measure test takers’ perceptions of

employers using different tests (Ryan & Sackett, 1987). The reaction measure with these five

scales (1=Totally Disagree, 2=Disagree, 3=Slightly Disagree, 4=Slightly Agree, 5=Agree, and

6=Totally Agree) can be found in Appendix E, and alphas of these five subscales range from .73

to .83 based on previous research.

Page 39: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

28

This study was a within-subjects design where participants were instructed to complete

the SS and FC measures twice under two instruction conditions (i.e., honest condition and

applicant condition). First, all participants were instructed to respond to the SS measure as honest

as possible followed by the FC measure, and the instruction was:

Please complete this personality inventory as honestly as you can. The results will be

completely anonymous and will be used for research purposes only. It is very important that you

respond to this survey by describing yourself as you really are (Modified from: Scherbaum,

Sabet, Kern & Agnello, 2012).

And then, the job flyer of a sales job position was provided to participants before they

responded to the same SS and FC measures as they responded under the honest condition. They

were required to respond as they were getting this job and the instruction of the applicant

condition was:

Please complete the personality inventory as if you were applying for this sales job you

really want. To increase your chances of being hired, you should respond in ways that will make

you look good to the organization.

After that, participants were required to complete the ATIC self-report measure, and an

example was provided to participants to help them understand the purpose of the measure. The

instruction was shown as the following:

In the previous sets of personality inventory questions that you just answered, you may have

tried to figure out what attributes the employer might find important. Therefore, you may have

given specific responses in order to increase your likelihood of get the sales job described.

Please rate the attributes below based on what you think as being assessed by the personality

Procedure

Page 40: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

29

inventory questions. Example: a question with the response option of: “don’t talk to people,”

could have been a measure of Social Perceptiveness attribute.

Lastly, the reaction measure was provided to the participants to assess their reactions toward

two different testing formats (i.e., the SS and FC measures) of the personality survey. Example

of each testing format was provided, and the instruction was shown as the following:

In the previous sets of personality inventory questions, you may have noticed that they were

asked in two different formats. We would like to your reaction is to this format (as shown) as you

responded in ways to increase your likelihood of getting the sales job.

Page 41: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

30

CHAPTER 6

DATA ANALYSES

1. Unidimensionality and model fit

Implementing IRT model defaults assumptions before proceeding to the further analyses:

unidimensional measurement and items are local independent. First, an exploratory factor

analysis (EFA) with principal axis factoring was performed by SPSS version 22.0 on each SS

measure to examine whether a dominant factor existed. As suggested by Reckase (1979), the first

factor should explain at least 20% of the total variance to obtain reasonable item parameter

estimations.

Chi-square fit indices was calculated using Stark’s (2001) MODFIT computer program to

assess model fit of each SS measure. Item singlet, doublets and triplets were obtained, and

singlet was considered. Drasgow, Levine, Tsien, Williams and Mead (1995) recommend that

Mχ2/df value less than or equals to 3.0 indicates model-data fit is satisfactory, and item or person

should be removed prior to the further analysis if any of them shows a misfit with the designated

model.

2. Item and person parameter estimation

The program GGUM2004 (Roberts, Fang, Cui &Wang, 2004) was used to estimate item

parameters for both the SS and the FC measures, also person parameters of the SS measure were

estimated using this program. The program uses the marginal maximum likelihood (MML)

method in estimating item parameters, and utilizes an expected a posteriori (EAP) procedure to

derive person parameter estimates. Note that the estimation of GGUM parameters does not

Page 42: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

31

require reverse coding of negative items as the location parameter in the GGUM is a direct

indicator to the standing of the item contents. As Roberts et al. (2000) report that item and person

parameters can be accurately recovered from a sample with more than 750 participants and no

more than 20 items. In this current study, data from 1130 participants were analyzed and none of

the measures had more than 20 items, so good item and person parameters estimation were

expected. Person parameters of the FC measure were estimated using Stark (2005)'s approach by

a large consulting company. The multidimensional bayes model approach was used to derive

person parameters of the FC measure when item parameters were available from the GGUM

(Stark, 2005).

3. Assessing the ideal-employee factor

Confirmatory factor analysis (CFA) was conducted using MML estimation in the Linear

Structural Relations (LISREL v8.80; Joreskog & Sorbom, 2007) program to examine whether

ideal-employee factor exhibited a good fit the to the personality data in order to address

Hypothesis 1a and 1b. Model 1a and 1b both represented the traditional measurement model

underlying five personality traits (i.e., Agreeableness, Conscientiousness, Extraversion,

Emotional Stability and Openness to Experience) but with a latent variable, ideal-employee

factor, cross-loaded on different personality traits. The Model 1a fit the personality data from the

SS measure of the honest condition, whereas the Model 1b fit the dataset of the applicant

condition using the same measure. In addition, Model 1c and 1d fit both conditions but using the

personality data from the FC measure.

Goodness-of-fit for these two models was evaluated by absolute fit and comparative fit

indices, including: χ2, Tucker-Lewis Index (TLI), Comparative Fit Index (CFI), the root mean

square error of approximation (RMSEA) and the 90% confidence interval of the RMSEA.

Page 43: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

32

Absolute fit indices indicates the ability of the specified model to reproduce the observed

covariance matrix, whereas comparative fit indices compare the theoretical model with the

baseline model. TLI compares the fit of the proposed model to the fit of a null model. The TLI

depends on the average size of the correlations in the data, and penalizes for model complexity.

The CFI is the most widely used SEM fit index, and assumes that no population covariance

exists among the observed variable and compares the sample covariance matrix with its null

model. The CFI penalizes for the model complexity as well. Usually values of TLI and CFI

greater than .90 are generally considered as an indication of good model-data fit. The RMSEA is

a measure that compares the 2 of the model to its df to determine whether the fit is significantly

worse than what should be expected. The RMSEA is a “badness of fit” index in that a value of 0

indicates the best fit and higher values indicate worse fit, and values of RMSEA less than .10 are

generally considered to have acceptable model-data fit. RMSEA becomes “one of the most

informative fit indices” due to its sensitivity to the number of estimated parameters in the

proposed model (Diamantopoulos & Siguaw, 2000). Further, the 90% confidence interval of the

RMSEA statistic allows researchers to test null hypothesis more precisely (McQuitty, 2004).

However, Kenny, Kaniskan and McCoach (2014) argue that RMSEA may not be a precise

indicator of goodness-of-fit when low degree of freedom is observed, so the RMSEA may not be

a precise indication of model fit in identifying the ideal-employee factors.

4. Assessing the ability to identify criteria

As Klehe and her colleagues (2012) suggest that ideal-employee factor serves as a function

of ATIC under the applicant condition. As an individual has higher level ATIC, the more ideal-

employee factors can be detected from the evaluative situation by the individual. Thus, structural

equation modeling (SEM) analyses were conducted by LISERL (v8.80; Joreskog & Sorbom,

Page 44: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

33

2007). Their model (Klehe et al., 2012) was replicated partially in this study to detect whether

level of the ATIC predicts response distortions (i.e., varying in different levels of capturing the

ideal-employee factors). Model 2 (Figure 6) mirrored the Model 1, but employed the ATIC as a

latent variable to predict the ideal-employee factor. Model 2a and 2b both fit the personality data

that obtained from the SS measure under the honest and applicant conditions, respectively.

Model 2c and 2d fit the personality data obtained from the FC measure of two conditions.

Comparisons among Model 2a through 2d were made and aimed to investigate whether the

ATIC could be a possible explanation of why high correlations were detected among traits,

particular under the applicant condition across testing formats. These comparisons intended to

support Hypothesis 2a in that ideal-employee factor serves as a function of the ATIC in the

application condition. TLI, CFI, RMSEA and 90% confidence interval of the RMSEA statistic

were also used to compare the overall goodness-of-fit. Beta (β) weight between latent variables

was also obtained for each model to address that whether the ATIC is a driven influence of the

ideal-employee factor.

5. Assessing construct validity

Three sets of correlational analyses were conducted among the latent traits scores obtained

from the SS and the FC measures under both conditions, also trait scores were related to the

ATIC. Pearson correlation coefficients were calculated using SPSS version 22.0. First,

correlations among the Big Five traits of the honest and applicant conditions were obtained

separately using the latent trait scores from the SS and FC measures, and Hypothesis 3 was

examined through such comparisons. Second, correlations of the Big Five latent trait scores that

obtained from the SS and the FC measures under the honest condition were achieved, and

compared with the same relationships but using latent trait scores from the applicant condition.

Page 45: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

34

The correlational differences between conditions were good indication of existence of the ATIC

and Hypothesis 4a was tested. Third, four groups of correlations between the ATIC and the Big

Five latent trait scores (i.e., trait estimations derived from the SS measure under the honest

condition; trait estimations derived from the SS measure under the applicant condition; trait

estimations derived from the FC measure under the honest condition; trait estimations derived

from the FC measure under the applicant condition) were compared to examine the ATIC effect

(Hypothesis 2b and 4b). Significance tests were conducted to detect whether correlational

differences exist between testing formats.

6. Participant reaction towards single-stimulus and forced-choice format testing

Five one-way analyses of variances (ANOVAs) were conducted to detect whether participant

reaction towards chance to perform, face validity, perceived predicative validity, property of

questions and reactions to honesty test scales, were perceived significantly different across two

testing formats. Negatively worded items were reverse coded in order to preceding to the

ANOVA and further analyses, and sum scores for each subscale of the reaction measure were

retrieved. F-values and p-values were obtained from the ANOVA analyses to indicate if

differences existed between testing formats for each subscale. Also, mean difference, 90%

confidence interval of the mean difference, and Cohen’s d were obtained and calculated to

support the ANOVA results. Cohen’s d is an effect size to indicate the standardized difference

between two means. The standard interpretation of the effect size that offered by Cohen (1988)

is: small effect size is .2, moderate effect size is .5, and large effect size is .8. All analyses were

conducted using the SPSS version 22.0.

Page 46: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

35

CHAPTER 7

RESULTS

1. Unidimensionality and local independence

Table 1 displays descriptive statistics of all the SS personality measures under honest and

applicant conditions, coefficient alpha ranged from .76 to .93 indicate that all the measures were

reliable to be used in general. All first factors in each measure accounted for at least 20% of the

total variance that has met the EFA criterion using the principle component analysis approach (as

shown in Table 2), suggesting it was appropriate to proceeding to unidimensional IRT analyses

on those measures.

The model fits of the GGUM were indicated by calculating the χ2/df, which represents in

Table 3. Average adjusted χ2/df ratios met the criteria of 2/df less than 3 for item singlet in the

honest condition indicated that the GGUM exhibited satisfactory fit for all five personality

measures. Higher but acceptable χ2/df item singlet ratios were observed in the applicant

condition due to the fact that participants did not respond honestly to the measures and led to

slightly worse model fit. These results in general suggested that item parameters were

interpretable in both conditions and therefore substantive investigations were conducted.

2. Item and person parameter estimations

The program GGUM2004 (Roberts, Fang, Cui &Wang, 2004) was used to estimate item

parameters for both the SS and FC measures. Table 4 includes the item parameters that estimated

from the SS measure of both conditions. Five τ parameters were obtained for each SS item as a

result of a six-option response scale. Table 5 and Table 6 list the item parameters that estimated

Page 47: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

36

for the FC measure of the honest and the applicant conditions, respectively. One τ parameter was

observed for each FC statement due to the dichotomous response estimation.

3. Assessing the ideal-employee factor

To determine whether ideal-employee factors cross-loaded on the different personality traits

under the applicant condition, the CFA analysis was conducted, and results can be found in

Table 7. As would have been expected, values of TLI and CFI both were greater than .95 in

Model 1b (i.e., fit personality data from the SS measure under the applicant condition) indicated

a good model-data fit (TLI=.96, CFI=.98 for the Model 1b), whereas values of these two indices

were comparatively lower in Model 1a (i.e., fit personality from the SS measure under the honest

condition) indicated a relatively worse model-data fit (TLI=.93, CFI=.93 for the Model 1a). The

RMSEA for the Model 1b was .12 and overlapped with the 90% CIs of RMSEA (lower 90%

CI=.10, upper 90% CI=.15), whereas bigger RMSEA of .14 was detected for the Model 1a

though it overlapped with the 90% CIs of RMSEA (lower 90% CI=.12, upper 90% CI=.16) as

well. Even though RMSEAs were reported, the RMSEAs were not strong indicators of

goodness-of-fit in this set of analyses as Kenny, Kaniskan and McCoach (2014) argue that

RMSEA is not a good indicator of model fit when low degree of freedom is observed, and model

with 5 degrees of freedom is considered to be a low-df model. Overall, the Model 1b, χ2 (5)

=92.21, p<0.01, yielded a slightly better fit than the Model 1a, χ2 (5) =114.32, p<0.01. Thus,

Hypotheses 1a and 1b were supported in that applicant data showed a slightly better model fit

than the data from the honest condition.

Notably, Model 1c and 1d experienced a miserable fit as all goodness-of-fit indices were

unacceptable, TLI=.20, CFI=.60, RMSEA=.28 in the Model 1c, χ2 (5) =432.83, p<0.01, and

TLI=.05, CFI=.47, RMSEA=.31 in the Model 1d, χ2 (5) =541.29, p<0.01, indicating that the

Page 48: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

37

possible existence of the ideal-employee factors showed poor fit to the FC measure dataset, and

hardly identified by the respondents in both conditions.

4. Assessing the ability to identify criteria

In order to test if ATIC is has influence on the ideal-employee factor, a structural equation

model that defined the ATIC as a latent variable to predict the latent ideal-employee factor was

tested within each of the four conditions; results are displayed in Table 8. Both Model 2a (fit

personality data from the SS measure under the honest condition) and Model 2b (i.e., fit

personality data from the SS measure under the applicant condition) achieved satisfactory model

fit where TLI was .93 and CFI was .95 in the Model 2a, χ2 (34) = 382.69, p<0.01; TLI was .95

and CFI was .96 in the Model 2a, χ2 (34) = 459.52, p<0.01. The RMSEA suggests reasonable

error of approximation, the Model 2a (RMSEA=.10) obtained a slightly lower RMSEA than the

Model 2b (RMSEA=.11), and RMSEAs from both models overlapped within the 90% CIs of

their RMSEAs. The β weights between the ATIC and the ideal-employee factor in Model 2a and

2b were .31 and .36, respectively. These results supported that the ideal-employee factor served

as a function of the ATIC in the applicant condition using the SS measure (Hypothesis 2a). In

addition, poor model fit were observed from the Model 2c and 2d (i.e., TLI=.83, CFI=.87 in the

Model 2c, χ2 (34) = 613.91, p<0.01, and TLI=.77, CFI=.83 in the Model 2d, χ2 (34) = 912.80,

p<0.01), particularly weak links were found between latent variables (i.e., β=-.06 in Model 2c

and β=-.01 in Model 2d).

5. Assessing construct validity

Single-stimulus and forced-choice measures

Table 9 presents correlations of latent trait scores between conditions from the SS

measure for the full sample. All the correlations were statistically significant at the .01 level. As

can be seen, the correlations of the same latent trait between conditions were comparatively low,

Page 49: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

38

falling between .124 and .338 as highlighted in the table, implicating that different constructs

were measured from two conditions. Additionally, the correlations between traits within

condition were moderately high in the honest condition, falling between .186 and .544, but even

higher correlations were observed from the applicant condition, falling between .523 and .742.

As previously discussed, high-ATIC individuals are able to recognize the ideal-employee factors

from the SS measure easily, and such factors usually cross-loaded on personality traits. Thus,

high correlations among traits under the applicant condition were good indications of existence

of the ideal-employee factors and non-faking resistant nature of the SS measure.

Table 10 presents correlations among latent trait scores between conditions of the FC

measure for the full sample. Not all the correlations were statistical significant at the .01 level in

this table. Correlations of same trait between conditions ranged from -.108 from .189 as

highlighted. However, the strikingly low correlations among traits within condition were

observed, ranging from -.297 to .010 in the honest condition, and ranging from -.277 to .023 in

the applicant condition. These low correlations among traits within condition implicated that

individuals were not able to recognize what favorable factors that were commonly shared by all

five personality traits. In sum, even though correlations of same trait between conditions in the

FC measure showed similar trend as these correlations in the SS measure, FC measure showed

significantly lower correlations among traits within conditions than the correlations in the SS

measure, thus, Hypothesis 3 was partially supported.

Next, correlations of the latent trait scores obtained from the SS and FC measures under

the honest condition were compared with the correlations derived from the applicant condition,

and Table 11 displays the results. The most important result in this table is that same latent trait

between formats under the honest condition correlated significantly higher than the correlations

Page 50: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

39

obtained from the applicant condition, expect for the trait of Conscientiousness (z =1.18, p=.24).

Specifically, correlations of same personality traits between formats under the honest condition

were all significant (i.e., r(1128)= .344, p<.01 of Agreeableness, r(1128)=.344 of

Conscientiousness; r(1128)=.575 of Extraversion, r(1128)=.432 of Emotional Stability,

r(1128)=.463, p<.01 of Openness to Experience), and these relationships were relatively small

under the applicant condition (i.e., r(1128)= .172, p<.01 of Agreeableness, r(1128)=.301, p<.01

of Conscientiousness; r(1128)=.469, p<.01 of Extraversion, r(1128)=-.044, p>.01 of Emotional

Stability, r(1128)=.109, p<.01 of Openness to Experience). In addition, small correlations among

five traits between formats were found under both conditions (r ranges from -.236 to .050 in the

honest condition, and ranges from -.212, p to .289 in the applicant condition). Thus, these results

support Hypothesis 4a in that same personality trait in different testing formats should correlate

under the honest condition, whereas lower correlations should be obtained from the applicant

condition, and the ATIC effect led to this observed correlation pattern.

Ability to identify criteria

In addition to previous findings on the ATIC effect, Table 12 presents four groups of

correlations between the ATIC with latent trait scores obtained from four conditions (i.e., honest

condition in SS measure, applicant condition in SS measure, honest condition in FC measure,

and applicant condition in FC measure) to investigate the existence of the ATIC thoroughly. In

general, the results showed that correlations of the ATIC with the latent traits scores estimated

from the SS measure were all statistically significant at the .01 level, particularly even higher

correlations were detected from the applicant condition of the SS measure (i.e., r(1128)= .315 of

Agreeableness, r(1128)=.344 of Conscientiousness, r(1128)=.291 of Extraversion, r(1128)=.267

of Emotional Stability, r(1128)=.378 of Openness to Experience in the honest condition;

Page 51: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

40

r(1128)= .162 of Agreeableness, r(1128)=.284 of Conscientiousness; r(1128)=.221 of

Extraversion, r(1128)=.171 of Emotional Stability, r(1128)=.210 of Openness to Experience in

the applicant condition). On the other hand, lower correlations of the ATIC with the latent traits

scores from the FC measure were observed, and almost none of them were statistically

significant at the .01 level (i.e., r ranges from -.017 to .080 in the honest condition, and r ranges

from -.053 to .074 in the applicant condition).

Comparisons of significant difference between correlations across testing formats were

made. Average correlations between the ATIC and personality traits that retrieved from the SS

measures were statistically higher than the FC measure (i.e., .21 of honest condition using the SS

measure, .29 of applicant condition using the SS measure, .00 of honest condition using the FC

measure, and .02 of applicant condition using the FC measure). Particularly the highest average

correlation was observed from the applicant condition when the SS measure was applied, and the

lowest correlation was observed from the honest condition when the FC measure was applied,

indicating that the ATIC effect associated with the SS measure but not with the FC measure.

Thus, Hypotheses 2b and 4b were supported successfully.

6. Participant reaction towards single-stimulus and forced-choice format testing

Table 13 represents the descriptive statistics of the five subscales of the reaction measure,

including chance to perform, face validity, perceived predictive validity, property of questions

and participant reactions to honesty test, to different testing formats. All reaction measure

subscales were reliable with coefficient alpha ranged from .78 to .95 for the reactions to the SS

measure, and ranged from .83 to .97 for the reactions to the FC measure.

Five one-way ANOVAs were conducted to test whether significant difference exists between

reactions to different testing formats, and results display in Table 14. Significant effect of

Page 52: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

41

formats were detected from all five reaction measure subscales as F(1,2258)=13.65, p=.00 for the

chance to perform scale, F(1,2258)=65.93, p=.00 for face validity scale, F(1,2258)=85.38, p=.00

for the perceived predicative validity scale, F(1,2258)=17.88, p=.00 for the property of questions

scale, and F(1,2258)=23.91, p=.00 for reaction to the honesty test scale. Mean difference of each

scale was in a reasonable range, and all mean differences overlapped with the 95% confidence

interval of the difference. In general, respondents prefer the SS measure over the FC measure on

all five subscales.

However, investigations of effect size for these five subscales suggested that only face

validity [d=.34] and perceived predictive validity [d=.39] had moderate effect; small effect were

found in chance to perform [d=.15], property of questions [d=.18] and reactions of honesty test

[d=.21] based on Cohen’s standardized interpretation (Cohen, 1988). Small effect size suggested

only trivial difference exits between reactions to different testing formats, even it was

statistically significant. In general, participants had different opinions toward two testing

formats, but meaningful differences were only perceived from the face validity and perceived

predictive validity scales.

Page 53: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

42

CHAPTER 8

DISCUSSIONS

Ever since the single-stimulus measure was developed, researchers and practitioners have

used this testing format widely in the field of personality assessment. Applicant response

distortions have been an issue in the literature on personality assessment over years, and research

has begun to question validity- and reliability-related issues of this testing format and seek

solutions to such concerns (Morgeson et al., 1997, McFarland et al., 2000). This study makes

important contributions to investigating the measurement accuracy of forced-choice personality

testing. What has been measured from the traditional personality measure (i.e., single-stimulus

measure) was explored, particularly under the high-stakes situation. Similarly to early research,

this study shows that an ideal-employee factor can be easily identified by job applicants via their

ability to identify criteria using the SS measure (Klehe et al., 2012). I extended on this research

by examining whether FC format testing could reduce the chances of ideal-employee factors

being detected by the ability to identify criteria to result in more accurate testing results. Some

imperative concerns of the FC format under the high-stakes situation were also addressed. In the

following sections, I first discuss findings regarding validity-related issues, and then discuss

separately the implications of results based on different testing formats that were used in the

study.

Conclusions regarding validity related issues

Regarding construct validity, I confirmed that ideal-employee factor is in part determined

by the level of ATIC for job applicants using a SS measure of personality. The five personality

Page 54: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

43

trait indicators loaded highly onto the ideal employee factor when the SS measure was employed

especially under the applicant condition, whereas no such relationships were observed using the

FC measure under either condition. Moderate links between the ATIC and the ideal-employee

factor were found using the SS measure under both honest and applicant conditions, however,

such relationships did not exist when the FC measure was applied.

Individuals with high level of the ATIC were better at capturing the testing theme than

the ones with lower ATIC from the traditional SS measure. Even though the ATIC was not

designated to be measured from the personality measure, moderate correlations were found

between ATIC and latent trait scores obtained from the SS measure (i.e., ranges from .27 to .38)

under the applicant condition. These results are consistent with the previous research (e.g., Klehe

et al., 2012; Kleinmann et al., 2011) in that the ATIC is a hidden construct that personality

assessment measures, particularly under the high-stakes situation, and the ideal-employee factor

loads across the personality traits and can be assessed by the ATIC.

My findings are particularly important because I confirmed that the FC measure was a

remedy to reduce the ATIC effect in the personality assessment, which has not been suggested

by previous research. Two positively/or negatively worded statements from two personality

dimensions with similar social desirability are usually paired for one multidimensional pairwise

preference item, and respondents are required to select the one option that reflects themselves the

most. Because it is a challenging task of selecting a more favorable option from two equally

good/or bad statements, respondents will be likely to endorse honestly under the circumstances

(e.g., Jackson et al., 2000; Stark et al., 2005; Chernyshenko et al., 2009). Extremely low

correlations of the ATIC with latent trait scores that obtained from the FC measure (i.e., ranges

from -.053 to .062) under the applicant condition suggested that respondents were unable to

Page 55: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

44

detect the ideal-employee factor via their ATIC, indicating the effect of the ATIC was almost

eliminated completely this time by the FC measure. In practice, recruiters often use the SS

measure without take the ATIC effect into account so that unfavorable selection decisions may

occur because they may not even know what constructs they measure from the assessments.

The ATIC that was unintentionally assessed from the psychological testing may also have

an impact on the criterion-related validity of selection procedure. Undeniably, the ATIC is a

positive factor that relates to social skills, intelligence and person with high-ATIC may lead to

acceptable job performance as previous research argues (e.g., Viswesvaran & Ones, 1999;

Schollaert & Lievens, 2008; Kleinmann et al., 2011). Even though Hypotheses in this study did

not address these previous research results directly, results support that the ATIC being a

complementary explanation for criterion-related validity indirectly (Kelinmann et al., 2011) as

significant links between the ATIC and latent traits that obtained from the SS measure were

observed. Respondents' ability to identify ideal-employee factors (i.e., performance-related

criteria) from sales job position in this study is relevant for their understanding of selection

procedures as well as expectations of how to perform the job well. Based on the social

effectiveness research, self-report social effectiveness is related to performance in selection

procedures as well as on the future jobs (e.g., Hochwarter, Witt, Treadway & Ferris, 2006). In

other words, job applicants may perform the job in line with their distorted responses so that the

ATIC becomes a complementary explanation for the criterion-related validity.

To summarize, the emergence of the construct of the ATIC is a sign of response

distortion, especially when the SS measure is utilized under the high-stakes situation. Even

though the ATIC complements criterion-related validity in the selection process, the construct

validity of personality tests is violated due to such response distortions. The goal of

Page 56: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

45

psychological testing is always to create a test that measures what is designed to measure, and to

use those traits to make accurate inferences about job applicants. Findings in the current study

remind test developers that precision in psychological testing is always necessary in the

personnel selection process.

Conclusions regarding the testing formats

A high score in personality measurement is always favorable when it leads to a desirable

job, and the findings of this study highlight how misleading the latent trait score estimation from

the SS measure can be, particularly under the high-stakes situation. Theoretically, the Big Five

traits are a set of relatively orthogonal traits, and correlations among them should range from

-.20 to .25 (Michel, 2010). However, the results supported the previous findings (Guan et al.,

2014) that the SS measure does not prevent individuals from distorting responses so that

moderately high correlations were found among the Big Five traits (i.e., ranges from .186

to .510) under the honest condition, and even higher correlations (i.e., ranges from .523 to .742)

were obtained under the applicant condition, indicating a violation of the most basic assumption

of the Big Five Factor Model. How can latent traits scores estimated from the SS measure be

accurate when the most basic assumption is violated? As described and demonstrated earlier, the

ideal-employee factor is thought to lead high correlations among FFM traits when they were

recognized by respondents via their ATIC, especially under the high-stakes situation.

Nevertheless, trait correlations were significantly lower when using the FC measure for both

conditions, implicating that respondents were not able to recognize the ideal-employee factors

from the FC measure despite the fact that they were able to identify such factors from the SS

measure.

Page 57: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

46

Further, respondents showed slightly similar reactions toward different testing formats.

They believe both testing formats allow them to perform their capability equivalently well (e.g.,

"I could really show my skills and abilities through this measure."), questions from two testing

formats share similar quality (e.g., “The content of the measure seemed appropriate.”), and both

testing formats are appropriate to be administered (e.g., “I would enjoy being asked to take such

a test.”). Interestingly, respondents tend to perceive the SS as a superior format to the FC

because they do not see how performance on the FC measure relates to the future job (e.g., “I did

not understand what the examination had to do with the job.”) in spite of the fact that they

believe both formats are appropriate. The purpose of the psychological testing in the selection

process is to assess true trait standings of potential job candidates, match their traits level to the

future jobs, and make initial screen or/and selection decisions based on how their trait levels fit

the job. Nevertheless, job applicants are more likely to distort their responses when they

understand how testing questions tap into the future job. From this reaction perspective, even

though respondents prefer the SS as a testing format, the FC is the favorable one to be used for

recruiters as a selection testing format because of this unambiguity to the respondents.

Remaining concerns with the FC format measure and future research directions

Despite positive results that were discussed, the current study brings to light two serious

concerns about the FC format testing. Therefore, I do not suggest the use of the FC format

testing until these concerns are addressed by future research. First, links of same latent trait

scores between the SS and the FC measures obtained from the honest condition were

comparatively lower than the existing research results. Chernyshenko and his colleagues (2009)

first presented a standard of "good" trait recovery of the multidimensional-FC measure. They

developed three MUPP measure that used same items from corresponding SS measures, and

Page 58: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

47

moderate links were found of the same trait that retrieved from different formats (i.e.,

intercorrelations of Order, Self-Control and Sociability were .75, .54, and .75, respectively)

under the honest responding condition. My results yielded smaller links of the same trait between

measures as the trait of Extraversion showed the highest link (i.e., r=.575), suggesting that latent

trait scores estimated from the FC measure were only somewhat consistent with the estimation

from the SS measure under the honest condition. Moreover, relatively lower links of traits were

observed under the applicant condition with the exception of the trait of Conscientiousness,

signifying that different constructs were measured from the two testing formats. As Carter,

Daniels & Zickar (2013) argued in the context of traditional versus projective testing,

correlations between FC and SS in honest conditions should not be automatically considered as

an indicator that the FC is a poor measure because the FC format testing may be measuring

something quite different from traditional SS measures. In general, more research is encouraged

to ensure the consistency across two different testing formats and an official standard of defining

a good FC measure is encouraged to be set so that practitioners may have more confident to

incorporate such testing format into their selection process.

Second, low same-trait correlations of the FC measure were found between two

responding conditions, which alters the attention that different constructs were measured from

two conditions although past research has shown that the FC format proved to be more resistant

to score inflation under applicant conditions (e.g., Chernyshenko et al., 2009; Drasgow et al.,

2012). Latent trait scores from honest and applicant conditions are expected to be consistent if

the FC measure is truly faking-resistant, and the findings in this study calls an urgent research

question that needs to be answered: what is measured from the applicant condition using the FC

measure if the ATIC is already reduced? Surprisingly, no past research has reported

Page 59: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

48

intercorrelations between FC scores estimated under honest and applicant responding conditions.

Practitioners are not able to make accurate and appropriate selection-related decision based on

personality traits measured from the FC format when latent constructs under the applicant

condition remain unclear. Regardless of faking-resistant nature of the FC format measure, I do

not encourage to rush to implement such testing format, and at least practitioners are clear about

what is measured from the SS measure under the high-stakes condition and can adjust selection

strategy accordingly.

To summarize, mixed results of the FC format were found from this research. Despite the

fact that the FC format was able to greatly reduce the effects of the ATIC compared to the SS

format, this study reveals even serious concerns about the FC format testing, and I do not

encourage to use the FC format until the concerns are addressed by future research: a) how to

ensure the consistency across testing formats; and b) what is measured from the FC measure

under the high-stakes situation. Without addressing such questions, the FC measure is an

inappropriate selection tool because researchers and practitioners cannot ensure the quality of

results obtained from the FC format so that incorrect decision making will occur. Such research

becomes increasingly crucial as more and more organizations, including the United States Army

(Drasgow et al., 2012), incorporate such testing format into their selection process.

Contributions to the field

Traditional psychological testing questions are usually constructed under the dominance

assumption and administered by the SS scale regardless of extreme latent traits on the trait

continuum are not covered, and respondents have a greater chance to distort their responses,

leading to inappropriate selection decisions. Because a wider range of latent trait scores can be

captured within the ideal point framework and previous research suggests that the FC format is

Page 60: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

49

more faking-resistant (e.g., Jackson, et al., 2000; Cheryshenko et al., 2009; Drasgow et al.,

2012), there is a resurgent trend of implementing the FC format to selection process recent years.

This study brings new insights to the previous research results, further, this study discourages the

use of the FC format as of now.

This study confirms the potential for the FC format testing in organizational settings in

some ways. Even though faking effect (i.e., ATIC) was detected from previous research (e.g.,

Kleinmann et al., 1993, Kleinmann et al., 2011, Klehe et al., 2012), this is the first study to

investigate whether the FC testing format could reduce the impact of ATIC on test scores.

Notably, despite the fact that ATIC is associated with higher job performance (Kleinmann,

1993), the ATIC contaminates construct- and criterion-related validities of the psychological

assessment. The use of the FC format provides a partial solution to such ambiguity, which

eliminates faking effect of the ATIC and ensures the assessment obtains the information as it is

intend to measure.

However, the FC format presents serious problems that need to be solved by researchers

for the first time, given that this study found there may be even more ambiguity as to what is

being measured with FC formats of testing. Above all, this study is the first time to raise

construct validity-related concerns about the FC format and encourage research on such topics.

Despite the positive features of the FC measure, it is still premature to implement such testing

format mainly because what is being measured under the applicant condition is still remaining

unknown. Without knowing what is measured from the applicant condition of the FC format

testing, appropriate and accurate selection/promotion decision cannot be made.

It is noteworthy to address that this study does not aim to discourage the use of the SS

measure entirely from the psychological testing because of unsolved concerns that were

Page 61: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

50

discussed earlier, also development of the FC format testing could be time consuming. Unlike

the SS measure that can be easily administered on a rating scale, FC measure development

requires extra steps: 1) statements needed to be administered on the rating scale to obtain the

item parameters; 2) statements from two personality dimensions but sharing similar social

desirability are grouped as FC pairs; 3) mix a small proportion of unidimensional-FC pairs by

pairing statements that are similar in social desirability but having different location parameters

to the final FC measure (Stark, 2005). GGUM is able to recover item and person parameters

accurately from the SS measure if response distortions are not involved. I do not encourage the

use of the FC format test as of now, however, if recruiters decide to use the FC measure, I

suggest to use the FC format only under the low-stakes situation because more research is

desperately need to determine what exactly is being measured by FC format tests under the high-

stakes condition.

Limitations of the current study

This study is limited in the test development process, due to the unavailability of the

personality measure constructed under the ideal point assumption, the FC items in this study

were paired using IPIP-NEO items that developed under the dominance assumption. Even

though the GGUM was flexible enough to capture both the dominance- and ideal point-type

items (Roberts, Laughlin & Wedell, 1999), more precise item and person parameter estimations

could be expected when a fully ideal point-based FC measure is utilized. Trait estimations are

expected to improve by addressing these two limitations.

Remaining concerns the FC measure that addressed in the previous section can only be

answered after a decent ideal-point based FC measure is developed, as well as an ideal-point

based FC format scoring program since there is no such publically available program at this time.

Page 62: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

51

Researchers can benefit from the ideal point-based FC measure and the program because

research can be done sufficiently when trait estimations can be recovered accurately from an

appropriate measure, practitioners can also get benefits from such development because test can

be implemented easily with a low cost when there is a growing trend of incorporating such

testing assumption and format into the selection process.

Page 63: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

52

REFERENCES

Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job

performance: a meta-analysis. Personnel Psychology, 44(1), 1–26.

Bauer, T. N., Truxillo, D. M., Sanchez, R. J., Craig, J. M., Ferrara, P., & Campion, M. A. (2001).

Applicant reactions to selection: Development of the selection procedural justice scale

(SPJS). Personnel Psychology, 54, 387-419.

Böckenholt, U. (2004). Comparative judgments as an alternative to ratings: identifying the scale

origin. Psychological Methods, 9(4), 453–65.

Brown, a., & Maydeu-Olivares, a. (2011). Item Response Modeling of Forced-Choice

Questionnaires. Educational and Psychological Measurement, 71(3), 460–502.

doi:10.1177/0013164410375112

Brown, A., &Maydeu-Olivares, A. (2013). How IRT can solve problems of ipsative data in

forced-choice questionnaires. Psychological methods, 18(1), 36–52.

Byrne, B.M. (1998). Structural equation modeling with LISERL, PRELIS and SIMPLIS: basic

concepts, applications, and programming. Mahwah, NJ: Erlbaum.

Cai, L., Thissen, D., & du Toit, S. H. C. (2011). IRTPRO 2.1 for Windows. Chicago, IL:

Scientific Software International.

Cao, M., Drasgow, F. & Cho, S. (2014). Developing ideal intermediate personality items for the

ideal point model. Organizational Research Methods, 24, 1-24.

Page 64: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

53

Carter, N., Daniels, M.A., Zickar, M. J. (2013). Projective Testing: Historical Foundations and

Uses for Human Resources Management. Human Resource Management Review, 23,

205-218.

Carter, N. T., Dalal, D. K., Boyce, A. S., O’Connell, M. S., Kung, M.-C., & Delgado, K. M.

(2014). Uncovering curvilinear relationships between conscientiousness and job

performance: how theoretically appropriate measurement makes an empirical difference.

The Journal of Applied Psychology, 99(4), 564–586. doi:10.1037/a0034688

Carter, N.T., Guan, L., Maples, J., Williamson, R. L., Miller, J. D. (2014). The Downsides of

Extreme Conscientiousness: A Facet-Level Examination. Manuscript submitted for

publication.

Chernyshenko, O. S., Stark, S., Drasgow, F., & Roberts, B. W. (2007). Constructing personality

scales under the assumptions of an ideal point response process: toward increasing the

flexibility of personality measures. Psychological Assessment, 19(1), 88–106.

doi:10.1037/1040-3590.19.1.88

Chernyshenko, O. S., Stark, S., Prewett, M. S., Gray, A. a., Stilson, F. R., & Tuttle, M. D.

(2009). Normative scoring of multidimensional pairwise preference personality scales

using IRT: empirical comparisons with other formats. Human Performance, 22(2), 105–

127.

Christiansen, N. D., Burns, G. N., & Montgomery, G. E. (2005). Reconsidering forced-choice

item formats for applicant personality assessment. Human Performance, 18(3), 267–307.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. (2nd ed). Hillsadle, NJ:

Erlbaum.

Page 65: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

54

Coombs, C. H. (1950). Psychological scaling without a unit of measurement. Psychological

Review, 57, 145-158.

Diamantopoulos, A. and Siguaw, J.A. (2000), Introducing LISREL. London: Sage Publications.

Douglas. E. F., McDaniel, M.A., & Snell, A.F. (1996). The validity of non-cognitive measures

decays when applicants fake. Academy of Management Proceedings, 16, 127-131.

Dunning, D., Griffin, D. W., Milojkovic, J. D., & Ross, L. (1990). The overconfidence effect in

social prediction. Journal of Personality and Social Psychology, 58, 568-581.

Gilliland, S. W. (1993). The perceived fairness of selection systems: An organizational justice

perspective. Academy of Management Review, 18, 694-734.

Goffin, R. D.,&Christiansen, N. D. (2003). Correcting personality tests for faking: A review of

popular personality tests and initial survey of researchers. International Journal of

Selection and Assessment, 11, 340–344.

Drasgow, F., Levine, M. V., Tsien, S., Williams, B., & Mead, A. D. (1995). Fitting

polychotomous item response theory models to multiple- choice tests. Applied

Psychological Measurement, 19, 143–165.

Drasgow, F., Chernyshenko, O. S., & Stark, S. (2010). 75 years after Likert: Thurstone was

right! Industrial & Organizational Psychology: Perspectives on Science and Practice,

3(4), 465-476.

Drasgow, F., Stark, S., Chernyshenko, O. S., Nye, C. D., Hulin, C. L., & White, L. A. (2012).

Technical Report 1311: Development of the Tailored Adaptive Personality Assessment

System (TAPAS) to support Army selection and classification decisions for the

behavioral and social sciences U.S. Army Research Institute for the behavioral and social

sciences Department of the Army Deputy Chief of Staff, G1 Authorized and approved for

distribution, (August).

Page 66: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

55

Drasgow, F. (2013, July 2). Tailored Adaptive Personality Assessment System (TAPAS). Paper

presented for the 4th Annual Meeting of International Personality Assessment

Conference. Columbus, OH.

Furnham, A. (1990) Faking personality questionnaires: Fabricating different profiles for different

purposes. Current Psychology: Research and Reviews, 9, 46–55.

Goldberg, L. R. (2001). International Personality Item Pool. Web address can be obtained from

authors.

Goltz, J. (2011, March 1st) the Hidden Costs of Bad Hiring. The New York Times. Retrieved

from January 13, 2014 from http://boss.blogs.nytimes.com/2011/03/01/the-hidden-costs-

of-bad-hiring/.

Guan, L., Carter, N.T., Tryba, B.A., & Griffith, R.L. (April, 2014). Personality test faking as a

shift in response process. Poster presented for the 29th Annual Meeting of the Society for

Industrial and Organizational Psychology: Honolulu, HI.

Hays, W.L., Bennett, J.F. (1961). Multidimensional unfolding: Determining configuration from

complete rank order preference data. Psychometrika, 26 (2), 221-238.

Heggestad, E. D., Morrison, M., Reeve, C. L., & McCloy, R. a. (2006). Forced-choice

assessments of personality for selection: evaluating issues of normative assessment and

faking resistance. The Journal of applied psychology, 91(1), 9–24.

Hicks, L. E. (1970).Some properties of ipsative, normative, and forced-choice normative

measures. Psychological Bulletin, 74(3), 167–184.

Hirsh, J. B., & Peterson, J. B. (2008). Predicting creativity and academic success with a “Fake-

Proof” measure of the Big Five. Journal of Research in Personality, 42(5), 1323–1333.

Page 67: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

56

Hochwarter, W. A., Witt, L. A., Treadway, D. C., & Ferris, G. R. (2006). The interaction of

social skill and organizational support on job performance. Journal of Applied

Psychology, 91, 482–489.

Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The impact of faking on

employment tests: Does forced choice offer a solution? Human Performance, 13(4), 371–

388.

Jansen, A.,Melchers, K. G., Ko¨nig,C. J., Kleinmann,M.,Bra¨ndli, M., Fraefel, L., ... Lievens, F.

(2010,April). Candidates who correctly identify situational demands show better

performance. Paperpresented at the 25th Annual Conference of theSociety for Industrial

and Organizational Psychology, Atlanta, GA.

Joreskog, K., & Sorbom, D. (2007). LISREL 8.80 [computer software]. Chicago: Scientific

Software International, Inc.

Kenny, A. D., Kaniskan, B., McCoach, D. B. (2014). The performance of RMSEA in Models

with Small Degrees of Freedom. Sociological Methods and Research, 1-22.

Klehe, U., Kleinmann, M., Hartstein, T., Melchers, K. G., König, C. J., Heslin, P. A., &Lievens,

F. (2012).Responding to Personality Tests in a Selection Context: The Role of the Ability

to Identify Criteria and the Ideal-Employee Factor. Human Performance, 25(4), 273-302.

Kleinmann, M. (1993) Are rating dimensions in assessment centers transparent for participants?

Consequences for criterion and construct validity. Journal of Applied Psychology, 78,

988-993.

Kleinmann, M., Ingold, P. V., Lievens, F., Jansen, a., Melchers, K. G., &Konig, C. J. (2011). A

different look at why selection procedures work: The role of candidates’ ability to

identify criteria. Organizational Psychology Review, 1(2), 128–146.

Page 68: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

57

Knowles, E. S. (1988). Item context effects on personality scales: Measuring changes the

measure. Journal of Personality and Social Psychology, 55(2), 312–320.

Komar, S., Brown, D. J., Komar, J. a, & Robie, C. (2008). Faking and the validity of

conscientiousness: a Monte Carlo investigation. The Journal of applied psychology,

93(1), 140–54.

Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 5–

53.

McCloy, R. a. (2005). A silk purse from the sow’s ear: retrieving normative information from

multidimensional forced-choice items. Organizational Research Methods, 8(2), 222–248.

McFarland, L. a, & Ryan, a M. (2000). Variance in faking across noncognitive measures. The

Journal of applied psychology, 85(5), 812–21.

McQuitty, S. (2004), "Statistical power and structural equation models in business research,"

Journal of Business Research, 57 (2), 175-83.

Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: A case for

nonparametric item response theory modeling. Psychological Methods, 9, 354 –368.

Melchers, K.G., Klehe, U.-C., Richter, G.M., Kleinmann, M., Konig. C.J., & Lievenis, F. (2009).

“I know what you want to know”: The imapct of interviewees’ ability to identify criteria

on interview performance and construct-related validity. Human Performance, 22, 355-

374.

Michel, J. S. (2010). Social Values : Emergence of a Five-Factor Model, 100(305), 65–69.

Morgeson, F. P., Campion, M. A., Hollenbeck, J. R., Murphy, K., & Schmitt, N. (2007). Are we

getting fooled again? Coming to terms with limitations in the use of personality tests for

personnel selection. Personnel Psychology, 60, 1029-1049.

Page 69: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

58

Ones, D. S., Viswesvaran, C., & Korbin, W. P. (1995, May). Meta-analyses of fakability

estimates: Between-subjects versus within-subjects designs. Paper presented at the

meeting of the Society of Industrial and Organizational Psychology, Orlando, FL.

Pauls, C.A., & Crost, N, W. (2005). Effects of different instructional sets on the construct

validity of the NEO-PI-R. Personality and Individual Differences, 39, 297-308.

Reckase, M. D. (1979). Unifactor latent trait models applied to multifactor tests: Results and

implications. Journal of Educational and Behavioral Statistics, 4(3), 207-230.

Richman, W. L., Kiesler, S., Weisband, S., & Drasgow, F. (1999). A meta-analytic study of

social desirability distortion in computer-administered questionnaires, traditional

questionnaires, and interviews. Journal of Applied Psychology, 84(5), 754–775.

Roberts, J. S., & Laughlin, J. E. (1996). A unidimensional item response model for unfolding

responses from a graded disagree–agree response scale. Applied Psychological

Measurement, 20, 231–255.

Roberts, J. S., Laughlin, J. E., & Wedell, D. H. (1999). Validity issues in the Likert and

Thurstone approaches to attitude measurement. Educational and Psychological

Measurement, 59, 211–233.

Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A general item response theory model

for unfolding unidimensional polytomous responses. Applied Psychological

Measurement, 24, 3–32.

Roberts, J. S., Fang, H., Cui, W., & Wang, Y. (2004). GGUM2004: A Windows based program

to estimate parameters in the generalized graded unfolding model. Applied Psychological

Measurement, 30, 64-65.

Page 70: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

59

Ross, S. (1998, September 20) Who’s more motivated to fake it in a personality test. The Los

Angeles News. Retrieved February 19th, 2014 from

http://articles.latimes.com/1998/sep/20/business/fi-24647.

Ryan, A. M., Ployhart, R. E., & Friedel, L. A. (1998). Using personality testing to reduce adverse

impact: A cautionary note. Journal of Applied Psychology, 83(2), 298–307.

Ryan, A.M. & Sacket, P.R. (1987) Pre-employment honesty testing: fakability, reactions of test

takers, and company image. Journal of Business Psychology, 1(3). 248-256.

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores

(Psychometric Monograph No. 18). Iowa City, IA: Psychometric Society.

Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel

psychology: Practical and theoretical implications of 85 years of research findings.

Psychological Bulletin, 124(2), 262–274.

Schollaert, E., & Lievens, F. (2008). The effects if exercise instructions on the observability of

assessment behavior. International Journal of Psychology, 43, 577-577.

Schmit, M.J., & Ryan, A.M. (1993). The Big-5 in personnel-selection: Factor structure in

applicant and nonapplicant populations. Journal of Applied Psychology, 91, 613-621.

Scherbaum, C. a, Sabet, J., Kern, M. J., & Agnello, P. (2012). Examining faking on personality

inventories using unfolding item response theory models. Journal of personality

assessment, (October), 37–41.

Smither, J. W., Reilly, R. R., Millasp, R. E., Peralman, K., & Stoffey, R. W. (1993). Applicant

reactions to selection procedures. Personnel Psychology, 46, 49-76.

Stark, S. (2001). MODFIT: A computer program for model-data fit. Unpublished manuscript,

University of Illinois at Urban–Champaign.

Page 71: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

60

Stark, S. (2005). An IRT approach to constructing and scoring pairwise preference items

involving stimuli on different dimensions: The multi-unidimensional pairwise-preference

model. Applied Psychological Measurement, 29(3), 184–203.

Stark, S., Chernyshenko, O. S., Drasgow, F., & Williams, B. (2006). Examining assumptions

about item responding in personality assessment: Should ideal point methods be

considered for scale development and scoring? Journal of Applied Psychology, 91, 25 39.

Tahmincioglu, E. (2011, August 15) Employers turn to tests to weed out job seekers. The

NBCNEWS. Retrieved January 8, 2014 from

http://www.nbcnews.com/id/44120975/ns/business-careers/t/employers-turn-tests-weed-

out-job-seekers/#.Utbvhzb1ASo.

Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–

286.Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology,

33, 529–554.

Vasilopoulos, N. L., Cucina, J. M., Dyomina, N. V., Morewitz, C. L., & Reilly, R. R. (2006).

Forced-choice personality tests: A measure of personality and cognitive ability? Human

Performance, 19(3), 175–199.

Viswesvaran, C., & Ones, D.S. (1999). Meta-analysis of fakability estimates: Implications for

personality assessment. Educational and Psychological Measurement, 59,197-210.

Wesman, A. G. (1952). Faking personality test scores in a simulated employment situation.

Journal of Applied Psychology, 36(2), 112-113.

White, L. A., & Young, M. C. (1998, August).Development and validation of the Assessment of

Individual Motivation (AIM). Paper presented at the annual meeting of the American

Psychological Association, San Francisco, CA.

Page 72: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

61

APPENDIX A Single-Stimulus Measure

To

tally

Disag

ree

Disag

ree

Slig

htly

Disag

ree

Slig

htly

Ag

ree

Ag

ree

To

tally

Ag

ree

Often feel blue.

Feel comfortable around people.

Believe in the importance of art.

Am always prepared.

Dislike myself.

Make friends easily.

Have a vivid imagination.

Believe that others have good intentions.

Pay attention to details.

Am often down in the dumps.

Am skilled in handling social situations.

Get chores done right away.

Have frequent mood swings.

Am the life of the party.

Carry the conversion to a higher level.

Accept people as they are.

Carry out my plans.

Panic easily.

Know how to captivate people.

Enjoy hearing new ideas.

Make people feel at ease.

Make plans and stick to them.

Rarely get irritated.

Have little to say.

Am not interested in abstract ideas.

Have a sharp tongue.

Waste my time.

Seldom feel blue.

Keep in the background.

Do not like art.

Cut others to pieces.

Find it difficult to get down to work.

Feel comfortable with myself.

Would describe my experiences as somewhat dull.

Avoid philosophical discussions.

Suspect hidden motives in others.

Do just enough work to get by.

Am not easily bothered by things.

Don’t like to draw attention to myself.

Do not enjoy going to art museums.

Get back at others.

Don’t see things through.

Am very pleased with myself.

Don’t talk a lot.

Insult people.

Shrink my duties.

Page 73: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

62

APPENDIX B Forced-Choice Measure

Pair Choose the statement is more like you in this pair.

1 Have a good word for everyone. Respect others.

2 Make plans and stick to them. Am not easily bothered by things.

3 Make friends easily. Am always prepared.

4 Am skilled in handling social situations. Carry out my plans.

5 Feel comfortable with myself. Feel comfortable around people.

6 Make people feel at ease. Am the life of the party.

7 Know how to captivate people. Accept people as they are.

8 Pay attention to details. Believe in the important of art.

9 Get chores done right away. Have a vivid imagination.

10 Rarely get irritated. Enjoy hearing new ideas.

11 Seldom feel blue. Carry the conversion to the higher level.

12 Believe that others have good intention. Am very pleased with myself.

13 Am not interested in abstract ideas. Don’t like to draw attention to myself.

14 Do not enjoy going to art museums. Keep in the background.

15 Avoid philosophical discussions. Would describe my experiences as somewhat

dull.

16 Do not like art. Don't talk a lot.

17 Suspect hidden motives in others. Find it difficult to get down to work.

18 Have little to say. Do just enough to get by.

19 Have a shape tongue. Often feel blue.

20 Dislike myself. Waste my time.

21 Panic easily. Cut others to pieces.

22 Am often down in the dumps. Get back at others.

23 Have frequent mood swings. Don't see things through.

24 Insult people. Neglect my duties.

Page 74: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

63

APPENDIX C Job Flyer

Title: Sales Representative

Company Description:

This is an international organization that manufactures and sells home accessories

worldwide.

Specific Duties

Contact regular and prospective customers to demonstrate products, explain product

features, and solicit orders.

Recommend products to customers, based on customers' needs and interests.

Answer customers' questions about products, prices, availability, product uses, and credit

terms.

Estimate or quote prices, credit or contract terms, warranties, and delivery dates.

Consult with clients after sales or contract signings to resolve problems and to provide

ongoing support.

Provide customers with product samples and catalogs.

Identify prospective customers by using business directories, following leads from

existing clients, participating in organizations and clubs, and attending trade shows and

conferences.

Minimum Training and Experience

At least have a high school diploma or equivalent, and one year working experience is required.

Hours:

Full-time

Page 75: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

64

APPENDIX D Ability to Identify Criteria Measure

Not Apply Apply

Somewhat

Apply Apply

Perfectly

Initiative

Independence

Judgment and decision making

Achievement/Effort

Analytic thinking

Persistence

Dependability

Attention to Detail

Adaptability/Flexibility

Integrity

Stress Tolerance

Self-Control

Speaking/communication

Mathematics

persuasion

Leadership

Science

Complex problem solving

Openness

Coordination

Page 76: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

65

Appendix E Participants’ Reaction Measures

(Wrenn, 2015; Ryan & Sackett, 1987; Smither, Reilly, Pearlman &Stoffey, 1993)

Face validity

1. I did not understand what the examination had to do with the job. (R)

2. I could not see any relationship between the examination and what is required on the job.

(R)

3. It would be obvious to anyone that the examination is related to the job.

4. The actual content of the examination was clearly related to the job.

5. There was no real connection between the examinations that I went through the job. (R)

Perceived predictive validity

1. Failing to pass the examination clearly indicates that you can't do the job.

2. I am confident that the examination can predict how well and applicant will perform on

the job.

3. My performance on the examination was a good indicator of my ability to do the job.

4. Applicants who perform well on this type of examination are more likely to perform well

on the job than applicants who perform poorly.

5. The employer can tell a lot about the applicant's ability to do the job from the results of

the examination.

Chance to perform

1. I could really show my skills and abilities through this measure.

2. This measure allowed me to show what my job skills are.

3. This measure gives applicants the opportunity to show what they can really do.

4. I was able to show what I can do on this measure.

Propriety of questions

1. The content of this measure did not appear to be prejudiced.

2. The measure itself did not seem too personal or private.

3. The content of the measure seemed appropriate.

Subjects reactions to honesty test

1. It is perfectly appropriate for an employer to administer such a test.

2. I would refuse to take such a test, even if it meant losing a chance at the job. (R)

3. I would enjoy being asked to take such a test.

4. This type of test is an invasion of privacy. (R)

5. If I had two comparable job offers, I'd reject the company that used such a test. (R)

6. I would resent being asked to take such a test. (R)

7. A test such as this is sometimes an appropriate selection procedure.

8. Administering a test such as this reflects negatively on the organization. (R)

9. Being asked to take such a test would not affect my view of the organization.

10. Tests like this are routinely used in the industry today.

Page 77: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

66

Table 1

Descriptive Statistics and Coefficient Alpha of Each Scale

Mean SD α

Honest Agreeableness 34.05 5.83 .79

Conscientiousness 40.74 8.06 .90

Extraversion 32.70 9.43 .90

Emotional Stability 36.25 9.84 .90

Openness 32.60 6.37 .83

Applicant Agreeableness 38.06 5.00 .81

Conscientiousness 51.35 5.95 .93

Extraversion 46.58 7.03 .86

Emotional Stability 48.07 6.59 .76

Openness 37.00 5.41 .83

Note. N=1130.

Page 78: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

67

Table 2

Factor Loadings of Each Single-Stimulus Scale of the Honest and Applicant Conditions

Loadings Agreeableness Conscientiousness Extraversion Emotional

Stability

Openness

Honest Factor 1 3.30 5.27 5.33 5.34 3.75

Factor 2 1.22 0.98 1.02 1.05 1.06

Factor 3 0.95 0.65 0.80 0.88 0.83

Factor 4 0.63 0.63 0.62 0.67 0.65

Factor 5 0.58 0.49 0.56 0.51 0.60

Applicant Factor 1 3.88 6.17 4.88 4.05 3.77

Factor 2 0.99 0.96 1.32 1.35 0.86

Factor 3 0.82 0.49 0.86 0.94 0.78

Factor 4 0.64 0.45 0.62 0.76 0.71

Factor 5 0.48 0.40 0.57 0.72 0.64

Note. N=1130

Page 79: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

68

Model-Data Fit Statistics of the Single-Stimulus Personality Measure under Both the Honest and

the Applicant Conditions Conditions Traits <1 1<2 2<3 3<4 4<5 5<7 >7 Mχ2/df SDχ2/df

Honest Agreeableness

Singlets 8 0 0 0 0 0 0 0.025 0.07

Doublets 0 0 3 1 5 8 11 7.241 4.488

Triplets 0 0 0 1 6 24 25 7.339 2.723

Conscientiousness

Singlets 10 0 0 0 0 0 0 0.215 0.3

Doublets 0 2 3 10 5 8 17 7.019 4.499

Triplets 0 0 0 4 15 43 58 7.869 3.409

Extraversion

Singlets 10 0 0 0 0 0 0 0 0

Doublets 0 2 4 7 3 14 15 6.635 3.703

Triplets 0 6 17 25 17 34 21 4.91 2.249

Emotion stability

Singlets 6 1 1 0 1 1 0 1.547 2.369

Doublets 0 0 0 2 2 9 32 11.9 8.351

Triplets 0 0 0 2 6 33 79 9.027 3.436

Openness

Singlets 7 0 1 0 0 0 0 0.318 0.712

Doublets 0 2 3 1 6 6 10 6.71 4.15

Triplets 0 0 1 4 6 15 30 7.89 3.168

Applicant Agreeableness

Singlets 2 1 0 0 1 1 3 5.005 4.281

Doublets 0 0 0 0 0 1 27 19.888 11.179

Triplets 0 0 0 0 0 0 56 30.492 11.235

Conscientiousness

Singlets 0 0 0 0 1 0 9 10.026 3.007

Doublets 0 0 0 0 0 0 45 19.354 9.585

Triplets 0 0 0 0 0 0 120 29.975 13.366

Extraversion

Singlets 2 1 1 0 2 0 4 4.78 3.823

Doublets 0 0 0 0 2 5 38 13.939 6.846

Triplets 0 0 0 0 0 0 120 20.052 7.377

Emotion stability

Singlets 4 0 0 0 0 1 5 8.895 8.725

Doublets 0 0 0 0 0 0 45 32.913 13.474

Triplets 0 0 0 0 0 0 120 58.073 17.805

Openness

Singlets 2 2 0 0 1 1 2 3.882 3.318

Doublets 0 0 0 0 0 2 26 16.488 8.732

Triplets 0 0 0 0 0 0 56 23.727 10.001

Note. N=1130.

Table 3

Page 80: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

69

Table 4

Item Parameters of the SS Measure under the Honest and Applicant Conditions

Honest Applicant

Often feel blue. 2.92 -2.82 -3.55 -2.51 -2.47 -1.69 -0.82 2.31 -3.57 -2.88 -1.68 -2.04 -1.88 -1.57

Feel comfortable around people. 1.29 2.24 -5.13 -3.44 -3.28 -2.66 -0.64 2.23 -1.94 -3.81 -3.40 -3.85 -3.84 -2.83

Believe in the importance of art. 1.90 2.60 -4.83 -3.98 -4.17 -3.11 -1.90 1.55 2.60 -5.07 -4.07 -4.79 -3.63 -2.11

Am always prepared. 1.24 3.13 -6.35 -4.91 -4.47 -3.24 -1.56 3.27 -2.07 -4.28 -4.06 -3.93 -3.85 -2.95

Dislike myself. 1.78 -3.53 -3.87 -2.98 -2.89 -1.95 -1.11 3.44 -3.54 -2.68 -1.80 -1.85 -2.09 -1.87

Make friends easily. 1.42 2.24 -4.56 -3.21 -3.07 -2.06 -0.75 2.64 -2.40 -4.65 -4.48 -4.14 -4.02 -2.93

Have a vivid imagination. 0.65 3.51 -6.66 -6.01 -5.20 -3.97 -2.53 0.86 2.84 -4.30 -4.99 -5.72 -3.86 -2.62

Believe that others have good

intentions.

0.60 3.70 -6.77 -5.58 -5.64 -3.30 -0.41

1.25 2.38 -4.71 -4.64 -4.62 -3.80 -2.18

Pay attention to details. 1.09 3.20 -6.47 -5.42 -5.23 -4.17 -2.39 2.95 -2.04 -4.24 -4.38 -3.90 -3.82 -3.06

Am often down in the dumps. 3.49 -3.32 -3.87 -2.98 -2.75 -2.16 -1.52 4.01 -3.48 -2.62 -1.81 -2.04 -1.81 -1.74

Am skilled in handling social situations. 1.48 1.85 -3.79 -3.26 -2.76 -1.79 -0.42 3.04 -2.11 -3.72 -4.83 -3.83 -3.73 -2.91

Get chores done right away. 0.85 3.40 -6.36 -4.56 -4.64 -3.24 -1.69 1.95 -1.86 -4.50 -3.72 -3.84 -3.67 -2.57

Have frequent mood swings. 1.05 -3.94 -4.98 -3.57 -3.60 -2.56 -1.83 3.53 -3.50 -2.74 -1.88 -2.14 -1.96 -1.75

Am the life of the party. 1.08 2.58 -3.94 -2.73 -2.38 -1.02 -0.34 0.60 -3.44 -4.91 -5.10 -5.84 -3.44 -2.57

Carry the conversion to a higher level. 0.35 4.59 -7.84 -6.77 -7.02 -3.65 -0.06 1.05 2.63 -4.24 -5.56 -4.62 -4.31 -2.89

Accept people as they are. 0.58 4.33 -8.28 -7.26 -7.74 -5.17 -2.72 1.60 2.15 -4.49 -4.05 -4.39 -3.91 -2.56

Carry out my plans. 1.74 3.03 -5.37 -5.48 -4.70 -3.51 -1.85 3.99 -2.28 -4.36 -4.08 -4.28 -3.94 -3.03

Panic easily. 0.89 -1.81 -2.49 -1.16 -1.22 -0.56 0.44 2.98 -3.60 -2.70 -1.95 -1.93 -2.33 -1.78

Know how to captivate people. 1.09 2.65 -4.88 -3.64 -3.36 -1.89 -0.60 2.25 -2.28 -4.60 -3.98 -4.10 -3.79 -2.87

Enjoy hearing new ideas. 0.95 3.12 -7.32 -6.02 -5.82 -4.09 -1.93 1.82 2.45 -5.37 -4.20 -4.67 -4.33 -2.99

Make people feel at ease. 0.42 4.57 -9.317 -6.98 -8.07 -5.46 -1.30 2.14 2.15 -4.27 -4.61 -4.29 -4.00 -3.10

Make plans and stick to them. 1.50 3.13 -5.56 -5.24 -4.68 -3.56 -1.94 3.19 -2.13 -3.97 -4.41 -4.17 -3.90 -2.91

Rarely get irritated. 0.59 1.88 -4.73 -2.64 -2.17 -1.39 1.81 0.33 4.46 -3.55 -3.61 -5.53 -11.45 -6.06

Have little to say. 1.05 -3.02 -4.86 -3.13 -2.46 -1.58 -0.48 1.75 5.61 -5.63 -4.11 -4.20 -3.96 -3.37

Am not interested in abstract ideas. 0.77 -4.86 -6.11 -4.33 -3.28 -3.21 -2.29 1.10 -5.86 -5.97 -4.43 -4.22 -4.08 -4.25

Have a sharp tongue. 0.58 -3.00 -4.94 -3.07 -3.47 -1.22 -0.16 0.40 -9.00 -7.00 -6.00 -8.71 -7.14 -7.39

Waste my time. 1.31 -4.72 -5.78 -4.52 -4.27 -3.02 -2.15 2.33 7.15 -6.19 -5.19 -5.42 -5.31 -5.10

Seldom feel blue. 1.04 1.31 -3.00 -2.09 -1.34 -1.48 0.28 0.18 7.76 -3.28 -4.11 -6.79 -20.05 -9.94

Keep in the background. 1.82 -3.37 -4.96 -4.00 -3.51 -2.59 -1.63 1.68 5.61 -5.48 -4.18 -4.38 -3.84 -3.69

Do not like art. 3.11 -4.45 -4.68 -3.72 -3.24 -2.75 -2.29 2.34 -5.91 -5.77 -4.68 -4.15 -4.24 -3.80

Cut others to pieces. 1.48 -4.13 -4.14 -2.93 -2.79 -1.87 -1.82 3.20 -5.98 -5.15 -4.23 -4.30 -4.31 -3.92

Find it difficult to get down to work. 1.34 -5.08 -5.84 -4.51 -4.25 -3.22 -2.86 2.95 7.16 -6.19 -5.31 -5.48 -5.24 -5.47

Feel comfortable with myself. 1.15 2.38 -4.35 -4.03 -3.50 -2.86 -1.05 1.20 2.08 -2.71 -3.95 -4.25 -4.33 -2.90

Would describe my experiences as

somewhat dull. 0.68 -4.33 -5.67 -3.98 -4.06 -2.17 -0.95 1.07 5.98 -5.67 -4.23 -4.28 -3.88 -3.61

Avoid philosophical discussions. 0.55 -5.23 -6.51 -3.94 -4.42 -3.45 -1.68 0.61 -5.92 -6.06 -4.51 -4.59 -4.21 -3.38

Page 81: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

70

Note. N=1130

Suspect hidden motives in others. 0.64 -4.23 -6.80 -4.63 -4.54 -2.28 -1.52 0.93 -6.04 -6.07 -4.50 -4.87 -3.96 -3.71

Do just enough work to get by. 0.99 -5.21 -6.17 -4.54 -4.13 -3.20 -2.74 2.36 7.05 -6.12 -5.21 -5.54 -5.02 -5.44

Am not easily bothered by things. 0.60 1.75 -4.31 -2.91 -2.01 -1.64 1.24 0.30 4.80 -2.68 -3.82 -7.15 -10.82 -5.70

Don’t like to draw attention to myself. 0.63 -3.96 -6.79 -5.34 -5.10 -3.80 -1.82 0.51 7.17 -7.53 -6.44 -6.60 -5.76 -5.21

Do not enjoy going to art museums. 1.54 -4.08 -4.65 -3.57 -3.07 -2.71 -2.13 2.10 -5.74 -5.86 -4.68 -4.01 -3.95 -3.69

Get back at others. 1.30 -3.84 -4.57 -3.35 -2.92 -1.88 -1.75 2.70 -5.85 -5.20 -4.09 -4.32 -4.28 -3.89

Don’t see things through. 1.44 -5.32 -6.14 -4.60 -4.10 -3.08 -3.30 3.25 7.07 -6.16 -5.31 -5.43 -5.21 -5.10

Am very pleased with myself. 1.09 2.58 -5.01 -3.97 -3.50 -2.26 -0.59 0.57 3.15 -4.14 -5.44 -6.58 -4.97 -3.08

Don’t talk a lot. 1.17 -3.55 -5.05 -3.88 -3.67 -2.49 -1.75 1.32 5.55 -5.49 -4.36 -4.14 -3.95 -3.24

Insult people. 1.75 -3.87 -4.28 -3.06 -2.61 -1.83 -1.46 3.09 -6.12 -5.14 -4.23 -4.52 -4.13 -4.21

Neglect my duties. 1.70 -5.63 -5.90 -4.63 -4.22 -3.62 -3.05 3.88 7.34 -6.23 -5.46 -5.64 -5.46 -5.37

Page 82: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

71

Table 5

Item Parameters of the FC Measure under the Honest Condition

Note. N=1130. A=Agreeableness, C=Conscientiousness, E=Extraversion, ES= Emotional Stability, O=Openness to Experience

Pair Trait 1 Statement 1 Trait 2 Statement 2 1 C5 Make plans and stick to them 0.92 1.88 -2.42 ES9 Am not easily bothered by things. 0.17 10.65 -6.78

2 E2 Make friends easily 1.20 4.02 -3.52 C1 Am always prepared. 2.08 0.82 -1.27

3 E3 Am skilled in handling social situations 1.38 3.72 -3.22 C4 Carry out my plans. 2.39 0.65 -1.18

4 ES8 Feel comfortable with myself 0.19 2.55 -6.61 E1 Feel comfortable around people 0.86 5.23 -3.88

5 A5 Make people feel at ease 0.81 1.73 -4.54 E4 Am the life of the party 1.17 5.41 -3.16

6 E5 Know how to captivate people 0.79 5.30 -3.21 A4 Accept people as they are 0.62 2.94 -5.29

7 C2 Pay attention to details 0.60 3.38 -4.20 O1 Believe in the important of art 0.72 3.57 -2.60

8 C3 Get chores done right away 0.83 2.46 -1.76 O2 Have a vivid imagination 0.66 3.53 -4.08

9 ES6 Rarely get irritated 0.37 6.48 -2.08 O5 Enjoy hearing new ideas. 0.67 3.38 -5.78

10 ES7 Seldom feel blue 0.37 6.46 -4.80 O4 Carry the conversion to the higher level 0.48 4.23 -5.18

11 A2 Believe that others have good intention 0.77 3.31 -3.30 ES10 Am very pleased with myself 0.85 1.34 -1.10

12 O6 Am not interested in abstract ideas 1.17 -3.91 -2.07 E9 Don’t like to draw attention to myself 1.39 -1.97 -3.58

13 O9 Do not enjoy going to art museums 3.36 -3.25 -2.17 E7 Keep in the background 2.24 -1.93 -3.16

14

O8 Avoid philosophical discussions 0.76 -4.20 -3.79 E8

Would describe my experiences as

somewhat dull 0.52 -3.76 -4.02

15 O7 Do not like art 2.08 -2.91 -1.98 E10 Don't talk a lot 1.90 -1.97 -2.94

16 A8 Suspect hidden motives in others 0.43 -2.27 -3.08 C7 Find it difficult to get down to work. 1.60 -0.95 -0.52

17 E6 Have little to say 0.99 -1.96 -3.07 C8 Do just enough to get by 1.01 -1.61 -0.28

18 A6 Have a shape tongue 1.89 -2.69 -2.95 ES1 Often feel blue 1.90 -1.95 -1.66

19 ES2 Dislike my self 1.29 -0.97 -0.24 C6 Waste my time 0.58 -1.46 -2.58

20 ES5 Panic easily 1.21 -2.45 -3.07 A7 Cut others to pieces 1.46 -3.78 -3.19

21 ES3 Am often down in the dumps 2.61 -1.91 -2.22 A9 Get back at others 2.46 -3.31 -2.99

22 ES4 Have frequent mood swings 0.96 -0.45 -0.88 C9 Don't see things through 1.61 -0.79 -0.71

23 A10 Insult people 0.78 -4.26 -3.66 C10 Neglect my duties 1.57 -0.69 -1.13

Page 83: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

72

Table 6

Item Parameters of the FC Measure under the Applicant Condition

Note. N=1130. A=Agreeableness, C=Conscientiousness, E=Extraversion, ES= Emotional Stability, O=Openness to Experience

Pair Trait 1 Statement content Trait 2 Statement content

1 C5 Make plans and stick to them 0.80 1.54 -3.53 ES9 Am not easily bothered by things 0.17 12.08 -2.15

2 E2 Make friends easily 0.37 5.74 -2.88 C1 Am always prepared. 0.82 2.92 -4.10

3 E3 Am skilled in handling social situations 0.84 3.23 -4.14 C4 Carry out my plans. 0.29 6.62 -3.57

4 ES8 Feel comfortable with myself 0.63 6.18 -2.62 E1 feel comfortable around people 1.13 1.74 -3.91

5 A5 Make people feel at ease 0.27 -7.83 -2.00 E4 Am the life of the party 0.29 7.88 -0.12

6 E5 Know how to captivate people 1.40 1.62 -2.95 A4 Accept people as they are. 0.84 1.23 -3.94

7 C2 Pay attention to details 1.79 1.20 -3.15 O1 Believe in the important of art. 1.38 4.50 -2.25

8 C3 Get chores done right away 1.29 2.04 -3.35 O2 Have a vivid imagination 0.75 4.04 -1.99

9 ES7 Seldom feel blue 0.14 13.79 -2.16 O4 Carry the conversion to the higher level 0.39 0.81 -4.64

10 A2 Believe that others have good intention 0.41 5.18 -6.46 ES10 Am very pleased with myself 0.60 4.83 -3.64

11 O6 Am not interested in abstract ideas 1.45 -2.10 -2.00 E9 Don’t like to draw attention to myself 1.71 -1.43 -1.48

12 O9 Do not enjoy going to art museums 4.14 -1.73 -2.21 E7 Keep in the background 3.51 -1.48 -1.04

13

O8 Avoid philosophical discussions 0.51 -4.03 -5.30 E8

Would describe my experiences as

somewhat dull 0.47 -4.96 -3.24

14 O7 Do not like art 2.68 -1.39 -2.10 E10 Don't talk a lot 3.26 -2.55 -1.91

15 A8 Suspect hidden motives in others 53.65 0.25 -1.56 C7 Find it difficult to get down to work 1.76 -5.05 -3.38

16 E6 Have little to say 0.90 -0.59 -2.38 C8 Do just enough to get by 1.13 -5.26 -3.72

17 A6 Have a shape tongue 1.24 -0.76 -1.23 ES1 Often feel blue 0.83 -2.97 -2.30

18 ES2 Dislike my self 0.65 -3.80 -4.86 C6 Waste my time 0.86 -4.55 -3.61

19 ES5 Panic easily 1.10 -1.60 -1.84 A7 Cut others to pieces 1.33 -1.21 -0.89

20 ES3 Am often down in the dumps 2.81 -2.12 -2.48 A9 Get back at others 4.73 -1.04 -0.83

21 ES4 Have frequent mood swings 0.38 -3.89 -6.50 C9 Don't see things through 1.03 -4.36 -3.04

22 A10 Insult people 1.31 -0.29 -0.98 C10 Neglect my duties 1.04 -4.19 -3.74

Page 84: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

73

Table 7

Goodness-of-Fit Indices for the Ideal-Employee Confirmatory Factor Analysis Models Tested

Fit Indices RMSEA 90% CI

Models χ2 df p TLI CFI RMSEA Low High

1a. Honest SS 114.32 5 .00 .93 .93 .14 .12 .16

1b. Applicant SS 92.21 5 .00 .96 .98 .12 .10 .15

1c. Honest FC 432.83 5 .00 .20 .60 .28 .25 .30

1d. Applicant FC 541.29 5 .00 .05 .47 .31 .29 .33

Note. N = 1130. SS=single-stimulus measure; TLI = Tucker-Lewis Index; CFI = Comparative

Fit Index; RMSEA = root mean square error of approximation.

Page 85: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

74

Table 8

Goodness-of-Fit Indices for the Structural Equation Models Tested

Fit Indices RMSEA 90% CI

Models χ2 df p TLI CFI RMSEA Low High β

2a. Honest SS 382.69 34 .00 .93 .95 .10 .09 .10 .31

2b. Applicant SS 459.52 34 .00 .95 .96 .11 .10 .11 .36

2c. Honest FC 613.91 34 .00 .83 .87 .12 .11 .13 -.06

2d. Applicant FC 912.80 34 .00 .77 .83 .15 .14 .16 .01

Note. N = 1130. SS=single-stimulus measure; TLI = Tucker-Lewis Index; CFI = Comparative

Fit Index; RMSEA = root mean square error of approximation.

Page 86: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

75

Table 9

Correlation of the Honest and the Applicant Conditions in the Single-Stimulus Measure

Note. N=1130. Correlations are all significant at 0.01 level (2-tailed).

1 2 3 4 5 6 7 8 9 10

1.Agreeableness Honest 1

2. Conscientious Honest .510 1

3. Extraversion Honest .288 .376 1

4. Emotion Stability Honest .468 .544 .489 1

5. Openness Honest .338 .325 .319 .186 1

6. Agreeableness Faking .338 .274 .112 .149 .231 1

7.Conscientiousness Faking .251 .289 .119 .143 .230 .742 1

8. Extraversion Faking .210 .213 .124 .117 .213 .655 .667 1

9. Emotion Stability Faking .261 .282 .144 .265 .202 .655 .709 .649 1

10. Openness Faking .270 .308 .207 .156 .368 .670 .600 .633 .523 1

Page 87: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

76

Table 10

Correlation of the Honest and the Applicant Conditions in the Forced-Choice Measure

Note. N=1130. Correlations greater than or equal to +/- .077 are significant at p<0.01.

1 2 3 4 5 6 7 8 9 10

1.Agreeableness Honest 1

2. Conscientiousness Honest -.149 1

3. Extraversion Honest -.297 -.110 1

4. Emotion Stability Honest -.116 -.099 -.078 1

5. Openness Honest .010 -.155 -.129 -.189 1

6. Agreeableness Faking .152 -.013 -.060 -.096 .017 1

7.Conscientiousness Faking .056 .030 -.008 -.113 .034 .082 1

8. Extraversion Faking .097 .006 -.108 -.003 .021 .081 .023 1

9. Emotion Stability Faking -.183 -.041 .061 .189 -.032 -.107 -.244 -.277 1

10. Openness Faking -.050 .021 .006 -.021 .056 -.010 -.161 -.213 -.121 1

Page 88: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

77

Table 11

Correlations of the Traits Estimates that Obtained from Single-Stimulus and Forced-Choice

Measures under the Honest and the Applicant Conditions

Note. N=1130. Correlations greater than or equal to +/- .077 are significant at p<0.01.

Forced-Choice

Honest Single-Stimulus Agreeableness Conscientiousness Extraversion Emotion

Stability

Openness

Agreeableness .344 -.128 -.085 -.016 -.111

Conscientiousness -.167 .344 -.065 .050 -.189

Extraversion -.289 -.149 .575 .107 -.134

Emotional

Stability

-.226 -.039 .034 .432 -.236

Openness -.026 -.184 -.008 -.062 .463

Forced-Choice

Applicant Single-Stimulus Agreeableness Conscientiousness Extraversion Emotion

Stability

Openness

Agreeableness .172 .166 -.212 -.212 -.017

Conscientiousness .118 .301 -.135 -.135 -.084

Extraversion .065 .101 .469 -.081 -.168

Emotional

Stability

.071 .202 .289 -.044 -.116

Openness .043 .119 .121 -.055 .109

Page 89: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

78

Table 12

Correlations of the ATIC with Each Trait Estimated between Two Testing Formats across the

Honest and the Applicant Conditions

ATIC

Single-

Stimulus

Forced-Choice Difference

Honest Agreeableness .162 -.065 Yes

Conscientiousness .284 .017 Yes

Extraversion .221 .080 Yes

Emotional stability .171 -.001 Yes

Openness .210 -.017 Yes

Applicant Agreeableness .315 .008 Yes

Conscientiousness .291 .074 Yes

Extraversion .267 -.053 Yes

Emotional stability .236 .062 Yes

Openness .378 .056 Yes

Note. N=1130. Correlations greater than or equal to +/- .077 are significant at p<0.01.

Page 90: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

79

Table 13

Descriptive Statistics for Reaction Measures

Note. N=1130.

Single-Stimulus Forced-Choice

Mean SD α Mean SD α

Chance to perform 11.66 5.23 .95 10.84 5.35 .97

Face validity 19.47 4.43 .80 17.07 5.30 .86

Perceived predictive validity 17.04 6.45 .88 14.81 5.97 .93

Property of questions 12.54 3.09 .78 11.95 3.49 .83

Reactions to honesty test 35.42 8.99 .87 33.52 9.50 .87

Page 91: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

80

Table 14

ANOVA Analyses that Compare the Effect of Testing Formats on Five Reaction Measure Scales

Note. N=1130.

Mean

Difference

95% CI of the

Difference

F p Cohen’s d

Low High

Chance to Perform .83 .39 1.27 13.65 .00 .15

Face Validity 1.58 1.20 1.96 65.93 .00 .34

Perceived Predictive Validity 2.22 1.75 2.70 85.48 .00 .39

Property of Questions .59 .32 .86 17.88 .00 .18

Reactions of Honesty Test 1.90 1.14 2.67 23.91 .00 .21

Page 92: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

81

a. I manage to relax easily

b. I am careful over detail

c. I enjoy working with others

d. I set high personal standards

Figure 1

Example of a Multidimensional Forced-Choice Item (Adapted from Brown and Maydeu-

Olivares, 2013, p.36)

Page 93: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

82

Which of the following is more like you?

a. Usually, my notes are so jumbled, even I had a hard time reading them.

b. My social skills are about average.

Figure 2

Example of a Multidimensional Pairwise Preference Item that Representing Order and Self-

Control (Adapted from Chernyshenko et al., 2009)

Page 94: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

83

Figure 3

Example of the Dominance Response Process

1

2

3

4

5

-4 -3 -2 -1 0 1 2 3 4

Expec

ted

Item

Sco

re

Page 95: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

84

Figure 4

Example of the Ideal Point Response Process

1

2

3

4

5

-4 -3 -2 -1 0 1 2 3 4

Expec

ted

Item

Sco

re

Page 96: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

85

Figure 5

A Hypothetical Item Response Surface of a MUPP Item (αs=0.7, δs=1.5, τs=-0.3; αt=2.0, δt=0.9,

τt=0.1)

-3-2.2-1.4-0.60.211.82.6

0

0.2

0.4

0.6

0.8

1

-3 -2.2 -1.4 -0.6 0.2

1 1.8 2.6

𝛳T

P(x)

𝛳s

Page 97: PERSONALITY, FAKING AND THE ABILITY TO IDENTIFY CRITERIA

86

Figure 6

Model 2 (Note: ATIC=ability to identify criteria; IE-F=ideal-employee factor;

A=Agreeableness; C=Conscientiousness; E=Extraversion; ES=Emotional Stability and

O=Openness to Experience