forced-choice personality tests: a measure of personality and cognitive ability?

26
Forced-Choice Personality Tests: A Measure of Personality and Cognitive Ability? Nicholas L. Vasilopoulos, Jeffrey M. Cucina, Natalia V. Dyomina, and Courtney L. Morewitz The George Washington University Richard R. Reilly Stevens Institute of Technology This study examined the effects of item format (single-stimulus vs. forced-choice) and response motivation (honest vs. applicant) on scores for personality scales mea- suring Conscientiousness and Openness to Experience. Consistent with the hypothe- ses, cognitive ability was related to forced-choice personality scores in the applicant condition but not in the honest condition. Cognitive ability was unrelated to sin- gle-stimulus personality scores in both the applicant and honest conditions. The re- sults suggest that controlling for cognitive ability can reduce the incremental predic- tive validity of forced-choice personality scales in applicant settings. Findings are discussed in terms of the importance of considering how item format influences the construct and criterion-related validity of personality tests used to make selection decisions. Personality testing has become an accepted method of employee selection (Mount & Barrick, 1998). The rise of personality testing is linked to the publication of meta-analytic studies showing that personality scales are useful predictors of job performance (see Barrick, Mount, & Judge, 2001) as well as primary studies show- ing that personality scales can add incremental validity over cognitive ability tests (Mount, Witt, & Barrick, 2000; Schmidt & Hunter, 1998, 2004). Despite this HUMAN PERFORMANCE, 19(3), 175–199 Copyright © 2006, Lawrence Erlbaum Associates, Inc. Jeffrey M. Cucina is now at U.S. Customs and Border Protection. Correspondence should be sent to Nicholas L. Vasilopoulos, Department of Organizational Sciences & Communication, George Washington University, 600 21st Street, NW, Washington, DC 20052. E-mail: [email protected]

Upload: geodata

Post on 15-May-2023

1 views

Category:

Documents


0 download

TRANSCRIPT

Forced-Choice Personality Tests:A Measure of Personality

and Cognitive Ability?

Nicholas L. Vasilopoulos, Jeffrey M. Cucina,Natalia V. Dyomina, and Courtney L. Morewitz

The George Washington University

Richard R. ReillyStevens Institute of Technology

This study examined the effects of item format (single-stimulus vs. forced-choice)and response motivation (honest vs. applicant) on scores for personality scales mea-suring Conscientiousness and Openness to Experience. Consistent with the hypothe-ses, cognitive ability was related to forced-choice personality scores in the applicantcondition but not in the honest condition. Cognitive ability was unrelated to sin-gle-stimulus personality scores in both the applicant and honest conditions. The re-sults suggest that controlling for cognitive ability can reduce the incremental predic-tive validity of forced-choice personality scales in applicant settings. Findings arediscussed in terms of the importance of considering how item format influences theconstruct and criterion-related validity of personality tests used to make selectiondecisions.

Personality testing has become an accepted method of employee selection (Mount& Barrick, 1998). The rise of personality testing is linked to the publication ofmeta-analytic studies showing that personality scales are useful predictors of jobperformance (see Barrick, Mount, & Judge, 2001) as well as primary studies show-ing that personality scales can add incremental validity over cognitive ability tests(Mount, Witt, & Barrick, 2000; Schmidt & Hunter, 1998, 2004). Despite this

HUMAN PERFORMANCE, 19(3), 175–199Copyright © 2006, Lawrence Erlbaum Associates, Inc.

Jeffrey M. Cucina is now at U.S. Customs and Border Protection.Correspondence should be sent to Nicholas L. Vasilopoulos, Department of Organizational

Sciences & Communication, George Washington University, 600 21st Street, NW, Washington,DC 20052. E-mail: [email protected]

evidence, many practitioners have expressed concern that the utility of personalityscores used to make selection decisions is compromised by applicant faking(Goffin & Christiansen, 2003).

There are several approaches to dealing with applicant faking on personalitytests (see Hough, Eaton, Dunnette, Kamp, & McCloy, 1990). A common approachis to correct scores using a social desirability scale embedded in the personalitytest. The corrections are expected to enhance criterion-related validity by sup-pressing variance in personality scores that is related to faking but unrelated to per-formance. In general, research suggests that correcting personality scores for fak-ing has little impact on criterion-related validity (Barrick & Mount, 1996;Christiansen, Goffin, Johnston, & Rothstein, 1994; Hough, 1998; Hough et al.,1990; Ones & Viswesvaran, 1998; Ones, Viswesvaran, & Reiss, 1996). Some re-searchers have argued that these results provide evidence that the concerns aboutapplicant faking are overblown (e.g., Barrick & Mount, 1996; Ones et al., 1996).Other researchers have argued that the results are misleading because applicantswho fake are overrepresented at the top of the score distribution (e.g., Christiansenet al., 1994; McDaniel, Douglas, & Snell, 1997). This is problematic because thecorrelational techniques used to make corrections assume that fakers are equallydistributed throughout the range of personality scores (Conger & Jackson, 1972;Snell & McDaniel, 1998).

Questions about the utility of correcting personality scores have led to an inter-est in more proactive approaches to dealing with applicant faking. An old approachthat has received renewed attention is the use of a personality test with aforced-choice (FC) item format (e.g., Christiansen, Burns, & Montgomery, 2005;Christiansen, Edelstein, & Fleming, 1998; Jackson, Wroblewski, & Ashton,2000). In its simplest form, a FC personality item requires the test taker to selectbetween two stems that reflect equally desirable traits. The rationale for using FCpersonality tests is captured by Jackson et al. (2000), who stated:

Forced-choice questionnaires represent one possible method of minimizing the prob-lem of faking in employment testing by making the task of responding desirablymuch more difficult. If respondents are motivated to make the best possible impres-sion, being forced to choose between items similar in perceived relevance to the jobtends to reduce this type of impression management. (p. 373)

PURPOSE OF THE STUDY

Though intuitively appealing, FC items are substantially different from the sin-gle-stimulus (SS) items typically included on personality tests.1 This difference in

176 VASILOPOULOS ET AL.

1We use the term single-stimulus to be consistent with the terminology of Jackson et al. (2000). Coombs(1964) used single-stimulus as the label for scale scores falling in Quadrant II of his taxonomy of data. Scalescores in Quadrant II are independent of the test taker’s score on other scales assessed by the test.

item format can have implications for both the construct and criterion-related va-lidity of personality scales. In this study, the implication for construct validity isassessed through analyses that examine whether personality item format moder-ates the relationship between personality and cognitive ability test scores. Of par-ticular interest is the effect of item format on the correlation between personalityand cognitive ability scores in a selection context. If FC personality items are hardto fake, it seems reasonable to expect that high cognitive ability applicants will bemost effective at identifying the appropriate response.

This study also includes analyses that compare the criterion-related validity ofhonest and applicant FC personality scores. Researchers have suggested that FCpersonality scales yield higher criterion-related validity than SS personality scalesbecause they are more resistant to applicant faking (Christiansen, 2001;Christiansen et al., 1998; Jackson et al., 2000). An alternative explanation for thehigher validity estimates is that the FC personality scales capture differences incognitive ability as well as the intended personality construct.

THE FORCED-CHOICE ITEM FORMAT

The FC item format was introduced more than half a century ago as a way to addressconcerns about faking. Initial interest in the FC method is indicated by its use onmany notable noncognitive measures developed in the middle of the 20th century(see Zavala, 1965). The popularity of the FC method faded, however, as critics beganto point out several limitations (Clemans, 1966; Guilford, 1954; Hicks, 1970;Travers, 1951). The key criticisms included the fact that FC personality tests (a) aredifficult and time consuming to develop, (b) are not entirely resistant to faking, and(c) have statistical properties that limit making meaningful score comparisonsacross applicants. The last two criticisms are elaborated on in this section becausethey concern the validity of FC personality tests used for personnel selection.

Faking and the Forced-Choice Method

There is ample evidence that motivated test takers can inflate their scores on FCpersonality scales (Borislow, 1958; Corah, Feldman, Cohen, Meadow, &Ringwall, 1958; Dicken, 1959; Dunnette, McCartney, Carlson, & Kirchner, 1962;Feldman & Corah, 1960; Norman, 1963). These findings have long been viewed asevidence that the FC item format offers an inadequate solution to the problem ofapplicant faking. Recently, however, researchers have questioned whether thisconclusion about the usefulness of FC personality tests is too pessimistic (Baron,1996; Christiansen et al., 1998; Jackson et al., 2000).

One criticism of the early research on the FC item format concerns the defini-tion of desirability used to form dyads. Many FC personality tests included dyadsthat paired stems with similar ratings of general desirability. Waters (1965) argued

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 177

that this approach to dyad formation is flawed because a stem can reflect a trait thatis seen as desirable in general, but neutral or undesirable for a specific job. For ex-ample, the stem “I enjoy being around people” is desirable in general, but neutralor undesirable for a job that requires working alone. Waters proposed that formingdyads on the basis of job desirability ratings will lead to more promising results.

A second criticism concerns the appropriateness of the score comparisons usedto evaluate whether the FC item format offers an effective solution to applicant fak-ing. Most evaluations of effectiveness involved comparing the difference betweenhonest and applicant FC personality scores. Christiansen et al. (1998) argued that amore appropriate evaluation of effectiveness involves comparing the differencebetween honest and applicant FC scores with the difference between honest andapplicant SS scores. This approach defines effectiveness in relative (more vs. less)rather than absolute (all vs. nothing) terms.

Christiansen et al. (1998) conducted a study that addressed the two concernsraised previously about the initial research on the FC item format. To ensure jobrelevance, the SS and FC tests included scales measuring Conscientiousness andExtraversion, the two personality dimensions found to predict performance in cus-tomer service jobs similar to the one used in the study (Frei & McDaniel, 1998).Second, the effectiveness of the FC item format was evaluated using a relativerather than absolute standard. Consistent with earlier findings, scores on both theFC and SS personality tests were higher in the applicant condition. However, onlythe difference between honest and applicant SS personality scores was significant.Applying a relative standard of effectiveness leads to the conclusion that FC per-sonality tests are useful in combating applicant faking, a conclusion not reachedwhen applying an absolute standard.

Measurement Issues and the Forced-Choice Method

A major difference between SS and FC personality tests is the measurementproperties they possess. SS personality tests have normative properties that yieldmathematically independent scales (i.e., a score on one scale has no effect on thescores for other scales) that allow for scores to be interpreted relative to the meanscore for the population of test takers. On the other hand, FC personality tests haveipsative properties that yield mathematically dependent scales (i.e., a score on onescale leads to specific scores on other scales) that allow for scores to be interpretedrelative to the test taker’s mean scores on the other scales. The ipsative nature of FCpersonality tests has led many researchers to argue against their use in settings suchas personnel selection where normative comparisons are required (Hicks, 1970;Johnson, Wood, & Blinkhorn, 1988; Tenopyr, 1988).

The problem with using FC personality scores to make normative comparisonsis illustrated in the following example of a six-item test where all dyads includestems measuring Conscientiousness and Openness to Experience. The Conscien-

178 VASILOPOULOS ET AL.

tiousness and Openness to Experience scale scores are dependent because the en-dorsement of one stem necessarily means the rejection of the other stem. That is, atest taker with a Conscientiousness score of four must have an Openness to Experi-ence score of two, resulting in a –1.0 correlation between Conscientiousness andOpenness to Experience scores. This property of a purely ipsative measure violatesa basic assumption of the statistics used to make normative comparisons (e.g.,yields a singular matrix), making it difficult to estimate reliability and validity(Guilford, 1954; Hicks, 1970; Johnson et al., 1988; Tenopyr, 1988).

Though FC personality tests have ipsative properties, Hicks (1970) argued thatscales on an FC personality test can have normative properties, especially whenonly one of the stems in each dyad taps a measured personality construct. Recentresearch supports this argument. Christiansen et al. (1998) noted that the compari-sons they used to evaluate the construct validity of their FC personality test alsoprovided evidence that the test has normative properties. As expected, relatedscales on the SS and FC personality tests correlated highly in both the honest andapplicant conditions. Moreover, scales on the SS and FC personality tests corre-lated highly with conceptually related scales on the Revised NEO PersonalityInventory (NEO-PI-R; Costa & McCrae, 1992). Taken together, these findings in-dicate that FC personality scores capture some of the normative information pro-vided by SS personality scores in addition to the ipsative information inherent inthe use of the FC format. Evidence that FC personality scores have normativeproperties is also provided by research examining criterion-related validity. For ex-ample, Jackson et al. (2000) compared the criterion-related validity for FC and SSConscientiousness scales completed under honest and applicant instructions. TheFC scale predicted self-reported delinquent job behaviors in both the honest andapplicant conditions, whereas the SS scale was only predictive in the honest condi-tion. Christiansen (2001) reported similar results, with the major difference beingthe use of supervisor ratings rather than self-reported delinquency as a criterion.

In sum, research suggests that scales on FC personality tests are more resistantto score inflation due to applicant faking than scales on SS personality tests.Though promising, it is premature to conclude that FC personality scales are al-ways superior in applied settings. It is possible that applicant responses to FC per-sonality items are partly a function of cognitive ability. This would have implica-tions for the construct and incremental validity of FC personality scale(s) includedin a selection battery along with a cognitive ability test.

PERSONALITY RESPONSE PROCESS

To understand how personality item format can influence the relationship betweenpersonality and cognitive ability, it is helpful to consider the similarities and differ-ences between the honest and faked item response processes (Holden, Kroner,

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 179

Fekken, & Popham, 1992; Vasilopoulos, Reilly, & Leaman, 2000). The processesare similar in that item responses involve comparing the trait reflected in thestem(s) to a cognitive structure that defines the image the test taker is trying topresent. The processes differ in the type of cognitive structure used to make itemresponses.

Honest item responses are believed to involve a self-referenced evaluation of thefit between the trait reflected in the stem(s) and a relevant self-schema—a cognitivestructure that integrates memories, affective reactions, and inferred traits for behav-iors in a specific domain (Markus, 1977). Self-schemas strengthen when do-main-relevantpersonalexperiencesoccurconsistentlyacrossvarioussituations (seeKunda, 1999). People have many self-schemas; however, only a subset is active inthe working self-concept at any moment (Markus & Kunda, 1986). This subset in-cludes core self-schemas that are always active as well as peripheral self-schemasthat are temporarily active due to environmental cues and personal motivations.

Item responses made by applicants who fake are believed to involve a semanticevaluation of the fit between the trait reflected in the stem(s) and an adopted appli-cant schema—a cognitive structure that integrates knowledge about the behaviorsand traits of a qualified job candidate (Holden et al., 1992).2 Research suggests thatthe accuracy of the adopted applicant schema is a function of job knowledge (Kroger& Turnbull, 1975; Vasilopoulos et al., 2000), though there is evidence that accurateschemas can be adopted solely based on the job title (Furnham, 1990; Holden &Jackson, 1981; Mahar, Cologon, & Duck, 1995; Velicer & Weiner, 1975).

ITEM FORMAT AND THE PERSONALITY/COGNITIVEABILITY RELATIONSHIP

Single-Stimulus Personality Items

A SS item response is a relatively straightforward task because all responses toitems on the test are made independent of each other (Coombs, 1964). This meansthat test takers only need to decide the point on a response scale that best describesthe fit between the trait reflected in the stem and the referenced schema. Cognitiveability should not play a major role in honest responses to SS personality items.The formation and strength of a self-schema is based on the consistency of cogni-tive, affective, and behavioral responses over time. Though most test takers willhave at least one self-schema with a cognitive component (e.g., self-schema for

180 VASILOPOULOS ET AL.

2An adopted schema is similar to social cognitive constructs such as an “implicit personality the-ory” or a “person schema.” We use adopted schema to be consistent with the terminology of Holden etal. (1992).

creative or imaginative), they will also have several self-schemas with no link tocognitive ability (e.g., self-schemas for dependable or moral).

It is also unlikely that cognitive ability impacts responses to SS personalityitems made by applicants who fake. Research has shown that test takers instructedto respond in a manner consistent with an applicant qualified for a specific job areable to do so (Furnham, 1990; Vasilopoulos et al., 2000). Moreover, it appears thatapplicants have little trouble identifying the social desirability of most SS person-ality items (Rosse, Stecher, Miller, & Levin, 1998). Finally, compared tononapplicants, applicants make more extreme, socially desirable responses to SSpersonality items (Schmit & Ryan, 1993; Stark, Chernyshenko, Chan, Lee, &Drasgow, 2001). Collectively, these findings suggest that inflating scores on SSpersonality tests is a relatively simple task that applicants motivated to fake can ef-fectively perform regardless of cognitive ability level.

Forced-Choice Personality Items

Comparatively, the response process is more complex for FC personality itemsthan for SS personality items. The added complexity occurs because the evaluationof fit for one stem in the dyad is dependent on the evaluation of fit for the otherstem. This leads to a response decision that involves a relative evaluation of fit todetermine which of the similarly desirable stems is most consistent with the refer-enced schema. Cognitive ability should not directly influence honest responses toFC personality items. A test taker answering honestly is motivated to endorse thestem most consistent with an active self-schema. Some of the active self-schemaswill link to cognitive ability, whereas others will not.

In contrast, cognitive ability should impact the effectiveness of applicant re-sponses to FC personality items by influencing the extent to which fakers are ableto accurately rank order traits in terms of job relevance. The rationale for positing arelationship between cognitive ability and FC personality scores in an applicantsetting comes from performance appraisal research linking cognitive ability andcomponents of Cronbach’s (1955) rater accuracy model. Smither and Reilly(1987) reported a significant relationship between cognitive ability and stereotypeaccuracy—the extent to which the profile of dimension means matches the norma-tive profile. Hauenstein and Alexander (1991) reported a significant relationshipbetween cognitive ability and dimensional accuracy—the extent to which dimen-sions are correctly rank ordered in terms of the relevance for a single rater.3 Bothstereotype and dimensional accuracy concern the accuracy of inferences of trait

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 181

3Hauenstein and Alexander (1991) noted that their definition of dimensional accuracy is equivalentto Cronbach’s (1955) differential accuracy, which refers to the correct rank order of multiple ratees oneach dimension.

covariation made on the basis of a job schema (Nathan & Alexander, 1985). Basedon the preceding discussion, the following hypotheses are made:

H1: SS personality scores will be higher in the applicant condition than in thehonest condition.

H2: FC personality scores will be higher in the applicant condition than in thehonest condition; however the difference between scores in the applicantand honest conditions will be largest among participants with high levels ofcognitive ability.

ITEM FORMAT AND THEPERSONALITY/PERFORMANCE RELATIONSHIP

A second purpose of this study is to explore whether the relationship betweenpersonality scores and college grade point average (GPA) is moderated by re-sponse motivation. Of particular interest is whether the higher validity estimatesfor FC personality scores in the applicant condition occur because they tap cog-nitive ability. There is substantial research showing that cognitive ability testsare highly predictive of performance in academic and organizational settings(Camara & Echternacht, 2000; Schmidt & Hunter, 1998). Therefore, any overlapwith cognitive ability scores could limit the utility of a FC personality test usedfor selection.

Overall, research suggests a relationship between personality and learning inboth academic and organizational settings. A consistent finding is the criterion-re-lated validity for the Big Five personality dimensions of Conscientiousness andOpenness to Experience (Barrick & Mount, 1991; Colquitt & Simmering, 1998;Costa & McCrae, 1992; Dolinger & Orf, 1991; Farsides & Woodfield, 2003;Musgrave-Marquart, Bromley, & Dalley, 1997; Tross, Harper, Osher, &Kneidinger, 2000; Wolfe & Johnson, 1995). For this reason, Conscientiousnessand Openness to Experience scales are included in this study.

In developing the FC personality test, stems reflecting Conscientiousness orOpenness to Experience were paired with distracter stems reflecting the Big Fivedimensions of Agreeableness or Extraversion. Agreeableness and Extraversionstems served as distracters because there is little evidence that these dimensionsare related to learning in applied settings. Stems reflecting Neuroticism were notincluded in this study because there is some evidence that it is related to learningperformance (Hurtz & Donovan, 2000). This approach leads to the development ofa FC personality test that allows for normative comparisons in levels of Conscien-tiousness and Openness to Experience. Based on the preceding discussion, the fol-lowing hypotheses are made:

182 VASILOPOULOS ET AL.

H3a: SS personality scores in the honest condition will be stronger predictors ofGPA than SS personality scores in the applicant condition.

H3b: FC personality scores in the applicant condition will be stronger predictorsof GPA than FC personality scores in the honest condition.

H4a: SS personality scores in the honest condition will add incrementally to theprediction of GPA found using a cognitive ability test alone to a greater ex-tent than SS personality scores in the applicant condition.

H4b: FC personality scores in the honest condition will add incrementally to theprediction of GPA found using a cognitive ability test alone to a greater ex-tent than FC personality scores in the applicant condition.

Note that FC personality scores are expected to account for unique variance inGPAevenaftercontrolling forcognitiveability.Recall thatChristiansenetal. (1998)reported validity evidence that the FC personality test in their study measured the in-tended personality constructs. This suggests that FC personality tests completed inan applicant setting will tap both cognitive ability and the intended personality trait.However, the extent to which the overlap with cognitive ability reduces the uniquevariance in GPA accounted for by the FC personality score is unclear.

METHOD

Participants

Participants were 327 undergraduate students enrolled in psychology courses at amid-sized university in the eastern United States. Eighty-three percent wereWhite, 8% were Asian, 6% were Hispanic, and 3% were African American.Sixty-two percent were female and 38% were male. The mean age was 19.24 years(SD = 2.12). None of these demographic variables correlated with any of the re-search variables.

Research Measures

Forced-choice personality test. Stems included on the FC personality testwere drawn from Goldberg’s (1999) 300-item Big Five personality test. In the ini-tial step of test development, six upper-level undergraduate students were asked torate how desirable the trait reflected in each stem would be for a person motivatedto gain admittance to a college or university. Ratings were made according to a7-point scale ranging from 1 (extremely undesirable) to 7 (extremely desirable).Two criteria were used to retain stems for the next step. First, a stem had to have amean desirability rating less than or equal to three (low desirability), or greaterthan or equal to four (high desirability). Second, the standard deviation of the de-

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 183

sirability ratings for the stem had to be less than or equal to one. In all, 174 stemsmet the criteria. The fact that nearly half of the 300 stems did not meet the criteriafor use on the FC personality test is not uncommon (Jackson et al., 2000).

In the next step, dyads were formed by pairing stems measuring different BigFive factors with and had similar desirability ratings (i.e., desirability ratingswithin 0.5 points of each other). This yielded a total of 64 dyads, of which 50 wereretained to ensure that all possible combinations of the Big Five personality factorswere equally represented on the test. At least two of the five dyads for each combi-nation were identified as desirable.

In the final step, high-desirability dyads were paired with low-desirabilitydyads to form 25 quartets. Stems measuring the same Big Five factors were neverincluded in the same quartet. The final version of the FC personality test was cre-ated by randomly ordering the quartets. This approach, called the tetrad or dichot-omous quartet method, has been used by other researchers developing a FC per-sonality test to avoid making test takers select between two undesirable options(Dunnette et al., 1962; Jackson et al., 2000).

The instructions directed test takers to select the stems in each quartet that aremost and least descriptive of their behavior. For desirable stems, a value of +1 wasgiven if it was selected as most descriptive, and –1 was given if it was selected asleast descriptive. For undesirable stems, a value of –1 was given if it was selectedas most descriptive, and +1 was given if it was selected as least descriptive. Stemsnot selected were given a value of zero. The Conscientiousness and Openness toExperience scores were computed by summing the values assigned to the 10 quar-tets. A total score was computed by summing the Conscientiousness and Opennessto Experience scores. Examples of the quartets for Conscientiousness and Open-ness to Experience scales are presented in the Appendix.

Single-stimulus personality test. The SS personality test was created usingthe same 100 stems included on the FC personality test. The instructions directedtest takers to indicate the extent to which the behavior depicted in the stem de-scribed them using a 5-point scale ranging from 1 (very inaccurate) to 5 (very ac-curate). Conscientiousness and Openness to Experience scores were computed bysumming the values assigned to the 10 items whose stems were also scored on theFC personality test. A total score was computed by summing the values assigned tothe 20 items on the Conscientiousness and Openness to Experience scales.

Cognitive ability test. The Wonderlic Personnel Test (WPT; Wonderlic, Inc.,1999) was used to measure cognitive ability. The WPT includes 50 items orderedin terms of increasing difficulty. Participants were given 12 min to complete thetest and required to write in their answers in the space provided.

184 VASILOPOULOS ET AL.

Academic performance. First semester GPA was used as a measure of aca-demic performance. GPA was obtained from participant’s transcripts provided bythe university’s Registrar’s Office with the permission of the participant. We usedfirst semester GPA rather than first year GPA to avoid having to omit participantswho had not yet finished their freshman year at the time the study was conducted.In the end, usable GPA data were obtained for 84% of the participants.

Procedure

Data were collected in several group sessions consisting of 10 to 50 participants.Initially, participants completed a background sheet and then were administeredthe WPT. Participants were then randomly assigned to one of four experimentalconditions corresponding to a 2 × 2 Response Instruction × Item Type design. Par-ticipants in the honest response condition were instructed to answer honestly,whereas the participants in the applicant response condition were instructed to re-spond as if they were completing the test as part of the admissions process for acollege that they really wanted to attend. Within both response conditions, partici-pants were randomly assigned to complete either the SS or FC personality test.

RESULTS

The descriptive statistics and correlation matrix for all research variables are pre-sented in Table 1. For each variable, the means, standard deviations, and samplesizes are presented separately for participants in the honest and applicant condi-tions. Differences in the scores observed in the honest and applicant response con-ditions are shown in terms of effect size estimates (Cohen’s d) and significancelevel for an independent samples t test. The correlation matrix presents the resultsfor the honest and applicant response conditions separately above and below thediagonal respectively, with reliability estimates included in the diagonal. HigherSS and FC personality scores were found in the applicant condition, suggestingthat the instruction manipulation was effective. As in earlier research (e.g.,Christiansen et al., 2005; Christiansen et al., 1998) the largest differences betweenscores in the honest and applicant conditions were found on the SS personality test.Finally, for both the SS and FC personality tests, Conscientiousness and Opennessto Experience were more strongly correlated in the applicant condition.

Results for H1 and H2

H1 proposed higher SS personality scores in the applicant condition than in thehonest condition, whereas H2 proposed higher FC personality scores in the appli-cant condition, with a larger difference between scores in the two conditions for

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 185

186

TABLE 1Descriptive Statistics and Correlations for Personality and Cognitive Ability Measures

Honest Applicant Correlations

n M SD n M SD d Sig. of t CON OPEN TOTAL WPT GPA

Single–stimulusCON 81 36.60 4.99 81 39.69 5.62 0.58 .00 (.79, .67) .44* .86* .02 .08OPEN 81 33.99 4.68 81 38.09 5.65 0.79 .00 .76* (.71, .56) .84* .09 .13TOTAL 81 70.59 8.22 81 77.78 10.59 0.76 .00 .94* .94* (.86, .74) .06 .12WPT 81 29.11 4.85 81 28.60 4.73 –0.11 .52 –.05 –.03 –.04 (.82, .94)a .30*GPA 81 3.17 .50 81 3.12 .51 –0.10 .53 .02 .07 .05 .33* —

Forced-choiceCON 84 2.98 2.59 83 3.67 2.43 0.27 .07 (.51, .52) .16 .77* –.03 .15OPEN 84 2.25 2.54 83 2.76 2.86 0.19 .23 .39* (.66, .54) .76* .09 .15TOTAL 84 5.23 3.91 83 6.43 4.41 0.29 .06 .80* .86* (.70, .58) .04 .20WPT 84 29.69 4.88 83 29.23 4.36 –0.10 .52 .36* .36* .43* (.82, .94)a .34*GPA 84 3.24 .42 83 3.18 .49 –0.13 .13 .24* .22* .28* .32* —

Note. Sig. of t = the significance level of the t tests for honest versus applicant scores; CON = Conscientiousness scale; OPEN = Openness to Experiencescale; TOTAL = Conscientiousness and Openness to Experience scale composite; WPT = Wonderlic Personnel Test; GPA = First semester grade point average.Correlations for honest condition are presented above the diagonal, whereas correlations for the applicant condition are presented below the diagonal. Coefficientalphas for personality scales are presented parenthetically in the diagonal for the honest and applicant conditions respectively.

aRange of test–retest reliabilities reported in the Wonderlic Personnel Test and Scholastic Level Exam (Wonderlic, Inc., 1999).*p < .01.

participants with high cognitive ability. Both hypotheses were tested using hierar-chical regression analyses, with personality scores regressed on response instruc-tion condition and cognitive ability in Step 1, and the Response Instruction × Cog-nitive Ability interaction term in Step 2. The results of the regression analyses arepresented in Table 2. Consistent with H1, significant Response Instruction maineffects were found for all three SS personality scales, accounting for 6.7% of the

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 187

TABLE 2Hierarchical Regression Analyses for Single-Stimulus and Forced-Choice

Personality Scores on Response Instruction and Cognitive Ability

Single-Stimulus Forced-Choice

Adjusted R2 ∆R2 β Adjusted R2 ∆R2 β

ConscientiousnessStep 1 Main effects .067** .029*

Response instruction .279** .146*Cognitive ability –.019 .148*

Step 2 Main effects andtwo-way interaction

.067** .000 .062** .033**

Response instruction .279** .147*Cognitive ability –.020 .170*Response instruction ×

cognitive ability–.034 .197**

Openness to ExperienceStep 1 Main effects .126* .048**

Response instruction .370** .106Cognitive ability .023 .225**

Step 2 Main effects andtwo-way interaction

.126** .000 .068** .020*

Response instruction .370** .107Cognitive ability .021 .243**Response instruction ×

cognitive ability–.050 .161*

Total personality testStep 1 Main effects .116** .064**

Response instruction .356** .156*Cognitive ability .002 .234**

Step 2 Main effects andtwo-way interaction

.116** .000 .108** .044**

Response instruction .356** .157*Cognitive ability .001 .259**Response instruction ×

cognitive ability–.046 .222**

Note. For the analyses involving single-stimulus scales, the df for Step 1 = (2, 159) and the df forStep 2 = (3, 158). For analysis involving forced-choice scales, the df for Step 1 = (2, 164) and the df forStep 2 = (3, 163).

*p < .05. **p < .01.

variance in Conscientiousness scores, 12.6% of the variance in Openness to Expe-rience scores, and 11.6% of the variance in total scores. In all cases, higher scoreswere found in the applicant condition. As suggested in H2, significant ResponseInstruction × Cognitive Ability interaction effects were found for all three FC per-sonality scales, accounting for 3.3% of the variance in Conscientiousness scores,2.0% of the variance in Openness to Experience scores, and 4.4% of the variance intotal personality scores.

The nature of the interaction effects for the FC personality scores can be seen inTable 3, which presents the means, standard deviations, and effect size estimates(Cohen’s d) for all personality scales using a 2 × 2 Cognitive Ability × ResponseInstruction format. For the purpose of presentation, low and high cognitive abilitygroups were defined using a median split on the WPT score. It can be seen that theeffect of Response Instruction on the FC personality scores was stronger for highcognitive ability participants, with scores in the applicant condition more thanthree quarters of a standard deviation higher than the scores in the honest condition(all d > .75). Note that a similar effect was found for SS personality scores amonghigh cognitive ability participants (all d > .71). On the other hand, FC personalityscores were slightly higher in the honest condition for low cognitive ability partici-pants (all d > .11). This suggests that low cognitive ability applicants attempting tomake an overly positive self-presentation might end up with a lower FC personal-ity scores than they would if they answered honestly.

Results for H3a and H3b

H3a proposed that SS personality scores in the honest condition will be most pre-dictive of GPA, whereas H3b proposed that FC personality scores in the applicantcondition will be most predictive of GPA. The hypotheses were tested using hierar-chical regression analysis with GPA regressed on personality scores and responseinstruction condition in Step 1 and the Personality × Response Instruction interac-tion in Step 2. The results of the hierarchical regression analyses are presented inTable 4. Counter to the hypotheses, none of the Personality × Response Instructioninteractions were significant. There were, however, significant main effects for allFC personality scores.

Though no significant Personality × Response Instruction interactions werefound, it is instructive to examine the difference between the personality–GPAzero-order correlations in the honest and applicant conditions because of thedownward bias that occurs when using moderated regression analysis to test groupdifferences in slopes (Aguinis, Beaty, Boik, & Pierce, 2005; Aguinis, Boik, &Pierce, 2001). The differences in the zero-order correlations between personalityand GPA are presented in Table 6. It can be seen that the direction of the differ-ences between correlations are consistent with the hypotheses. That is, larger

188 VASILOPOULOS ET AL.

189

TABLE 3Means and Standard Deviations for Personality Scale Scores by Instruction and Cognitive Ability

Low WPT High WPT

Honest Applicant Honest Applicant

n M SD n M SD d n M SD n M SD d

Single-stimulusConscientiousness 42 37.02 4.34 41 39.17 5.58 0.43 39 36.15 5.64 40 40.22 5.69 0.72Openness to Experience 42 34.50 4.31 41 37.59 5.45 0.63 39 33.44 5.06 40 38.60 5.88 0.94Total personality test 42 71.52 7.21 41 76.76 10.26 0.60 39 69.59 9.18 40 78.83 10.95 0.92

Forced-ChoiceConscientiousness 42 3.02 2.59 42 2.71 2.49 –0.12 42 2.93 2.62 41 4.66 1.94 0.76Openness to Experience 42 2.05 2.86 42 1.45 2.96 –0.21 42 2.45 2.18 41 4.10 2.03 0.78Total personality test 42 5.07 4.24 42 4.17 4.46 –0.21 42 5.38 3.58 41 8.76 2.92 1.04

Note. WPT = Wonderlic Personnel Test. For scale scores, a positive d value indicates that participants in the applicant condition scored higher than partici-pants in the honest condition.

zero-order correlations between personality scores and GPA were found in thehonest condition for the SS scales and in the applicant condition for the FC scales.

Results for H4a and H4b

H4a and H4b proposed that SS and FC personality scores in the honest conditionwould provide greater incremental validity over a cognitive ability test used to pre-dict GPA than scores in the applicant condition. The hypotheses were tested using

190 VASILOPOULOS ET AL.

TABLE 4Hierarchical Regression Analyses With GPA on Personality Scoresand Response Instruction Without Cognitive Ability as a Covariate

Single-Stimulus Forced-Choice

Adjusted R2 ∆R2 β Adjusted R2 ∆R2 β

ConscientiousnessStep 1 Main effects .005 .030*

Personality .051 .194*Response instruction –.064 –.093

Step 2 Main effects andtwo-way interaction

.005 .000 .030 .000

Personality .163 –.001Response instruction –.065 –.094Personality × response

instruction–.117 .206

Openness to ExperienceStep 1 Main effects .001 .030*

Personality .103 .193*Response instruction –.088 –.084

Step 2 Main effects andtwo-way interaction

.001 .000 .030 .000

Personality .223 .073Response instruction –.090 –.084Personality × response

instruction–.124 .126

Total Personality TestStep 1 Main effects .002 .051**

Personality .085 .243**Response instruction –.080 –.101

Step 2 Main effects andtwo-way interaction

.002 .000 .051* .044**

Personality .250 .107Response instruction –.085 –.101Personality × response

instruction–.170 .143

Note. GPA = grade point average.*p < .05. **p < .01.

191

TABLE 5Hierarchical Regression Analyses with GPA on Personality Scores

and Response Instruction With Cognitive Ability as a Covariate

Single-Stimulus Forced-Choice

Adjusted R2 ∆R2 β Adjusted R2 ∆R2 β

ConscientiousnessStep 1 Covariate .093** .103**

Cognitive ability .315** .329**Step 2 Covariate and

main effects.093** .000 .116** .013

Cognitive ability .314** .304**Personality .057 .149*Instruction –.049 .000 –.072

Step 3 Covariate, maineffects, and two-wayinteraction

.093** .000 .116**

Cognitive ability .313** .303**Personality .136 .124Instruction –.050 –.072Personality ×

instruction–.082 .026

Openness to ExperienceStep 1 .093** .103**

Cognitive ability .315** .329**Step 2 .093** .000 .109** .006

Cognitive ability .311** .298**Personality .095 .126Instruction –.068 –.063

Step 3 .093** .000 .109** .000Cognitive ability .310** .297**Personality .150 .105Instruction –.070 –.063Personality ×

instruction–.057 .021

Total personality testStep 1 .093** .103**

Cognitive ability .315** .329**Step 2 .093** .000 .123**

Cognitive ability .313 .020* .285**Personality .085 .175*Instruction –.063 –.077

Step 3 .093** .000 .123** .000Cognitive ability .311** .286**Personality .192 .190Instruction –.066 –.077Personality ×

instruction–.110 –.016

Note. GPA = grade point average.*p < .05. **p < .01.

hierarchical regression analyses with GPA regressed on cognitive ability in Step 1,personality scores and response instruction condition in Step 2 and the Personality× Response Instruction interaction in Step 3. The results of the hierarchical regres-sion analyses are presented in Table 5. Counter to the hypotheses, all Personality ×Response Instruction interactions were nonsignificant, although significant effects(one-tailed) were found for all of the FC personality scores.

These differences in the semi-partial correlations (controlling for cognitiveability) in the honest and applicant conditions are presented in Table 6. Thoughconsistent with the hypotheses, the differences between the semi-partial correla-tions in the honest and applicant conditions were small, especially when comparedto the differences between the zero-order correlations.

DISCUSSION

The primary goal of this study was to investigate the validity of FC personalitytests. Despite the renaissance of personality testing in personnel selection, con-cerns over their susceptibility to applicant faking have not been fully resolved.Though recent research suggests that FC personality tests offer organizations away to address applicant faking (e.g., Christiansen et al., 1998; Jackson et al.,2000), the results of this study suggest the need to consider whether changing a testcharacteristic simultaneously alters test validity.

An interesting finding was the fact that low cognitive ability participants inthe applicant condition scored slightly lower on the FC personality test than did

192 VASILOPOULOS ET AL.

TABLE 6Difference Between Zero-Order Correlations and Semipartial Correlations

Adjusted for Cognitive Ability Across Response Conditions

Zero-Order Correlation Semipartial Correlation

Honest Applicant Diff Honest Applicant Diff

Single-stimulusConscientiousness .09 .02 .07 .08 .04 .04Openness to Experience .13 .07 .06 .11 .08 .03Total personality test .13 .05 .08 .11 .06 .05

Forced-choiceConscientiousness .14 .24 –.10 .16 .13 .03Openness to Experience .14 .22 –.08 .12 .12 .00Total personality test .19 .28 –.09 .18 .15 .03

Note. Diff = the difference in the correlation coefficients between the honest and applicantconditions. For the single-stimulus scales, n = 81 in both the honest and applicant conditions. For theforced-choice scales, n = 84 in the honest condition and n = 83 in the applicant condition.

low cognitive ability participants in the honest condition (d values ranged from–.11 to –.21). This suggests that low cognitive ability applicants who are moti-vated to respond to items in a way that makes them look as qualified as possiblefor a job might inadvertently hurt their chance of getting a job offer. Clearly, theobserved effects in this study are small, however it is important to note the possi-bility that a larger effect exists in the general population. The average SAT scorefor freshman at the university where the data were collected is 1250 (SD = 110)compared to the average national SAT score of 1020 (SD = 209). This suggests arestricted range in the WPT scores in this study, resulting in an underestimate ofthe true population effect.

Another goal of the study was to compare the criterion-related validity of FCand SS personality tests used to predict GPA. Counter to expectations, none ofthe Personality × Response Instruction interactions were significant. However,the pattern of results is consistent with the hypothesis that FC personality scoresin the applicant condition are more predictive of GPA than SS personality scoresin the applicant condition. Though the absence of statically significant interac-tions limits the level of confidence in generalizing results, it is useful to discussthe differences in the correlations between FC personality scores and GPA in thehonest and applicant conditions. Table 6 shows that the validity estimates for theFC personality scores were larger in the applicant condition than in the honestcondition (71%, 57%, and 47% larger for the Conscientiousness, Openness toExperience, and total personality scales, respectively). However, the utility of us-ing FC (rather than SS) personality test to predict GPA disappeared when cogni-tive ability was included in the analysis. Recall that cognitive ability was ex-pected to account for most of the variance in FC personality scores, leading tohigher incremental validity in the honest condition. The finding that the FC per-sonality scores in the applicant condition were predictive of GPA after control-ling for cognitive ability suggests that the scores reflected individual differencesin personality as well as cognitive ability.

The theoretical implications of this study concern the construct validity of per-sonality tests. The results show that FC personality items tap a different set of con-structs than the SS personality items. Thus, it seems wise to follow Campbell andFiske’s (1959) position that evaluating the construct validity of a psychologicalmeasure requires a consideration of both the trait and the method. Another concernis the fact that the significant FC personality–cognitive ability correlations in theapplicant condition run counter to the assumption of independence underlyingCronbach’s (1949) categorization of personality tests as measure of typical perfor-mance and cognitive ability tests as measures of maximal performance. Cronbachmakes no attempt to distinguish between personality tests that use different itemtypes. The results of this study suggest that it may be helpful to create taxonomiesthat differentiate between tests which measure broader typical performance andtests which measure maximal performance categories.

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 193

This study also has important implications for practitioners thinking about us-ing FC personality tests. An obvious implication of the finding that FC personalityscores correlate with cognitive ability test scores in an applicant setting is test fair-ness and adverse impact. Unfortunately, the homogeneity of the sample preventsus from evaluating this possibility. Another implication has to do with the incre-mental validity that FC personality scores add over pure cognitive ability mea-sures. It is currently assumed that the variance in performance accounted for bypersonality tests is independent of the variance accounted for by cognitive abilitytests (Schmidt & Hunter, 1998). The results of this study showed that this assump-tion may not always be correct.

Several areas of future research should be investigated to gain a better under-standing of the effect of item format on the validity of personality scales. For ex-ample, researchers should investigate if the results of this study hold for differenttypes of intelligence. The WPT is a general measure of intelligence. This pre-vented us from looking into the role of specific mental abilities in the response pro-cess. Future research should also explore if responses to FC personality items aredue to individual differences in personal characteristics. It is possible that highlyneurotic applicants are more likely to become overwhelmed when completing aFC personality test, resulting in less effective performance than applicants lower inneuroticism. Another possibility is that high self-monitoring applicants are moreeffective at responding to FC personality items because they are more aware of therelevant traits than are low self-monitors. Future research could also further inves-tigate the cognitive load associated with the use of a FC personality test. In a recentstudy, Vasilopoulos, Cucina, and McElreath (2005) reported longer response la-tencies for SS personality items when test takers were given a warning of responseverification by others. This was interpreted as evidence that the warning of verifi-cation made it more difficult to present oneself in a favorable light. It would be in-teresting to see if response latencies for FC personality items are longer under anapplicant rather than honest condition.

During the review of this article, Christiansen et al. (2005, study 3) replicatedour finding that FC personality scales correlate with cognitive ability under appli-cant conditions. Their study also explored the possibility that the quality of appli-cants’ implicit job theories—a cognitive representation very similar to our concep-tion of an adopted applicant schema—mediates the relationship between cognitiveability and FC personality scores in applicant settings. The results showed that im-plicit job theory partially mediated the relationship between cognitive ability andFC personality scores in the applicant condition. Christiansen et al. suggested thatthe finding of partial, rather than full, mediation occurred because of limitationswith the measure of implicit job theory used in the study. It is also possible thatthere are proximal mechanisms involved in faking that are influenced by cognitiveability but not captured in the current conception of an implicit job theory oradopted schema. For example, cognitive ability may influence faking FC item re-

194 VASILOPOULOS ET AL.

sponses by its affect on the test taker’s ability to identify the dimensions assessedon the FC personality test.

The main limitation of the study is inherent in all laboratory studies. Real col-lege applicants may be more (or less) motivated to fake on college entrance appli-cations than the participants in this study were. Future research should investigatethe relationship between cognitive ability and FC personality scores in an appliedsetting. The generalizability of the results are also restricted in that all applicantswere somewhat familiar with the job (i.e., student) because they were already inthe role. It is possible that the results would change if a different type of position,such as customer service representative, was used in the applicant condition.

Another limitation of the study is the relatively low scale reliabilities for the FCscales. As suggested by an anonymous reviewer, we assessed the difference in thecoefficient alphas for SS and FC scales in both the honest and applicant conditionsusing procedures outlined by Feldt (1969) and modified by Feldt and Ankenmann(1999) and found several significant differences. The lower reliabilities demon-strate the difficulty in developing FC scales. The reliabilities reported in this studymay not be ideal for scales used to select employees. However, it is important torecognize that statistically and practically significant results were obtained despitethe low reliabilities.

REFERENCES

Aguinis, H., Beaty, J. C., Boik, R. J., & Pierce, C. A. (2005). Effect size and power in assessing moder-ating effects of categorical variables using multiple regression: A 30-year review. Journal of AppliedPsychology, 90, 94–107.

Aguinis, H., Boik, R. J., & Pierce, C. A. (2001). A generalized solution for approximating the power todetect effects of categorical moderator variables using multiple regression. Organizational ResearchMethods, 4, 291–323.

Baron, H. (1996). Strengths and limitations of ipsative measurement. Journal of Occupational and Or-ganizational Psychology, 69, 49–56.

Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance.Personnel Psychology, 44, 1–26.

Barrick, M. R., & Mount, M. K. (1996). Effects of impression management and self-deception on thepredictive validity of personality constructs. Journal of Applied Psychology, 81, 261–272.

Barrick, M. R., Mount, M. K., & Judge, T. A. (2001). Personality and performance at the beginning ofthe new millennium: What do we know and where do we go next? International Journal of Selectionand Assessment, 9, 9–29.

Borislow, B. (1958). The Edwards Personal Preference Schedule and fakability. Journal of AppliedPsychology, 42, 22–27.

Camara, W. J., & Echternacht, G. (2000). The SAT I and high school grades: Utility in predicting suc-cess in college (The College Board Research Note No. RN-10). New York: The College Board.(ERIC Document Reproduction Service No. ED 446592)

Campbell, D. T., & Fiske, D. W. (1959). Convergent and divergent validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 195

Christiansen, N. D. (2001, April). Utilizing forced-choice item formats to enhance criterion-related va-lidity. Paper presented at the 15th annual conference of the Society for Industrial and OrganizationalPsychology, New Orleans, LA.

Christiansen, N. D., Burns, G. N., & Montgomery, G. E. (2005). Reconsidering forced-choice item for-mats for applicant personality assessment. Human Performance, 18, 267–307.

Christiansen, N. D., Edelstein, S., & Fleming, B. (1998, April). Reconsidering forced-choice formatsfor applicant personality assessment. Paper presented at the 13th annual conference of the Societyfor Industrial and Organizational Psychology, Dallas, TX.

Christiansen, N. D., Goffin, R. D., Johnston, N. G., & Rothstein, M. G. (1994). Correcting the 16PF forfaking: Effects on criterion-related validity and individual hiring decisions. Personnel Psychology,47, 847–860.

Colquitt, J. A., & Simmering, M. J. (1998). Conscientiousness, goal orientation, and motivation to learnduring the learning process: A longitudinal study. Journal of Applied Psychology, 83, 654–665.

Conger, A. J., & Jackson, D. N. (1972). Suppressor variables, prediction, and the interpretation of psy-chological relationships. Educational and Psychological Measurement, 32, 579–599.

Coombs, C. H. (1964). A theory of data. New York: Wiley.Corah, M. L., Feldman, M. J., Cohen, I. S., Meadow, A., & Ringwall, E. A. (1958). Social desirability

as a variable in the Edwards Personal Preference Schedule. Journal of Consulting and Clinical Psy-chology, 22, 70–72.

Costa, P. T., Jr., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO-PI-R) and NEOFive-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Re-sources.

Cronbach, L. J. (1949). Essentials of psychological testing. New York: Harper & Row.Cronbach, L. J. (1955). Processes affecting scores on “understanding others” and “assumed similarity.”

Psychological Bulletin, 52, 177–193.Dolinger, S. J., & Orf, L. A. (1991). Personality and performance in “Personality”: Conscientiousness

and Openness. Journal of Research in Personality, 25, 276–284.Dunnette, M. D., McCartney, J., Carlson, H. C., & Kirchener, W. K. (1962). A study of faking behavior

on forced-choice self-description checklist. Personnel Psychology, 15, 13–24.Farsides, T., & Woodfield, R. (2003). Individual differences and undergraduate academic success: The

roles of personality, intelligence, and application. Personality & Individual Differences, 34,1225–1243.

Feldman, M. J., & Corah, M. L. (1960). Social desirability and the forced-choice method. Journal ofConsulting Psychology, 24, 480–482.

Feldt, L. S. (1969). A test of the hypothesis that Cronbach’s Alpha or Kuder-Richardson CoefficientTwenty is the same for two tests. Psychometrika, 34(3), 363–373.

Feldt, L. S., & Ankenmann, R. D. (1999). Determining sample size for a test of the equality of Alphacoefficients when the number of part-tests is small. Psychological Methods, 4(4), 366–377.

Frei, R. L., & McDaniel, M. A. (1998). Validity of customer service measures in personnel selection: Areview of criterion and construct evidence. Human Performance, 11, 1–27.

Furnham, A. (1990). Faking personality questionnaires: Fabricating different profiles for different pur-poses. Current Psychology: Research and Reviews, 9, 46–55.

Goffin, R. D., & Christiansen, N. D. (2003). Correcting personality tests for faking: A review of popularpersonality tests and an initial survey of researchers. International Journal of Selection and Assess-ment, 11, 340–344.

Goldberg, L. R. (1999). A broad-bandwidth, public-domain, personality inventory measuring thelower-level facets of several five-factor models. In I. Mervielde, I. J. Deary, F. De Fruyt, & F.Ostendorf (Eds.), Personality psychology in Europe (Vol. 7, pp. 7–28). Tilburg, The Netherlands:Tilburg University Press.

196 VASILOPOULOS ET AL.

Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw-Hill.Hauenstein, N. M., & Alexander, R. A. (1991). Rating ability in performance judgments: The joint in-

fluence of implicit theories and intelligence. Organizational Behavior and Human Decision Pro-cesses, 50, 300–323.

Hicks, L. E. (1970). Some properties of ipsative, normative, and forced-choice normative measures.Psychological Bulletin, 74, 167–184.

Holden, R. R., & Jackson, D. N. (1981). Subtlety, information, and faking effects in personality assess-ment. Journal of Clinical Psychology, 37(2), 379–386.

Holden, R. R., Kroner, D. G., Fekken, G. C., & Popham, S. M. (1992). A model of personality test itemresponse dissimulation. Journal of Personality and Social Psychology, 63, 272–279.

Hough, L. M. (1998). Effects of intentional distortion in personality measurement and evaluation ofsuggested palliatives. Human Performance, 11, 209–244.

Hough, L. M. Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990). Criterion-relatedvalidities of personality constructs and the effect of response distortion on those validities. Journal ofApplied Psychology, 75, 581–595.

Hurtz, G. M., & Donovan, J. J. (2000). Personality and job performance: The big five revisited. Journalof Applied Psychology, 85, 869–879.

Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The impact of faking on employment tests:Does forced-choice offer a solution? Human Performance, 13(4), 371–388.

Johnson, C. E., Wood, R., & Blinkhorn, S. F. (1988). Spuriouser and spuriouser: The use of ipsative per-sonality tests. Journal of Occupational Psychology, 61, 153–162.

Kroger, R. O., & Turnbull, W. (1975). Invalidity of validity scales: The case of the MMPI. Journal ofConsulting and Clinical Psychology, 43, 48–55.

Kunda, Z. (1999). Social cognition: Making sense of people. Cambridge, MA: MIT Press.Mahar, D., Cologon, J., & Duck, J. (1995). Response strategies when faking personality questionnaires

in a vocational setting. Personality and Individual Differences, 18, 605–609.Markus, H. (1977). Self-schemata and processing information about the self. Journal of Personality

and Social Psychology, 35, 63–78.Markus, H., & Kunda, Z. (1986). Stability and malleability of the self-concept. Journal of Personality

and Social Psychology, 51, 858–866.McDaniel, M. A., Douglas, E. F., & Snell, A. F. (1997, April). A survey of deception among job seekers.

Paper presented at the 12th annual conference for Industrial and Organizational Psychology, St.Louis, MO.

Mount, M. K., & Barrick, M. R. (1998). Five reasons why the “Big Five” article has been frequentlycited. Personnel Psychology, 51, 849–858.

Mount, M. K., Witt, L. A., & Barrick, M. R. (2000). Incremental validity of empirically keyedbiodata scales over GMA and the five factor personality constructs. Personnel Psychology, 53,299–323.

Musgrave-Marquart, D., Bromley, S. P., & Dalley, M. B. (1997). Personality, academic attribution, andsubstance use as predictors of academic achievement in college students. Journal of Social Behavior& Personality, 12, 501–511.

Nathan, B. R., & Alexander, R. A. (1985). The role of inferential accuracy in performance ratings.Academy of Management Review, 10, 109–115.

Norman, W. T. (1963). Personality measurement, faking, and detection: An assessment method for usein personnel selection. Journal of Applied Psychology, 47, 225–241.

Ones, D. S., & Viswesvaran, C. (1998). The effects of social desirability and faking on personality andintegrity assessment for personnel selection. Human Performance, 11, 245–269.

Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in personality testing forpersonnel selection: The red herring. Journal of Applied Psychology, 81, 660–679.

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 197

Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of response distortion onpreemployment personality testing and hiring decisions. Journal of Applied Psychology, 83(4),634–644.

Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psy-chology: Practical and theoretical implications of 85 years of research findings. Psychological Bulle-tin, 124, 262–274.

Schmidt, F. L., & Hunter, J. (2004). General mental ability in the world of work: Career attainment andjob performance. Journal of Personality and Social Psychology, 86, 162–173.

Schmit, M. J., & Ryan, A. M. (1993). The Big Five in personnel selection: Factor structure in applicantand nonapplicant populations. Journal of Applied Psychology, 78, 966–974.

Smith, D. B., Hanges, P. J., & Dickson, M. W. (2001). Personal selection and the five-factor model: Re-examining the effects of applicant’s frame of reference. Journal of Applied Psychology, 86, 304–315.

Smither, J. W., & Reilly, R. R. (1987). True intercorrelation among job components, time delay in rat-ing, and rater intelligence as determinants of accuracy in performance ratings. Organizational Be-havior and Human Decision Processes, 40, 369–391.

Snell, A. F., & McDaniel, M. A. (1998, April). Faking: Getting data to answer the right questions. Pa-per presented at the 13th annual conference of the Society for Industrial and Organizational Psychol-ogy, Dallas, TX.

Stark, S., Chernyshenko, O. S., Chan, K. Y., Lee, W. C., & Drasgow, F. (2001). Effects of testing situa-tion on item responding, Cause for concern. Journal of Applied Psychology, 86, 943–953.

Tenopyr, M. L. (1988). Artifactual reliability of forced-choice scales. Journal of Applied Psychology,73, 749–751.

Travers, R. M. (1951). A critical review of the forced-choice technique. Psychological Bulletin, 48,62–70.

Tross, S. A., Harper, J. P., Osher, L. W., & Kneidinger, L. M. (2000). Not just the usual cast of character-istics: Using personality to predict college performance and retention. Journal of College StudentDevelopment, 41, 323–334.

Vasilopoulos, N. L., Cucina, J. M., & McElreath, J. M. (2005). Do warnings of response verificationmoderate the relationship between personality and cognitive ability? Journal of Applied Psychology,90, 306–322.

Vasilopoulos, N. L., Reilly, R. R., & Leaman, J. A., (2000). The influence of job familiarity and impres-sion management on self-report measure response latencies and scale scores. Journal of AppliedPsychology, 85, 50–64.

Velicer, W. F., & Weiner, B. J. (1975). Effects of sophistication and faking sets on the Eysenck Person-ality Inventory. Psychological Reports, 37(1), 71–73.

Waters, L. K. (1965). A note of the “fakability” of forced-choice scales. Personnel Psychology, 16,187–191.

Wolfe, R. N., & Johnson, S. D. (1995). Personality as a predictor of college performance. Educational& Psychological Measurement, 55, 177–185.

Wonderlic, Inc. (1999). Wonderlic personnel test and scholastic level exam. Libertyville, IL: WonderlicPersonnel Test, Inc.

Zavala, A. (1965). Development of the forced-choice rating scale technique. Psychological Bulletin,63, 117–124.

198 VASILOPOULOS ET AL.

APPENDIX

Conscientiousness

(a) Fear for the worst (Neuroticism, undesirable)(b) Take advantage of others (Agreeableness, undesirable)(c) Set high standards for myself and others (Conscientiousness, desirable)(d) Take charge (Extraversion, desirable)

Openness to Experience

(a) Keep others at a distance (Extraversion, undesirable)(b) Feel sympathy for those who are worse off than myself (Agreeableness,

desirable)(c) Prefer to stick with things that I know (Openness to Experience, undesir-

able)(d) Seldom get mad (Neuroticism, desirable)

PERSONALITY ITEM FORMAT AND COGNITIVE ABILITY 199