idiosyncratic weighting of trait information in impression formation

19
Journal of Personality and Social Psychology 1979, Vol. 37, No. 11, 2025-2043 Idiosyncratic Weighting of Trait Information in Impression Formation Thomas M. Ostrom Ohio State University Deborah Davis University of Nevada, Reno An analysis is offered of the role of unequal weighting in the averaging model of information integration. A distinction is made between unequal weighting at the normative level (which has been referred to as "differential weighting") and unequal weighting at the level of the individual subject (which we refer to as "idiosyncratic weighting"). Two studies are reported that examine the prevalence of idiosyncratic weighting in the trait-judgment impression forma- tion task. Whereas most past research on the question of unequal weighting in this task involved averaging responses across both subjects and stimulus replications, the present studies were analyzed at the level of an individual subject's repeated responses to separate stimulus replications. Clear evidence of idiosyncratic weighting was obtained from about 50% of the 120 subjects; only 20% of the subjects indicated absolutely no tendency toward unequal weighting. There was no evidence that idiosyncratic weighting was restricted to just a subset of stimuli, since all of the 20 stimulus replications showed idiosyncratic weighting effects. In contrast to previous findings, negative traits did not always receive more weight than positive traits. In more than 20% of the instances of unequal weighting, the more positive trait was accorded a higher weight. Information integration theory (Anderson, 1974) offers an approach for understanding how people combine stimulus information when making judgments and decisions. The theory seeks to determine the nature of the integration rule (e.g., adding, averaging, mul- tiplying) employed by people in various re- sponse domains. In addition, it provides a way to determine whether all stimulus items in the domain contribute equally to the overall judg- ment or whether they carry different weights. Integration theory provides no a priori basis for predicting which integration rule or weighting assumption is correct for any par- ticular response domain. It does, however, provide an array of conceptual alternatives This research was supported by National Science Foundation Grant GS-38604. The authors are grate- ful to Sarah Boysen for her assistance with collection of the data. Requests for reprints should be sent to Thomas M. Ostrom, 404C W. 17th Avenue, Columbus, Ohio 43210. along with a diagnostic methodology (func- tional measurement). With a comprehensive series of studies, it is possible to determine which conceptual alternatives best describe a response domain. Hence, one of the great strengths of integration theory is its ability to uncover different integration rules for differ- ent response domains. In order to apply integration theory to a response domain, it is necessary to specify two features of that domain, the population of information or stimulus items and the na- ture of the subjective judgment continuum. It is necessary to specify these two features because a shift in either may affect the inte- gration rule or parameter values. For example, one integration rule may apply when traits are combined with traits (e.g., averaging), and a different rule might apply when traits are combined with adverbs (e.g., multiplication). Parameter values may also vary as the nature of the judgment continuum shifts. Whereas friendly may have a higher scale value than Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3711-2025$00.75 2025

Upload: independent

Post on 16-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Journal of Personality and Social Psychology1979, Vol. 37, No. 11, 2025-2043

Idiosyncratic Weighting of Trait Informationin Impression Formation

Thomas M. OstromOhio State University

Deborah DavisUniversity of Nevada, Reno

An analysis is offered of the role of unequal weighting in the averaging modelof information integration. A distinction is made between unequal weightingat the normative level (which has been referred to as "differential weighting")and unequal weighting at the level of the individual subject (which we referto as "idiosyncratic weighting"). Two studies are reported that examine theprevalence of idiosyncratic weighting in the trait-judgment impression forma-tion task. Whereas most past research on the question of unequal weightingin this task involved averaging responses across both subjects and stimulusreplications, the present studies were analyzed at the level of an individualsubject's repeated responses to separate stimulus replications. Clear evidence ofidiosyncratic weighting was obtained from about 50% of the 120 subjects;only 20% of the subjects indicated absolutely no tendency toward unequalweighting. There was no evidence that idiosyncratic weighting was restrictedto just a subset of stimuli, since all of the 20 stimulus replications showedidiosyncratic weighting effects. In contrast to previous findings, negative traitsdid not always receive more weight than positive traits. In more than 20% ofthe instances of unequal weighting, the more positive trait was accorded ahigher weight.

Information integration theory (Anderson,1974) offers an approach for understandinghow people combine stimulus informationwhen making judgments and decisions. Thetheory seeks to determine the nature of theintegration rule (e.g., adding, averaging, mul-tiplying) employed by people in various re-sponse domains. In addition, it provides a wayto determine whether all stimulus items in thedomain contribute equally to the overall judg-ment or whether they carry different weights.

Integration theory provides no a prioribasis for predicting which integration rule orweighting assumption is correct for any par-ticular response domain. It does, however,provide an array of conceptual alternatives

This research was supported by National ScienceFoundation Grant GS-38604. The authors are grate-ful to Sarah Boysen for her assistance with collectionof the data.

Requests for reprints should be sent to Thomas M.Ostrom, 404C W. 17th Avenue, Columbus, Ohio43210.

along with a diagnostic methodology (func-tional measurement). With a comprehensiveseries of studies, it is possible to determinewhich conceptual alternatives best describe aresponse domain. Hence, one of the greatstrengths of integration theory is its ability touncover different integration rules for differ-ent response domains.

In order to apply integration theory to aresponse domain, it is necessary to specifytwo features of that domain, the populationof information or stimulus items and the na-ture of the subjective judgment continuum.It is necessary to specify these two featuresbecause a shift in either may affect the inte-gration rule or parameter values. For example,one integration rule may apply when traits arecombined with traits (e.g., averaging), and adifferent rule might apply when traits arecombined with adverbs (e.g., multiplication).Parameter values may also vary as the natureof the judgment continuum shifts. Whereasfriendly may have a higher scale value than

Copyright 1979 by the American Psychological Association, Inc. 0022-3514/79/3711-2025$00.75

2025

2026 THOMAS M. OSTROM AND DEBORAH DAVIS

energetic on the favorability continuum, it islikely that energetic would have the higherscale value on an activity continuum.

Integration theory has been used exten-sively to analyze the adjective combinationtask first introduced by Asch (1946) to studyimpression formation processes. The stimuliused in the impression formation task are per-sonality trait adjectives, and the response con-tinuum refers to overall impression favorabil-ity. At the operational level, most investiga-tors use the 5SS traits scaled by Anderson(1968) as their domain of stimuli. The judg-ment scale usually has seven or more intervalsand is anchored with the bipolar terms offavorable-unfavorable or like-dislike.

It is widely assumed that for the impres-sion formation response domain, people em-ploy an averaging integration rule and assignapproximately equal weights to all stimuli inthe stimulus domain. Anderson has offeredthis view of an equal weighting model in anumber of sources (Anderson, 1974, p. 261;Anderson, 1976, pp. 681-682; Anderson &Lopes, p. >69; Leon, Oden, & Anderson, 1973,pp. 301-302; Oden & Anderson, 1971, p.159). The main objective of the presentarticle is to examine the assumption that allpersonality trait adjectives, as items of personinformation, carry approximately equalweight in the impression formation task.

This assumption of equal weighting hasbeen questioned by some investigators (e.g.,Birnbaum, 1974). In the following review ofresearch bearing on the equal weighting ques-tion, we note that existing data are not con-clusive regarding the source or strength ofdifferences in trait weights. One of the prob-lems with this research is that most of thestudies are based on group averages. Thispaper explores the possibility that such groupdata may "average out" idiosyncratic differ-ences in trait weights that exist separately foreach individual.

Differential Weighting

Two kinds of deviations from equal weight-ing have been examined in previous research;one is related to the lability of a trait's weight,and the other concerns the variation in weightbetween different traits.

Weight lability. The lability of a trait'sweight reflects the extent to which the weightis affected by situational or contextual vari-ables. There can be little question as towhether "differential" weighting occurs in theimpression formation domain in this sense ofthe term.

Characteristics of the judgment task areknown to affect the functional weight givenany particular item. For example, communica-tor credibility (Rosenbaum & Levin, 1968)and serial position (Anderson, 1965) can actas situational determinants of trait weight.

Weight can also be influenced by the seman-tic relationship that a trait holds with otheritems in the set. The weight parameter is af-fected by the level of redundancy (Schmidt,1969) and inconsistency (Anderson & Jacob-son, 1965) among traits describing the sameperson. Also, there are traits that possess twodistinct semantic meanings (polysemoushomographs)—traits such as discriminatingand sensitive—that may change their scalevalue as well as their weight, depending onother stimuli in the information set. Althougheach of the above features can lead to differ-ential weighting, they all derive from the rela-tionship between the semantic definitions ofthe different traits and can in principle bespecified a priori. They do not represent in-stances in which particular traits carry differ-ent weights when in isolation from one an-other.

Weight differences between traits. Theweights of most, if not all, traits are clearlylabile. However, this does not mean that traitsdiffer in their natural or context-free weights.When variables known to influence lability areheld constant, it is possible that trait weightsare either approximately equal or substan-tially different from one another. It is thissense of the term differential weighting thatis being rejected when the integration processin impression formation is said to conform toan equal-weight averaging model.

Several possible determinants of traitweight have been examined. Both trait novelty(Wyer, 1970) and trait ambiguity (Kaplan,1971, 1975; McKillip, Barrett, & Dimiceli,1978; Schumer, 1973; Wyer, 1974b; Wyer &Watson, 1969) have received some attention.

IDIOSYNCRATIC WEIGHTING 2027

In neither case, however, do the findings un-equivocally support the conclusion of differ-ential weighting in the integration process.

Other research has been directed towardsources of differential weighting that are asso-ciated with trait scale value. It is possible, forexample, that traits with extreme scale valuescarry more weight than do neutral traits (e.g.,Warr, 1974; Warr & Jackson, 197S). Thesefindings are difficult to interpret, however,because of confounding between negativity,ambiguity, and extremity (Warr & Jackson,1976; Wyer, 1973).

Unlike the above potential sources of differ-ential weighting, research findings have un-equivocally established the existence of anegativity effect. Traits with a negative va-lence tend to carry more weight than thosewith positive valence (e.g., Birnbaum, 1974;Hodges, 1974; Oden & Anderson, 1971). Thisnegativity effect is thought to exist for a num-ber of response domains (Kanouse & Hanson,1972).

Although the negativity findings representa genuine limitation to the equal weightingassumption, they may not be especially seri-ous. Anderson (1974, p. 261) has speculated,for example, that the magnitude of this nega-tivity effect is not very large and that it maybe restricted to only extremely negative stim-uli. Another difficulty with this research isthat it only employed a limited sample ofpositive and negative traits. No attempt wasmade to see how representative the samplewas of the entire trait population. The pos-sibility exists that most traits have equalweights, but that a few negative traits carrya higher weight and a few positive traits alower weight. One objective of the presentresearch was to examine these questions aboutthe negativity effect.

Idiosyncratic Weighting

Most work on the issue of differentialweighting has been at the level of normativeweights. That is, the research has examinedhow much weight is given traits by the typicalor average subject. Little consideration hasbeen given to the problem of whether or notequal weighting applies to each person indi-vidually. It is possible that because of a per-

son's idiosyncratic experiences with differenttrait adjectives, he or she may assign widelydiffering weights to the adjectives. However,one person's pattern of weights may be un-related to another's. One person may weightjriendly twice as much as intelligent, whereasanother might give friendly only half theweight of intelligent. If we were to obtainnormative weights by averaging over thesetwo persons (as is the case in most differentialweighting research), we would find that thetwo traits received the same weight. Differen-tial weighting of traits at the level of a singlesubject (what we term idiosyncratic weight-ing) may emerge as equal weighting whenmeasured at the normative level.

Only one published study has examined thepossibility of idiosyncratic differences in traitweights. In this study (Anderson, 1962), thestimulus sets consisted of three traits, onefrom each of the three factors in the design.Each factor consisited of a negative, neutral,and positive trait to form its three levels. This3 X 3 X 3 design resulted in a total of 27stimulus sets. Subjects judged all 27 sets oncea day for 5 days. The resulting data wereanalyzed to determine whether significant in-teraction variance emerged. This design wasrepeated for 12 subjects, with every 2 subjectsreceiving a different stimulus replication (i.e.,six stimulus replications of the three-factordesign, with 2 subjects per replication). Thetest for differential weighting (or nonadditiv-ity) was performed on the pooled (i.e., 20degrees of freedom) interaction. Of the 12subjects, 3 were found to produce significantnonadditivity.

Even though the total number of subjectswas small, this study suggests that about 25%of the people show differential weighting. Thatpercentage figure would, no doubt, have beenhigher had each of the two-way interactionsand the three-way interaction been separatelytested for each subject. It would be possiblefor a subject to have a significant two-wayinteraction but not to have the pooled inter-action reach significance.

On the other hand, it is possible that the25% figure is actually an overestimate of thenumber of people displaying differentialweighting. Two of the three subjects with sig-

2028 THOMAS M. OSTROM AND DEBORAH DAVIS

nificant interaction variance received the samestimulus replication. It is conceivable that thisparticular stimulus replication contained in-consistent traits, redundant traits, or polyse-mous homographs. Anderson (1962) did notreport attempting to avoid stimulus combina-tions of this sort.

There is a second reason that the 25% fig-ure might be an overestimate. There is no wayto verify that the response scale that thesethree subjects used to make their judgmentswas linearly related to the underlying sub-jective continuum. It is possible that thesethree subjects assigned equal weight to allstimuli but that in reporting their subjectiveimpressions they did not use an equal-intervalresponse scale. For example, a statistical in-teraction of the form implying that negativetraits carry more weight than positive traitscould be produced by a nonlinear responsescale if the categories at the negative end ofthe scale were wider (i.e., covered a greaterrange of the subjective continuum) than thecategories at the positive end.

The research in the present paper wasundertaken to provide a more thorough testof the prevalence of idiosyncratic weightingin the trait judgment domain. Two studiesare reported, both of which have the samemethodological features. The studies were de-signed to maximize the likelihood of detectingidiosyncratic weighting. The design used byAnderson (1962) was improved upon in sixways.

The stimulus sets used by Anderson con-tained three traits. According to the averag-ing model, however, the observable effects ofweight differences due to a single trait de-crease as the number of other traits in the setincreases. To maximize sensitivity to differ-ential weighting, the present studies used setscomposed of only two traits.

Second, there is a need to increase the num-ber of stimulus replications (Anderson, 1962,used only six). The use of an interaction totest the presence of differential weighting iscontingent upon having the two traits with adifference in weight present in the same factorof the design. That is, the two must both beon Factor A or on Factor B. If two highlyweighted stimuli are on Factor A and two low

weighted stimuli are on Factor B, the averag-ing model predicts no interaction, and so theeffect of the weight differences would go un-detected. Increasing the number of stimulusreplications increases the likelihood that atleast some traits with differential weights willbe assigned to the same factor in the design.The present research employed 10 differentstimulus replications in the first study and 10new replications in the second.

Third, most research applying integrationtheory to the trait-judgment domain has usedequal weighting instructions in which subjectsare explicitly told that each trait is accurateand equally important and that the subjectshould pay full attention to each of them. Tothe extent that subjects actively bear thoseinstructions in mind while making their judg-ments, it is possible that differential weightingis suppressed in this line of research. Somesupport for this notion is offered by Andersonand Jacobson (1965), who obtained slightlystronger evidence of inconsistency discountingunder naturalistic instructions than underequal weighting instructions. The first studyin this paper deleted any mention of equalweighting, and the second study manipulatedweighting instructions as a factor.

Fourth, it is possible that the design-widestructure of the stimulus combinations couldlead subjects to adopt an equal weighting setregardless of the explicit instructions. Ander-son (1962) presented subjects with all 27combinations in the three-factor design. Thatmeant that each trait appeared nine times,once with every combination of all the traitsfrom the other two factors. Each trait ap-peared in one third of all the test sets pre-sented. It is possible that such extensive repe-tition induces a mechanistic judgment set inwhich the subject actively suppresses anydifferential weighting that he or she mightnaturally bring to the task. In the presentstudies, 40 test sets are presented, and noadjective appears in more than 2 of the 40.

Fifth, if differential weighting applies onlyto some stimuli and not to others, it is im-portant to design the study so as to maximizethe likelihood of identifying those stimulus-subject combinations that display differentialweighting. Whereas the basic interaction used

IDIOSYNCRATIC WEIGHTING 2029

in Anderson (1962) was a 3 X 3 (or four de-grees of freedom) interaction, the presentstudy used a 2 X 2 (or one degree of freedom)interaction. In the Anderson design, it wouldbe difficult to detect a significant interactionthat was due to only one of the six stimulientering into a 3 X 3 interaction. Althoughthe effect of differential weighting in that de-sign might affect only one of the four compo-nents of the interaction, statistically it wouldbe spread out over the four degrees of freedomand, when tested on that basis, might not pro-duce overall significance. In a 2 X 2 design,however, differential weighting contributed bya single trait would be wholly contained in theone-degree-of-freedom interaction term.

Lastly, the Anderson (1962) study em-ployed only 12 subjects. It is difficult to gen-eralize about an estimate of the percent ofpeople displaying differential weighting fromsuch a small sample. In the present paper, 40subjects were used in the first study and 80 inthe second, providing greater confidence in thegenerality of the findings.

Experiment 1

In the first experiment, each subject wasgiven 40 two-trait stimulus sets to judge on alikeability scale once a day for five days. Thestimulus sets contained 10 replications of a2 X 2 design in which each factor of the de-sign contained a moderately favorable and amoderately unfavorable trait. Each stimulusreplication provided one opportunity for dif-ferential weighting to emerge, offering 10 op-portunities for each subject. Such a design notonly allowed an analysis of the prevalence ofidiosyncratic weighting but permitted a de-termination of whether significant interactionsemerge from many or only a few stimulusreplications. Further, a qualitative analysis ofthe pattern of means for significant inter-actions can reveal the proportion of unequalweight due to negative traits carrying moreweight than positive traits.

Method

Subjects

Subjects were 40 introductory psychology students,19 males and 21 females, at Ohio State University,

who participated in partial fulfillment of course re-quirements.

Stimuli

The stimulus trait adjectives were arbitrarilyselected from Edward's (Note 1) Ohio State rescalingof Anderson's (1968) personality trait adjectives.Subjects judged 40 stimulus trait pairs, 4 pairs fromeach of 10 stimulus set replications. The 4 pairs fromeach replication represented the four cells of a 2 X 2factorial design. Each factor of this design had twolevels of trait likeableness, a moderately favorable(M+) trait and a moderately unfavorable (M—)trait. For each replication there is one M+M+ traitpair, two M+M— trait pairs, and one M—M— traitpair. Mean scale values for the M+ and M— traitswere S.S and 2.5, rated on a scale of likeability rang-ing from 1 (dislike very much) to 7 (like verymuch). In selecting traits for each replication, carewas taken to eliminate all instances of redundancyand inconsistency.

Procedure

When subjects arrived for the first day of the ex-periment they received the following written instruc-tions:

This experiment is concerned with impression for-mation. What we are interested in is how peopleform impression of others on the basis of verylimited information. . . .

In this particular experiment, you will be seeingpairs of traits that describe various persons. Youwill be asked to tell how much you would likeeach person. Imagine that each of the two traitswas contributed by a different person who knowshim (her) well.

Read each pair carefully, try to imagine the typeof person being described, and rate how much youlike the person, using the scale given below eachpair of traits. Sometimes this may seem hard, butjust act naturally and do the best you can.

All subjects judged the same 40 stimulus sets oneach of 5 successive days. Stimulus sets were pre-sented in a booklet, 1 set to a page. There were fiverandom orders of stimuli. The five booklet orderswere Latin square counterbalanced across the 5 days,so that each subject received each booklet order andall booklet orders appeared equally often over the5 days. Ratings were made on a 21-point scale rang-ing from 10 (like very much) to —10 (dislike verymuch).

Results

Group Data

Most studies that have tested for equalweighting in the adjective judgment task have

2030 THOMAS M. OSTROM AND DEBORAH DAVIS

o

<cc>-K-I

5<uu

_̂l

Z

-5

M- M +

NORMATIVE SCALE VALUE

Figure 1. Mean likeability rating of adjective setscomposed of M+ and M— '(moderately favorableand moderately unfavorable) traits, averaged overdays, stimulus replications, and subjects in Experi-ment 1.

averaged over both stimulus replications andsubjects. In analyzing the present group data,the question of equal weighting can be ex-amined both on this overall basis and withineach stimulus replication.

If traits are equally weighted, there shouldbe no interaction between the M+ and M—levels of the A factor and the M+ and M —levels of the B factor. The present group databrings considerable power to the test of thatinteraction; each cell mean is the average of2,000 (10 X 5 X 40) observations. The inter-action (see Figure 1) was significant, F ( l ,39) = 10.35, p < .005, and its pattern is sim-ilar to that obtained by previous investigators.It can be explained by assuming that negativetraits carry more weight than positive traits.

Whether the obtained overall interactionrepresents a serious challenge to the equalweighting assumption is contingent upon sev-eral factors, including its magnitude, the pro-portion of stimulus replications that contrib-ute to it, the presence of similar versus differ-ential interaction patterns across stimulusreplications, and whether response scale non-linearity could account for the significanteffects.

The magnitude of the overall interactionwas not very substantial. The interaction con-tributed only one percent to the total be-

tween-cells variance. With fewer observationsgoing into each cell mean, this interactioncould well have gone undetected.

It is conceivable that the overall interactionwas due to only a few of the 10 stimulus rep-lications. In line with Anderson's (1974) sug-gestions, it could be the result of the one ortwo stimulus replications that contained themost extremely negative traits from the M —range. There is a second advantage to exam-ining each of the stimulus replications sep-arately. It is possible that the pattern dis-played in Figure 1 does not hold for all sig-nificant stimulus replications. If the majorityof the significant interaction replicationsshowed the pattern in Figure 1 and were aver-aged with others that showed different pat-terns (e.g., convergence rather than diver-gence to the right of the graph), the observedoverall interaction could still have emerged.

Separate analyses were performed on eachof the 10 stimulus replications, and only 2were found to contain significant interactionvariance, F s ( l , 1521) > 6.11, ps < .OS.1 Thepercent of between-cells variance contributedby the interaction was 3.40 and 3.80 for thefirst and eighth replications, respectively. Thesolid line portions of Figure 2 display theinteractions for those 2 stimulus replications.Both replications showed the same pattern—one that could be explained by assuming thatnegative traits carry more weight than posi-tive traits do.

A significant interaction could be producedby weight differences between the two traitson the A factor, on the B factor, or on both.Each stimulus replication, then, provides twoopportunities for differential weighting tooccur. The fact that only two of the replica-tions had significant interactions means thatnear equal weighting was observed in between16 and 18 of the 20 opportunities. At thisnormative level of analysis, it appears thatequal weighting was characteristic of 80% to90% of the trait pairs. The overall negativityeffect was produced by at most 2Q% of thestimulus pairs.

1 The error term for these analyses was the inter-action between subjects (40) and stimulus persons(40).

IDIOSYNCRATIC WEIGHTING 2031

The data provide some support for theinterpretation that differential weighting isrestricted to highly negative traits. The lowestmarginal mean was computed for each stim-ulus replication to provide an index of whichreplications contained the most negative trait.Among the 10 stimulus replications, the 2replications for which the interaction was sig-nificant were the first and fifth lowest.

Response scale linearity. Significant inter-actions of the form portrayed in Figures 1 and2 could be obtained even if all traits were sub-jectively given equal weights. This wouldoccur if subjects used wider categories at thenegative end of their response scale than atthe positive end. In such a case, subjects'overt responses would not be linearly relatedto their subjective responses. Two qualitativetests were devised for the purpose of examin-ing the response scale explanation.

Response scale nonlinearity can be dis-missed if the qualitative pattern satisfies atest of disordinality. The interaction por-trayed in Figure 1 is termed an ordinal inter-action because both the lower and upper linesare ordinally related to the horizontal axis inthe same direction (i.e., the sign of bothslopes is the same). A disordinal interactionwould be one in which the slope of one line ispositive and the slope of the other is negative.If the means of a 2 X 2 design can be plottedso that disordinality emerges, there exists nocontinuous monotonic transformation of theresponse scale that will reduce that interactionto zero. Consequently, the presence of dis-ordinality supports an unequal weighting in-terpretation. Although none of the stimulusreplications in the analysis of group data (seeFigure 2) displayed such a pattern, it is use-ful to point out the implications of such a pat-tern for response scale linearity prior to dis-cussing the individual data.

The intersection test involves comparing thedata pattern of a significant interaction forone stimulus replication against a base lineprovided by a nonsignificant (i.e., interactionF < 1) stimulus replication. If the judgmentscontributing to both interactions came fromcomparable portions of the response scale, nocontinuous monotonic quadratic transforma-tion of the response scale (e.g., using wider

O? +5

3 o

% -5

/ A/U

M- M+ M-

REPLICATION NO. 1 REPLICATION NO. 8

Figure 2. Use of the intersection test to eliminatescale nonlinearity as an explanation of the inter-actions obtained when averaging over days and sub-jects in Experiment 1. (M+ = moderately favorable.M— = moderately unfavorable. The significant inter-actions for Stimulus Replications 1, P(\, 156) =6.3S, and 8, f (1, 156) = 6.12, are in solid lines andthe nonsignificant control interaction from StimulusReplication 3, F(l, 156) = 1.00, is in dashed lines.)

categories at the negative than at the positiveend) could simultaneously reduce both inter-actions to zero. The intersection test providesa way of insuring that both the significantinteraction and the nonsignificant control in-teraction involved responses from similar por-tions of the rating scale. The control stimulusreplication must be selected so that when thetwo interactions are graphed together, thelower lines of each intersect one another andthe upper lines of each intersect one another.

This intersection test was applied to bothsignificant stimulus replications for the groupdata. It was found that Stimulus Replication3 (F = 1.00), in which the interaction ac-counted for only 0.43% of the between-cellsvariance, provided such a control for both sig-nificant replications (see Figure 2). It can beconcluded, then, that the two significant inter-actions portrayed in Figure 2 represent gen-uine instances of unequal weighting andshould not be dismissed on the grounds ofsimple response scale nonlinearity.

Individual Data

The analyses of group data verified thepresence of differential weighting in the trait-judgment paradigm. The deviation from equal

2032 THOMAS M. OSTROM AND DEBORAH DAVIS

weighting, however, appeared to be relativelyminor. The overall contribution of differentialweighting accounted for only 1% of the be-tween-cells variance, was significant for only20% of the stimulus replications, and in allcases displayed a pattern that could be inter-preted as meaning that negative stimuli car-ried more weight than positive stimuli did.These restrictions on the equal weighting as-sumption would be even less consequential ifit could be demonstrated that they applied toonly a minority of the subjects.

Setting alpha at .05 and testing each 2 x 2stimulus replication interaction against thesubject's own Days (5) X Stimulus Persons(40) interaction (on 156 degrees of freedom),a full 82.5% of the subjects showed evidenceof nonadditivity for at least 1 of the 10 stim-ulus replications (see Table 1).

One difficulty with the use of 10 stimulusreplications lay in the increasing role ofchance as the number of replications in-creased. Assuming that 5 subjects out of 100would produce a significant interaction forany given stimulus replication simply bychance, that percentage increases to 40.1 wheneach subject responds to 10 different stimulusreplications. The role of chance returns to 5%when the interaction variance pooled acrossall replications (i.e., on ten degrees of free-dom) for each subject is tested. This approachis statistically comparable to the pooled testfor nonadditivity employed by Anderson(1962). Whereas only 25% of Anderson'ssubjects showed nonadditivity, Table 1 showsthat 60% of the subjects in the present studyproduced significant overall nonadditivity.These results indicate that when the experi-ment is designed to be maximally sensitive tothe effects of differential weighting, a sub-stantial number of people produce statisticallysignificant interactions in the trait-judgmenttask.

As noted in the presentation of the groupdata, a number of questions arise in interpret-ing significant interactions. First, the possibil-ity must be examined that the interactionsresulted from response scale nonlinearityrather than from unequal weighting. Sec-ond, it is possible that unequal weighting oc-curred for only a few stimulus replications

(e.g., the two that were significant in thegroup analysis). Third, it may be that all sig-nificant interactions are the result of a singlestimulus characteristic (e.g., extremely nega-tive traits having a greater weight than allothers), in which case all significant inter-actions for all persons would have the samepattern (as in Figures 1 and 2 ) . The predic-tion of idiosyncratic weighting, in regard tothe second and third above questions, is thatall stimulus replications should be involvedand that a variety of interaction patternsshould be obtained for each replication.

Scale linearity. Although we discardedscale nonlinearity as an explanation for thetwo significant stimulus replications for thegroup data, such an explanation could stillbe viable for a majority of the individuals.Consequently, the pattern of significant andnonsignificant stimulus replication interactionswas examined for the 24 subjects (60% of oursample) who displayed overall nonadditivity.

Of these 24 subjects, 14 satisfied the dis-ordinality test by displaying at least one sig-nificant disordinal interaction. Of the 10 re-maining subjects, 8 satisfied the intersectiontest for at least one significant interaction.For the remaining 2 subjects (Subjects 28and 33), it was not possible to dismiss thenonlinearity explanation, since in both casesthey showed the same interaction pattern forall 10 stimulus replications. That pattern wasconsistent with the interpretation that nega-tive stimuli carry more weight than positivestimuli do. However, even after deletion ofthese 2 subjects, 55% of the sample showedgenuine evidence of differential weighting.

Stimulus replications. The possibility re-mains that such widespread differentialweighting is due to a limited number of stim-ulus replications. Two of the three subjectswith significant interaction variance in theAnderson (1962) study were presented withthe same stimulus replication.

Out of 400 interactions tested in the presentstudy, 106 were significant. Table 2 shows thedistribution of significant interactions over the10 stimulus replications. It can be seen thatthe significant interactions were not restrictedto just a few stimulus replications. One hun-dred percent of the replications were involved,

IDIOSYNCRATIC WEIGHTING 2033

Table 1Analyses of Individual Data for Experiment 1

PooledSubject interactionnumber .Fstt

Stimulus replication interactions'1

10

12345678910111213141516171819202122232425262728293031323334353637383940

1.041.513.362.06* * *1.19 *4.05**3.33**3.15**.951.93*1.09 *.057.03** **3.97** ** * *3.98** *2.86** **3.47*1.92* *1.22 *3.33** **3.47** * **2.63** *.675.93** ** **1.19 *1.3718.39** * ** **1.732.27* **6.09** ** **18.87** ** **1.09

25.21** ** ** **.531.255.50** **2.14* *.676.13**4.06**

**

*

** **** ** **

*#

***

** ** ** ** **#* * * * * *

* **** * **

*

****

* **

** *

*** ** ** * ** **

** *

* ** **** ** ** ** **

*** ** ** ** ** ** **

* * * * * ** * * **

* ** ** **** * **

•# = 10, 156.b d f = 1,156.*p < .05. **£ < .01.

and all contributed approximately equally.For the distribution of interactions over the10 stimulus replications, x2(9) = 5.89, p >.75.

An average of 26.5% of the subjects had asignificant interaction for the typical stim-ulus replication. The number of subjects (outof 40) with significant interactions rangedfrom a low of 7 to a high of 15 over stimulus

replications. Even for the most additive of thestimulus replications, 17.5% of the subjectshad a significant interaction. In contrast tothe findings of the group data, then, it wouldappear that the pattern of unequal weightingobserved in the individual analyses cannotbe dismissed as due to only a few stimuluscombinations.

Idiosyncratic weighting. The prediction of

2034 THOMAS M. OSTROM AND DEBORAH DAVIS

Table 2Number of Subjects in Each StimulusReplication With a Significant Interactionfor Experiment 1

Stimulusreplication

123456789

10M

Number ofsubjects

withsignificantinteraction

128

121288

1215

71210.6

Numberinterpret-

able asTOP > wa

13561110122.1

Numberinterpret-able as

111,, > TOp

1157577

1015698.2

Note. N = 40. The terms K/P and wn refer to theweights given the more positive and the morenegative traits, respectively.

idiosyncratic weighting derives from the rea-soning that trait meaning and importance isacquired on the basis of each individual's per-sonal linguistic experiences with the word andits referents. It follows from this that for moststimulus replications, some people shouldweight the negative traits more highly andothers should weight the positive traits morehighly. Also, when examining the significantstimulus replications for each subject, thereshould be some subjects who show both pat-terns of trait weighting. Since the group datashowed evidence only of negative traits carry-ing more weight than positive, we must firstask whether there is any evidence for the op-posite pattern in the individual data.

Both ordinal and disordinal interactions canbe coded in terms of whether the more posi-tive or the more negative trait is being giventhe most weight. Of the 103 codable 2 inter-actions in the present study, 82 (79.6%)were found to have a pattern consistent withthe interpretation that negative traits carrymore weight than positive traits do. The re-maining 20.4% showed exactly the oppositepattern, that is, positive traits carried moreweight than negative traits did.

By chance alone, it would be expected that

half of the 400 interactions would show anegativity effect and half a positivity effect.At the .05 level of significance, this meansthat 10 tests should show significant negativ-ity and 10 tests significant positivity. Theobserved frequencies of 82 and 21 were bothsignificantly greater than the expected fre-quencies, x 2 g ( l ) - 531.69 and 12.41, ps <.001, respectively. Unlike previous researchat the normative level, clear evidence for apositivity effect is found for some subjects.Table 2 shows that in all but one stimulusreplication, there existed at least one subjectwho weighted the positive traits more than thenegative traits. An illustration of this diversityof interaction patterns is provided in Figure3 for Stimulus Replication 7.

Although there might be individual differ-ences in the propensity to assign greaterweight to negative or to positive traits, thenotion of idiosyncratic weighting leads us toexpect that a substantial number of peoplewill display both weighting patterns over dif-ferent stimulus replications. Of 23 subjectswho provided more than one significant stim-ulus replication interaction, 9 (or 39%)showed both weighting patterns. This is illus-trated for Subject 36 in Figure 4. The remain-ing subjects split 12 and 2 in showing exclu-sively higher negative weighting and higherpositive weighting, respectively.

Discussion

In contrast to the group data, the individualdata indicated that differential weighting inthe trait-judgment paradigm was widespread.No stimulus replication was immune and nosingle pattern of differential weighting (e.g.,negative traits receiving more weight) pre-vailed exclusively. It appears, then, that whena more sensitive experimental design than thatused by Anderson (1962) is employed, cleardifferences emerge between the analyses ofgroup judgments and the analyses of indi-

2 Three disordinal interactions could not be codedin these terms because the two lines intersected.When such "double disordinality" is present, it is im-possible to determine the relative scale values of thestimulus traits for the stimulus replication.

IDIOSYNCRATIC WEIGHTING 2035

O

1on

CO

+5

Z -5

$

Coolheoded

Unenterprising

CoolheadedCoolheaded

Unenterprising

Unenterprising

I I I

Coolheaded

Unenterprising

SLY CONSCIENTIOUS SLY CONSCIENTIOUS SLY CONSCIENTIOUS SLY CONSCIENTIOUS

Subject no. 10 Subject no. 16 Subject no. 13 Subject no. 40

(Ordinal, wN>wp) (Ordinal, *Vp>w^) (Disordinal, WN>WR) (Intersecting

Disordinal)

Figure 3. Examples of four different significant interaction patterns obtained for Stimulus Replica-tion 7 in Experiment 1. (The terms wf and wn refer to the weights given the more positive andthe more negative traits, respectively.)

vidual judgments. Whereas the equal weight-ing assumption holds for most stimuli at thenormative level, idiosyncratic weighting is thedominant feature at the individual level.

Experiment 2

Experiment 2 was undertaken for three rea-sons. Since the outcome of Experiment 1 stood

in direct contradiction to previous assumptionsregarding equal weighting of items in thetrait-judgment task, a need to replicate thefindings was evident. Experiment 2 used thesame individual-based design as Experiment1 but employed 10 new stimulus replicationsand 80 new respondents.

Whereas Experiment 1 used "naturalistic"instructions, the majority of studies with the

O

<

>

^<LU

_̂l

Z

LU

+5

O

-5

High-spirited

Rebellious

Clever

Overconfident

ISQUEAMISH CONGENIAL

NERVOUSRESOURCEFUL

SUSPICIOUSPUNCTUAL

REPLICATION NO. 8 REPLICATION NO. 10 REPLICATION NO. 1(Ordinal, WN>WP) (Ordinal, wp>v\^) (Disordinal, wp>w|st)

Figure 4. Examples of three different interaction patterns obtained from Subject 36 in Experiment 1.(The terms wf and Wn refer to the weights given the more positive and the more negative traits,respectively.)

2036 THOMAS M. OSTROM AND DEBORAH DAVIS

trait-judgment task have used "equal weight-ing" instructions. It is possible to argue (seeWyer, 1974a) that people can suppress theiridiosyncartic weighting of traits and give themnearly equal weight when instructed to do soby the experimenter. This would be consistentwith the finding that other kinds of experi-menter instructions regarding weights (e.g.,Anderson & Jacobson, 196S, "discounting"instructions) are known to be influential.

Two respondents in the first study producedthe same form of significant interaction for all10 stimulus replications. Consequently, it wasnot possible to eliminate the interpretationthat these two were using a noninterval re-sponse scale. A procedural modification wasintroduced in Experiment 2 to reduce thelikelihood that subjects would use a noninter-val response scale. A total of 20 extremeanchor stimuli preceded and were interspersedamong the 40 test stimuli in this study. Thisleads subjects to use the interior portions ofthe scale when rating the stimuli (Simpson,Ostrom, & Sloan, 1973) and thereby reducesthe influence of any noninterval responsetendencies that may intrude into either thepositive or negative extremes of the scale. Theuse of such anchor stimuli has become widelyadopted in the trait-judgment paradigm.

Method

Subjects

Subjects were 80 introductory psychology studentsfrom Ohio State University, 40 males and 40 females,who participated in partial fulfillment of courserequirements.

Stimuli

New stimulus traits were randomly selected fromthe Edwards (Note 1) list. Subjects judged 40 teststimulus trait pairs from 10 randomly constructedstimulus set replications, where each set consisted of4 trait pairs produced from a 2 X 2 factorial design.The two levels of likeability for the two trait factorswere M+ and M—, as in Experiment 1. No seman-tically inconsistent or redundant trait pairs wereallowed within a stimulus replication.

In order to adequately anchor the response scale,12 pairs of extreme anchor stimuli preceded the 40pairs of test stimuli, and 8 pairs of anchor stimuliwere evenly interspersed among the 40 test sets. Halfof the anchor pairs were positive, and half were nega-tive.

Procedure

Subjects were given either "equal weighting" (N =40) or "naturalistic" (N = 40) instructions. Thenaturalistic instructions were the same as in Experi-ment 1, and the equal weighting instructions aregiven below.

Each trait is equally important in describing theperson. Sometimes, of course, the two words mayseem inconsistent. That is to be expected, becauseeach of the two people may see a different part ofthe person's personality. However, both traits areaccurate, and each is equally important. Youshould pay equal attention to both.

After the experimenter answered any questions, sub-jects completed a practice booklet containing ISstimulus pairs sampled from the entire range of therating scale.

Subjects judged all 60 of the stimulus pairs (20anchor and 40 test) in the experimental booklets oneach of 5 successive days. Ratings were made on a21-point scale ranging from 0 (dislike very much)to 20 (like very much). Five random orders of teststimuli were Latin square counterbalanced acrossdays. Anchor stimulus pairs always appeared in thesame order.

Results

The second experiment differed from thefirst in two important ways: (a) It employedboth equal-weight and naturalistic instruc-tions and (b) it introduced extremely positiveand negative anchor stimuli in order to stabil-ize the response scale. No reliable differenceswere found between the equal-weight andnaturalistic instructions, either for the groupdata or the individual data. The results inFigure S and Table 3 show highly comparablefindings for the two conditions.

The introduction of extreme anchor stimulidid have its expected effect. Whereas themeans for the M + M+ and M — M — cellsdeviated 5.0 scale units on the average fromthe scale midpoint in Experiment 1, they de-viated only 3.9 scale units in the second study.Further, the adoption of anchor stimuli ap-parently succeeded in reducing the number ofsubjects employing a nonlinear response scale.Whereas in the first study 2 of 40 subjectswere identified as possibly using nonlinearscales, none of the 80 subjects in the secondstudy were so identified.

IDIOSYNCRATIC WEIGHTING 2037

Group Data

As in the first study, the overall interactionwas significant, F(l, 78) = 27.97, p < .001,and as Figure S shows, was not significantlyreduced under equal weighting instructions,F(l, 78) < 1. The total proportion of the be-tween-cells variance contributed by the inter-action was .Sl^o. Separate analyses were donefor each instruction condition, and the inter-action was significant for both, Fs(l, 39) >13.87, ps < .001, with neither contributingmore than .40% of the between-cells variance(see Table 3).

When tested separately, 4 of the 10 stim-ulus replications (1, 4, 6, and 10) producedsignificant interactions, Fs(l, 3042) > 3.97,ps < .OS.3 Of the 20 stimulus pairs in thisstudy, between 60% and 80% showed nosignificant departure from equal weighting atthis normative level. This is slightly lowerthan was observed in the first study (forwhich the comparable estimate was between80% and 90%). The observed increase innumber of significant interactions was due tothe doubling of sample size. When each in-struction condition was tested separately, onlytwo replications were significant in each (4and 6).

Three of the four replications satisfied theintersection test and could not therefore bedismissed on the grounds of scale nonlinearity.All significant replications had the same pat-tern as Figure S, indicating that when aver-aged over all subjects, negative stimuli ap-peared to carry more weight than positivestimuli. There was, however, only modestevidence that the significant stimulus replica-tions contained the most negative traits. Usingthe lowest marginal mean as an index of eachreplication's most negative trait, the four sig-nificant replications ranked second, third,fourth, and seventh most negative.

Overall, the results for these group datasubstantially replicate the group results forExperiment 1.

Individual Data

Over three quarters (77.5%) of the sub-jects had at least one significant interaction.Although this figure was quite comparable to

OZ +5

oo<

Z<

-5

M- AA+

NORMATIVE SCALE VALUE

Figure S. Mean likeability ratings averaged over days,stimulus replications, and subjects in Experiment 2.(M+ = moderately favorable. M— = moderately un-favorable. Naturalistic instructions results are indashed lines, and equal weighting instructions resultsare in solid lines.)

that for Experiment 1, there was a reductionin the percentage of subjects who showed sig-nificant pooled interaction variance (on 10and 156 degrees of freedom), from 60% to46%. Even though this figure is slightly below50%, it is still substantially higher than abaseline provided either by chance, x 2U) =286.30, p < .001, or by the 25% level ob-served in Anderson's (1962) study, x2(0 =

19.27, p < .001.The discrepancy between the two studies is

reduced somewhat when those subjects whoseinteractions might have been the result ofnonlinear response scales are eliminated fromconsideration. Whereas 2 subjects were de-leted on these grounds in the first study (re-ducing the figure to 55%), none were elim-inated from the second study (leaving thefigure at 46%). In examining the scale non-linearity interpretation for the 37 nonadditivesubjects in the second study, 22 satisfied thedisordinality test, and 15 the intersection test.

Stimulus replications. Once again, all stim-ulus replications contributed approximatelyequally to the significant interactions obtained

3 The error term for these analyses was the inter-action between stimulus persons (40) and subjects(40) nested within instruction conditions (2).

2038 THOMAS M. OSTROM AND DEBORAH DAVIS

Table 3Comparison of Experiments 1 and 2 for Group and Individual Data

Experiment 2

Item

Percent between cells variance due tointeraction

Number of significant stimulusreplications

Percent of subjects with at least onesignificant interaction"

Percent of subjects with significantpooled interactions11

Percent of stimulus replications withat least one significant interaction

Total number of significant interactionsPercent of all significant interactions

in which:fa > Wp

Wv > 1VD

Experiment 1

Group data

1.00

2

Individual data

82.5

60.0

100106

79.620.4

Naturalisticinstructions

.26

2

72.5

45.0

10057

74.125.9

Equal-weightinstructions

.40

2

82.5

47.5

10076

78.421.6

Combined

.31

4

77.5

46.25

100133

76.623.4

Note. The total number of subjects in Experiment 1 was 40. Experiment 2 had 40 subjects in each of the twoinstruction conditions. The terms wv and wn refer to the weights given the more positive and the morenegative traits, respectively.• Chance = 40.1%. b Chance = 5%.

at the individual level, X2(9) = 13.69, p >

.10. This occurred, however, despite a sub-stantial decrease in the average percentage ofsubjects showing a significant interaction forthe typical stimulus replication. The figuredropped from 26.5% to 1,6.6%, with a range,in the second study, from 7.5% to 27.5%.

The decrease in the second study may haveresulted from the introduction of 20 extremeanchor stimuli. The anchor stimuli were in-tended to eliminate some interactions thatwere due to scale nonlinearity. Since they in-creased the number of stimulus persons to bejudged each day by 50%, however, they mayalso have induced a more additive integrationset by making the task more boring.

Idiosyncratic weighting. Of the 128 cod-able 4 interactions, 76.6% showed a patterninterpretable as meaning that negative traitscarried more weight than positive traits, and23.4% showed a pattern favoring positivetraits. This was very close to the split ob-served in Experiment 1 (see Table 3). Also,as in Experiment 1, both the number of sig-

nificant negativity and positivity interactionsexceeded chance, x 2 ( l ) = I624, P < -Q0l> and

X 2 ( l ) = 5.13, p < .05, respectively. Instancesof both weighting patterns appeared in 8 ofthe 10 stimulus replications, the remaining 2containing only interactions in which thenegative trait received more weight.

As in the first study, a substantial numberof subjects displayed both weighting patterns.Out of the 37 subjects with more than one sig-nificant stimulus replication interaction,35.1% showed both patterns. All but 1 of theremaining 24 subjects gave the negative traitsmore weight than the positive traits.

Discussion

The results of Experiment 2 replicated theidiosyncratic weighting findings of Experiment

* Five disordinal interactions were not codable,four because the lines intersected (see Footnote 2)and one because the marginal means were exactlyidentical on one of the factors in the 2 X 2 design.

IDIOSYNCRATIC WEIGHTING 2039

1 and demonstrated as well that idiosyncraticweighting is obtained under both naturalisticand equal weighting instructions. The absenceof differences between instruction conditionsat the level of group analyses is consistentwith several previous studies finding no differ-ences in trait judgments between naturalisticand equal weighting instructional conditions(Anderson & Jacobson, 196S; Gollob & Lugg,1973; Lampel & Anderson, 1968; Wyer,1974b). Whereas people appear able to in-crease differential weighting tendencies whengiven "discounting" instructions (Anderson &Jacobson, 1965; Kaplan, 1973), they did notreduce their differential weighting tendencieswhen specifically instructed to weight all traitsequally.

General Discussion

The data reported in this article supportthe conclusion that it is incorrect to describethe integration process in the trait-judgmentparadigm as following an equal weightingaveraging rule at the level of the individualsubject. This conclusion is in contradiction toprevious descriptions of the trait-judgmenttask, descriptions that were based upon theone previous direct test of individual differ-ences in trait weights (Anderson, 1962). Thedifference in outcomes between this and theAnderson (1962) study can be attributed tothe far greater sensitivity of the present ex-perimental design.

Implications for InformationIntegration Theory

Generality oj idiosyncratic weighting. Thetrait-judgment paradigm is not the only socialjudgment domain in which differential weight-ing occurs at the individual level. Significantnonadditivity has been obtained in severalother areas. Leon, Oden, and Anderson(1973) found it for 38% of 16 subjects whojudged the "badness" of a group of peopleguilty of committing various crimes; Ander-son (1972) obtained it for 83% of 6 subjectswho judged, on the basis of behavior items,the severity of psychiatric disturbances;Troutman and Shanteau (1976) observed it

for an average of 28% of 20 subjects whojudged the quality of disposable diapers inone replication and infant car seats in theothers; and Shanteau and Anderson (1969)obtained it for an average of 25% of 20 sub-jects over four stimulus replications in whichsubjects rated their preferences for differentfood and beverage combinations. In the lastmentioned study, 65% of the subjects showedat least one significant interaction over thefour replications.

This extensive evidence of differentialweighting at the individual level was obtaineddespite the fact that none of the above studiesincorporated all of the precautions for max-imizing sensitivity employed in the presentresearch. Yet none of these percentage figuresare reasonably close to the five percent chancelevel (although due to small sample sizes,some may not statistically differ from it). Itwould appear, then, that at present thereexists no social judgment domain that couldplausibly be regarded as an instance wherean equal weight averaging integration ruleheld.

The absence of any verified instance ofequal weighting encourages speculation thatthere may, in fact, exist no social judgmentdomain for which equal weighting holds. Sucha state of nature would follow from thepremise that items of social informationacquire meaning and importance on the basisof highly individualistic experiences with thesigns and symbols conveying that information.Certainly on these grounds, it would seemthat any social information conveyed lexicallywould be susceptible to idiosyncratic weight-ing differences. A more promising domainwhere equal weighting may prevail would in-volve purely sensory stimuli responded tothrough a nonsemantic mode (as is often donein cross-modality matching).

Normative versus individual levels oj anal-ysis. The present studies show that findingsobtained at the group (or normative) level donot necessarily replicate for each member ofthat group. Consequently, it is appropriate toquestion whether other empirical findings ob-tained at the group level in the trait-judgmenttask (e.g., set size, serial position, and incon-sistency discounting effects) hold for all indi-

2040 THOMAS M. OSTROM AND DEBORAH DAVIS

viduals. There are no adequate data presentlyavailable that allow us to estimate what per-centage of people employ an averaging rule(as opposed to a summative or multiplicativerule) in the trait-judgment task. This possibil-ity of individual differences in integration rulecan be illustrated by a study in another socialjudgment domain (Leon, Oden, & Anderson,1973). They conducted analyses on individualsubject data and found significant set sizeeffects for some subjects and not for others.

The difference in outcome between groupand individual responses should not be dis-missed as reflecting mere individual differ-ences around a group average. Instead, thetwo types of data should be regarded as fun-damentally different levels of analysis, eachappropriate to different scientific objectives.

Data from an individual, when obtainedover repeated observations, are definitive forunderstanding the integration rule used bythat person. If the objective is to establishthat the typical person uses a particular inte-gration rule, it is necessary to show that mostpeople, when studied as individuals, employthat integration rule. Such a generalizationcannot be made on the basis of group data.

This is not meant to dismiss the importanceof the multitude of findings previously ob-tained at the group level. 'Group data informus as to the integration rule that characterizesthe modal societal reactions made in a par-ticular response domain. At this level ofanalysis, information integration theory pro-vides estimates of the normative weights andscale values of information items in that do-main. It also establishes a normative integra-tion rule for the subject group. This is clearlya descriptive enterprise that need have no im-plications for how individuals subjectively in-tegrate information. There are many problemareas for which it is of direct interest to studythe integration rule underlying the modalgroup response, areas such as stereotyping,advertising, political preferences, and jurydecision making. For example, it is useful toknow that jurors in aggregate can presumeinnocence (Ostrom, Werner, & Saks, 1978),even though this presumption may not holdfor all of the jurors considered individually.

Ambiguity of the "parallelism" test. Up to

this point in the paper we have chosen to ex-plain deviations from parallelism in terms ofunequal trait weights. But to do so requiresus to assume that the theoretical scale valueof each trait is unaffected by the nature of theother trait in the pair. Unlike trait weight, thescale value is viewed as nonlabile. This as-sumption has been made in most integrationtheory interpretations of differential weightingresearch and has been defended as a legit-imate strategy for model building (Himmel-farb, 1975).

Some recent studies (see Ostrom, 1977)have attacked that assumption, and haveestablished the plausibility of scale value la-bility. The cognitive representation and ac-companying evaluative response evoked by atrait appears to be directly affected by theevaluative tone of the context traits. If suchmeaning shift processes do occur in the trait-judgment domain, the interpretation of devia-tions from parallelism becomes much moredifficult. Significant interactions could be dueto differential weighting, scale value shifts(with equal weighting), or to both.

Most of the significant interactions ob-served in the present studies had an ordinal ordisordinal, nonintersecting pattern. Thesecould all be interpreted either in terms of dif-ferential weighting or in terms of scale valueshift. However, there is one logically possibleinteraction pattern that, if obtained, could notbe explained by a differentially weightedaveraging model, namely, a pattern that isboth disordinal and intersecting (see the rightpanel of Figure 3). Given that people areusing an averaging integration rule, this pat-tern could only be obtained if the scale valuesof the row traits reversed their relative posi-tivity, depending on which column trait theywere paired with. Although such patterns wereobserved in the present studies (three in thefirst and four in the second), they representedonly 2.93% of all significant interactions.

The major findings of the present studiescould be explained as easily in terms of scalevalue shift as they could by differentialweighting. Namely, it could be concluded thata majority of the subjects displayed scalevalue shifts, that all stimuli were capable ofundergoing such shifts, and that these shifts

IDIOSYNCRATIC WEIGHTING 2041

are highly idiosyncratic. For any particulartrait pair, some people show greater shift in anegative context, others in a positive context,and other people show no shift at all.

Limitations of a differential weighting aver-aging model. No fundamental difficulty iscreated for information integration theory ifidiosyncratic weighting proves to be the stateof nature in social judgment (or even alljudgment) domains. Its objective is, after all,the specification of an integration rule, anddetermination of whether equal or differentialweighting occurs. The problems created aremore ones of experimental convenience.

The equal weight averaging model has aremarkably useful feature. When it holds, andwhen subjects are responding on an intervaljudgment scale, no interaction should be sta-tistically detectable when orthogonally com-posed stimulus sets are being judged. This hasallowed researchers to draw three importantconclusions when such parallelism emerges:(a) An equal weight additive model is ap-propriate (discarding both unequal weightaveraging models and multiplicative models),(b) the interval property of the response scaleis validated, and (c) the marginal means of-fer estimates of stimulus scale values on aninterval scale. The empirical proceduresneeded to discriminate between an averagingmodel and a multiplicative model, to validatea response scale, and to obtain interval esti-mates of scale values are much more compli-cated in the absence of equal weighting.

Another disadvantage is that the trait judg-ment task cannot be used as a validationalbaseline for investigating other stimulus do-mains. For example, Lampel and Anderson(1968) had subjects judge persons describedby a photograph and several traits. The traitswere factorially constructed, and since nointeraction was obtained, the authors con-cluded that the response scale was validatedand that all traits received equal weight.These two conclusions allowed an unambig-uous interpretation of an interaction that wasobtained between trait valence and photo-graph attractiveness. The possibility of a non-linear response scale and the possibility ofdifferential trait weights could both be ruledout because of the absence of a Trait X Trait

interaction. This allowed the investigators toconclude (by a process of elimination) thatthere was a differential weighting of the pho-tographs in which weight was inversely re-lated to photograph attractiveness. The pres-ent finding of idiosyncratic weighting wouldrule out the use of the trait-judgment domainfor such purposes in future research at theindividual level.

Implications jor Person Perception Research

The stimulus domain studied in this re-search was fairly narrowly defined, being re-stricted solely to person characteristics in theform of personality trait adjectives. Even thatdomain, however, was not fully representedin the traits sampled for use in the presentstudies (from Anderson, 1968). The traits didnot include slang terms or adjectives solelydescriptive of mood or feeling states (e.g.,Bush, 1973). It seems unlikely, however, thateither slang or feeling terms would be morecharacterized by equal weighting than werethe sampled adjectives. In fact, slang termsmay well be even more prone to idiosyncraticweighting, given respondents who are selectedfrom two or more different social groupings.

There is a wide variety of informationitems that have been used in person percep-tion research, including personal attitudes,hobbies and interests, demographic character-istics, group memberships, and behavioral actsand intentions. It is, of course, possible (al-though we do not regard it as probable) thatone or more of these categories may representan equal weighting domain.

Although the present studies were nottailored to investigate the determinants oftrait weight, the idiosyncratic weighting find-ings suggest two possible avenues of explora-tion. There was some suggestion of individualdifferences in whether or not weight was asso-ciated with scale value. Nearly two-thirds ofthe subjects who had more than one signif-icant stimulus replication showed the sameweighting pattern (either negativity or posi-tivity) for each significant replication. Con-ceivably, such a dispositional tendency couldbe related to other individual differences inpositivity or negativity in impression judg-

2042 THOMAS M. OSTROM AND DEBORAH DAVIS

ments (see Kaplan, 1973). A second ap-proach would be to relate trait weight to thepersonal constructs (Kelly, 19SS; Rosenberg,1977) each individual characteristically usesto describe important others in his or hersocial world.

Reference Note

1. Edwards, J. D. Revised likeableness ratings of 554personality trait adjectives. Unpublished manu-script, Ohio State University, 1967.

References

Anderson, N. H. Application of an additive model toimpression formation. Science, 1962, 138, 817-818.

Anderson, N. H. Primacy effects in personality im-pression formation using a generalized order effectparadigm. Journal of Personality and Social Psy-chology, 1965, 2, 1-9.

Anderson, N. H. Likeableness ratings of SSS per-sonality trait words. Journal oj Personality andSocial Psychology, 1968, 9, 272-279.

Anderson, N. H. Looking for configurality in clinicaljudgment. Psychological Bulletin, 1972, 78, 93-102.

Anderson, N. H. Information integration theory: Abrief survey. In D. H. Krantz, R. C. Atkinson,R. D. Luce, & P. Suppes (Eds.), Contemporarydevelopments in mathematical psychology (Vol. 2 ) .San Francisco: Freeman, 1974.

Anderson, N. H. How functional measurement canyield validated interval scales of mental quantities.Journal oj Applied Psychology, 1976, 61, 677-692.

Anderson, N. H., & Jacobson, A. Effect of stimulusinconsistency and discounting instructions in per-sonality impression formation. Journal oj Per-sonality and Social Psychology, 196S, 2, 531-539.

Anderson, N. H., & Lopes, L. L. Some psycholinguis-tic aspects of person perception. Memory and Cog-nition, 1974, 2, 67-74.

Asch, S. E. Forming impressions of personality. Jour-nal of Abnormal and Social Psychology, 1946, 41,258-290.

Birnbaum, M. H. The nonadditivity of personalityimpressions. Journal oj Experimental Psychology,1974, 702, 543-561.

Bush, L. E., II. Individual differences multidimen-sional scaling of adjectives denoting feelings. Jour-nal of Personality and Social Psychology, 1973, 25,50-57.

Gollob, H. F., & Lugg, A. M. Effect of instructionsand stimulus presentation on the occurrence ofaveraging responses in impression formation. Jour-nal of Experimental Psychology, 1973, S*8, 217-219.

Himmelfarb, S. On scale value and weight in theweighted averaging model of integration theory.Personality and Social Psychology Bulletin, 1975,1, 580-583.

Hodges, B. H. Effect of valence on relative weighting

in impression formation. Journal oj Personalityand Social Psychology, 1974, 30, 378-381.

Kanouse, D. E., & Hanson, L. R. Negativity in eval-uation. Morristown, N.J.: General Learning Press,1972.

Kaplan, M. F. Context effects in impression forma-tion: The weighted average versus the meaning-change formulation. Journal oj Personality andSocial Psychology, 1971, 19, 92-99.

Kaplan, M. F. Stimulus inconsistency and responsedispositions in forming judgments of other persons.Journal oj Personality and Social Psychology, 1973,25, 58-64.

Kaplan, M. F. Evaluative judgments are based onevaluative information: Evidence against meaningchange in evaluative context effects. Memory andCognition, 1975, 3, 375-380.

Kelly, G. A. A theory oj personality: The psychologyoj personal constructs. New York: Norton, 1955.

Lampel, A. K., & Anderson, N. H. Combining visualand verbal information in an impression formationtask. Journal oj Personality and Social Psychology,1968, 9, 1-6.

Leon, M., Oden, G. C., & Anderson, N. H. Functionalmeasurement of social values. Journal of Per-sonality and Social Psychology, 1973, 27, 301-310.

McKillip, J., Barrett, G., & Dimiceli, A. J. Trait am-biguity and impression formation: Sufficiency testsof the meaning change model. Journal oj GeneralPsychology, 1978, 98, 161-171.

Oden, G. C., & Anderson, N. H. Differential weightingin integration theory. Journal oj Experimental Psy-chology, 1971, 89, 152-161.

Ostrom, T. M. Between-theory and within-theoryconflict in explaining context effects in impressionformation. Journal oj Experimental Social Psy-chology, 1977, 13, 492-503.

Ostrom, T. M., Werner, C., & Saks, M. J. An integra-tion theory analysis of jurors' presumptions of guiltor innocence. Journal oj Personality and SocialPsychology, 1978, 36, 436^50.

Rosenbaum, M. E., & Levin, I. P. Impression forma-tion as a function of source credibility and orderof presentation of contradictory information. Jour-nal of Personality and Social Psychology, 1968, 10,167-174.

Rosenberg, S. New approaches to the analysis of per-sonal constructs in person perception. In J. K.Cole and A. W. Landsfield (Eds.), Nebraska Sym-posium on Motivation (Vol. 24). Lincoln: Uni-versity of Nebraska Press, 1977.

Schmidt, C. F. Personality impression formation as afunction of relatedness of information and lengthof set. Journal of Personality and Social Psychol-ogy, 1969, 12, 6-11.

Schumer, R. Context effects in impression formationas a function of the ambiguity of test traits. Euro-pean Journal of Social Psychology, 1973, 3, 333-338.

Shanteau, J. C., & Anderson, N. H. Test of a con-flict model for preference judgment. Journal ojMathematical Psychology, 1969, 6, 312-325.

Simpson, D. D., Ostrom, T. M., & Sloan, L. R. An-

IDIOSYNCRATIC WEIGHTING 2043

choring effects of trait range in impression forma-tion. Bulletin of the Psychonomic Society, 1973,2, 383-384.

Troutman, C. M., & Shanteau, J. Do consumersevaluate products by adding or averaging attributeinformation? Journal of Consumer Research, 1976,3, 101-106.

Warr, P. B. Inference magnitude, range, and evalua-tive direction as factors affecting relative impor-tance of cues in impression formation. Journal ofPersonality and Social Psychology, 1974, 30, 191-197.

Warr, P., & Jackson, P. The importance of extremity.Journal of Personality and Social Psychology, 1975,32, 278-282.

Warr, P., & Jackson, P. Three weighting criteria inimpression formation. European Journal of SocialPsychology, 1976, 6, 41-50.

Wyer, R. S. Information redundancy, inconsistency,

and novelty and their role in impression formation.Journal of Experimental Social Psychology, 1970,6, 111-127.

Wyer, R. S. Category ratings as 'subjective expectedvalues': Implications for attitude formation andchange. Psychological Review, 1973, 80, 446-467.

Wyer, R. S. Cognitive organization and change: Anin formation-processing approach. Potomac, Md.:Erlbaum Associates, 1974. (a)

Wyer, R. S. Changes in meaning and halo effects inpersonality impression formation. Journal of Per-sonality and Social Psychology, 1974, 29, 829-835. (b)

Wyer, R. S., & Watson, S. F. Context effects in im-pression formation. Journal of Personality andSocial Psychology, 1969, 12, 22-23.

Received November 27, 1978 •