contingency analysis validating evidence and process

101
3.1 CONTINGENCY ANALYSIS Validating Evidence and Process CHARLES OSGOOD * 108 INFERENCES AND INDICATORS Making an inference (or prediction) in content analysis involves at least the following: (1) some indicator or class of indicators that can be identified in the message sequence, (2) some state or process in the individuals producing or receiving the message, and (3) some dependency between these two such that the presence, absence, or degree of the former is correlated more than by chance with the presence, absence, or degree of the lat- ter.** The events in messages that might serve as indicators (correlate unspecified) are practi- cally infinite—the frequency or locus of occur- rences of the first-person singular pronoun “I,” the sheer magnitude or rate of output in word or other units, pitch and/or intensity oscillation of the voice in various message segments, the probability level of the syntactical alternatives chosen, and so on ad infinitum. Similarly, the states of individuals that one might make inferences about (again, correlate unspecified) are as infinite as the classificatory ingenuity of all of the members of the American Psychological Association put together—the intelligence, communicative facility, or racial origin of the speaker, his anxiety, aggressive, or sexuality level, his association, attitude, or value structure, his semantic or formal lan- guage habits, and so on. Most (if not all) of the characteristics of an individual, in one way or another, probably influence what happens in his communications. But the rub lies in (3) above—some indicator having a non- chance relation to the characteristic in which we are interested must be isolated—and so far, psycholinguistics has had little more than suggestions or hunches to offer. [Content analysts] . . . are likely to be most interested in specific inferences; for instance, does country A intend to attack country B and *From Osgood, C. K. (1959). The representational model and relevant research methods. In I. de Sola Pool (Ed.), Trends in content analysis (pp. 33–88). Urbana: University of Illinois Press. (Excerpt represents pp. 33–37, 54–71, and 73–77). **When this text-context relationship is operationalized for use in a content analysis, we call it an “analytical construct.” See Krippendorff, K. (2004:34–35, 171–187). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks, CA: Sage. 03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 108

Upload: khangminh22

Post on 20-Jan-2023

2 views

Category:

Documents


0 download

TRANSCRIPT

3.1CONTINGENCY ANALYSIS

Validating Evidence and Process

CHARLES OSGOOD*

108

INFERENCES AND INDICATORS

Making an inference (or prediction) in contentanalysis involves at least the following:(1) some indicator or class of indicators thatcan be identified in the message sequence,(2) some state or process in the individualsproducing or receiving the message, and (3) some dependency between these two suchthat the presence, absence, or degree of theformer is correlated more than by chance withthe presence, absence, or degree of the lat-ter.** The events in messages that might serveas indicators (correlate unspecified) are practi-cally infinite—the frequency or locus of occur-rences of the first-person singular pronoun “I,”the sheer magnitude or rate of output in wordor other units, pitch and/or intensity oscillationof the voice in various message segments, theprobability level of the syntactical alternativeschosen, and so on ad infinitum. Similarly, the

states of individuals that one might makeinferences about (again, correlate unspecified)are as infinite as the classificatory ingenuityof all of the members of the AmericanPsychological Association put together—theintelligence, communicative facility, or racialorigin of the speaker, his anxiety, aggressive,or sexuality level, his association, attitude, orvalue structure, his semantic or formal lan-guage habits, and so on. Most (if not all) ofthe characteristics of an individual, in one wayor another, probably influence what happensin his communications. But the rub lies in(3) above—some indicator having a non-chance relation to the characteristic in whichwe are interested must be isolated—and so far,psycholinguistics has had little more thansuggestions or hunches to offer.

[Content analysts] . . . are likely to be mostinterested in specific inferences; for instance,does country A intend to attack country B and

*From Osgood, C. K. (1959). The representational model and relevant research methods. In I. de Sola Pool (Ed.), Trendsin content analysis (pp. 33–88). Urbana: University of Illinois Press. (Excerpt represents pp. 33–37, 54–71, and 73–77).

**When this text-context relationship is operationalized for use in a content analysis, we call it an “analytical construct.”See Krippendorff, K. (2004:34–35, 171–187). Content analysis: An introduction to its methodology (2nd ed.). ThousandOaks, CA: Sage.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 108

when? Driven by internalized demands forscientific rigor, the academically orienteduser of content analysis is likely to be mostinterested in general inferences; for instance,is there a general lawful relationship such thatincrease in the drive level of the speaker isaccompanied by simplification and normal-ization of his semantic and structuralchoices? . . . [T]here is no necessary incom-patibility here: just as the validation of manyspecific inferences by practically orientedusers may provide insights into general rela-tions, so the gradually accumulating gener-alities of the academician may enrich theinference base for the practical content ana-lyst. The ideal situation is probably that inwhich “tool makers” and “tool users” work inclose association.

Of all . . . source or receiver characteris-tics, which might be inferred from the contentof their communications, . . . four (are outstand-ing. Attention or interest, inferred from therelative frequencies with which lexical itemsare produced; Attitudes, inferred from the useof evaluative terms; Language correspon-dences or linguistic habits, inferred from con-text dependent expectations; and Associationstructures, inferred from the contingenciesbetween content items in a source’s messages,regardless of either frequency of usage orevaluation. This chapter concerns the latter.)

ASSOCIATIONS AND DISSOCIATIONS

An inference about the “association struc-ture” of a source—what leads to what in itsthinking—may be made from the contingen-cies (or co-occurrences of symbols) in thecontent of a message. This inference is largelyindependent of either “attention level” (fre-quency) or “evaluation” (valence). One of theearliest published examples of this type ofcontent analysis is to be found in a paper byBaldwin (1942) in which the contingenciesamong content categories in the letters of awoman were analyzed and interpreted.

If there is any content analysis technique,which has a defensible psychological rationale,

it is the contingency method. It is anchored tothe principles of association, which werenoted by Aristotle, elaborated by the BritishEmpiricists, and made an integral part of mostmodern learning theories. On such grounds, itseems reasonable to assume that greater-than-chance contingencies of items in mes-sages would be indicative of associationsin the thinking of the source. If, in the pastexperience of the source, events A and B (e.g.,references to FOOD SUPPLY and to OCCU-PIED COUNTRIES in the experience ofJoseph Goebbels) have often occurredtogether, the subsequent occurrence of one ofthem should be a condition facilitating theoccurrence of the other: the writing or speak-ing of one should tend to call forth thinkingabout and hence producing the other. It alsoseems reasonable to assume that less-than-chance contingencies of items in messageswould be indicative of dissociations in thethinking of the source. If, in the experienceof the source, events A and B (e.g., MOTHERand SEX in a psychotherapy case) have oftenbeen associated, but with fear or anxiety, theoccurrence of one of them should lead to theinhibition of the other. Such inhibition mightbe either central (unconscious and involun-tary) or peripheral (conscious and deliberate).

AN EXPERIMENTAL TEST

OF THE BASIC ASSUMPTIONS1

In applying contingency analysis to real prob-lems, such as propaganda study and psy-chotherapy, we would like to use the data aboutwhat things co-occur in messages to makeinferences about a person’s association struc-ture and also about what things have gonetogether in his (or her) experience; that is,about the experiential basis for his or her asso-ciation structure. Unfortunately, however, insuch application situations we seldom if everhave any data with which to validate our infer-ences. Usually we have only the messages pro-duced, not the source who produced themessages (and who could give us other indicesof his association structure) and certainly not

Contingency Analysis • 109

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 109

the history of his experience. In order to test thebasic assumptions of this method, therefore, itis necessary to develop a controlled experimen-tal situation in which (1) the experiential historycan be approximately known, and (2) the asso-ciation structure can be estimated indepen-dently of the message structure. The followingexperiment provides such conditions.

Hypotheses and General Design

Our general assumption is that (1) contin-gencies in experience come to be representedin (2) an individual’s association structureby patterns of association and dissociationof varying strengths, which help determine(3) the contingencies in messages producedby this individual. We require a simple situa-tion in which we can measure

(1) Fa(b) > Fa(c) > Fa(d) . . . > Fa(n) —the varying frequencies (F) in experience withwhich an event (a) is followed by other events(b, c, d . . . n);

(2) Pa(b), Pa(c), Pa(d) . . . Pa(n) —thevarying probabilities (P) with which subjectsexposed to the above experience will associ-ate items b, c, d . . . n when some other person(the experimenter) gives a, thus providing ameasure of association structure (associa-tional probability); and

(3) P*a(b), P*

a(c), P*a(d) . . . P*

a(n) —thevarying probabilities with which subjectsexposed to the above experience will produceitems b, c, d . . . n after they themselves haveproduced a. This provides a measure of mes-sage contingency (transitional probability). Ifwe think of the subject in this experiment as acommunicating unit in the information theorysense, Fa(b) is the input to the unit and P*

a(b)is the output. The experimenter determines theinput in such a way that Fa(b) > Fa(c) > Fa(d).(It is assumed that before this experience theassociations between these items are randomacross subjects, and materials for the experi-ment are selected to approximate this condi-tion.) The theory, which we are testing, may bestated more formally as a series of hypotheses.

Hypothesis I. Exposure to a sequence ofpaired events such that Fa(b) > Fa(c) > Fa(d)will result in a non-chance association struc-ture among these events such that Pa(b) >Pa(c) > Pa(d).

Hypothesis II. Given an association structuresuch that Pa(b) > Pa(c) > Pa(d) in a set of sub-jects, sequential messages by these subjectslimited to these events will display contingen-cies (transitional probabilities) such thatP*

a(b) > P*a(c) > P*

a(d).

Hypothesis III. Given exposure to a sequenceof paired events such that Fa(b) > Fa(c) > Fa(d)and subsequent production of sequential mes-sages limited to these events, message contin-gencies will be such that P*

a(b) > P*a(c) > P*

a(d).This dependency relation between input andoutput assumes mediation via the subject’sassociation structure.

Hypothesis IV. The dependency relationbetween association structure and messagecontingency (described in Hypothesis II)will be greater than the dependency relationbetween input contingency and messagecontingency (described in Hypothesis III).This derives from the assumption that mes-sage contingencies depend directly upon theassociation structure of the subject and onlymediately upon his experience; to the extentthat individual subjects have prior associa-tive experience with the items, these associ-ations will also influence the final structure.

Hypothesis V. The degree of dependence(1) of association structure upon experientialcontingency and (2) of message contingencyupon experiential contingency will be a directfunction of the frequency of experiential con-tingency, Fa(b). In other words, we assumethat modification of association structure (andhence transitional message structure) varieswith the frequency with which events arepaired in experience—a straightforward psy-chological association principle. With respectto measurement, this implies that the morefrequent pairings in experience have been,

110 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 110

the more significant will be the deviations ofassociational and transitional probabilitiesfrom chance.

Hypothesis VI. The degree of dependencebetween association structure and messagecontingency will be relatively independentof the frequency of experiential contingency.This assumes that whatever pre-experimentalassociations between items exist in individualsubjects will determine both associative andtransitional (message) contingencies; hencedependency relations here should be rela-tively independent of experimental inputs.

The burden of this analysis, if substanti-ated in the results, would be that contingencycontent analysis provides a valid index of theassociation patterns of the source, but only amediate and tenuous index of his life history.It is realized, of course, that this “laboratory”

approach side-steps many of the problemsthat arise in practical applications of contin-gency analysis; some of these will be consid-ered later under a critique of the method.

Method

Two groups of 100 subjects each wereshown 100 successive frames of a single-framefilm strip. On each frame was a pair of girls’names, for example BEATRICE-LOUISE.There were only ten girls’ names altogether,but these were so paired that (1) each namewould appear equally often on the left and onthe right, (2) the ordering of frames with respectto names was random, and (3)—the mainexperimental variable—each name appearedwith others with different frequencies. The pat-tern of input pairing shown below forJOSEPHINE was duplicated for each of the tennames (with different specific names, of course):

Contingency Analysis • 111

JOSEPHINE-BEATRICE 6 LOUISE-JOSEPHINE 6JOSEPHINE-CYNTHIA 3 GLADYS-JOSEPHINE 3JOSEPHINE-HAZEL 1 ESTHER-JOSEPHINE 1

with SARAH 0with ISABELLE 0with VALERIE 0

Subjects were asked simply to familiarizethemselves with the names. Following view-ing of the 100 frames, two different measureswere taken. (1) Association test. Each of thegirls’ names was shown separately on thescreen for eight seconds, and subjects wereinstructed to write down the first other girl’sname that occurred to them. Here the experi-menter provides the stimulus—associativeprobability. (2) Transitional contingency test.Subjects were given little booklets andinstructed to write one girl’s name succes-sively on each page, filling in as many pagesas they could and not looking back. Here thestimulus for each response is the subject’sown previous behavior.

The group that had the associational testfirst and the transitional last will be referredto as Group I; the one that had the transitional

test first and the associational last will becalled Group II. Three tables were formed foreach group. The input table, the same for bothgroups, gave the relative frequency (percent)with which each name had been paired withevery other name on the presentation frames,without regard to the forward or backwarddirection of association. Since each nameappeared on 20 frames, an item paired sixtimes with another would have this noted as30 percent of its appearances, three times,15 percent, and one time, 5 percent. The associ-ation table gave, for each stimulus name, therelative frequency (percent) of subjects givingeach of the ten possible response names.Thus, if 16 of the 100 subjects wrote BEAT-RICE when they saw JOSEPHINE, 16 per centwas entered in the appropriate cell. The tran-sitional table gave, for each self-produced

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 111

stimulus name, the relative frequency (percent)of subjects giving each of the ten possibleresponse names. Thus, if JOSEPHINE app-eared in the booklets of 79 subjects and wasfollowed immediately by the name HAZEL inthe booklets of 11 of these subjects, 14 percent was entered in the appropriate cell.2

Results

Table 1 gives the correlations obtainedamong input, association structure, and mes-sage contingency. With regard to the firsthypothesis, it can be seen that the r betweeninput frequency and associative probability is.58 for Group I and .37 for Group II—theseare both significantly greater than zero and inthe expected direction. The fact that the corre-lation is considerably higher for Group I thanGroup II may reflect the effect of interpolatingthe transitional test and the consequent greaterremoteness of the stimulus input from the actof producing the association in Group II.

Relations between input frequency andtransitional (message) contingency are both inthe predicted direction but are not significantlydifferent from zero. Hypothesis III is thus notconfirmed at a satisfactory level of signifi-cance. It is interesting to note, however, that(1) the input/transitional r is actually lower inGroup II, where the transitional test immedi-ately followed the input, than in Group I, and(2) the relation between input and transitionalprobabilities seems to vary with that betweeninput and associational probabilities—as if (ashypothesized) the transitional contingenciesdepended upon the associative structure.

Regarding Hypothesis II, it may be seenthat the r between associational and transi-tional probabilities is positive and significantfor both groups. That the degree of relationbetween associational and transitional proba-bilities is approximately the same (particu-larly when the continuous raw data arecorrelated, .46 and .42) for both groups sub-stantiates Hypothesis VI; as is shown, despitethe gross differences between Groups I and IIin degrees of correlation between inputfrequencies and both measures, the relationwith association structure and transitionalmessage structure is the same. With regard toHypothesis IV, it can be seen that for bothgroups the correlations between associationalprobabilities and transitional probabilities arehigher than those between input frequenciesand transitional probabilities, as anticipated.

Finally, there is Hypothesis V—that thedegree of dependency of both associationaland transitional probability upon the inputfrequencies varies with the absolute fre-quency of input pairing. To test that we exam-ine whether the predictability of a responsename from knowing the stimulus name varieswith the frequency of pairing in the input.For each subject the number of “correct”responses given in each frequency categorywas recorded. (A Dixon-Mood sign test wasused to determine significance.) For all condi-tions except the backward direction of associ-ation on the transitional test, six pairingsyielded significantly more “correct” associa-tions than either three or one pairings. Thelower frequencies of pairing, three and one,were not significantly different from each

112 • PART 3

Table 1

Dependency Relations Group 1 Group II

Input/Associational .58a .37a

Input/Transitional .19 .09

Associational/Transitionalb .39a (.46)a .48a (.42)a

a. Significantly greater than zero at the 5 percent level or better.b. The bracketed values for associational/transitional in this table werecomputed from the continuous raw data prior to transformation intodiscrete stepwise values.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 112

other or from zero pairing. As might beexpected in a culture that reads from left toright, “forward” associations were signifi-cantly stronger than “backward” associations.

Conclusion

This experiment was designed to test certainassumptions that seem to give value to a con-tingency method of content analysis: (1) thatthe association structure of a source dependsupon the contingencies among events in hislife experience, and (2) that inferences as tothe association structure of a source can bemade from the contingencies among items inthe messages he produces. This experimentprovided conditions in which the contingen-cies among events occurring to human“sources” could be at least partly manipulatedand hence known. It also provided conditionsin which the resultant association structures ofthese “sources” could be determined indepen-dently of the contingencies in the “messages”(transitional outputs) they produced. Both ofthe major assumptions above were supportedby the results, association being shown to bedependent upon input contingencies and tran-sitional output contingencies upon associationstructure to significant degrees. The resultsalso indicate that whereas “message” contin-gencies are dependent upon association struc-ture, they are only remotely dependent uponexperienced input within the experiment itself;that is, non-chance associations between itemsexisted prior to the experimental input manip-ulation and also influenced transitional contin-gencies. In general, the degree to which inputinfluences both association structure and tran-sitional contingency is a function of the fre-quency of input pairing.

NATURE OF CONTINGENCY ANALYSIS

In the application of the contingency methodas a kind of content analysis, in contrast to theexperimental situation just described, we arelimited to events in messages, and from themtry to make inferences about the association

structure of their source. The message is firstdivided into units, according to some relevantcriterion. The coder then notes for each unitthe presence or absence of each content cate-gory for which he is coding. The contingen-cies or co-occurrences of categories in thesame units are then computed and tested forsignificance against the null (chance) hypoth-esis. Finally, patterns of such greater-than- orless-than-chance contingencies may be ana-lyzed. This may be done by a visual model,which gives simultaneous representation toall of the relationships. Let us take up thesestages of analysis one by one.

Selection of Units

Often the message materials to be ana-lyzed will fall into natural units. One wouldnormally take each day’s entry in a personaldiary, for example, as a single unit. Or inanalyzing the association structure of“Republicans” vs. “Democrats,” where a sam-ple of individuals in each class have writtenletters to an editor, the letter from each indi-vidual would be a natural unit. Similarly, instudying the editorials in a certain newspaper,each editorial might be a unit. On the otherhand, one may wish to analyze the contingen-cies in a more or less continuous message, forexample in James Joyce’s Ulysses, and here itwould be necessary to set up arbitrary units.

If the unit is too small (a single word, forexample), then nothing can be shown to becontingent with anything else; if it is too large(the entire text or message, for example), theneverything is completely contingent witheverything else. There seems to be a broadrange of tolerance between these limits withinwhich approximately the same contingencyvalues will be obtained. . . . In one small-scale investigation, . . . we found contingencyvalues to be roughly constant between 120and 210 words as units.

Selection of Coding Categories

Here, as in most other types of contentanalysis, the nature, number, and breadth of

Contingency Analysis • 113

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 113

categories noted depend upon the purposesof the investigator. If the analyst has a veryspecific purpose, he will select his content cat-egories around this core. In our own work,which has been methodologically oriented, wehave merely taken those interesting contentsmost frequently referred to by the source. Thesame categorizing problems faced elsewhereare met here as well; for example, whetherreferences to RELIGION in general, CHRIS-TIANITY, and the CHURCH should belumped into a single category or kept separate.Of course, the finer the categories used, thelarger must be the sample in order to get signif-icant contingencies. We do run into one specialcategorizing problem with the contingencymethod, however: if one were to code twoclose synonyms like YOUNG WOMEN andGIRLS as separate categories, he would prob-ably come to the surprising conclusion thatthese things are significantly dissociated inthe thinking of the source; being semantic

alternatives, the source tends to use one in onelocation and the other in another location. Ifsuch closely synonymous alternates are treatedas a single category, the problem does not arise.

Raw Data Matrix

Armed with a list of the content cate-gories . . ., the coder inspects each unit of thematerial and scores it in a raw data table suchas that shown as Figure 1A.

Each row in the table represents a differentunit (1, 2 . . . n) and each column a differentcontent category (A, B . . . N). The coder maynote merely the presence or absence of refer-ences to each content category; if present inunit 1, category A is scored plus, and if absentin unit 1, category A is scored minus—howoften A is referred to (within a unit) is irrele-vant in this case. One may also score in termsof each category being above or below itsown median frequency; if above, plus, if

114 • PART 3

A. Raw Data Matrix

Content Categories

Units A B C . . . N

1 + − + etc.

2 − + −3 − + −: + + −

n etc.

% 0.40 0.20 0.60

B. Contingency Matrix

Content Categories

Units A B C . . . N

A − 0.08 0.24 etc.

B 0.06 − 0.12

C 0.38 0.02 −: etc. −

N

Figure 1 (A) Raw Data Matrix and (B) Contingency Matrix

ExpectedContingencies

pA × pB

ObtainedContingencies

pAB

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 114

below, minus. This method needs to be usedwhen units are relatively large and many cat-egories tend to occur in most units (as can beseen, the presence/absence method in thiscase would show everything contingent oneverything else). In this case, one first entersthe actual frequencies of reference in the cellsof Figure 3A, computes the median for eachcolumn, and then assigns each cell a plus or aminus depending on whether its frequency isabove or below this median.

Contingency Matrix

The contingency matrix, as illustrated inFigure 2B with entirely hypothetical data, pro-vides the information necessary for comparingexpected or chance going-togetherness of cate-gories with actual obtained going-togetherness.The expected or chance contingency for eachpair of columns is obtained by simply multi-plying together the sheer rates of occurrenceof these two categories, that is, pA times pB inanalogy with the probability of obtaining bothheads (HH) in tossing two unbiased coinswhose PH are both .50. We find the probabili-ties or relative rates of occurrence for eachcontent category in the row labeled “percent”at the bottom of the raw data matrix. Thus,since A occurs in 40 per cent of the units andB in 20 per cent, we would expect A and B tooccur together (be contingent) in only 8 percent of the units on the basis of chance alone.Extending this to all possible pairs of cate-gories, we fill in the upper right cells of thematrix, A/B, A/C, B/C, etc.

In the corresponding lower left cells of thismatrix, for example B/A, C/A, C/B, etc., wethen enter the actual or obtained contingen-cies; these are simply the percentages of unitswhere plusses occur in both of the columnsbeing tested. For example, in the part of thematrix shown in Figure 2A there is one suchdouble plus between columns A and B.

If the obtained contingency is greater thanthe corresponding expected value (e.g., C/A.38, A/C .24), these events are co-occurringmore often than by chance; if the obtainedcontingency is less than the corresponding

expected value (e.g., C/B .02, B/C .12), theseevents are co-occurring less often than bychance.

Significance of Contingencies

The significance of the deviation of anyobtained contingency from the expectedvalue can be estimated in several ways.Baldwin (1942) utilized the chi-square test,in which a two-by-two frequency table (AB,A but not B, B but not A, neither A nor B) isarranged from the data in each pair ofcolumns in the original data matrix andwhere the total N equals the number of units.This becomes pretty laborious with a largenumber of units. . . . [F]urthermore, the fre-quency of entries in the AB cell may often bebelow five, a number usually given as a lowerlimit in applying chi-square. We have usedthe simple standard error of a percentage,

where p is the expected value in the upperright half of the contingency table and N is thetotal number of units sampled. This gives usan estimate of how much an obtained percent-age may be anticipated to vary about itsexpected value; for example, if the sigma is.07 then a difference between the expectedand obtained of .14 (two sigma) would onlyoccur about five times in a hundred (two-tailtest, direction of difference unspecified) bychance alone. . . . [W]ith large numbers ofunits, the size of p may become so small thatsome correction (e.g., an arc-sin transforma-tion) must be made. . . . This method of esti-mating significance is not altogethersatisfactory, and some work on a bettermethod is needed. . . .

Representation of Results

There are a number of ways in which theresults of a contingency analysis can be repre-sented, all of them being matters of conve-nience and efficiency in communicating

σp =ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipð1− pÞ

N

r,

Contingency Analysis • 115

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 115

rather than rigorous quantitative proceduresin themselves. (1) Table of significant contin-gencies. The simplest summary picture is atable which simply lists, for each category,the other categories with which it has signifi-cant associations or dissociations. (2) Clusteranalysis. From the total contingency matrix,one may by inspection select sets of cate-gories which form clusters by virtue of allhaving either significant plus relations witheach other or at least include no significantminus relations. All such sets may be repre-sented in an ordinary two-dimensional sur-face as overlapping regions (see Figure 2, theGoebbels diary data).

(3) Models derived from the generalizeddistance formula. Where the plusses andminuses in the raw data matrix representfrequencies above and below the median fre-quencies for each column, one may use thegeneralized distance formula

where d represents the difference in each unitbetween values (+ or –) in any two columns(zero where they have the same sign, 2 wheredifferent signs). If all signs between twocolumns are identical, D equals zero; if thereis no correspondence, D is maximal.

We may now construct a new matrix simi-lar to the contingency matrix (Fig. 1B) inwhich we enter D for every pair of categories.If no more than three factors are required toaccount for the relations in the D matrix, theentire set of distances can be represented in asolid (three-dimensional) model. If more fac-tors are involved, a three-dimensional (repre-sentation) can only approximate the truedistance relations (even though the values inthe D matrix are valid for any number ofdimensions) (see Osgood & Suci, 1952). (Thereason the D method cannot be applied wheremere presence and absence is recorded is thatin this case pairing of minuses between

columns merely indicates lack of relation orindependence between categories.)

ILLUSTRATIVE APPLICATIONS

OF THE CONTINGENCY METHOD

Cameron’s Ford SundayEvening Hour Talks

A sample of 38 talks given by W. J. Cameronon the Ford Sunday Evening Hour radioprogram, each talk running to about 1,000words, was studied by this method.3 Each talkwas treated as a unit. Based on a preliminaryreading, 27 broadly defined content cate-gories were selected in terms of frequency ofusage. The analyst then went through thesematerials noting each reference to these cate-gories. The median frequency of appearanceof each category was computed and a matrixof units (rows) against categories (columns)was filled. A plus sign was entered for fre-quency of category above the median for aunit and a minus for frequencies below.Applying the formula for D given above toeach pair of content columns, a D matrixshowing the distance of each category fromevery other category was computed.*

. . . [Those] familiar with Cameron’s talks[have acknowledged] that the pattern of rela-tionships produced here had considerable facevalidity. References to FACTORIES, indus-try, machines, production, and the like(FAC) tended to cluster with references toPROGRESS (PRO), FORD and Ford cars(FD), free ENTERPRISE and initiative(ENT), BUSINESS, selling, and the like(BUS), and to some extent with references toRUGGED INDIVIDUALISM, independence(RI), and to LAYMEN, farmers, shopkeepers,and so on (LAY). But when Cameron talkedabout these things he tended not to talk about(i.e., to dissociate them from) categories likeYOUTH, our young people (YTH), INTEL-LECTUALS, “lily-livered bookmen,” etc.

D=ffiffiffiffiffiffiffiffiffiffiffiffiffiX

d2q

,

116 • PART 3

*This can be represented in three dimensions; see Krippendorff, Content Analysis, 207.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 116

(INT), and DISEASE, poisoned minds, un-healthy thoughts, and the like (DIS), which formanother cluster. This relation in Cameron’sthinking between YOUTH (always favorable)and DISEASE notions (obviously unfavor-able) was unsuspected by the analyst untilit appeared in the contingency data, whichsuggests one of the potential values of themethod.

Also tending to be dissociated from theFORD, FACTORIES, ENTERPRISE cluster,and more or less independent of the YOUTH,INTELLECTUALS, and DISEASE one, wefind an interesting collection of superfi-cially contrary notions: on one hand we haveSOCIETY in abstract, civilization (SOC),CHRISTIAN, God, and church (CHR), ourELDERS, mature minds (ELD), TRADI-TION and basic values (TRAD), and to someextent the PAST of our forefathers (PAS) andour HOMES, fireside, and families (HOM)—all things favorably drawn; but on the otherhand, in the same cluster, we find DESTRUC-TION and violence (DES), assorted ISMSlike Communism, Fascism, and totalitarian-ism (ISM), FEAR, bewilderment, and dismay(FEAR), and sundry EVILS (EVL). Appar-ently, when he thinks and writes about thesolid, traditional things that hold societytogether, he immediately tends to associatethem with the things he fears, the various ismsthat threaten destruction of his values.References to the FUTURE (FUT) and to ourHOPES and confidence in the New World(HOP) tend to be associated with referencesto AMERICA (AM), but also ISMS again.The allocation of a few other notions, includ-ing references to the general PUBLIC (PUB),to FREEDOM and democracy (FREE), andhuman NATURE, what is instinctive ornatural (NAT), may be studied by the readerhimself.

Goebbels’ Diary

Using a table of random numbers to selectpages and then lines-on-page, 100 samples,each approximately 100 words in length(beginning and ending with the nearest full

sentence), were extracted from the Englishversion of Goebbels’ diary and typed oncards. An example would be:

#38. Spieler sent me a letter from occupiedFrance. He complained bitterly about theprovocative attitude of the French, who con-tinue to live exactly as in peacetime and haveeverything in the way of food that their heartsdesire. Even though this is true only of theplutocratic circles, it nevertheless angers oursoldiers, who have but meager rations. WeGermans are too good-natured in every respect.We don’t yet know how to behave like a victo-rious people. We have no real tradition. On thiswe must catch up in the coming decades.

In terms of a rough frequency-of-usageanalysis made previously, 21 content cate-gories were selected for analysis. An indepen-dent coder went through the 100 units in ashuffled order noting simply the presence orabsence of reference to these 21 categories,generating a raw data table like that illustratedin Figure 1A. The data were then transformedinto a contingency table of the sort shown asFigure 1B, and significance tests were run(utilizing the arc-sin transformation).References to GERMAN GENERALS weresignificantly contingent upon references toINTERNAL FRICTIONS (in the inner circleabout Hitler) at the one per cent level; refer-ences to GERMAN PUBLIC were associatedwith those to BAD MORALE at the 5 per centlevel, as were contingencies between RUSSIAand EASTERN FRONT; negative contingen-cies, significant at the 5 per cent level, wereobtained between RUSSIA and BADMORALE, between references to ENGLANDand references to GERMAN SUPERIORITYas a race, and between references to the GER-MAN PUBLIC and references to RUSSIA.Such negative contingencies are at least sug-gestive of repressions on Goebbels’ part; thatis, avoiding thinking of Russia when he thinksof the bad home-morale situation, avoidingthinking about England when he thinks aboutthe superiority of the German race, and so on.These are merely inferences, of course.

Contingency Analysis • 117

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 117

A cluster analysis was made of these data,with the results shown in Figure 2. The con-tent categories included within regions havemainly plus relations and no minus relations.Numerous inferences might be made from thischart. For example: (D) that Goebbels defendshimself from thoughts about the HARD WIN-TER with SELF PRAISE and thoughts abouthis closeness to DER FUEHRER; (A) thatideas about BAD MORALE lead promptly torationalizations in terms of the INTERNALFRICTIONS brought about by GERMANGENERALS, which in turn bring up con-flicts between himself and others in securingthe favor of DER FUEHRER; (C) thatthoughts about his job of maintaining GOODMORALE among the GERMAN PUBLIClead to thoughts about BAD MORALE andINTERNAL FRICTIONS; (H) that hisproblem-solving ideas about PROPAGANDA

MANIPULATIONS may lead him alterna-tively to the GOOD MORALE cluster of asso-ciations, to the dismal RUSSIA-EASTERNFRONT-MILITARY FAILURES cluster, or tothe more encouraging cluster in which his ally,JAPAN, is having MILITARY SUCCESSESagainst ENGLAND and the U.S.; and finally(G and F), that when he thinks about thesubject peoples, JEWS and ITALIANS, andFRANCE, he tends also, particularly in thecase of FRANCE, to think about difficulties ofmaintaining FOOD supplies, leading quitenaturally to ideas about GERMAN SUPERI-ORITY in withstanding hardships, and thelike. These are inferences, of course; there arealternative interpretations possible as to whyany cluster of symbols shows positive ornegative contingency. But the inferences havethe advantage of resting on demonstrable verbalbehavior, which may even be unconscious to

118 • PART 3

A

C

BH

E

G

D

F

GermanGenerals

BadMorale

InternalFrictions

German Public

GoodMorale

PropagandaManipulations

Russia

EasternFront

U.S.

MilitarySuccesses

England

JapanFood

France

Implied GermanSuperiority

Jews

Italians

MilitaryFailures

Hard Winter

Self Praise

DerFuehrer

Figure 2 Clusters of Contingencies; Goebbel’s Diary

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 118

the source. They do not necessarily depend uponexplicit statements of relation by the source.

CRITIQUE OF THE

CONTINGENCY METHOD

The use of the contingency method is basedupon a very general inference between mes-sages and those who exchange them. . . . [C]on-tingencies . . . in messages are indicative of theassociation structure in the source and predic-tive of the association structure that may resultin the receiver (given sufficient frequency of[exposure]). But under what conditions is thisgeneral inference . . . valid?

If we are dealing with spontaneous infor-mal messages from a single known source(e.g., personal diaries, . . . letters to friendsand family, . . . extemporaneous speech, as inpsychotherapeutic interviews, etc.) then attri-bution of the association structure to thissource is probably most defensible. Whendealing with deliberately planned messages,particularly when the source is an institution,as . . . in propaganda (or mass media content)analysis, it would probably be safer to speakof the “policy” of the source rather than itsassociation structure.

(What does the contingency method yieldwhen language use is instrumental or cynical,as can be expected when the analyst faces aclever rhetorician, a propagandist, an adver-tiser, a client in therapy, or a candidate for apolitical office? Those who question the useof the contingency method under such condi-tions assume that the results must representmessage contents. This is not so.) The factthat references to YOUTH and DISEASE byCameron are significantly associated saysnothing (about Cameron’s belief system)about the direction of the assertions betweenthem. . . . Cameron’s typical statement would

be that “Our young people are not susceptibleto the diseased ideologues of our times.”What the method tells us, however, is thatthere is a greater-than-chance tendency forideas about DISEASE to occur in the environ-ment of ideas about YOUTH—quite apartfrom what assertions he may make relatingthese two.

[We] assume that a significant contin-gency, whether positive or negative, is evi-dence for an underlying association (not forwhether they are habitual or deliberate). Ifthe contingency is negative (i.e., a significantdissociation) it presumably means that theseideas are associated with some kind ofunpleasant affect. (Intentionally avoiding cer-tain associations, for example, in order not tooffend somebody, to hide something, or incompliance with a taboo, suggests that theconcepts are close in the mind of the source—not much else.)

The contingency method . . . does nottake into account the (expressed) intensitywith which assertions are made—if thesource says “The French are definitively likethe Italians in this respect,” the method onlyrecords an instance of contingency betweenthe categories of French and Italian. On theother hand, reflecting the basic psychologi-cal principle relating habit strength to fre-quency of response, the method doesindirectly reflect the strength of an associa-tion (or dissociation).

. . . [A]ssociation is not indicative ofsemantic similarity. References to COM-MUNISM may frequently lead to referenceto CAPITALISM, but this does not neces-sarily imply that these concepts are eithersimilar in reference or in psychologicalmeaning. (They exhibit a contrast within acommon linguistic domain, as in GOD andDEVIL, SOLDIER and SAILOR (orBUYER and SELLER).*

Contingency Analysis • 119

*Another example for the fact that associations have little to do with semantic similarities is synonyms. Synonyms rarely

co-occur near each other, and lacking contingency might give the impression of dissociation. In contingency analysis, as

presented here, this problem does not arise as it is applied to categories that subsume synonyms. When applied to raw

words (see reading 7.3, this volume) this becomes a distraction.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 119

NOTES

1. This experiment was done by the authorin collaboration with Mrs. Lois Anderson(1957).

2. These tables and certain details of statisticaltreatment will be found in Osgood and Anderson(1957).

3. It is instructive to compare the resultsof this contingency analysis with an earlierfrequency analysis of speeches by the samespeaker by Green (1939). The clusters spotted inthe present study seem to have been largelyoverlooked in Green’s more conventional analy-sis. The studies were independent of each other.The earlier one was not known to the presentauthor.

REFERENCES

Baldwin, A. L. (1942). Personal structure analysis:A statistical method for investigating thesingle personality. Journal of Abnormal andSocial Psychology 37:163–183.

Green, Jr., T. S. (1939). Mr. Cameron and the FordHouse. Public Opinion Quarterly 3:669–675.

Osgood, C. E., & Anderson, L. (1957). Certainrelations between experienced contingencies,association structure, and contingencies inencoded messages. American Journal ofPsychology 70:411–420.

Osgood, C. E., & Suci, G. J. (1952). A measureof relation determined by both mean differ-ence and profile information. PsychologicalBulletin 49:251–262.

120 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 120

3.2FOUR TYPES OF INFERENCE

FROM DOCUMENTS TO EVENTS

VERNON K. DIBBLE*

121

Many of the intellectual proceduresused by historians can be viewed interms of the dichotomy between

documentary evidence and facts or events thatare external to the documents themselves. Atsome moments, historians work on only oneside of this divide. Where the meaning of adocument is not clear, for example, they some-times use one phrase in order to infer themeaning of another. Or, working only on theother side of this dichotomy, once a given factis established, they use it in order to makeinferences about other facts. Although suchinferences are almost completely neglected bymanuals of historical method, they are verycommon in the works of historians. Chrimes(1952:15–16) writes, for example, that “fromthe time of Cnut at least we begin to see menwho began their careers as Scribes in theking’s service blossoming forth to be bishopsand abbots⎯a sure sign of their growingimportance and favor.” The rules for makinginferences of this type, whatever they mightturn out to be when adequately codified,are quite different from rules for the use of

documents. Once the first fact is established(e.g., the career lines of royal scribes), the his-torian’s inference to the second fact (e.g., theimportance of the royal secretariat) has noth-ing to do with documentary techniques.

At other moments historians move fromone side of the dichotomy to the other.Moving from fact to document, they often usethe former in order to make inferences aboutthe provenance, age, authenticity, or author-ship of the latter. And, of course, they alsomove in the opposite direction, using docu-ments in order to make inferences aboutexternal events.

Although these various procedures areused in conjunction with one another, andalthough the resulting conclusions often standor fall together, this article is concerned onlywith inferences from documents to events. Itidentifies four quite different ways in whichhistorians make such inferences, as illustratedin recent historical literature, and discussessome of the problems which each of the fourentails. Manuals of historical method alsoconcentrate on inferences from documents to

*From Dibble, V. K. (1963). Four types of inference from documents to events. History and Theory 3:203–221.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 121

events. But they hardly reflect the proceduresthat historians actually use. To judge by mostmanuals, historians establish facts from docu-ments primarily by examining testimony toevents, which is recorded by witnesses whohave seen or heard about these events. Thisarticle is concerned not only with testimonybut also with three other categories, whichmay be termed social bookkeeping, corre-lates, and direct indicators.

These four categories are based on twovery different criteria of classification. First,the distinction between testimony and socialbookkeeping is a classification of sources. Ofall documents used by historians, some pur-port to record information about things thathappen, and some do not. Codes of law,pieces of pottery, and poems, for example, donot. Documents, which do purport to recordinformation, can in general be classified fur-ther as testimony or social bookkeeping,depending upon the circumstances underwhich they are produced. Since the proce-dures appropriate to testimony are not iden-tical with those appropriate to socialbookkeeping, this classification of sources issimultaneously a classification of techniques.Second, the distinction between correlatesand direct indicators refers only to techniques.For all documents, both those that purport torecord information and those that do not, arepotential correlates and potential indicators.This double-edged scheme of classificationmust be kept in mind as we proceed further.

TESTIMONY

The manuals give a number of familiar rulesfor evaluating testimony. Many historians donot care for generalizations or formal method-ology and prefer to regard masterful documen-tary criticism as “a sort of sixth sense that willalert you to the tell-tale signs.”1 But most ofthese rules can be stated as general laws in oneor another social science, although they are allprobability laws and many are definitely ofthe armchair variety. Some rules turn out to begeneral laws governing the psychology of cog-nition: testimony about specific details is likelyto be more accurate than testimony about

general conditions.2 Others are laws govern-ing the psychology of memory: testimonyrecorded shortly after an event took place islikely to be more accurate than testimonyrecorded long afterwards. Other rules can bestated as general laws, which govern commu-nication: testimony about ideologically rele-vant events, which is addressed to people whoshare the witness’s beliefs and values, is likelyto be more accurate than testimony addressedto audiences that do not share the witness’s ide-ology. Some rules turn out to be laws govern-ing cultural processes in cognition: the rule ofthumb that the ancients grossly overestimatednumbers would be such a law, if we were ableto state in what kinds of societies or culturespeople overestimate numbers and in whatkinds they underestimate them.

In using such rules, historians implicitlyconstruct syllogisms that include probabilitystatements. That is, they begin with premises,which are stated in terms of likelihood ratherthan certainty and, therefore, proceed to con-clusions, which are likely rather than certainto be true. This logical structure of inferencesfrom testimony to events is seen more clearlywhen historians choose between n conflictingaccounts than when they evaluate a singlepiece of testimony. . . .

Historians reach an overall conclusion atsuch points by engaging in a peculiar kind ofarithmetic without numbers. They assignweights to each syllogism and to its conclu-sion, usually assigning greater weight to somethan to others. . . . The different weights, orestimates of probable accuracy, are comparedwith one another, and out of the probabilitiesascribed to each syllogism, an aggregateprobability for each of the possible overallconclusions is arrived at. These combinationsof probabilities, and comparisons betweenthem, are carried out despite the fact that theyare never stated with quantitative precision.For the logic of inferences from testimony toevents is the logic of qualitative probabilities.

But while the notion of probability or like-lihood may apply to the historian’s assess-ments and conclusions, it does not apply to thesingle event in question. Patrick Henry eitherdid or did not profess his loyalty to the king,and there is no probability about it. To say

122 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 122

historians infer from documents to events bythe logic of qualitative probabilities is to saythat if they make inferences about hundredsof events by the simultaneous application, toeach event in question, of a number of syllo-gisms which include probability statements,then they will reach the correct conclusionmore often than not. But since their premisesare probability laws, the most rigorous evalu-ation of testimony can lead to the incorrectconclusion in any single case. If, as the manu-als suggest, historians relied primarily on wit-nesses, if document equaled testimony andinternal criticism equaled the evaluation oftestimony⎯then historians would seldom beable to spot those instances where rigorousevaluation of testimony leads to the wronginference. But the impression conveyed bythe manuals is fortunately incorrect. Historiansinfer from documents to events in ways thathave nothing to do with the evaluation of tes-timony. Other types of inferences are some-times used without reference to testimony andsometimes supplement testimony. The first ofthe three remaining types to be consideredhere is the use of social bookkeeping.

SOCIAL BOOKKEEPING

When the manuals speak of witnesses andtheir testimony and of the rules for evaluat-ing testimony, there is always the implicitassumption that documents are produced byindividuals and not by social systems. Thesecluded monk, the diarist alone in his roomat the end of the day, and the solitary travelerare the classic examples of the historian’s wit-ness. But if one were to enumerate the sourcesused by, say, fifty representative historians,then one would probably find that documentsproduced in individualized circumstancesmake up only a small percentage of the totaland are outnumbered by social bookkeeping.Groups and organizations in all literate soci-eties have institutionalized procedures forrecording facts and events. The term socialbookkeeping refers to all documents whichpurport to record information and whichare the product of groups and organizations.The term includes such diverse sources as

transcripts of parliamentary debates, calendarsof saints, bankbooks, tax returns, inventoriesof estates, the Domesday Book, court records,crime statistics, censuses, reports by subordi-nates in hierarchies to their superiors, and thelist of graduates of Harvard University.

Testimony is the work of individuals.Historians have accordingly evolved a psy-chology and social psychology of documentsthat guide them in their use of testimony.Social bookkeeping is the work of socialsystems. But historians have not yet evolved asociology of documents to guide them in theiruse of such sources. One does find in themanuals a few stray reminders that docu-ments of this type must be read in the lightof the social system that produces them.Students are reminded, for example, that theCongressional Record is not a literal tran-script of Congressional debates. The Recordis an inaccurate transcript not because record-ing clerks have faulty hearing, or politicalbiases, or any other failing to which witnessesare prone. It is an inaccurate transcriptbecause of one simple feature of the socialsystem that produces it: members of Congressare free to amend their remarks before theRecord goes to press.

It is possible to state a few general princi-ples for the use of social bookkeeping. TheCongressional Record reminds us that differ-ent forms of social bookkeeping vary in theextent to which interested parties have a handin producing the record. In some societies,inventories of estates are compiled by heirsand in others by disinterested parties. TheRecord also reminds us that different formsof social bookkeeping vary in the extent towhich interested parties are likely to checkthe record after it is first set down. People aremore likely to check certificates of inheri-tance or deeds to their land than the informa-tion about themselves, which is collected bycensus enumerators. Different forms of socialbookkeeping, which are checked by inter-ested parties, vary in the extent to which theinterested parties are free to alter the record.Lords of manors could presumably alterrecords, which were compiled by their ownoverseers, more easily than they could alterrecords compiled by tax officers. Different

Four Types of Inference • 123

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 123

forms of social bookkeeping, which can bealtered by interested parties, vary in the extentto which such alteration makes for greateraccuracy or for less. Transcripts of some legalproceedings are less complete than theywould be otherwise because interested partiescan sometimes have remarks “stricken fromthe record.” In contrast, alterations initiatedby an interested party to his own advantageare likely to make for a more complete and amore accurate record if the record-keeper is ina position to make his own independent checkon the accuracy of the suggested alteration.The professor who asks his chair(person) ordean to add missing items to his bibliographyis an obvious example. If the record-keeperis not in a position to make an independentcheck, as with compilations of researchallegedly in progress, the interested party isfree to embellish veracity.

Some forms of social bookkeeping areprovided with built-in checks, apart frominterested parties, while others are not. TheBollandist fathers provide the CatholicChurch with an institutionalized check on therecord of saints: there is no built-in check onthe biographies submitted to the editors ofWho’s Who. Different forms of social book-keeping vary in the extent to which the eventsrecorded are visible to the record-keeper or inthe extent to which communication betweenobservers and record-keepers is assured. Theyalso differ in the number of steps betweenobserver and record-keeper. There are nosteps between observer and record-keeperwhen, as with court stenographers, the twojobs are performed by the same person. Thereare many steps in the hierarchy of a corpora-tion between a sales(person)’s weekly reportsto an immediate supervisor and the record ofsales in the corporation’s annual report, withcommunication steps in the sales departmentparalleled by different steps in accountingand billing departments. Different examplesof social bookkeeping, which do come intobeing only after many steps between observerand record-keeper, vary in the extent to whichdistortion or suppression of information takesplace along the way. Staff officers in contem-porary American corporations get line person-nel to innovate by agreeing to distort budget

reports in order to make line personnel lookbetter than they really are; no matter howmany steps there might be between graduatestudents, dissertation supervisors, departmen-tal secretaries, deans’ offices, and printers ofcommencement programs, carelessness is notthe only thing which might distort or suppressinformation along the way.

Some of these general observations, orothers like them, have been concretely appliedin the works of historians who have had tocome to terms with particular kinds of socialbookkeeping in particular historical soci-eties . . . Kosminsky’s criticism of the surveyof 1279 and of certain other forms of socialbookkeeping in thirteenth-century England isan example (Kosminsky, 1956). He asks whoinitiated the survey, and why. To whom wouldthe returns be valuable and who might be hurtby them? Who carried it out, and how? Whatdid the officers of the king do when theyarrived in a county? How did the local juriesacquire their information? Did the royal offi-cers check on the local juries or simply accepttheir returns as given? Was new machinerydevised for gathering the information requiredfor the surveyor or established and triedmachinery used? In what respects did thequestions presented to the local juries forcethem to simplify the facts? What was most vis-ible to the juries and what was least visible?

Many of the questions posed by Kosminskyhave exact parallels in the criticism of testi-mony. Comparisons between descriptionsof a manor in the survey of 1279 and in anInquisition post mortem are analogous tocomparisons between the testimonies of twowitnesses to the same event. In some cases,however, we do not really have two indepen-dent records, since the information given bylocal juries was sometimes copied fromanother source. This is, of course, parallelto the difference between two independentwitnesses and two witnesses, one of whomreports what the other had told him.Kosminsky’s examinations of the vocabularyof the survey and of its internal consistencyalso have their parallels in the evaluation oftestimony. For other questions, however, thereis no parallel with the evaluation of testi-mony. The difference between improvised

124 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 124

and established machinery for acquiringinformation, questions concerning the flow ofcommunications and commands between dif-ferent people involved in gathering the data,and questions concerning the extent to whichsome people checked up on other people, allpoint to the distinctly social character ofsocial bookkeeping. In such questions asthese the historian is concerned not with theveracity or eyesight of individuals, but withthe operation of social systems.

As with criticism of testimony, historianscriticize social bookkeeping in order to makedecisions about the probable accuracy or com-pleteness of the record. But, of course, histori-ans are not interested only in accurate socialbookkeeping. Inaccurate social bookkeepingcan be just as valuable as testimony known toconsist of lies and distortions. Whatever thesurvey of 1279 tells us or fails to tell us aboutmanors and villages, it also tells us somethingabout the administrative mechanisms of themedieval English state. Documents, whichpurport to record information, to be useful tohistorians, need to be accurate only when his-torians are concerned with the informationthey purport to record. But testimony is notalways used qua testimony and social book-keeping is not always used qua social book-keeping. Documents of both types are used ascorrelates or direct indicators of facts or eventsother than those they purport to record. And, ofcourse, documents that do not purport to recordinformation can be used in the same way.

DOCUMENTS AS CORRELATES

Historians are often able to make inferencesfrom documents to events in the absence oftestimony or social bookkeeping that tellsthem about the events in question. One way ofdoing so is to use documents whose charac-teristics are known to be correlated with theevents in question. Haskins (1918) providesa particularly striking example of the use ofdocuments as correlates, supplementing theuse of testimony.

In Norman Institutions, Haskins demon-strates that the governmental machinery ofRobert Curthose (1087–1096 and 1100–1106)

was weak, ineffective, and underdeveloped,but that the more highly developed institutionsof the Conqueror were “in some measuremaintained even during the disorder and weak-ness of Robert’s time” (Haskins, 1918:84).Among his sources are the narratives ofOdericus Vitalis, the charters of RobertCurthose, and the charters of William Rufus,who ruled Normandy between 1096 and 1100while his brother Robert was on a crusade.The narratives of Odericus are, of course, anexample of testimony. His descriptions ofNormandy under Robert are “a dreary tale ofprivate war, murder, and pillage, of perjury,disloyalty, and revolt . . .” (Haskins, 1918:62).Of William Rufus, in contrast, “Odericus tellsus that . . . under his iron heel Normandy atleast enjoyed a brief period of order and justiceto which it looked back with longing afterRobert’s return” (Haskins, 1918:80).

Haskins is less concerned with private warand public peace than with the institutionsof Norman government. The testimony ofOdericus is not adequate for his purpose,since the witness was not close to Robert’sgovernmental machinery and since his per-ceptions were colored by his geographicallocation and by his position as a monk.Haskins makes inferences about organs ofgovernment from the testimony of Odericusand from certain other narratives but thengives reasons not to rely on these inferences:

Amidst these narratives of confusion andrevolt, there is small place for the machinery ofgovernment, and we are not surprised that thechroniclers are almost silent on the subject.Robert’s reliance on mercenaries [reference toOdericus and to another witness] shows thebreakdown of the feudal service, which mayalso be illustrated by an apparent example ofpopular levies [reference to Odericus]; hisconstant financial necessities [reference toOdericus and to another witness] point to thedemoralization of the revenue. The rare men-tion of his curia [reference to Odericus] impliesthat it met but rarely. Still, these inferences arenegative and to that extent inconclusive, andeven the detailed account of Odericus is largelylocal and episodic, being chiefly devoted toevents in the notoriously troubled region of thesouth, and is also colored by the sufferings andlosses of the church. (Haskins, 1918:64)

Four Types of Inference • 125

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 125

Thus far, Haskins could be following theinjunctions of the manuals concerning the useof testimony. His next step, however, has notbeen dreamt of in the manuals. Having foundhis star witness wanting, he turns to the char-ters and similar documents of Robert’s reignand uses certain of their characteristics as cor-relates of the nature of Robert’s government.He notes that the number of surviving char-ters is small, relative to the length of the reignand in comparison with other Norman dukes.Perhaps only thirty-nine survive because“later times were indifferent to preservingcharters of Robert Curthose, but it is evenmore likely that his own age was not eager tosecure them. As confirmation at his handscounted for little, none of these charters con-sist of general liberties or comprehensive enu-merations of past grants; they are all specificand immediate. Furthermore, so far as cannow be seen, the surviving documents are allauthentic; privileges of the Conqueror, HenryI, or Henry II were worth fabricating but noone seems to have thought it worth whileto invent a charter of Robert” (Haskins,1918:71).3 Seventeen of the existing charterswere issued in Robert’s name, while twenty-two were drawn up by interested parties forhim to attest. The seventeen issued inRobert’s name are not uniform in size, style,or method of authentication. Of the seven,which are preserved in the original, each is ina different handwriting. Seals were used ononly some of the charters, and were not usedin uniform fashion. There are nine variationson the title dux Normannorum. To these vary-ing titles there is sometimes added one ofthree variations of filius Willelmi gloriosiregis Anglorum. Robert signs sometimes asdux and sometimes as comes. Some chartersinvoke the Trinity while others do not.

From these and similar observations,Haskins concludes “the range of variation instyle and form precludes the existence of aneffective chancery and indicates that theduke’s charters were ordinarily drawn up bythe recipients” (Haskins, 1918:74). Thedecline of the ducal chancery is accompaniedby a decline in the curia. The lists of wit-nesses on charters show little continuity in theducal entourage and “still less any clearly

marked official element” (Haskins, 1918:76),and a meeting of the curia is mentioned onlyonce in the surviving documents. In short, thecharacteristics of Robert’s charters confirmthe inferences about the nature of his rulemade on the basis of testimony in the chroni-cles. A similar examination of the writs andcharters issued in or about Normandy byWilliam Rufus during his reign there alsoconfirms the testimony of Odericus. UnderWilliam Rufus, Haskins infers, we see “theregular mechanism of Anglo-Norman admin-istration at work” (Haskins, 1918:83).

Haskins’s use of ducal charters has nothingto do with the evaluation of testimony. Andalthough charters are a form of social book-keeping, since they record information aboutgrants and privileges given by the crown,Haskins is not using them as such. He is notprimarily concerned with the informationabout grants and privileges they contain.Haskins’s procedure illustrates the making ofinferences from documents to events by theuse of documents as correlates of the events inquestion. As with testimony and social book-keeping, the logic of such inferences can bestated syllogistically: Norman dukes known tohave effective organs of government issuedcharters with the characteristics a, b, and c;Duke Robert’s charters have characteristicswhich are the opposite of a, b, and c; therefore,Robert’s rule must have been weak.

There are, of course, a number of syllo-gisms here, one for each characteristic in ques-tion. And, as with testimony and socialbookkeeping, historians somehow combinethe differently weighted conclusions of eachsyllogism in order to reach an overall conclu-sion, even though the weights to be given toeach conclusion are not precisely known. Itshould be noted further that Haskins’s majorpremises are not strictly adequate to the con-clusion. Ideally, he should have grounds forhis major premise that Norman dukes knownto have effective organs of government issuedcharters with given characteristics whileNorman dukes known to have ineffectiveorgans issued charters with the opposite char-acteristics. Haskins actually has grounds onlyfor making the first part of the statement,simply because there were not enough weak

126 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 126

dukes to provide the evidence for the secondpart. The problem is hardly serious, however.Our knowledge of Norman government andsociety, and of the functions of charters insuch a system, allows us to state why the char-acteristics of ducal charters should be corre-lated with the strength of ducal governments.

It may some day be possible to state gen-eral rules for the use of documents as corre-lates, just as there are rules in the manuals forthe use of testimony. But most of these rulesare likely to be quite different from the rulesconcerning testimony. The latter are, in effect,psychological or social psychological lawsgoverning such phenomena as cognition,memory, and communication. Although a fewsimilarly general principles might one day bestated for the use of documents as correlates,it is usually possible to use documents in thisway only because of historically specificknowledge about the institutions of particularsocieties or types of societies. It was suchknowledge, and not general laws, whichenabled Haskins to use charters as correlatesof the nature of Robert’s regime.

DOCUMENTS AS DIRECT INDICATORS

The fourth type of inference from documentto event might appear to entail no inference atall. This is the case when all or part of thedocument itself, as opposed to external eventswhich are recorded by or correlated with thedocument, is the datum under investigation. Ifone wants to know, say, what the Britishambassador in Berlin reported to the ForeignOffice on the day Bismarck moved againstFrance, then one simply finds and reads what-ever messages he sent. His cables or dis-patches are direct indicators. What is more, ifno records have been lost and if there were nooral messages that were never recorded, thenthe documents themselves provide an exhaus-tive answer to the historian’s question. Thereis no need to infer from documents to events.Direct indicators, surely, have nothing to dowith inferences.

The matter is not quite so simple, however.The example given illustrates the methodolog-ically uninteresting case in which the content

of the documents and the answer to the histo-rian’s question are completely coterminouswith each other. Documentary research comescloser to absolute certainty in such cases thanin any other. There is no need for probabilisticsyllogisms, which might lead to incorrect con-clusions even when most rigorously applied.But this certainty is possible only in specialcases, and sometimes requires that only trivialquestions be asked. The documents at handand the answer to the historian’s question arecoterminous only when two conditions aremet. First, it must be possible to answer thequestion by reference to the documents them-selves, as the historian’s subjects happened toproduce them, and without reference to theiraccuracy concerning, or correlation with,events external to the documents. Two sorts ofquestions meet this condition. (1) To continuewith the example of Norman charters, onemight ask questions about the formal charac-teristics of documents. What were RobertCurthose’s charters like? (2) One might askquestions, which, in effect, simply state thecontent of the documents in interrogativeform. What rights or privileges were grantedto what monasteries in which of Robert’s char-ters? Since a ducal grant of privileges inNorman society may be defined as the emis-sion or attestation by the duke of a charterwhich states that he is making such a grant,charters can be used as direct indicators of theevents in question. There is no need to arguethe contention that historians cannot limitthemselves to questions of these two types.

Second, even though the historian mightbe posing questions of these two types, thedocuments at hand and the answers to hisquestions are rarely if ever coterminousunless the questions call for purely descrip-tive answers. Such answers are not sufficientwhen historians conceptualize. And, as MarcBloch has taught us, historians conceptualizeall the time, even when they deny all interestin concepts and claim to deal only withunique particulars. A glance at one example ofconceptualization in the work of a historianwill indicate why conceptualization usuallymakes it impossible to pose questions whoseanswers are coterminous with the documentsthemselves.

Four Types of Inference • 127

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 127

As everyone knows, to conceptualize isto classify a number of relatively less generalitems under some relatively more generalrubric. Instead of limiting themselves to spe-cific items, on the one hand, and a single gen-eral rubric, on the other, historians and othersocial scientists often find it useful to specifyconceptual rubrics of intermediate generality.Heckscher, for example, defines “mercantilisteconomic policy” in terms of five intermediaterubrics; mercantilism as an agent of unifica-tion, as a system of power, as a protectionistsystem, as a monetary system, and as a concep-tion of society (Heckscher, 1955). Each ofthese intermediate rubrics or “five aspects ofmercantilism,” not to speak of the more gen-eral rubric under which they are all subsumed,sums up an enormous variety of concretedetails. Each of them refers to a vast number ofbooks or treatises, acts of parliament, royaldecrees, instructions sent down throughadministrative hierarchies and arguments overeconomic policy. If the problem were one ofeconomic practice, such sources would notnecessarily tell us what people actually did andit would be necessary to make inferences fromdocuments to events external to the documents.But since policy is defined by what people sayand not only by what people do, such sourcesmay in most cases be taken as direct indicatorsof mercantilist economic policy.

Implicit in Heckscher’s definition of mer-cantilist policy is the empirical assertion thatthese “five aspects of mercantilism” were infact correlated with one another, either becausethey were effects of the same cause or becausesome of them were causes of others. Alsoimplicit in Heckscher’s definition, however,is the empirical assertion that the correlationbetween the five aspects of mercantilist policyis less than perfect and is not always observed.Although the first two components of mercan-tilism, “unification and power, were well suitedto each other . . . it is . . . important to drawattention to the opposite point, that the twowere not inseparable. That there were two sep-arate aspects becomes clear in considering lais-sez-faire, for this policy usually combined aunification which was almost complete in everyrespect with a remarkable indifference to con-siderations of power” (Heckscher, 1955:24).

This formulation betrays a notion whichHeckscher did not spell out explicitly: the fivecomponents of his definition could all be sub-sumed under the same concept because theytended to go along together empirically, butthere were five “separate aspects” becausethey did not necessarily go along togethercompletely or in all times and places. And ifthe correlation between these five generalrubrics is less than perfect, what are we tothink of the enormous number of detailswhich each of these rubrics sums up? If welook at the details under any single rubric,under monetary policy let us say, would weexpect to find that a highly mercantilist royaldecree is necessarily followed by otherdecrees which are equally mercantilist? Andthat as royal decrees become more mercan-tilist all judicial decisions, opinions of highofficials, books published, and instructionssent down to subordinates by administrativesuperiors will follow along? Of course not.There is more free floating among the details,among the specific indicators of the generalconcept, than is seen when we deal with con-ceptual classifications that are established byadding up the general tendency of the details.4

The lesson is clear. If historians want touse direct indicators in order to answer lim-ited factual questions, which call for purelydescriptive answers, then there is no problem.The ambassador’s cables are direct indicatorsof what he told the Foreign Office. But whenhistorians use concepts that sum up a largenumber of details, the answers to their ques-tions can rarely if ever be coterminous withany single document or set of documents. Forany single item mayor may not be a reliableindicator of a concept, depending upon theway in which such items are inter-correlatedwith other items that are also subsumed underthe same concept. A mercantilist treatise pub-lished in France in 1690 is sufficient to indi-cate that France was a mercantilist nation,only if we know that the characteristics of allbooks on economic policy were highly corre-lated and that the characteristics of suchbooks were in turn correlated with the charac-teristics of royal decrees, memoranda by highofficials, and all the rest. When informationon all items subsumed under the concept is

128 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 128

available, or when adequate samples are pos-sible, there is no particular difficulty. Whenneither is possible, the problem may be han-dled in one of two ways. First, the indicatorsavailable might include such a large portionof all items included under the concept thatwe need not worry about the missing indica-tors. Whatever they might look like, if everlocated, they could not radically changethe original judgment. Second, even thoughthe available indicators be a small portionof all items included under the concept, thehistorian might know or might be able to inferthe ways in which they must have beeninter-correlated with the missing items. Thedifference between these two situations isnicely illustrated in one of the most imagina-tive examples of the use of direct indica-tors that can be found in historical literature,V. H. Galbraith’s guide to the Public RecordOffice in London (Galbraith, 1934).

Galbraith’s task is to find order in collec-tions that fell into disorder during centuries ofneglect. In order to do so, he argues, one mustregard documents as secretions of the organi-zations that issued them. Hence, he mustreconstruct the issuing organizations and therelationships between them. His reconstruc-tion of the various organs of government inmedieval England illustrates the second situa-tion described above. With hardly a glance atthe substantive content of any document, heuses differences in seals, parchment, filingpractices, systems of dating, language, andhandwriting as indicators of the existence ofseparate organs of government and of theiremergence at various points in time. Now,when one speaks of separate organs of gov-ernment, and of the boundaries and differ-ences between them, one refers primarily topatterns of interaction between officials orclerks and to the actions they perform.Characteristics of pieces of paper or of theway in which they are filed are only a smallpart of what one means. But the indicatorsthat Galbraith uses must have been so highlycorrelated with the social systems withinwhich officials and scribes did their jobs thatwe do not worry if they are only a small partof everything that is meant by “distinct organof government” or “separate organization.”

Galbraith’s reconstruction of the changingchannels of communication between thevarious organs of government in medievalEngland illustrates the first situation describedabove. Among fourteenth- and fifteenth-century documents, for example, he finds war-rants under the Royal Signet, which were filedamong the records of the Privy Seal office,and warrants under the Privy Seal, which werefiled among the Chancery records. These doc-uments indicate a flow of communicationfrom the King through the office of the PrivySeal to the Chancery, a more complex systemthan is seen in earlier records. The documentsindicate a similar flow of communication fromthe Council and other departments through theoffice of the Privy Seal and then on to theChancery. The office of the Privy Seal was agreat clearing house, the center of a systemthat placed the Great Seal, in the custody ofthe Chancery, at the disposal of all depart-ments. But certain notations on Chancerydocuments, such as per ipsum regem or perconsilium, indicate that these normal channelswere sometimes circumvented. The King orother officials sometimes communicateddirectly with the Chancery. Communicationbetween departments is indicated in otherways as well. The Exchequer’s OriginaliaRolls are extracts of Chancery Rolls dealingwith fines payable, and duplicate copies of theChancery’s Inquisitions post mortem arefound in the Exchequer archives.

“Channels of communication” includesoral communications which were neverrecorded and perhaps the less important,illicit, or informal communications whichwere not preserved. Some traces of oral com-munications are found in the documents, andwritten communications not destined forpreservation are less of a problem when scarceparchment rather than plentiful paper wasused. For these reasons, we need not worry toomuch about the missing indicators. Those,which are available to Galbraith, cover such alarge portion of the phenomena in questionthat additional evidence could not change hisjudgments by very much. To speak of cooper-ation between departments is quite anothermatter (Galbraith, 1934:24). The sending ofdocuments from one office to another is only

Four Types of Inference • 129

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 129

a small part of what is meant by “cooperation”and all the other things meant by the termwould not necessarily go along with the trans-mission of documents. The indicators used arenot sufficient for this purpose because we donot know, without looking at other evidence,how they might have been correlated withother indicators of cooperation. When Galbraithreconstructs the channels of communication,however, he is on sure ground.

The use of documents as indicators mustnot be confused with the use of facts or eventsas indicators. The events that took place onMay 30, 1765, in the House of Burgesses,along with numerous other events throughoutthe colonies, are indicators of, let us say, thedegree of tension between the colonies andEngland. But in order to know what tookplace on that day in the House of Burgesses,historians must first confront documents andthen make inferences about events that areexternal to the documents. Only then can theevents be used as indicators. And, of course,historians must go through similar steps inorder to use events as correlates of otherevents. The use of external facts as correlatesor as indicators, in other words, the making ofinferences from established facts to otherfacts, is among the procedures that historiansshare with social scientists generally. In con-trast, although the four types of inferencefrom documents to events, which have beenidentified here, are sometimes found in theworks of other social scientists, these fourprocedures are among the distinctive featuresof the historian’s craft.

In some cases, historians simultaneouslyuse all of these four procedures in order toanswer a single question. Testimony, socialbookkeeping, correlates, and direct indicatorsare all used by Homans (1942) in order toestablish the geographical location andboundaries between champion land andwoodland in medieval England. Homansdeals with four characteristics by whichwoodland and champion land differed fromeach other: size of fields, open or closedfields, certain agricultural practices, and thedistribution of settlements. For the first ofthese characteristics, he relies on testimony.For the second he relies on testimony and on

a correlate, enclosure acts. For both the firstand second, he also uses a direct indicator.If you go out and look you “can see thatDevonshire, with its small squarish fields andbig walls, has not the same landscape asOxfordshire, though today the fields of bothcounties are, in the technical sense of theword, enclosed” (Homans, 1942:16). (Thisdirect indicator would be even more telling ifthe book were written today, for the more sen-sitive eyes of aerial photography have sincebeen used for this purpose.) For the thirdcharacteristic Homans cites documents of thesocial bookkeeping type, medieval surveysand extents. For the fourth, he again relies ontestimony. While some of the evidence, whentaken singly, may be called into question, theconsistency with which these four types ofevidence all point to the same conclusionanswers all objections.

Although it is possible to classify neatlythe evidence and procedures used by Homansin terms of the four categories presented here,it is not possible to do so in all cases, whichthe reader might think of. For example, thoseportions of the letters sent to Versailles by theFrench colonial governors in Quebec that pur-port to record information about events are across between testimony and social book-keeping. One must read them in both lights.Perhaps such documents are common to sim-ilar situations, those in which one personreports to another in an official capacity but,instead of being one link in a complex andformally organized system of communication,is free to decide what he reports and how hereports it. Similarly, the distinction betweencorrelates and indicators may not always beneat, and readers with experience in such mat-ters can undoubtedly remember using proce-dures that fit into neither category. This is allto the good. For the methodological analysisof research situations which cannot be unam-biguously classified in terms of the categoriesset forth here will not only define the limitswithin which these categories are useful, butwill also force us towards greater clarity andprecision. The further refinement of the cat-egories presented here, and the identifica-tion of still further types of inference fromdocuments to events, will provide historians

130 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 130

and all social scientists who use documentswith a greater measure of self-conscious con-trol over their materials and techniques.

The greatest historians may not need self-conscious control. Haskins did not need thegeneral category “correlates.” Kosminsky didnot need the category “social bookkeeping.”But few are blessed with their historical intu-ition. And among those who are not, onlysome can apply general methodological cate-gories to the details of their specific problems.That is something, which the categories them-selves cannot do for us. In short, while manymay need methodological investigations suchas this, perhaps only a few can make use ofthem. But if historians are willing to grant thatthere is any problem of methodology at all,then they must also welcome all steps towardmethodological refinement.

NOTES

1. The phrase is from Gray, W. (1959).Historian’s handbook: A key to the study and writ-ing of history (p. 36). Boston: Houghton Mifflin.

2. The consistent application of this rule toearly Spanish accounts of the Incas has led oneauthor to a reconstruction of Inca society very dif-ferent from those presented by earlier writers whohad accepted the general descriptions given in theSpanish accounts. Cf. Moore, S. F. (1958). Powerand property in Inca Peru. New York: ColumbiaUniversity Press.

3. Decisions about authenticity are, of course,an example of “external criticism” and not of infer-ences from documents to events. Once the decisionabout authenticity is made, however, the fact of

authenticity or lack of authenticity is a characteris-tic of the document, which, in this case, is used asa correlate of external events.

4. This discussion has profited fromLazarsfeld (1959). So far as the writer knows,most historians become painfully aware of thisfact⎯that the specific indicators of general con-cepts are not perfectly correlated with oneanother⎯only when they attempt to periodize. Bythe age of laissez-faire, or some such concept, his-torians refer to a thousand and one specific items.When they attempt to set the temporal boundariesof an age, however, they confront the fact that thespecific indicators of the concept do not all changein equal degree with equal speed.

REFERENCES

Chrimes, S. B. (1952). An introduction to theadministrative history of mediaeval England.Oxford: Blackwell.

Galbraith, V. H. (1934). An introduction to the use ofthe public records. Oxford: Clarendon Press.

Haskins, C. H. (1918). Norman institutions.Cambridge, MA: Harvard University Press.

Heckscher, E. F. (1955). Mercantilism (M. Shapiro,Trans., & E. F. Soderlund, Ed., Vol. 1, rev. ed.).New York: Macmillan.

Homans, G. C. (1942). English villagers of thethirteenth century. Cambridge, MA: HarvardUniversity Press.

Kosminsky, E. A. (1956). Studies in the agrarianhistory of England in the thirteenth century(R. Kisch, Trans., & R. H. Hilton, Ed.). NewYork: Kelly & Millman.

Lazarsfeld, P. F. (1959). Latent structure analysis.In S. Koch (Ed.), Psychology: A study of ascience (Vol. 3, pp. 476–543). New York:McGraw-Hill.

Four Types of Inference • 131

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 131

3.3POLITBURO IMAGES OF STALIN

NATHAN LEITES, ELSA BERNAUT, AND RAYMOND L. GARTHOFF*

132

Hypotheses regarding differences (orlack of differences) in policy-orientationor in degrees of influence between

the various members of the Soviet Politburohave always been of great interest to studentsof politics. Thus there have been frequentspeculations regarding alleged differences inforeign policy lines and on the problem ofsuccession. The absence of confirming or dis-confirming data for any of these hypotheses isstriking, and obvious in view of the secrecythat enshrouds the internal operations of thePolitburo. Published statements of any kindby members of the Politburo have becomeinfrequent in recent years. Such statements asare available for analysis have usually dealtwith different subjects and have been madeat different dates, so that they were difficultto compare from the point of view of testinghypotheses regarding differences in policyof influence.

Through Stalin’s seventieth birthday,December 21, 1949, however, a rare opportu-nity for comparative analysis did occur. Pravdapublished articles by Politburo membersMalenkov, Molotov, Beria, Voroshilov,

Mikoyan, Kaganovich, Bulganin, Andreyev,Khrushchev, Kosygin, and Shvernik (in thisorder), preceded by a joint message to Stalinfrom the Central Committee of the Party andthe Council of Ministers of the USSR. Thesearticles were reprinted in Bolshevik, the Partyorgan, and the Soviet press in general.1 Inaddition, the anniversary issue of Pravda (butnot Bolshevik) contained two articles onStalin by persons who are not members of thePolitburo, M. Shkiryatov (a Party Secretary)and A. Poskrebyshev (presumably Stalin’spersonal secretary), thus treating their state-ments on a par with those made by themembers of the Politburo. This body of mate-rials will be examined as to what it may revealregarding the distribution of influence andattitudes within the Politburo.

While all the statements mentioned appearat first glance to express the same adulation ofStalin, they do contain nuances in style andemphasis. These nuances could more easily bedismissed as matters of individual rhetoric,of little relevance to political analysis, if thestatements had been made by non-Sovietwriters. But nuances in the political language

*From Leites, N., Bernaut, E., & Garthoff, R. L. (1951). Politburo images of Stalin. World Politics 3:317–339.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 132

used by members of the Politburo when talk-ing about Stalin are of a different nature.Stalinism is not afraid of monotony and doesnot shun repetitiveness. Lack of completeuniformity of language is therefore possibly ofpolitical interest. It is worthwhile to examinethe materials intensively in order to determinewhether or not the differences in language,however subtle, fall into any patterns, andto explore the meaning of differentiationsbetween groups or individuals in the Politburo.It seemed especially useful to approach thematerial with a view to investigating thedegree of maintenance (or disuse and replace-ment) of earlier Bolshevik terms and themes.

Two major types of statements about theimage of Stalin which can be discerned in thearticles are analyzed in this paper. Table 1gives the total frequencies of statements2 con-cerning these ideas: first, Stalin in comparisonto Lenin; and second, characterizations of

Stalin’s dominant role, as “perfect Bolshevik”or “ideal Father.” A third image, “Stalin” asperson or symbol, is not presented in thistable or discussed in detail because the differ-ence between images is a more qualitativeclassification derived from analysis of thecontext within the articles; it is briefly dis-cussed at the close of this article.

The frequencies of statements, when readacross, indicate the weight given to “popularimage” of Stalin. The articles were notuniform in length: Malenkov’s article wasapproximately 3,500 words; those ofShvernik, Andreyev, Kosygin, Khrushchev,and Shkiryatov were each about 2,500 words;the others were each approximately 5,000words. However, since the relative weightgiven to characterizations within each articleis the subject of our attention here, no “weigh-ing” of frequencies has been made in the table,and absolute figures have been used. . . .

Politburo Images of Stalin • 133

Table 1 References to Stalin in the Birthday Speeches December 21, 1949

Stalin: Stalin:Lenin’s Pupil or Equal? Perfect Bolshevik or Ideal Father?

Politburo Bolshevik Popular Bolshevik Popular Member Image Ambiguous Image Image Ambiguous Image

Molotov 5 1 0 12 3 0

Malenkov 4 0 2 11 3 0

Beria 13 3 1 15 1 2

Shvernik 4 0 5 2 2 2

Voroshilov 0 0 1 1 2 4

Mikoyan 2 2 9 3 0 5

Andreyev 1 0 2 3 0 15

Bulganin 0 1 6 0 0 3

Kosygin 1 0 3 0 0 8

Khrushchëv 0 1 4 0 0 7

Kaganovich 0 3 6 0 3 21

Shkiryatov 0 3 6 0 0 6

Poskrebyshev 2 1 3 0 0 10

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 133

STALIN: LENIN’S PUPIL

OR LENIN’S EQUAL?

In current Soviet public discourse, the “great”Lenin is not called “greater” than the “great”Stalin; nor is it affirmed explicitly that Leninand Stalin are equal in “greatness.” It is, how-ever, possible to adopt formulations that sug-gest the former or the latter of these emphases.

In the articles on Stalin’s birthday, the dif-ferences of stress fall into the pattern of ten-dencies toward what we have termed the“popular” and the “Bolshevik” images ofStalin; the popular image emphasizes Stalin’sequality (and in some instances even primacy)in relation to Lenin, while the Bolshevik imagelays more stress on Stalin as Lenin’s “pupil,” orthe “continuer” of his work and ideas.

In the treatment of this point, the Bolshevikimage characterizes the articles of the topgroup of the Politburo: Malenkov, Molotov,and Beria. In a “middle” position, using bothimages, are the joint article of the CentralCommittee and the Council of Ministers andShvernik’s article. Tendencies toward thepopular image are expressed by Kosygin andVoroshilov (each of whom makes only twocomparisons), Andreyev, and Poskrebyshev.The popular image is most frequently andclearly presented by Mikoyan, Kaganovich,Bulganin, Khrushchev, and Shkiryatov.

Beria uses the Bolshevik image, illustratedin the following examples, most frequently:

From his first steps of revolutionary activityComrade Stalin stood unwaveringly underLenin’s banner. He was Lenin’s true and devotedfollower. He made his extremely valuable contri-butions to Leninist development of the MarxistParty’s . . . tenets. . . . Establishing and develop-ing Leninism and relying on Lenin’s instructions(ukazniya), Comrade Stalin developed the tenetsof . . . industrialization. [Digest:12]

There are other instances where Beriastates that “Comrade Stalin developed Lenin’sinstructions” (Digest:13) and “developedLenin’s teaching on the Party” (Digest:12),but this quotation is especially significantsince Stalin is in Soviet writing almost

universally credited with the decision to col-lectivize and industrialize the country at arapid tempo. There are many other referencesto Stalin’s “arming the Party with Leninism,”or “defending” or “advancing” Leninism, butthese are not real comparisons.

There is one statement of equality on a sit-uation (the conduct of the Civil War) concern-ing which Stalin has credited himself with arole possibly higher than Lenin’s, so thatequality in this respect would belong to theBolshevik image.

During the difficult Civil War years Lenin andStalin led the Party, the State, the Red Army andthe country’s entire defense. [Digest:12]

Beria even makes one statement about“the introduction of the Leninist-Stalinistnational policy” (Digest:13) dealing with theone matter attributed to Stalin’s own author-ship prior to the middle twenties. Beria alsomentions Stalin’s investiture by Lenin, atheme that is rarely touched upon:

Lenin proposed that the Central Committeeof the Party elect Comrade Stalin GeneralSecretary of the Central Committee. ComradeStalin has been working in this high post sinceApril 3, 1922. [Digest:12]

As Lenin proposed, in 1923, that the Partyconsider the “removal” of Stalin from this“high post,” Beria’s reference is unusual. . . .

Molotov also expresses the Bolshevikimage of Stalin in comparison to Lenin,emphasizing his theoretical continuationrather than personal discipleship, as Beriadoes. Both mention the fact that after Lenin’sdeath, Stalin headed the Communist Party.Molotov goes on, however, to state:

Comrade Stalin upheld and developed Lenin’stheory of the possibility of victory of socialismin one country. . . . [Digest:7]

As the . . . representative of creative Marxism,Comrade Stalin has highly developed theLeninist principles of strategy and tactics of ourparty. . . . [Digest:10]3

134 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 134

Molotov also expresses the Bolshevikimage of Stalin as the successor to Lenin in hiscapacity as “head of the Party” and the pre-server of its monolithic character, and says:

As the great continuer of the cause of immortalLenin, Comrade Stalin stands at the head of allour socialist construction. . . . [Bolshevik:22]

Malenkov also stresses the Bolshevik image(despite two statements of apparent equalityconcerning their role in the Revolution):

Better than anyone else, Comrade Stalin pro-foundly understood Lenin’s inspired ideas on anew-type Marxist party. [Digest:3]

A middle position, using both imagesfrequently, is noticeable in the joint C.C.-Council of Ministers message, and in thearticle by Shvernik, entitled “ComradeStalin⎯Continuer of the Great Cause ofLenin.” In addition to the title of his article,Shvernik makes three weaker Bolshevikimage references to Lenin and Stalin, such asthe one cited below.

From the first steps of his revolutionary strug-gle, Comrade Stalin was pervaded with aboundless faith in Leninist genius, and went onLenin’s path as the most loyal of his pupils andcompanions-in-arms. [Bolshevik:1]

On the other hand, he expresses the popu-lar image four times, writing “together withLenin, Comrade Stalin” (Bolshevik:91,twice), and “Lenin and Stalin” led the work-ing class to victory (Bolshevik:91), andfinally, in words borrowed from Mikoyan,he says: “Stalin⎯that is Lenin today”(Bolshevik:95).

Poskrebyshev (Stalin’s secretary, and pos-sibly a future member of the Politburo) alsoexpresses a mixed attitude on this question,with three unequivocal statements of equality,three as “continuer of the cause of Lenin,”and two as teacher-pupil.

Kosygin, Andreyev, and Voroshilovemploy the popular image more frequentlythan the Bolshevik but do not compare Leninand Stalin often. Thus Kosygin writes: “The

ideas of Lenin-Stalin have triumphed. One-third of the population of the globe hasentered firmly onto the path indicated byLenin Stalin . . .” (Bolshevik:89), and later“path of socialism, indicated by Lenin-Stalin . . .” (Bolshevik:90). Kosygin evenomits the name of Lenin in a passage whereone might have expected to find it:

With the name of Stalin is indissolubly con-nected the creation of our Communist Partyand of the first Soviet socialist state in theworld. . . . [Bolshevik:86]

Just as Andreyev’s article was predomi-nantly devoted to agricultural matters,Voroshilov’s article was concerned with mili-tary affairs, more specifically the strategy andconduct of the Great Fatherland War. In addi-tion to two references to “the Party of Leninand Stalin” he makes only one comparison,expressing equality.

During the years of the heroic struggle andlabor [the Revolution], the Soviet people underthe leadership of the Party of Bolsheviks, underthe guidance of the great leaders Lenin andStalin, secured a world-historical victory.[Bolshevik:35]

The popular image is clearly dominant,and frequent, in the articles of Mikoyan,Kaganovich, Bulganin, Khrushchev, andShkiryatov.

Thus, Mikoyan states:

Stalin not only fully mastered the entirescientific heritage of Marx, Engels andLenin . . . [He “defended” and “brilliantly inter-preted” it]; he also enriched Marxism-Leninismwith a number of great discoveries, and furtherdeveloped the Marxist-Leninist theory. In thewords of Comrade Stalin Leninism is raised toa new, higher historical plane. . . .The Marxist-Leninist philosophy, which is transforming theworld, has reached its apex in the works ofComrade Stalin. [Digest:19]4

Kaganovich is even more devoted to theuse of the popular image, representing Stalinas equal to (or in rare instances even superiorto) Lenin. There are no clear uses of the

Politburo Images of Stalin • 135

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 135

Bolshevik image in his article, which aboundsin comparisons.

Comrade Stalin did not simply defend and safe-guard the Leninist theory of the possibility ofthe victory of socialism in one country, but onthe foundation of rich experience of the strug-gle, he creatively augmented and enriched thetheory. . . . [Bolshevik:59]

In one place Bulganin credits Stalin withthe distinction between just and unjust wars(“as Stalin teaches . . .”), without any mentionof Lenin, who first made this distinction, anduntil now has been generally so credited inthe Soviet Union (Bolshevik:70).

Khrushchev also uses the popular image,with one possible exception, in all his com-parisons of Lenin and Stalin. In addition tofive references to “the X of Lenin and Stalin”(X = Party, teaching, idea, cause, and banner),he makes three statements of clear equalityand one which may even attribute superiorityto Stalin.

Herein lies Comrade Stalin’s tremendous andinvaluable service. He is the true friend andcomrade-in-arms of the great Lenin. [Digest:30]

. . . Stalin, who together with Lenin created thegreat Bolshevist Party, our socialist state,enriched Marxist-Leninist theory, and raised itto a new, higher level. [Bolshevik:80]

Shkiryatov also expresses the extremeimage most frequently, stating only threetimes that Stalin is continuing the “cause” or“banner” of Lenin, while using the phrase“the teaching of Lenin and Stalin” four times,and making six comparisons of Lenin andStalin, in all of which they are clearly repre-sented as equal.

Reviewing the treatment of this theme wesee that there emerge rather distinctly aBolshevik image and a popular image, in thetreatment of the relative standing of Leninand Stalin by Politburo members.

The “Bolshevik image” is most prominentin the articles of the top sector—Beria,Molotov, and Malenkov (in that order). It rep-resents Stalin as the pupil of Lenin, his fol-lower, and his continuer as Lenin’s successor,

who continued to implement, defend, andelaborate Leninism. He appears as the mostloyal of Lenin’s followers and the one whobest understood his ideas. Stalin is not consid-ered as Lenin’s peer (with the single excep-tion of Malenkov’s treatment of the OctoberRevolution).

The “popular image” of Stalin is predomi-nant, in varying degree, in the words of all theothers, especially Kaganovich, Khrushchëv,Mikoyan, Bulganin, and Shkiryatov. It repre-sents Stalin as the equal of Lenin, also in sit-uations where this was obviously not the case.In rare instances, Stalin even appears greaterthan Lenin.

STALIN: THE PERFECT BOLSHEVIK

PARTY LEADER, OR THE IDEAL FATHER

The Bolshevik image is employed by Beria,Malenkov, Molotov, and to a lesser degree byShvernik and Mikoyan. Stalin appears as thegreat “leader” and “teacher,” but by implica-tion the Party is superior to him. He possessesa very high degree of Bolshevik virtues.

The perfect Bolshevik takes it for grantedthat his life is dedicated to the advancementof Communism, at whatever deprivations tohimself. He regards it as improper to talkabout ultimate values and personal sacrifices;attention, he feels, should be concentrated ondiscerning the correct line and carrying itthrough. The traits ascribed to Stalin by Beria,for instance, are almost all means to this endand are presented as such. A positive evalua-tion of a Bolshevik commends him for havingmade himself an effective tool in correctdirections.

The popular image of Stalin, given muchmore profusely, does not present him as aParty leader impersonally fulfilling the moralobligation to render service to the proletariatby providing a correct policy line. It showshim as a People’s Leader in the Soviet Unionand in the rest of the world, bestowing bound-less paternal solicitude (zabota) on the “sim-ple people.” The people, overwhelmed bysurprise at finding such freely tendered good-ness in one of their very own (rodnoi) onhigh, work harder and better for him in loving

136 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 136

gratitude. While the aim of the Party leader isto realize Communism in the future, at thecost of current hardships, the solicitude of theLeader of the People aims at satisfying humanneeds now. This he does, not only by layingdown over-all policy, but also by innumerableconcrete actions. In all this, Stalin possessesthe virtues of an ideal father (sometimesbrother and friend) which his children do notstrive to equal. Stalin tends to become thecreator of all good things.

The use of the Bolshevik image by the topgroup in this respect is far from excluding theuse of elements of the popular image.Nevertheless, there is a differentiation, whichwe shall endeavor to show.

1. One of the aspects of the Bolshevikimage of Stalin is his endowment with a veryhigh degree of Bolshevik virtues. The impli-cation is that these distinctive virtues shouldbe emulated by less perfect Bolsheviks andthat, although the chances of attaining Stalin’sdegree of perfection are slight, the model isclear, and there is no predetermined limit toadvance.

For example, Beria says:

In Comrade Stalin the Soviet people saw evenmore clearly and distinctly the features of hisgreat teacher, Lenin. They saw that our armyand people were led into battle against a brutal-ized enemy by a tested leader who, like Lenin,was fearless in battle and merciless toward theenemies of the people; like Lenin, free of anysemblance of panic: like Lenin, wise and boldin deciding complicated questions; like Lenin,clear and definite, just and honorable, lovinghis people as Lenin loved them. [Digest:15]5

Molotov also stresses Stalin’s Bolsheviktraits in several passages, (one) outstandingexample follows below.

The works of Stalin are now appearing, con-taining his containing his works from 1901. It isimpossible to overestimate the theoretical andpolitical significance of this publication. Beforeour eyes, stage by stage, there unfolds the pic-ture of the inspired creative work of the greatStalin, in all its diversity and spiritual wealth.Here, all the diverse practical questions of thework of the Bolshevist party and the international

communist movement and, together with this,complex scientific problems of history andphilosophy are treated in the light of the ideasof Marxism-Leninism. . . . [Digest:9]

In most cases, popular image characteriza-tions are admixed with Bolshevik statementsshowing Stalin as “leader” and “teacher.” Ofall the statements by the top group in the pop-ular vein, only one (by Molotov) communi-cates a feeling or judgment by the speakerhimself; all the other instances allege judg-ments or feelings of the people.

Comrade Stalin is rightfully considered a greatand loyal friend of the freedom-lovingpeoples of the countries of people’s democ-racy. . . . [Digest:3]

In addition to stressing his Bolshevikvirtues, the Bolshevik image presents Stalinas leader in three forms: political strategist,teacher, and Party executive. We shall exam-ine these in turn.

2. According to the Bolshevik image,Stalin’s main role is to make a diagnosis andprognosis of the political situation and toderive the correct line from it. In the popularimage of Stalin this is stressed much less.This aspect of the Bolshevik image is con-veyed particularly by Molotov, as theexamples below indicate.

. . . [Stalin’s] ability . . . to show the Party thetrue way and to lead it to victory. [Digest:11]

In order that the anti-Hitler three-powercoalition might be created during the war, itwas necessary first to thwart the anti-Soviet plans of the governments of Britain,France . . . Comrade Stalin discerned in timethe . . . Anglo-French intrigues . . . enablingus . . . to bring the developments of eventsto a point at which the governments ofBritain and the U.S.A. were faced withthe necessity of establishing an Anglo-Soviet-American . . . coalition. . . . [Digest:8]

3. Related to this in the Bolshevik imageis Stalin’s function in “teaching” the Partyrules of organization, strategy, and tactics.This is another point less stressed in the

Politburo Images of Stalin • 137

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 137

popular image of Stalin. But it is one of themain emphases of Malenkov (who may expectto take over this function). The following cita-tions from his speech are but a few of many.

Comrade Stalin teaches that the BolshevistParty is strong because . . . it multiplies itsties with the broad masses of theworkers . . . Comrade Stalin teaches that with-out self-criticism we cannot advance . . .Comrade Stalin teaches that . . . ComradeStalin educates the cadres of ourParty. . . . [Digest:4–5]

Molotov and Beria emphasize Stalin’scharacter as the “continuer,” “defender,” and“developer” of Leninism more than thisteaching role, but they often do refer to Stalinas “leader and teacher.” (This standard phraseis also found in the popular image but lessfrequently and prominently.)

4. The top-level statements frequently pre-sent as the major acting force not Stalin but theParty (or, sometimes) the “Soviet Union,” orthe “Soviet people,” while other members ofthe Politburo stress the personal role of Stalinby omitting references to the Party. The Party iseven credited with those services most oftencredited to Stalin by most of the others⎯inspiring, mobilizing, organizing. The term“leadership of the Party” clearly refers to othersbesides Stalin⎯in fact, to the speaker himself.Thus in the following passage, Malenkov men-tions the Party eight times, and Stalin only once.

The friendship among peoples which is firmlyestablished in our country is a great achievementof the leadership of the Bolshevist Party. Onlythe Bolshevist Party could forge the indissolublefraternity among the peoples⎯the Bolshevistparty which consistently carries forward theideas of internationalism. . . . [The recent war]was a most serious one for the Bolshevist Partyitself. The Party emerged from this test a greatvictor . . . following the instructions of ComradeStalin, our Party constantly inspired the peopleand mobilized their efforts in the struggleagainst the enemy. The Party’s organizationalwork united and directed. . . . Again the unsur-passed ability of the Bolshevist Party to mobi-lize the masses under the most difficultconditions was demonstrated. [Digest:4]

On the other hand, the image of Stalin asthe People’s Leader (the popular image)shows him acting directly, without using thetransmission belt of the Party. Occasionallythe “top group” and the “middle group”members use this image in topics intended formass consumption:

. . . Stalin’s voice in defense of peace . . . haspenetrated throughout the world. . . . All simpleand honest people responding to his appealgroup themselves into powerful columns offighters for peace. [Voroshilov, Digest:19;Bolshevik:44]

The popular image of Stalin, as we haveindicated previously, does not stop at thelimits which mark the Bolshevik charac-terization described above. Indeed, it veryrarely uses them at all, except for casualand occasional reference to the standardterm “leader” and “teacher.” The articlesof Kaganovich, Khrushchev, Shkiryatov,Poskrebyshev, Bulganin, Kosygin, andAndreyev, in roughly that descending order,are most expressive of the popular image, inthe aspects presently under review. Mikoyan,and to a lesser degree Shvernik, also use it,but there arc a number of mixed and evenBolshevik statements in their articles. On theother hand, the seven writers listed abovehave only four Bolshevik image statements inall their articles. Voroshilov is a special case;in his introduction and conclusion he makes anumber of statements in the popular image.

1. In the popular image Stalin is charac-terized as the “father” of his people, whoconstantly helps them because of his “pater-nal solicitude” for them. (This is sometimesweakened to a “friend” relationship, andsometimes intimate relationship terms are notemployed.) “The simple people” are grateful,loving, and industrious in return. For themStalin is rodnoi, meaning “one’s very own.”and connoting familial intimacy.

Each of the members of the “bottomgroup” uses this description (to varyingdegrees, of course, as shall become evident).The following examples are by no meansexhaustive of the instances used.

138 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 138

Kaganovich depicts Stalin in this mannerin the following passages:

Comrade Stalin displays exceptional solicituderegarding miners and the alleviation of theirlabor. . . . The glorious army of railway workersresponds to Comrade Stalin with warm love,devotion, and with a growing and improvingtransport [system] for his paternal warmth andsolicitude. . . . The systematic increase of wages[etc.] . . . all these are the results of the constantsolicitude and attention of our very own [rod-noi] Comrade Stalin, whom the people lovinglycall father and friend. [Bolshevik:60–61]

Bulganin develops a similar image:

Comrade Stalin always displayed and displaysup to the present time a constant paternal solic-itude for the bringing up [vyrashchivani; usedin the phrase “bringing up one’s children”] ofmilitary cadres, educating them in the spirit ofsupreme fidelity to the Bolshevist Party, inthe spirit of self-sacrifice in the service of thepeople. . . . [Bolshevik:67]

Khrushchev similarly states:

Lenin and Stalin stood at the cradle of eachSoviet republic, they guarded it from menacingdangers, paternally [po-otecheski] helped it togrow and become strong. . . . This is why allthe peoples of our land, with the uncommonwarmth and feeling of filial love, call thegreat Stalin their very own [radnoi] father. . . .[Bolshevik:81]

Andreyev, while not stressing this aspectof the extreme image, states:

Attentively, paternally, daily leading and watch-ing over affairs on the collective farms . . . [is]Comrade Stalin. [Digest:29]

The two non-Politburo members, Shkiryatovand Poskrebyshev, both use this aspect of thepopular image frequently. Poskrebyshev eventitled his article “Beloved Father and GreatTeacher.”

Shkiryatov writes:

The peoples of our country grow and becomestronger like one family, and glorify ComradeStalin⎯ father and friend of all peoples of theUSSR. [Pravda:11]

Stalin, our father and friend, instills in us alove for all that is ours, native—in science,in culture, in production, and educates intothe Soviet people a warm devotion to itsMotherland. . . . [Pravda:11]

2. As has already become evident, thepopular image pictures Stalin as the People’sLeader, as contrasted to the emphasis on theParty and Stalin as Party leader in the moder-ate view. There are several aspects to being“People’s Leader,” and one which has beensuggested in several of the quotations alreadycited shows Stalin as an opponent of “bureau-cracy.” In his concern for the welfare of thesimple people, he must overcome the inde-cency, selfishness, and malice of the bureausstanding between him and the people.Bulganin makes this almost explicit:

Comrade Stalin always paid great attention to thewelfare of soldiers and sailors. He was interestedin food standards, the quality of uniforms, andthe weight of arms carried by soldiers. ComradeStalin frequently pointed out in his orders thatconcern for the soldiers’ . . . welfare was thesacred duty of the commanders, that they mustsee to it most strictly that soldiers received all thefood due under established standards, that thetroops were given well-prepared warm meals ingood time. . . . Due to the constant solicitude ofComrade Stalin for the suppliers of the troopsour front fighters were well fed and comfortablyand warmly clad. [Digest:28; Bolshevik:71]

Many other examples could be cited todemonstrate this aspect of the popular image.

The popular image of Stalin shows him, byimplication, almost as a one-man Party-government-and-army apparatus. The previousquotations have pointed out this characteri-zation of Stalin in situations where the wel-fare of the people required it. But this doesnot exhaust the range of his actions, andKaganovich and Bulganin in particular extendStalin’s active personal role to rather extremelengths. According to Kaganovich:

. . . while . . . the countries of Europe, and theU.S.A., first of all, are slipping toward a crisis,here in the Soviet Union the socialist economyimproves constantly. . . . We are obliged forthis to the superiority of the socialist system of

Politburo Images of Stalin • 139

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 139

economy, and above all to Comrade Stalin’sgreat energy, initiative and organizing genius.[Digest:25]

Bulganin concerns himself with Stalin’srole during the war, where Stalin performedan apparently prodigious amount of diverselabors constantly. Already in the Civil War,

Comrade Stalin was the creator of the mostimportant . . . strategic: plans and the directleader of the decisive battle operations. . . . AtTsaritsyn and Perm, at Petrograd and againstDenikin, in the West against the Poland of Pans,and in the south against Wrangel⎯everywherehis iron will and military genius secured [obe-spechivali] the victory of Soviet forces.[Bolshevik:66]

And in the recent war,

All operations of the Great Fatherland Warwere planned by Comrade Stalin and executedunder his guidance. There was not a singleoperation in the working out of which he didnot participate. Before finally approving aplan . . . Comrade Stalin subjected it to thor-ough analysis and discussion with his closest[an unusual statement] . . . Comrade Stalin per-sonally directed the whole course of every oper-ation. Each day and even several times a day heverified the fulfillment of his orders, gaveadvice, and corrected the decisions of thosein command, if there was need of this.[Bolshevik:66]

This image of Stalin as omnipresent andcompetent in every matter⎯an image neverpresented by the Politburo top group⎯isdeveloped to a still further extreme byPoskrebyshev:

Attentively supervising the work of the leadingMichurinists [the new geneticists], headed byComrade Lysenko, Comrade Stalin gave themdaily assistance by his advice and instruc-tions. . . . Comrade Stalin must also be noted asa scientific innovator in specialized branches ofscience. . . . Among the old specialists in agri-culture it was considered firmly established thatthe cultivation of citrus crops could not beextended on a wide scale in the region of theUSSR Black Sea coast. . . . [Digest:34]

STALIN: PERSON OR SYMBOL?

In our material, “Stalin” often refers to morethan the man, J. V. Stalin. The boundarybetween references to Stalin the person and,as might be said, Stalin the symbol is blurred,probably on purpose. The top group, however,is more careful than the other to distinguishbetween these two images, and to lay stresson Stalin the person.

One way of indicating that Stalin is beingreferred to as a symbol is by speaking of his“name,” or actually declaring his name to bea “symbol.” Thus Beria states in his introduc-tory paragraph:

Since the great Lenin there has been no name inthe world so dear to the hearts of millions ofworking people as the name of the great leader,Comrade Stalin. [Digest:11]

And Molotov tells us that for “the worldmovement for peace”

. . . the name of Stalin is its great banner.[Digest:9]

Malenkov also states this:

The name of Comrade Stalin has long sincebecome a banner of peace in the minds of thepeoples of all countries. [Digest:3]

And Bulganin writes:

The name of Stalin became for the Soviettroops the symbol of the greatness of our nationand its heroism. They went into battle with theslogan: “For Stalin, for the Motherland!”[Bolshevik:71]

Another way of differentiating betweenStalin the person and Stalin the symbol is bymaking explicit the personal character of thereference. In the birthday articles, Molotov,Shvernik, and Bulganin use this mode ofexpression most frequently. Although manyother references which do not specify thatStalin the person is meant probably do meanthis, the method remains, when used by the topgroup, an indication of instances where Stalin’s

140 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 140

personal role is held to be highly significant.Malenkov uses a different method of achievinga similar effect. Although he refers to Stalin anaverage number of times (average number,59 Malenkov’s total, 60), a disproportionatelylarge number of the references are to the effectthat “Stalin teaches that . . .” or “as Stalinsaid,” etc. Consequently, he says relatively lessabout other accomplishments of Stalin.

A technique used to transform “Stalin” fromthe person into the symbol is to employ theadjectival form of the word, “Stalinist.” TheBolshevik image usually reserves the term“Stalinist” to describe the achievements ofStalin’s regime rather than his personal accom-plishments. The popular image is, on the whole,lax about this differentiation, and apparentlyallows personal and impersonal meanings to begiven to “Stalinist,” as well as to “Stalin.”

The proportion of uses by the top group(Molotov, Beria, and Malenkov) of “Stalinist”as meaning “Stalin’s” personally is only twoout of a total of twenty-seven, in contrastto the very frequent use of the term in thismeaning by all the others (excepting onlyVoroshilov’s account of Stalin’s role in therecent war). Very often statements are madedescribing “the Stalinist, Soviet path,”“Soviet, Stalinist military science,” and thelike, inferring clearly that the term in theseinstances indicates merely “under the presentregime” or “in a Bolshevik manner.”

The relatively impersonal meaning of theadjective “Stalinist” is particularly evident insuch passages as the following. Molotov,affirming that the Soviet Union has gained instrength over the last quarter of a century, says:

This is a very great service of Comrade Stalinand of Stalinist leadership. [Digest:6]

Presumably, “Stalinist leadership” hererefers to Party leaders other than Stalin, andbecomes a synonym for “Party.” This is shownwhen Kaganovich, in a rare formulation, says:

A decisive condition for the victory of social-ism was the incessant struggle of ComradeStalin and of the united collective Stalinist lead-ership . . . for the realization of the general lineof the Party. [Bolshevik:63]

CONCLUSIONS

Two main conclusions emerge from this studyof the birthday articles:

1. Despite many individual differencesamong these articles and despite the variationswithin each of them, two major images of Stalinmay be constructed, toward which each articleis oriented to its particular degree. Briefly, theseimages are Stalin the Party Chief and Stalin thePeople’s Leader. The Party Chief is a very greatman; the People’s Leader stands higher thanany man. The Party Chief is characterized byBolshevik traits; the People’s Leader by con-stant and boundless solicitude for the welfareof all. We have referred for the sake of brevityto the first as “the Bolshevik image,” and to thesecond as “the popular image.”

2. Three groups within the Politburo canbe distinguished in terms of using theseimages. Malenkov, Molotov, and Beria, whopresumably are the most influential membersof the Politburo, stress the Bolshevik image ofStalin more than the other members, althoughindications of the popular image are not totallyabsent from their statements. Kaganovich,Bulganin, Khrushchev, Kosygin, and to alesser degree Mikoyan and Andreyev, occupypositions near the popular image (as doShkiryatov and Poskrebyshev). Shvernik andthe joint Party-government address occupy amiddle position. Voroshilov is a special case,presenting the popular image of Stalin in hisintroduction and peroration, but a very moder-ate Bolshevik image in terms of specific mili-tary operations (in contrast to Bulganin).

These two images of Stalin can now bereviewed with two questions in mind: (1) Towhom is either image addressed? Is there apreferred audience for the popular image andanother such audience for the Bolshevikimage? (2) What political significance can beattached to the finding that the Bolshevikimage is stressed by the “top group” in thePolitburo, while the popular image is usedmost freely by the “bottom group”?

Concerning the first question, it should beremembered that all statements analyzed inthis paper were published; they were not made

Politburo Images of Stalin • 141

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 141

in private. As public statements they were notprimarily, or at any rate not exclusively,addressed to Stalin. It is reasonable to assumethat the “masses” of the Soviet populationwere meant to be the consumers of the popu-lar image, whereas the Bolshevik imagewas offered primarily for adoption byCommunists, i.e., a small segment of the pop-ulation. It is characteristic of Bolshevism,though paradoxical to Western thinking, thatthe symbols of nearness and intimacy(“father,” “solicitude,” etc.) appear most fre-quently in the popular image of Stalin andare stressed for that audience which is farremoved from Stalin. Those closer to Stalinpolitically are permitted to speak of him interms of lesser personal intimacy (“leader ofthe party”). This paradox results partly fromthe merely instrumental use in Bolshevik lan-guage of words indicating personal nearness,and partly from the Bolshevik deprecation ofsuch nearness in political relationships. Theideal Party member does not stress any gratifi-cation he may derive from intimacy withothers, much as he may use such intimacy forpolitical ends.

For this reason it is difficult to answer thesecond question with certainty. It cannot beruled out that the Politburo⎯or a leadinggroup within it, or Stalin personally⎯decidedto use both images of Stalin in the birthdaystatements and to adopt a certain distributionof roles among its members in presentingthem. (Such a decision may have taken theform of an editorial scrutiny of each state-ment, in the course of which the differentia-tion of language was imposed.)

However, the assumption that there was adecision within the Politburo on the use ofdifferent images of Stalin does not precludecertain tentative conclusions about the statusof the groups within the Politburo. Theemphasis on the Bolshevik image by a fewmembers of the Politburo and on the popularimage by others not only reflects theBolshevik evaluation of the Party as distin-guished from, and superior to, the masses atlarge, but also indicates the relative distanceof the speakers from Stalin. In the situationunder review, it is a privilege for a member ofthe Politburo to refrain from using the crudest

form of adulation, words signifying personalintimacy and emotions; that is, private, ratherthan political, words. Given the Bolshevikevaluation of political as against private life,the use of the Bolshevik image indicateshigher political status. Hence, a planned dis-tribution of roles in using the two images ofStalin on the occasion of his birthday wouldstill indicate a political stratification of thePolitburo, though not necessarily politicalantagonism within it.

Unless one were to make the somewhatabsurd assumption that the roles to be per-formed on this occasion were distributed bylot, or the improbable assumption that theywere assigned for the purpose of concealingthe real stratification within the Politburo,those members who stress the Bolshevikimage could be assumed to be politicallycloser to Stalin than those who do not.

The assumption that there had been a deci-sion of some kind on the use of the two imageswould appear more plausible if either imagewere used by certain members of the Politburowithout the admixture of elements taken fromthe other. As it is, the difference between the“top group” and the “bottom group” is one ofemphasis in imagery. For this reason, we areinclined to regard the differentiation of politi-cal language discussed in this article as theresult of individual choices rather than of acentral decision. However, in this case we mayassume that the stress⎯whether conscious ornot⎯of any given Politburo member on theone or the other image of Stalin was relatedto his status in the Politburo in the fashionindicated above.

NOTES

1. As far as is feasible, quotations are givenfrom the translation in Volume 1, No. 52, of TheCurrent Digest of the Soviet Press (hereafter citedas Digest). Other passages have been translatedfrom the December 1949, No. 24 Bolshevik. Allitalics, unless otherwise indicated, are by theauthors of this article.

2. A “statement” for the purposes of this table,means each incidence of an explicit idea, and mayvary from a phrase to a paragraph. The examplescited in the text should clarify this point.

142 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 142

3. December 21, 1929, Molotov wrote morespecifically that Stalin had been a “man of practice”(praktik i organizator) up to Lenin’s death, after whichhe became a “theoretician.” Even in 1919 Molotov hasnot quite suppressed his tendency to deny that Stalinwas manifestly perfect from the start. He begins hisspeech by saying: “It is now particularly clear how veryfortunate it was . . . that after Lenin the CommunistParty of the USSR was headed by Comrade Stalin”(Digest:6). In the Bolshevik atmosphere of veiled lan-guage, this is bound to be understood, to some extent,as conveying: It was not always clear.

4. Although Molotov and Beria both praiseStalin as the theorist, they do not state explicitly (orclearly implicitly) that Stalin is as great a theoristas Lenin, to say nothing of the statement that“Marxist-Leninist philosophy has reached its apex”in Stalin’s work.

5. “Loving the people” also belongs to thepopular image. These occasional popular imageterms in a moderate picture may be the effect ofreverse seepage of esoteric propaganda into theconstantly assaulted esoteric integrity of thetop group.

Politburo Images of Stalin • 143

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 143

3.4QUANTITATIVE AND

QUALITATIVE APPROACHES TO

CONTENT ANALYSIS

ALEXANDER GEORGE*

144

Researchers have long debated therespective merits and uses of “quanti-tative” and “qualitative” approaches

to content analysis. . . . Most writers on con-tent analysis have made quantification a com-ponent of their definition of content analysis.In effect, therefore, they exclude the qualita-tive approach as being something other thancontent analysis.

Quantitative content analysis is, in the firstinstance, a statistical technique for obtainingdescriptive data on content variables. Its valuein this respect is that it offers the possibilityof obtaining more precise, objective, and reli-able observations about the frequency withwhich given content characteristics occureither singly or in conjunction with oneanother. In other words, the quantitativeapproach substitutes controlled observationand systematic counting for impressionistic

ways of observing frequencies of occurrence.1

The term “qualitative,” on the other hand, hasbeen used to refer to a number of differentaspects of research . . . :

1. The preliminary reading of communicationsmaterials for purposes of hypothesis forma-tion and the discovery of new relationships

As against

Systematic content analysis for purposes oftesting hypotheses.

2. An impressionistic procedure for makingobservations about content characteristics

As against

A systematic procedure for obtaining pre-cise, objective, and reliable data.

3. Dichotomous attributes (i.e., attributes,which can be predicated only as belongingor not belonging to an object)

*Excerpt from George, A. L. (1959). Quantitative and qualitative approaches to content analysis. In I. de Sola Pool (Ed.),Trends in content analysis (pp. 7–32). Urbana: University of Illinois Press.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 144

As against

Attributes which permit exact measurement(i.e., the true quantitative variable) or rankordering (i.e., the serial).

4. A “flexible” procedure for making con-tent-descriptive observations, or “coding”judgments

As against

A “rigid” procedure for doing the same.

FREQUENCY AND NON-FREQUENCY

CONTENT INDICATORS

While these four distinctions are important inthemselves, they do not serve to differentiatebetween the two approaches to the analysis ofcommunication. . . . We therefore introduce asomewhat different distinction, which focuseson the aspects of the communication contentfrom which the analyst draws inferencesregarding non-content variables.

1. Quantitative content analysis, as wehere define it, is concerned with the frequencyof occurrence of given content characteristics;that is, the investigator works with thefrequency of occurrence of certain contentcharacteristics.

2. Inferences from content to non-contentvariables, however, need not always be basedon the frequency values of content features.The content term in an inferential hypothesisor statement of relationship may consist of themere presence or absence of a given contentcharacteristic or a content syndrome within adesignated body of communication. It is thelatter type of communication analysis, whichmakes use of “non-frequency” content indica-tors for purposed of inference, that is regardedhere as the non-quantitative or non-statisticalvariant of content analysis.

The distinction we have introduced con-cerns the type of content indicator utilized forpurposes of inference. Given (different usesof) . . . “quantitative” and “qualitative,” it isdesirable to introduce a new set of terms. Weemploy the term “non-frequency” to describe

the type of non-quantitative, non-statisticalcontent analysis, which uses the presence orabsence of a certain content characteristic orsyndrome as a content indicator in an inferen-tial hypothesis. In contrast, a “frequency”content indicator is one in which the numberof times one or more content characteristicsoccur is regarded as relevant for purposes ofinference.

The distinction between frequency andnon-frequency analysis, it should be noted, isindependent of the aforementioned fourdimensions to which the terms quantitativeand qualitative are sometimes applied. Thus,both in frequency and non-frequency analysis(one can distinguish) between hypothesis-formation and hypothesis-testing phases ofresearch, between impressionistic and sys-tematic types of content description, betweenflexible and rigid procedures for makingcontent-descriptive judgments.

Nor is the familiar distinction in the theoryof measurement between dichotomous, serial,and quantitative attributes equivalent to thedistinction advanced here. Thus, . . . fre-quency as well as non-frequency analysismay be concerned with dichotomous attrib-utes, that is, attributes which can be predi-cated only as belonging or not belonging toan object. This is the case, for example, in thesimple word-counts . . . deciding whether acertain word or symbol (“democracy,”“Germany,” “Stalin”) does or does not appearin each sentence, paragraph or article. . . .

[T]he difference between the twoapproaches is that frequency analysis, evenwhen it deals with dichotomous attributes,always singles out frequency distributions asa basis for making inferences. In contrast,the non-frequency approach utilizes the mereoccurrence or nonoccurrence of attrib-utes . . . for purposes of inference. Thus, forexample, [from] a quantitative study, whichshows a sharp decline in number of referencesto Stalin in Pravda, the frequency analystmight infer that the successors to Stalin areattempting to downgrade the former dictatoror are trying to dissociate themselves fromhim. On the other hand, the non-frequencyanalyst might make a similar inference fromthe fact that in a public speech one of Stalin’s

Approaches to Content Analysis • 145

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 145

successors pointedly failed to mention himwhen discussing a particular subject (e.g.,credit for the Soviet victory in World War II)where mention of Stalin would formerly havebeen obligatory. In one case, it is the fre-quency distribution of attention to “Stalin”over a period of time on which the inferencerests. In the other, it is the mere occurrence ornonoccurrence of the word “Stalin” on a par-ticular occasion,* which serves as a basis forthe inference. Yet, in both of these examples,the investigator deals with a dichotomousattribute . . . the presence or absence of“Stalin” in a given unit of communication.

Furthermore, the use of frequency andnon-frequency methods is not determined bythe fact of multiple or single occurrence of thecontent feature in question within the commu-nication under examination. The fact that acontent feature does occur more than oncewithin a communication does not oblige theinvestigator to count its frequency. Theimportant fact about that content feature forhis inference may be merely that it occurs atall within a prescribed communication.

It should be noted, finally, that the non-frequency approach to content analysis isreally an older and more conventional way ofinterpreting communication and drawinginferences from it than is the quantitativeapproach. The resemblance of the non-frequency approach to traditional methods oftextual analysis, moreover, will become obvi-ous when we consider some of the character-istics of this approach.

SOME EXAMPLES OF NON-FREQUENCY

CONTENT ANALYSIS

[Two] examples will illustrate the nature ofthe non-frequency approach and some of thedisadvantages of relying exclusively uponfrequency or quantitative content analysis.The examples are drawn from wartimepropaganda analyses of German communi-cations made by personnel of the AnalysisDivision, Foreign Broadcast Intelligence

Service, and Federal CommunicationsCommission (FCC).

1. [T]he FCC analyst inferred that the NaziPropaganda Ministry was attempting to dis-courage the German public, albeit indirectly,from expecting a resurgence of the U-boats.This inference was also based in part upona non-frequency content indicator. HansFritzsche, the leading radio commentator, hadasserted the following in discussing a recentsuccess achieved by German U-boats:“ . . . we are not so naive as to indulge in spec-ulation about the future on the basis of the factof this victory . . .” In focusing upon this state-ment, the FCC analyst was not concerned withthe frequency of the theme in Fritzsche’s talkor in other German propaganda accounts ofthe same U-boat “victory,” or with the ques-tion of whether it now appeared more or lessfrequently than in earlier propaganda on theU-boats. For his purpose, it sufficed thatthe content theme was present even once inthe context of Fritzsche’s remarks about thelatest German U-boat victory.

It is interesting to speculate on what wouldhave happened had a frequency (quantitative)approach been employed in this case. In thefirst place, it is problematic whether a contentcategory could have been set up that wouldhave precisely caught the meaning of this onephrase in Fritzsche’s talk. Secondly, since thephrase (or its equivalent) appeared at best onlya very few times in German propaganda at thetime, the propaganda analyst, in looking overthe quantitative results, might well have dis-missed it as a “minor theme” or lumped ittogether with other items in a “miscellaneous”or “other” category. In other words, if this sin-gle phrase from Fritzsche’s talk had been sub-sumed under a frequency indicator, it mightwell have lost its inferential significance.

2. The (second) example is quite similar.[Towards the end of World War II,] Mussolinihad set up a Republican-Fascist governmentfollowing his “liberation” by German para-chutists (from his imprisonment). Germanpropaganda gave quite a play to these events

146 • PART 3

*In the context of what is expected, a known norm.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 146

and celebrated Mussolini’s re-establishmentof a pro-Axis Italian government. In lookingover this propaganda, the FCC analyst notedthat, after a few days, a minor theme of somesobriety was introduced into the otherwiseenthusiastic publicity on Mussolini and hisnew government. Only a few Nazi papers car-ried the new message and, where it did appear,it was rather well hidden. For example, in theVölkischer Beobachter, September 9, 1943,the following sentence appeared: “. . . the bat-tle is not yet won by the changes proclaimedby Mussolini, and the structural changesundertaken by him must not be regarded as aguarantee of future greatness.”

The significance of this new theme to theFCC analyst lay in the fact that it appearedat all. In other words, he made use of it as anon-frequency content indicator. Had this newtheme been subsumed under a generalfrequency-type indicator in a quantitative study,it would probably have passed unnoticed. Butwhen singled out as a non-frequency indicator,the theme, although repeated only a few timesin the total Nazi propaganda on Mussolini,provided the basis for an important inference.The FCC analyst inferred, from the appear-ance of the theme, that the PropagandaMinistry had decided to moderate the public’sexpectations regarding a resurgence of Italianfascism. Continuing the chain of inference, theFCC analyst then reasoned that such a propa-ganda goal must have been adopted as theresult of a new, more sober estimate by Nazileaders of the potential of Mussolini’s newgovernment. This inference was subsequentlyverified [by] material appearing in TheGoebbels Diaries (Lochner, 1948).

SOME DIFFICULTIES IN APPLYING

QUANTITATIVE CONTENT ANALYSIS FOR

THE STUDY OF INSTRUMENTAL ASPECTS

OF COMMUNICATION

[I]n such fields as clinical psychiatry andpropaganda analysis, content analysis is oftenused as a diagnostic tool for making causal

interpretations about a single goal-orientedcommunication. In order to identify andexplore some of the special problems thatarise in this type of content analysis, we shallfurther examine the case of propagandaanalysis. Other communication analyses,which operate within the framework ofan instrumental model, may encounter similarproblems. We will discuss the following:(1) the problem of coding irrelevant content,(2) the problem of changes in the speaker’sstrategy, (3) the problem of an expanding uni-verse of relevant communication, and (4) theproblem of structural characteristics of instru-mental communication.*

These problems arise in part from thecharacteristics of propaganda communica-tion, and in part [from] the investigator’sinterest in making specific inferences aboutsome aspect of the communicator’s purpo-sive behavior. In any case, the result is that aconsiderable portion of the research effortmust be given to discovering new hypothesesor refining old ones; systematic quantitativeanalysis for purposes of testing inferentialhypotheses is often difficult, infeasible, orunnecessary; and, finally, non-quantitative(non-frequency) content indicators are oftenmore appropriate and productive than quanti-tative (frequency) indicators.

The Problem of CodingIrrelevant Content

A variety of specific goals and strategiesare usually pursued in propaganda communi-cations. The propaganda analyst, however,may be interested in making inferences onlyabout one or a few matters of policy interest.Accordingly, he must exercise care in consid-ering which passages in the stream of com-munication are relevant to each of the goals orstrategies of the communicator.

The difficulty of arriving at such judg-ments of relevance and the considerablesensitivity and discrimination, which arerequired for this purpose, are often reasons fornot undertaking elaborate quantitative “fishing

Approaches to Content Analysis • 147

*These issues are just as relevant today in light of the use of content analysis techniques on the Web.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 147

expeditions” in any sizable body of propa-ganda communications. Not all the individualitems tabulated under any given content cate-gory in such a “fishing expedition” may berelevant to the specific inference, which theanalyst would like to make about thespeaker’s state of mind.

We are referring here obliquely to one ofthe important requirements of statistical con-tent analysis, namely, that it be “systematic”in the sense that “all of the relevant con-tent . . . be analyzed in terms of all of therelevant categories, for the problem at hand”(Berelson, 1952:17).

But the obverse of this requirement—thatnone of the irrelevant content be analyzed—is equally important and is a weighty reasonfor not undertaking elaborate quantitativecontent analyses of the “fishing expedition”variety. . . . In some cases the inclusion ofirrelevant content in the analysis may be nomore than a waste of manpower. But in othercases, it may rule out the possibility of mak-ing a useful inference or lead to wholly mis-taken inferences. The problem may becomeparticularly acute when the investigator,engaged in a “fishing expedition” of thissort, deliberately selects broad content cate-gories in order to ensure large enoughfrequencies for purposes of subsequent statis-tical analysis.*

The danger of coding irrelevant content isminimized when research is designed to testclear-cut hypotheses. Hypotheses usuallyindicate or imply the realm of relevant con-tent or the appropriate sample to be coded.

However, precise hypothesis formation—the assertion of a relationship between a con-tent indicator and one or more communicatorvariables—is often difficult in propagandaanalysis. This difficulty reflects the rudimentarystate of the scientific study of communication.The lack of good hypotheses about relation-ships between content variables and commu-nicator variables makes it difficult for thepropaganda analyst to circumscribe the termsand categories for specific investigations.

This difficulty, of course, is by no meansconfined to propaganda analysis. . . . In asober assessment of the results of their large-scale study of symbols as indices of politicalvalues, attitudes, and ideological dispositions,[Lasswell lamented]:

. . . there is as yet no good theory of symboliccommunication by which to predict how givenvalues, attitudes, or ideologies will be expressedin manifest symbols. The extant theories tend todeal with values, attitudes, and ideologies as theultimate units, not with the symbolic atoms ofwhich they are composed. There is almost notheory of language that predicts the specificwords one will emit in the course of expressingthe contents of his thoughts. Theories in philos-ophy or in the sociology of knowledge some-times enable us to predict ideas that will beexpressed by persons with certain other ideas orsocial characteristics. But little thought has beengiven to predicting the specific words in whichthese ideas will be cloaked. The content analyst,therefore, does not know what to expect.(Lasswell, Lerner, & Pool, 1952:49)

In summary, there are relatively fewtheories or general hypotheses about sym-bolic behavior available for testing bymeans of rigorous quantitative content analy-sis. . . . [S]ome investigators, [therefore]employ quantitative content analysis for pur-poses of a “fishing expedition”; large quanti-ties of content data are collected withoutguidance of clear-cut hypotheses in the hopeof discovering, at the end of the study, newrelationships and new hypotheses. Such stud-ies tend to be time consuming, wasteful, andgenerally unproductive. Disappointing resultswith “fishing expeditions” are particularlylikely when large quantities of material areprocessed (aided by) clerical personnel to dothe coding. As a result (of being locked into afixed coding instrument), there is insufficientopportunity to refine categories and it is usu-ally not possible to recode the bulky materialas many times as necessary in order to pro-duce content data appropriate for testinginteresting hypotheses.

148 • PART 3

*And thereby failing to record the distinctions that may prove critical in supporting the desired inferences.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 148

The Problem of Changes in theSpeaker’s Strategy

Due to the circumstances, which have beendescribed, the “qualitative” phase of hypothe-sis formation may properly receive unusualemphasis in propaganda analysis. [This isjustified by] the fact that the propagandist’sstrategy on any single subject may changeabruptly at any time. In attempting inferencesabout the speaker’s state of mind, . . . the ana-lyst cannot easily draw up a set of contentcategories, which will be appropriate for allpossible shifts in the communication strategyof the speaker. [He] will hesitate to commithimself to systematic quantitative descriptionbecause he fears that the speaker’s strategymay change while the count is being made.If such a change is unnoticed . . ., the valueof the results of the quantitative tabula-tion . . . may be lost. For then, such contentdata might well be ambiguous or inappropri-ate for purposes of inference.

In propaganda analysis, the instrumentaluse to which communication is put by thespeaker is regarded as a highly unstable vari-able, which intervenes between various otherantecedent conditions of communication(e.g., speaker’s attitude and state of mind, theconditions and calculations, which haveaffected choice of action) and the contentvariable itself. In this respect, propagandaanalysis has much in common with the analy-sis of psychotherapy protocols. Both the pro-paganda analyst and the psychotherapist aresensitive to the possibility that the communi-cation intention and strategy of the speakercan change frequently during . . . a systematiccount of the content features of what hesays. . . . [E]xcept when there is reason to

believe that the content features selected asindicators are insensitive to variations in thespeaker’s strategy, frequency counts maybe inappropriate as a means of inferringthe speaker’s attitudes, state of mind,and . . . conditions that have influenced hischoice of a communication strategy or goal.

In propaganda analysis, typically, theinvestigator is interested in inferring one ormore of the following antecedent conditionsof the propagandist’s communication: hispropaganda goals and techniques; the esti-mates, expectations, and policy intentions ofthe leadership group for whom the propagan-dist is speaking which have influenced theadoption of a particular propaganda strategy;the situational factors or changes which haveinfluenced the leadership’s estimates, expec-tations, and policy intentions and/or the pro-pagandist’s choice of communication goalsand techniques.

Investigators interested in inferring eliteestimates, expectations, policy intentions,and/or situational factors, which lay behindthe adoption of a particular propaganda goalor strategy may employ one of two rather dif-ferent methods of inference. They can attemptto find content indicators which directlyreflect the component of elite behavior or thesituational factor in which he is interested, orhe can attempt first to infer the speaker’s pro-paganda goal and then proceed step by step toaccount for the selection of that goal in termsof elite estimates, expectations, policy inten-tions and/or situational factors.

The first of these two methods of inferencebypasses consideration of the speaker’s pro-paganda strategy. The inferences made withthis direct method are one-step inferences, asfollows:

Approaches to Content Analysis • 149

Elite policy intentions Content indicator

Elite expectations Content indicator

Elite estimate Content indicator

Content indicatorSituational factor

Figure 1

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 149

In contrast, the indirect method is com-prised of an inferential chain of two or moresteps, the first of which is always an inference

about the speaker’s goal or strategy. It maybe depicted, in somewhat simplified form, asfollows:

150 • PART 3

Situationalfactor

Eliteestimate

Eliteexpectation

Elitepolicy

intention

Contentindicator

Speaker’spropaganda

goal

Figure 2

The direct method . . . can be successfullyemployed only if content features can befound which occur regularly and only when acertain type of elite policy intention, expecta-tion, estimate, or situational factor occurs. Thetypes of regularities or generalizations thatthe direct method requires as a basis for infer-ences, therefore, are correlations of a non-causal character. It is important to recognizethat the content terms in such correlationsmust be insensitive to possible variationsin propaganda strategy. This is necessarybecause propaganda strategy is an interveningand relatively unstable variable between elitepolicy behavior and propaganda content. Thedirect method is on firm ground only when itemploys as content indicators features of thecommunication over which the propagandistdoes not exercise control or of whose informa-tion-giving value regarding elite policy behav-ior he remains unaware. Such content featuresare likely to be symptomatic features of a pro-pagandist’s behavior rather than part of hiscommunication intention.2

The indirect method, on the other hand,attempts to utilize for purposes of inferencethe fact that the behavior of the propagandistin selecting communications goals and strate-gies constitutes an intervening set of eventsbetween elite policy behavior and the depen-dent variable (content of propaganda).Therefore, the investigator who employs theindirect method attempts to identify contentfeatures in the propaganda, which are sensitiveto and dependent upon the speaker’s strategy.

The distinction between content features thatare sensitive and insensitive to variations in the

speaker’s strategy is useful not only in propa-ganda analysis, but whenever content analysis isused on instrumentally manipulated material. Insuch cases, obtaining good indices of attitude,value, etc., would seem to require either theavoidance of content features, which are likelyto be sensitive in the first instance to variationsin the speaker’s communication strategy or asophisticated awareness of that strategy.

The Problem of an Expanding Universeof Relevant Communication

Another characteristic of propaganda,which has procedural implications, is the factthat the universe of relevant communi-cation may be expanding while [its] ana-lyst is attempting to draw inferences fromit. . . . [Under these conditions], the propa-ganda analyst finds himself trying to keep upwith the flow of communication that hassome relevance to his problem. . . . [A]s newstatements on the topic are made by thesource, . . . the set of alternative hypothesesunder consideration and [corresponding] con-tent categories [have to be revised]. . . .

These circumstances frequently rule outquantitative content description. A familiarprerequisite of quantitative content analysisis that the investigator knows what he islooking for before beginning to count. Thepropaganda analyst (who relies on quantita-tive accounts of communications) . . . can-not be confident that the data provided willstill be adequate for purposes of inferencewhen new statements on the topic arereceived from the source. For the most

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 150

recent communication may throw new lighton the inferential problem and, based onthese new insights, the propaganda analystmay have to reread and reinterpret the ear-lier propaganda communications in theiroriginal form. . . .

A similar problem arises, we may note,when information about non-content eventsbearing upon the inferential problem comes tothe attention of the propaganda analyst afterhe has received and analyzed the relevant pro-paganda communication. Such non-contentevents may permit the analyst retrospectivelyto formulate more discriminating hypothesesabout the inferential significance of the pro-paganda. And, for this purpose, it may be nec-essary for him to reread and reappraise thepropaganda communication in its originalform, in the light of the new informationavailable on relevant non-content events.

The Problem of StructuralCharacteristics of InstrumentalCommunication

Propaganda analysis procedures are muchinfluenced, finally, by the necessity to takeinto account the structure of individual propa-ganda communications. Different structuraltypes of communication are encountered inthe flow of propaganda available for analysis.An article by Goebbels appearing in DasReich, for example, was structurally differentfrom a speech by Hitler; and both, certainly,were structurally different from German radionews broadcasts.

The propaganda intention of an individualcommunication (and its effect as well) oftendepends not merely on the explicit contentof the individual statements or propositionstherein contained but also upon the structuralinterrelationship of these statements withinthat communication.

Thus, what may be called the “whole-part”problem in content analysis has severalimportant procedural implications. It mayaffect [the] choice of counting units and cate-gories as well as the decision on the type ofcontent indicator (frequency or non-frequency)to be employed.

Awareness of the “whole-part” problemoften leads the propaganda analyst to be criticalof an important implicit assumption of statisti-cal content analyses, namely, that each individ-ual item counted as falling under a designatedcontent category is of equal significance forpurposes of inference.3 Similarly, the propa-ganda analyst is often critical of the assumptionthat the inferential significance of explicitpropositions, themes, or statements is depen-dent upon the precise frequency of their occur-rence. Rather, he may find explicit propositionsof significance for purposes of ascertaining thestrategy of the propagandist because they occurat all or because they occur in a certain relation-ship to each other within the communication.

This does not mean that frequency countsare useless for purposes of propaganda analy-sis. Frequency tabulation of words, clichés,stereotypes, and slogans may provide an indi-cation of propaganda emphasis and tech-niques as well as of intentions. But suchtabulations in themselves give no clue to themeaning of the content in question. They areof value, therefore, only when the investigatorhas prior or independent knowledge of theirmeaning, role, and significance in the systemof language habits under study.4

. . . The procedure employed in ascertain-ing the propositional content of a propagandacommunication and in weighing the structuralinterrelationships of parts therein undoubt-edly is often less systematic than in rigorousquantitative content analysis in which codingjudgments are closely prescribed. But in prin-ciple, the reliability of such content observa-tions, too, is subject to investigation.

SOME CHARACTERISTICS AND

SPECIAL PROBLEMS OF THE

NON-FREQUENCY APPROACH

The preceding discussion has already suggestedsome of the characteristics of non-frequencyanalysis of instrumental communications. Inthis section, we recapitulate these characteris-tics briefly and single out for more extendedcomment the special problems to which theygive rise.

Approaches to Content Analysis • 151

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 151

Selection of Content Categories:The Search for SpecificDiscriminating Categories

In some quantitative investigations, thetechnical requirement of relatively largenumbers for statistical analysis appears toexercise an important influence on the choiceof content categories and on the size of thesample of raw material to be coded. Symbolsand themes with low frequency of occurrencemay be either ignored or grouped togetherunder broader content categories.

The conscious selection of content cate-gories and sample size with an eye to satisfy-ing technical requirements of statisticalanalysis may be justified when the researchobjective is to make general inferences. Butsuch a criterion is inappropriate when, as inpropaganda analysis, the object is to makespecific inferences about events at particulartimes and places. In the latter case, valuableopportunities for making inferences are lost. . . .

The investigator who is aware of the valueof non-frequency indicators tries, rather, toformulate ever more discriminating contentcategories. He deliberately attempts to “nar-row down” the categories and to make themrelatively more specific. The fact that thisresults in low frequencies, in a single occur-rence, or in no occurrence at all of the contentfeature in question is not of concern to himsince he expects to employ a non-frequencyindicator for purposes of inference. . . .

Emphasis on Hypothesis Formationas Against Hypothesis Testing

Perhaps more so than in most frequencyanalyses, the investigator who employs thenon-frequency approach gives unusual atten-tion and effort to the hypothesis-formationphase of research. There are a number of rea-sons for this emphasis to which we havealready alluded: the search for more discrimi-nating categories, the need to exclude irrele-vant content, and, of course, the rudimentarystate of knowledge and theory about the rela-tionships between content and communicatorvariables.

Relative Emphasis Upon Validity asAgainst Reliability of SemanticalContent Description

The non-frequency approach places moreemphasis upon obtaining valid estimates ofthe speaker’s intended meaning than do manyversions of quantitative content analysis.Because he usually deals with relatively largefrequencies, the quantitative investigator canwork with somewhat lower validity require-ments and can (and must) pay greater atten-tion to reliability considerations. Theinclusion of a small number of incorrectdeterminations of the speaker’s intendedmeaning under a content category composedof large frequencies will probably not affectthe final analysis greatly. In contrast, justbecause he works with low frequencies orsingle occurrences, the non-frequency ana-lyst cannot afford to risk making any invaliddeterminations of the speaker’s intendedmeaning.

Given the crucial importance in non-frequency analysis of validly estimating theone or few meanings, which may be of infer-ential significance, the investigator concen-trates upon making an intensive assessmentof contextual factors upon which such mean-ings are likely to depend. This type ofassessment, however, is particularly difficultto objectify, for it requires taking intoaccount the situational and behavioral, aswell as the linguistic contexts of givenwords. Accordingly, the procedure for infer-ring the speaker’s precise intended meaningscannot easily be made fully explicit.Investigators who attempt to infer intendedmeanings must usually settle for relativelyflexible and interpretative procedures ofcoding content.

Each judgment of an intended meaning is aseparate inference arrived at by taking intoaccount not only dictionary meanings andrules of the language, but also all relevantaspects of the context: situational and behav-ioral as well as linguistic. The concern withinferring intended meaning in this mannerdoes not distinguish the non-frequencyapproach from all frequency analysis. But itdoes serve to differentiate it radically from

152 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 152

one variant of the quantitative approachknown as “manifest” content analysis.*

. . . [I]n “manifest” content analysis theinvestigator estimates the meanings of wordsby applying a set of external criteria as to theusual, customary, or most frequent meaningof the words in question.5 Such a judgmentor estimate of meaning is not [situation] spe-cific. . . . [Use of such criteria] increases theobjectivity of the content-descriptive proce-dure and facilitates achieving reliability ofresults, but . . . may seriously prejudice thevalidity of results if the intended mean-ings . . . differ . . . from the meanings whichthose words ordinarily have.6

In coding content for its usual, or “mani-fest” meaning, the investigator needs to befamiliar with the general rules of the lan-guage, the customary meanings of words forall users of that language, and—in somevarieties of quantitative content analysis—with the usual or most frequent languagehabits of the communicator.

. . . [T]o make valid inference of intendedmeaning in each specific instance of communi-cation, the investigator also takes into accountthe situational and behavioral contexts of thatcommunication. He does so in order to deter-mine which of the possible meanings of thewords in question the speaker intends to con-vey in the instance at hand and the preciseshading of his intended meaning. . . .

In taking into account the behavioral con-text of words, the investigator considers theinstrumental aspect of the communication inits broad action setting. In order to interpretthe precise meaning intended by the speaker inany individual instance he takes into accountthe purpose or objective, which the specificcommunication is, designed to achieve.

In taking into account the situational con-text of the communication being analyzed theinvestigator considers who is speaking, towhom, and under what circumstances. Clues

to the speaker’s intended meanings [areobtained] by considering various known char-acteristics of the speaker, his audience, andthe nature of the speaker-audience relation-ship. The investigator also takes into accountthe time and place of the communication andrelated events preceding or accompanying it.He does so in the expectation that the exactintended meanings of the words employed bythe speaker are shaped by (and understood bythe audience with reference to) certain aspectsof the setting and the stream of related events.This is particularly likely when, as in wartimepropaganda, the communication being ana-lyzed is highly situation- or event-oriented.

Such analysis of the instrumental aspect ofcommunications in their situational contextsis not confined to non-frequency approaches,for it is by no means the case that quantitativeor frequency analysis always limits itself tocoding “manifest” content. In fact, the crite-rion of “manifest” content is not generallyaccepted as essential to the technique of quan-titative content analysis. Cartwright (1953)explicitly rejects it, for example. Even in thearea of political communication research, forwhich Lasswell’s version of content analysishas been primarily developed, many if notmost quantitative content analyses do not inpractice employ the “manifest” content crite-rion. Rather, they often attempt to inferintended meanings and employ relativelyflexible and interpretative procedures for cod-ing content. If this is a drawback from a sci-entific standpoint, it is one present in manyquantitative studies as well as in non-frequency analyses. The difference in thisrespect between frequency and non-frequencyanalyses, therefore, seems to be one ofdegree, stemming partly from the differentnature of the two approaches. The non-frequency approach, by virtue of the limitednumber of cases with which it deals, requiresthat all relevant intended meanings always be

Approaches to Content Analysis • 153

*Berelson’s (1952:14–18) definition of content analysis restricts the technique to the analysis of the manifest content ofcommunications, commonly equated with the existence of widespread agreement on what a communication means.Since agreement is also equated with reliability, Berelson concludes that only manifest content can be analyzed reliably.He thus confounds the methodological requirement of reliability with the conceptual distinction between manifest andlatent content. Experts may well achieve high reliability in coding latent meanings.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 153

estimated as validly as possible and, there-fore, that full account be taken of situationaland behavioral contexts.

The emphasis on validity in the non-frequency approach is accompanied by lessconcern with the reliability of the judgments,or inferences, being made of the speaker’sintended meanings. Rarely are systematicprocedures employed to ensure or to demon-strate the reliability of non-frequency contentdescriptions. A possible explanation for thismay be suggested. Since the non-frequencyanalyst works with a relatively small amountof content data, which he collects himself, hetends to be less self-conscious about the reli-ability problem and less concerned with itthan the quantitative analyst is. As a result, heis likely to overlook what the well-trainedquantitative investigator knows so well andrightly emphasizes, namely, that when theprocedure for obtaining content data onintended meanings is highly interpretative, itis all the more necessary to assess in somefashion the reliability of its results.

CLOSE RELATIONSHIP

BETWEEN DESCRIPTIVE AND

INFERENTIAL PROCEDURES

Some possible circularities of procedurebeleaguer the non-frequency approach to theanalysis of communications content. . . .[C]ontent description is intimately inter-twined with and overlaps the assessment ofinferences from the contents. Inferences as towhat the propagandist is trying to say andwhy he is trying to say it are not neatly discrete.

To illustrate, if one person addressesanother as “you old rascal,” the analyst whois seeking to interpret the intent validly willwant to know if the addressee is an old manor an infant. If it is a baby, one infers thatthe intent is affectionate and simultaneouslydescribes the content as endearment. Thereis a mutually interdependent set of assump-tions here. One has not established the intentindependently and derived the content inter-pretation from that nor has one established

the affectionate meaning of the phrase “youold rascal” independently and derived theintent from that. The two propositions areparts of an interdependent set of inferentialhypotheses.

The question arises of whether this aspectof the non-frequency approach necessarilyentails the danger of analytical bias, or “circu-larity.” That is, by not distinguishing moresharply—as in quantitative content analysis—between the descriptive and inferential phasesof research, does not the investigator risk thepossibility that a hypothesis formulated earlyin the course of his content description willdetermine what he subsequently “sees” andregards as significant in the communication?

The danger of circularity in this sense isindeed potentially present in the proceduredescribed above and undoubtedly occurs inmany low-grade analyses. However, the disci-plined analyst guards against it in severalways. He does not read through the communi-cation material just once, but rereads it asmany times as necessary to satisfy himselfthat the inference favored by him is consonantwith all of the relevant portions and character-istics of the original communication material.Similarly, he considers not just one inferentialhypothesis when reading and rereading theoriginal communication material, but alsomany alternatives to it.

He systematically weighs the evidenceavailable for and against each of these alter-native inferences. Thus, the results of hisanalysis, if fully explicated, state not merely1) the favored inference and the content “evi-dence” for it, but also 2) alternative explana-tions of that content “evidence,” 3) othercontent “evidence” which may support alter-native inferences, and 4) reasons for consider-ing one inferential hypothesis more plausiblethan others.

In this fashion, the disciplined analyst con-trols the dangers of circularity present in theoverlapping of descriptive and inferential pro-cedures. To the extent that he operates in thesystematic, disciplined fashion we have out-lined, the non-frequency analyst follows theaccepted scientific procedure of successiveapproximations.

154 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 154

NOTES

1. For a brief exposition of quantitative con-tent analysis and some of its uses in the study ofpolitical communication, see Lasswell, Lerner,and Pool (1952). These authors’ sober assess-ment of the difficulty of meeting various pre-requisites of statistical content analysis isparticularly useful.

2. On the difference between interpretationof “intent” and interpretation of “symptoms” inthe analysis of communication, see Kecskemeti(1952:61–62).

3. For an explicit statement of this important(but often ignored) assumption of quantitativecontent analysis, see Berelson (1952:20).

4. This point is explicitly discussed inGoldstein (1942:26–27, 38–40, 150).

5. For a detailed discussion, see Lazarsfeldand Barton (1951:155–192). See also Cartwright(1953).

6. The problem of validity noted here arisesonly when “manifest” content is used as a rule-of-thumb substitute for intended meaning. It does notarise, or course, when the investigator is interestedonly in the usual meanings of words, as in linguis-tic studies or in studies of effect on a mass audi-ence rather than of intent of the communicator. Inother words, depending upon the hypotheses andquestions, which are being investigated, the analystmay be interested either in “manifest” meaningor intended meaning. And, when interested inintended meaning, it can be inferred directly in

each instance or employ “manifest” meaning as arough approximation.

REFERENCES

Berelson, B. (1952). Content analysis in communi-cation research. Glencoe, IL: Free Press.

Cartwright, D. P. (1953). Analysis of qualitativematerial. In L. Festinger & D. Katz (Eds.),Research methods in the behavioral sciences(pp. 421–470). New York: Holt, Rinehart &Winston.

Goldstein, J. (1942). Content analysis: A propa-ganda and opinion study. Unpublishedmaster’s thesis, New School for SocialResearch, New York.

Kecskemeti, P. (1952). Meaning, communicationand value. Chicago: University of ChicagoPress.

Lasswell, H. D., Lerner, D., & Pool, I. de Sola.(1952). The comparative study of symbols.Stanford, CA: Hoover Institute Studies, Ser.C: Symbols.

Lazarsfeld, P. F., & Barton, A. H. (1951).Qualitative measurement in the social sci-ences: Classification, typologies and indices.In D. Lerner & H. D. Lasswell (Eds.), Thepolicy sciences: Recent developments inscope and method (pp. 155–192). Stanford,CA: Stanford University Press.

Lochner, L. (Ed.). (1948). The Goebbels diaries.Garden City, NY: Doubleday.

Approaches to Content Analysis • 155

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 155

3.5EVALUATIVE

ASSERTION ANALYSIS

OLE R. HOLSTI*

156

INTRODUCTION

Content analysis—which may be performedusing many different techniques, depend-ing upon the theoretical interests of theinvestigator—is used as a tool for researchin international conflict on the premise thatfrom . . . the decision-makers’ messages,valid inference may be drawn concerning theattitudes of the speaker or writer. [A] methodof content analysis must fulfill a number ofrequirements.

1. It must provide valid results.

2. It must provide reliable results.

3. It must provide results that are capable ofquantification. A continuing study of worldtension levels, for example, requires a tech-nique that provides not only a measure ofthe appearance or non-appearance of certainattitudes, but also of the intensity of thoseattitudes.

One method meeting these requirementsis “evaluative assertion analysis,”1 a form ofquantitative content analysis in whichmessages are translated into simple, three-element assertive format. Numerical valuesare then assigned to the constituent elementsof each assertion, depending upon its direc-tion and intensity.

Evaluative assertion analysis is not merelya technique for scaling previously coded data.Rather, it is an all-inclusive method of contentanalysis; as such, it prescribes comprehensiverules for each step from the initial preparationof the written text through the final analysisof the processed data.

This technique was designed for the studyof evaluative attitudes on a “good-bad” con-tinuum; its senior author has demonstratedelsewhere, through factor analysis of semanticdifferentials, that the good-bad, active-passive,and strong-weak dimensions dominate humanexpression (Osgood, Suci, & Tannenbaum,1957:50–51, 72–73). The mechanics of the method,

*From Holsti, O. R. (1963). Evaluative assertion analysis. In R. C. North, O. R. Holsti, M. G. Zaninovich, & D. Zinnes(Eds.), Content analysis: A handbook with applications for the study of international crisis (pp. 91–102). Evanston, IL:Northwestern University Press.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 156

however, are suitable for analysis of anydimension defined as a continuum betweenpolar opposites. The technique is also read-ily adaptable for measuring categories thatare defined, as in Q-Sort scaling, as a single“more-to-less” continuum. For the study ofinternational conflict, relevant variables, inaddition to those mentioned above, include:hostility-friendship, satisfaction-frustration,strength-weakness, specificity-diffuseness, andviolence-nonviolence. Any dimensions chosenfor analysis must, of course, be explicitly defined.

As with all kinds of content analyses, eval-uative assertion analysis rests upon certainminimal premises regarding (1) the structureof messages, and (2) the operations that canbe undertaken by reasonably skilled coderswith an acceptable degree of reliability. Theseassumptions have, however, been empiricallyshown to be valid (Osgood, Saporta, &Nunnally, 1956:47–48).

It will suffice [here] . . . to outline verybriefly the primary characteristics of thistechnique. A reader contemplating the use ofevaluative assertion analysis should turn tothe original source for a comprehensivedescription. The summary below will serve asan introduction to the method and as the basisfor discussing its utility in research oninternational conflict (Osgood, 1959; Osgoodet al., 1956).

CODING AND SCALING

The steps for converting unedited messagesinto the quantified data against which hypo-theses can be tested are as follows:

1. The initial step in evaluative assertionanalysis is the identification and isolation of

attitude objects in relation to the variablesunder study (Osgood et al., 1956:49). Attitudeobjects are symbols whose evaluative mean-ings vary from person to person; for example,capitalism, foreign aid, United Nations,Khrushchev. Common-meaning terms arethose whose evaluative meanings vary, mini-mally; for example, evil, honest, benevolent. Ingeneral, terms that are capitalized are attitudeobjects rather than common-meaning terms.

2. After attitude objects⎯which mightinclude nations, policies, ideologies, decision-makers, non-national organizations or gen-eral symbols⎯have been identified by thecoders, they are masked with meaninglesssymbols. For example, the text of a Sovietnote to the United States Government statesthat,

In recent days, fascistic elements with the obvi-ous connivance of the United States occupationauthorities have carried out in the Americansector of West Berlin a series of dangerousprovocations against members of the honorguard of the Soviet forces.

After masking of attitude objects withnonsense symbols, the edited text would readas follows:

In recent days, fascistic elements with theobvious connivance of the AX occupationauthorities have carried out in the AX sector ofBY a series of dangerous provocations againstmembers of the honor guard of the CZ forces.

Note that because in the Soviet note theterms “United States” and “American” areinterchangeable, both are masked with thesame symbol.

3. Following these initial operations, themasked message is translated into one of twogeneric assertion forms:

Evaluative Assertion Analysis • 157

Form A: Attitude Object1 (A01)/verbal connector (c)/common meaning term (cm)

Form B: Attitude Object1 (A01)/verbal connector (c)/Attitude Object2 (A02)

Comprehensive guides for translation ofthe text have been prepared (Osgood et al.,1956:59–89), making possible the revision ofthe most complex sentences. Thus an editorial

statement in Jen-min jih-pao: “The treacher-ous American aggressors are abetting thecorrupt ruling circles of Japan,” would becoded as follows:

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 157

The complete text is typed on a seven-column data chart (Figure 1): Values are thenentered in columns 4 and 6 of the data chart.If the project involves the analysis of morethan one dimension, assertions should be

kept separate either by adding an additionalcolumn on the data chart in which identifica-tion of the dimension can be made, or bymaintaining a separate data chart for eachdimension.

158 • PART 3

1. Americans are treacherous (form A)

2. Americans are aggressors (form A)

3. Americans are abetting Japanese ruling circles (form B)

4. Japanese ruling circles are corrupt (form A)

1 2 3 4 5 6 7

Source AO1 c Value of cm Value of Product:Column or Column Columns

3 AO2 5 4 × 6

Figure 1

4. The next step is to determine the direc-tion or valence and intensity of the attitudes,as expressed in the verbal connector and thecommon-meaning term. Each of these is ratedfor both valence (+ or –) and intensity (1, 2or 3). The direction of the verbal connectordepends upon whether the perceived relation-ship is associative (+) or dissociative (–). Thevalence of the common-meaning term isdetermined by whether the expressed attitudelies on the negative or the positive side—however these are defined by the researcher—of the neutral point on the dimensional scale.2

Intensities for the verbal connectors andcommon-meaning terms are also assignedaccording to a comprehensive set of guides.For example, most unqualified verbs or verbalphrases in the present tense are given a valueof ± 3; verbs with auxiliaries are rated ± 2;and, verbs implying only a hypothetical rela-tionship are assigned a value of ± 1. Similarly,common-meaning terms are rated 1, 2, or 3,corresponding roughly to the categories“extremely,” “moderately,” and “slightly.”The assigned values are then entered incolumns 4 and 6 of the data chart.

The values for attitude objects are firstdetermined for all assertions in form A; only

after the values for attitude objects in asser-tions of form A have been calculated, can theevaluation for assertions in form B be made.In the previous example, assertions 1, 2,and 4 are of type A, whereas assertion 3(Americans / are abetting / Japanese rulingcircles) is in form B. The numerical value of“Japanese ruling circles” is calculated byevery assertion of type A. In assertion 4 itwas stated that,

Japanese ruling circles / are / corrupt

From this and other assertions of a similarnature, (A01 / c / cm), that might appear inthe text, it is possible to calculate the per-ceived evaluation of “Japanese ruling cir-cles” (in this case a strongly negative one).That value is then inserted into assertion 3;thus because the Americans are closely asso-ciated (“are abetting”) with the Japaneseruling circles, the value of “Americans” is astrongly negative one.

The reader may ask, “What if the text iscomposed entirely of assertions in form B,making it impossible to determine any values?”This can only occur in messages devoid ofany adjectives or adjectival phrases. Thus, it

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 158

is difficult to imagine an extensive communi-cation in which every assertion is of the AO1

/ c / AO2 type.

5. The scaling of an attitude object on anydimension is the sum of its evaluation in

assertions of form A and form B. In each casethe value is the product of the second (verbalconnector) and third element (common mean-ing term [form A] or attitude object2 [formB]). For example, assertion 4 above wouldappear as follows on the data chart (Figure 2):

Evaluative Assertion Analysis • 159

1 2 3 4 5 6 7

Source AO1 c Value of cm Value of Product:Column or Column Columns

3 AO2 5 4 × 6

Jon-min jih- Japanese are +3 corrupt −3 –9pao ruling circles

Figure 2

The reason for multiplying the values incolumns 4 and 6 is to assure the propervalence or direction of the final evaluation;thus the double negative (X is not bad) asser-tion will receive the same value as the doublepositive (X is good) assertion.

The final evaluation of each attitude objectis calculated in three steps:

1. All values in column seven for assertions oftype A are summed.

2. All values in column seven for assertions oftype B are summed.

3. The total of the values derived in steps xand z is then divided by the modular sum ofcolumn three.

The final evaluation may he expressedalgebraically as,3

APPLICATIONS OF EVALUATIVE

ASSERTION ANALYSIS

The results of the completed analysis may beaggregated in a variety of ways. For some

projects, it might be useful to compare singledocuments, whereas for others the analystmay be interested in compiling totals for alldocuments within prescribed time periods. Inother cases, it may be desirable to combineresults in terms of the senders or recipientsof the messages.4 Such a decision will, ofcourse, be dictated by the nature of theresearch problem.

A number of objections may be raisedagainst evaluative assertion analysis. In thefirst place, the method is admittedly timeconsuming.5 A second point is that the trans-lation of the text into assertion form leads tosome loss in the “flavor” of the originalmessage.

There is some weight in both objections,but the technique has many compensatingadvantages. By translating all messages intoassertion form, much is gained by providing ahigh degree of uniformity for the judges whomust do the scaling. Three major sources oflow reliability are (1) the ambiguity of cate-gories, (2) confusion over the perceived rolesof various attitude objects within a sentence,and (3) difficulty in assigning numerical val-ues to complex statements. Each of thesepoints will be considered in terms of evalua-tive assertion analysis.

The first problem is primarily a theoreticalone and precedes the coding stage. However, atechnique which reduces each sentence to its

Evaluation AO1 =Pn

i= 1

cicmi + Pn

i= 1

ci AO2ð ÞiP

cj jcm+ Pcj jAO2

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 159

constituent elements eliminates the possibilityof more than one dimension appearing in anyone assertion. This may be illustrated by a typ-ical Chinese statement during the U-2 crisis:“The Chinese people firmly support the standof the Soviet Government in opposing UnitedStates imperialism’s war provocation and itssabotage of the Summit Conference.” This sen-tence consists of a number of attitude objects in

a complex relationship. In addition, the sen-tence contains elements of friendship (firmlysupport), hostility (oppose, war provocation,sabotage), evaluation (just stand), and policyconditions (firmly support, war provocation,sabotage, opposing). The unedited text clearlyposes a problem for the scaler; when coded inassertion form, which separates the various ele-ments, the difficulties are materially reduced.

160 • PART 3

Chinese people / firmly support / Soviet Government

Soviet Government’s stand / is / just

Soviet Government / opposes / the United States

United States / is / imperialistic

United States / provokes / war

United States / sabotaged / Summit Conference

A second source of difficulty with manytechniques, arising usually after a sentencehas been masked, is the possibility of confus-ing the perceived roles of the various attitudeobjects in any sentence. In the statement citedabove, for example, there are three actors—the Chinese people, the Soviet Government,and the United States—and maintaining theirperceived relationship is of crucial impor-tance. The translation of statements intoassertion form minimizes the possibility ofconfusion because the position of each ele-ment in the assertion is always the same. Thedata sheets themselves impose a high degreeof uniformity, being divided into columns,which maintain that order throughout.

As stated elsewhere . . ., the essentialtheoretical components of any statement are(1) perceiver, (2) perceived, (3) action, and(4) target. Evaluative assertion analysis isreadily adaptable to such a conceptualization:

Perceiver = Source

Perceived = Attitude Object1

Action = Verbal Connector

Target = Attitude Object2

In addition, there is a vital fifth element, theincorporated modifiers, which may be con-nected to the perceiver, perceived, or target. Oneof the valuable characteristics of evaluativeassertion analysis is that it forces a separation,for the purposes of analysis, of “action asser-tions” from “evaluative assertions.”6 The impor-tance of this point can be illustrated in thestatement, “The valiant X has repelled thetreacherous forces of Y.” Although it includesonly one perceiver (author of the statement), oneperceived (X), one action (has repelled), and onetarget (Y), the statement creates difficulties—both for the coder who must categorize it andfor the scaler who must assign it a numericalvalue—owing to the presence of the affec-tive elements “valiant” and “treacherous,”in addition to the action element of “has repel-led.” But when the sentence is translated into

1. X / has repelled / Y (action assertion)

2. X / is / valiant (evaluative assertion)

3. Y / is / treacherous (evaluative assertion),

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 160

much of the difficulty, both in categorizationand in assigning numerical values, isresolved. Assertion 1 can then be scaled foraction dimensions such as activity-passivity,specificity-diffuseness or violence-non-violence; assertions 2 and 3 can be scaled foraffective dimensions such as good-bad orhostility-friendship.

When this fifth element, the incorporatedmodifier, has been introduced as a separateconstituent, the conversion between the theo-retical framework developed in this manualand evaluative assertion analysis is complete:

Perceiver = Source

Perceived = Attitude Object1

Action (or = Verbal Connectorattributive verb)

Target = Attitude Object2

Incorporated = Common-meaning Modifiers terms

A third source of low reliability—difficultyover the assignment of numerical values tocomplex sentences—is reduced to a minimumby allowing the scaler to focus attention firston the verbal connector and then on the com-mon-meaning term, in each case a singleword or a short phrase. When all data havebeen processed, it is possible to do a rapidcongruity check on the finished data sheets todetect any errors (Osgood et al., 1956:98–99).

Unlike forced distribution scaling tech-niques, evaluative assertion analysis isamenable to comparative analysis across aswell as within universes of statements. Forexample, a project may involve scaling allSoviet statements in the month before theU-2 incident and the month after the affair asseparate bodies of data, in order to testhypotheses concerning the patterns of vari-ables. If, however, it is also desirable to com-pare hostility levels between the two months,this cannot be done using any forced distribu-tion scaling technique without further rescal-ing of at least samples from the combineduniverses, because the mean hostility level for

each month is by definition identical.7 Whilethis additional step is by no means an insur-mountable barrier, a technique which definesthe value of each value category rather rigor-ously beforehand bypasses some of the prob-lems of comparative analysis.

ADAPTABILITY TO COMPUTER ANALYSIS

A final point, which may be considered, is theadaptability of evaluative assertion analysis tocomputer analysis. Translation into assertionform appears to be one of the methods mostreadily adaptable to this type of analysis.*Retrieval of relevant assertions, assignmentof values, and the arithmetic computationscan easily be performed by computer. Finally,the results can be aggregated in terms of theresearcher’s hypotheses.

CONCLUSION

It is almost inevitable that research into inter-national conflict involving any extensive useof content analysis will be group research,utilizing teams of translators, coders, scalers,data recorders, programmers, analysts, andothers. Because both coders and scalers arelikely to be part-time and short-term employ-ees of the research project, the rules forcoding and scaling must be sufficiently com-prehensive to avoid ambiguity, yet simpleenough to be easily learned. For this reason, atechnique of content analysis, such as evalua-tive assertion analysis—by imposing a highdegree of uniformity on each of the varioussteps of data preparation and analysis—canbe of great value. Moreover, research person-nel can be rapidly trained. The increment ofadditional time required to use evaluativeassertion analysis must be weighed againstthe degree of reliability and precision that isgained; in the end, however, the selection of amethodological tool must rest upon the natureof the research problem and the information

Evaluative Assertion Analysis • 161

*See Kleinnijenhuis, de Ridder, and Rietberg (reading 7.5, this volume), who use computer aids to kernelize text in theabove two and several additional types of assertions.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 161

that the researcher seeks to obtain from thecommunications to be analyzed.

NOTES

1. The most complete guide to this techniqueis Osgood et al. (1956). A briefer, and more readilyaccessible, summary may be found in Osgood(1959:41–54). The present brief description of thevarious steps in evaluative assertion analysis isderived from these sources.

2. The continuum has a middle point of zero.Any one statement, however, with an evaluativeproduct of zero should not be coded. For example,the assertion “Kennedy is a man,” has a value ofzero on a friendship-hostility scale. Thus the state-ment is not coded.

3. This formula gives a “weighted” evaluation(see Osgood et al., 1956:92). For research of thiskind described in this manual, an un-weightedevaluation (in which each assertion is given equalvalue), may be more desirable. In this case, the fol-lowing formula may be used:

in either case, final evaluations fall within a rangeof +3 to –3. The rationale for using the un-weighted evaluation formula is discussed inAppendix A of Holsti (1962).

4. Examples of the various uses of evaluativeassertion analysis may be found in Holsti (1962a)and (1962b).

5. Coders are able to process completely aboutone page per hour. A short form of the method isdescribed in Osgood et al. (1956:96–97). Coding

speed can be increased by a factor of three withouta disastrous loss of inter-coder reliability.

6. Adjectives formed from verbs or implyingan object may cause some ambiguity. Consider theassertion, “X is aggressive.” This is both evalua-tive on a number of scales (hostility, friendship,etc.) and implies action against an unspecified orgeneral target. This point was raised by WilliamQuandt (1962).

7. It should be noted that there are dangersinherent in the assumptions that all results are com-parable. For example, even a cursory reading ofChinese Communist statements will reveal a levelof affect rarely found in the more genteel diplo-matic language of the nineteenth century.

REFERENCES

Holsti, O. R. (1962a). The belief system andnational images: John Foster Dulles and theSoviet Union. Ph.D. dissertation, StanfordUniversity.

Holsti, O. R. (1962b). The belief system andnational images: A case study. Journal ofConflict Resolution 6:245–252.

Osgood, C. F. (1959). The representational model.In I. de Sola Pool (Ed.), Trends in contentanalysis (pp. 33–88). Urbana: University ofIllinois Press.

Osgood, C. F., Saporta, S., & Nunnally, J. C.(1956). Evaluative assertion analysis. Litera3:47–102.

Osgood, C. F., Suci, G. J., & Tannenbaum, P. I.(1957). The measurement of meaning.Urbana: University of Illinois Press.

Quandt, W. (1962). The application of the GeneralInquirer to content analysis of diplomaticdocuments. Stanford Studies in InternationalConflict and Integration, July. 1.

Evaluation AO1 =Pn

i= 1

cicmi

3n+

Pn

i= 1

ciðAO2Þi3n

162 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 162

3.6AN ECOLOGY OF TEXT

Memes, Competition, and Niche Behavior

MICHAEL L. BEST*

163

INTRODUCTION

Ideas do not exist in a vacuum. Neither doesdiscourse, the interconnected ideas that makeup conversation and texts. In this research, weinvestigate the pair-wise interaction betweenpopulations of ideas within discourse: Areour text populations in competition with eachother? Do they mutually benefit each other?Do they prey on one another?

This work attempts to build models of pop-ulation memetics by bringing together twodisciplines: Alife** and text analysis. Throughtechniques of text analysis, we determine thesalient co-occurring word sets, texts, and textclusters, and track their temporal dynamics.We then study the life-like properties of thishuman-made system by considering itsbehavior in terms of replicators, organisms,and species.

Richard Dawkins coined the term memeto describe replicating conceptual units(Dawkins, 1976). In studying the population

dynamics of ideas we consider the memeto be the largest reliably replicating unitwithin our text corpus (Pocklington, 1996;Pocklington & Best, 1997). Through textanalysis, we identify memes within a corpusand cluster together those texts, which makeuse of a common set of memes. These clus-ters describe species-like relationshipsamong the texts.

The particular texts we study are posts tothe popular USENET News (or NetNews)system. These posts form the basis of a newAlife environment, the corporal ecology(Best, 1996, 1997). In this ecology, texts arethe organisms, the digital system defined byNetNews describes an environment, andhuman authors operating within some cultur-ally defined parameters are the scarceresource.

At the core of our study sits a large textanalysis software system based primarily onLatent Semantic Indexing (LSI) (Deerwester,Dumais, Furnas, Landauer, & Harshman,

*Best, M. L. (1997). Models for interacting populations of memes: Competition and niche behavior. Journal ofMemetics—Evolutionary Models of Information Transmission, 1.

**Short for artificial life, the study of computer simulations of living systems and their evolution.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 163

1990; Dumais, 1992, 1993; Furnas et al., 1988).This system reads each post and computes thefrequency with which each word appears.These word counts are then used in computinga vector representation for each text. A princi-pal component analysis is performed on thiscollection of vectors to discover re-occurringword sets; these are our memes. Each post isthen re-represented in terms of these memes.By grouping texts, which are close to oneanother within this meme-space, we clustersemantically similar texts into species-like cat-egories or quasi-species (Eigen, McCaskill, &Schuster, 1988).

We proceed to study the interactionsbetween those populations that coincide tem-porally. For each cluster, we compute a seriesthat represents its volume of post activity overtime, for instance, how many texts of a givencluster were posted on a given day. Cross-correlations between each pair of time seriesare then determined. We find that some pairshave strong negative correlations and arguethat these are examples of texts in competi-tion. A number of examples of such competi-tion are explored in depth. We argue that highcompetition is correlated with those text clus-ters that exist within a narrow ecologicalniche; this phenomenon is also observed innatural ecologies (Pianka, 1981).

Note that this is an unusual shift from thetypical Alife environment. We are not syn-thesizing replicators, embodying them intoagents, and observing their life-like interac-tions. Instead, we are studying a pre-existingartifact. Through our analysis, we discoverreplicators within organisms, and use com-putational techniques to observe theirdynamics.

In this paper we first briefly overview theNetNews environment and describe the LSI-based text analysis system. Next, we describethe mechanism used to determine the tempo-ral dynamics and cross-correlations given acorpus of posts. We then relate the cross-correlations to models of interacting popula-tions. In the next section, we examine indepth a couple pairs of post clusters withstrong interactions. We then describe atheory of niches within the corporal ecologyand note that narrow ecological niches are

correlated with significant competition. Weend with our conclusions.

THE NETNEWS CORPUS

Understanding our corpus requires a basicknowledge of the NetNews system. NetNewsis an electronic discussion system developedfor and supported on the Internet (Kantor &Lapsley, 1986). Discussion groups haveformed along subjects ranging from science topolitics to literature to various hobbies. Thecollections of messages are organized into par-ticular subject groups called newsgroups. Thenewsgroups themselves are organized in a tree-like hierarchy, which has general top-level cat-egories at the root and moves to more specifictopics as you progress towards the leaves. Anewsgroup name is defined as the entire pathfrom the top-level category through any subse-quent refining categories down to the name ofthe group itself. Category and group namesare delimited by the period symbol. Thus,“soc.religion” is the name of a newsgroup con-cerned with social issues around the world’sreligions and “soc.religion.hindu” is a morespecific group devoted to Hinduism.

Texts sent to NetNews, the posts, are com-posed of a number of fields only a few ofwhich are relevant here. The user creating thepost is responsible for the post body (that is,the actual text of the message) as well as asubject line. The subject line is composed ofa few words that describe what the post isabout. NetNews software will attach anumber of additional fields to posted mes-sages including a timestamp and the username of the person who created the post.

Posts can be either an independent messageor a follow-up to a previous message. Afollow-up, or “in-reply-to” message, will havespecial threading information in its headerlinking it to the previous posts to which it is areply. This header information allows news-readers to reconstruct the discussion thread.

NetNews today has grown considerablyfrom its beginnings in the late 70’s and 80’s.With over 80,000 posts arriving each day, itprovides an excellent dataset for the study ofcultural microevolution.

164 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 164

THE TEXT ANALYSIS METHOD

We analyze a corpus of posts to NetNews todistill their salient replicating unit or memes,and to cluster together posts, which makecommon use of those memes. We do this byemploying a large system of text analysis soft-ware we have built. The techniques employedare based on the vector space model of textretrieval and Latent Semantic Indexing (LSI).

Vector Space Representation

We begin with a corpus composed of thefull-text of a group of posts. We analyze thecorpus and identify a high-dimensioned space,which describes the conceptual elementswithin the texts. For each post, we identify apoint within this space, which captures itsemantically. This technique is known as avector space representation (Frakes & Baeza-Yates, 1992; Salton & Buckley, 1988). Eachdimension in this space will represent a termfrom the corpus where a term is a word thatoccurs with some frequency (e.g., in at leastthree posts) but not too frequently (e.g., theword “not” is dropped from the term list). Thegoal is to arrive at a set of terms that semanti-cally capture the texts within the corpus.

Given the conceptual space described bythis set of terms, each post can be representedas a point within this space. We score eachdocument according to the frequency each termoccurs within its text, and assign each term/document pairing this term weight. Theweighting we use for each term/documentpair is a function of the term frequency(simply the number of times the term occursin the post) and the inverse document fre-quency (IDF). Consider a corpus of m postsand a particular term, j, within a list of nterms. Then the IDF is given by,

where mj is the number of posts across theentire corpus in which term j appears. Thus, ifa term occurs in 50% or more of the texts theIDF for that term will vanish to zero. But if,

for instance, a term occurs in 10% of thedocuments the IDF will be nearly log(10). Inwords, rare terms have a large IDF.

The term weight for a document, i, andterm, j, is then defined by,

TermWeightij = wij =log(TermFrequencyij ) · IDFj.

Each term weight, then, is a function of theinter- and intra-document term frequencies.

Each post, i, is now represented by a partic-ular term vector, ri = (wi1, wi2, . . ., win). Theentire collection of m term vectors, one foreach post, define the term/document matrix, A.

This set of steps, culminating in theterm/document matrix, forms the basis formuch of modern text retrieval or filtering andis at the core of most Web search engines.

Latent Semantic Indexing (LSI)

LSI is a technique used to distill high-order structures from a term/documentmatrix, consisting of sets of terms thatre-occur together through the corpus withappreciable frequency. The re-occurring termsets are discovered through a principalcomponent method called Singular ValueDecomposition (SVD). While LSI was pri-marily developed to improve text retrieval,we are interested in its ability to find replicat-ing term sets, which act as memes. We willfirst overview the LSI technique and thendiscuss how it discovers memes.

LSI was originally proposed and has beenextensively studied by Susan Dumais of BellCommunications Research and her colleagues(Deerwester et al., 1990; Dumais, 1992, 1993;Furnas et al., 1988). Peter Foltz investigatedthe use of LSI in clustering NetNews articlesfor information filtering (Foltz, 1990).Michael Berry and co-authors researched avariety of numerical approaches to efficientlyperform SVD on large sparse matrices such asthose found in text retrieval (Berry, 1992;Berry & Fierro, 1995; Berry, O’Brien, Do,Krishna, & Varadhan, 1993).

The SVD technique decomposes theterm/document matrix into a left and rightorthonormal matrix of eigenvectors and a

IDFj = logm−mj

mj

����

����� �

,

An Ecology of Text • 165

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 165

diagonal matrix of eigenvalues.* The decom-position is formalized as, Ak = U VT.

The term/document matrix, A, is approxi-mated by a rank-k decomposition, Ak; in factthe SVD technique is known to produce thebest rank-k approximation to a low-rankmatrix (Berry, 1992).

We are interested in only the right ortho-normal matrix of eigenvectors, VT. Each rowof this matrix defines a set of terms whoseco-occurrences have some statistically salientre-occurrences throughout the corpus. That is,each eigenvector describes a subspace of theterm vector space for which the terms are fre-quently found together. These term-subspacesdescribe a set of semantically significantassociative patterns in the words of the under-lying corpus of documents; we can think ofeach subspace as a conceptual index into thecorpus (Furnas et al., 1988).

For instance, an example term-subspacegenerated by analyzing a collection of mili-tary posts found three words as having signif-icant re-occurrences, and therefore replicatingtogether with success: “harbor,” “japan,” and“pearl.” These term-subspaces make up ourreplicators and are our putative memes.Memes are not single re-occurring words butare made up of sets of re-occurring words.

Our final text analysis step is to “com-press” the original term/document matrix bymultiplying it with this right orthonormalmatrix of eigenvectors (in other words weperform a projection). This, in effect, pro-duces a term-subspace/document matrix.Each post is represented by a collection ofweights where each weight now describes thedegree to which a term-subspace is expressedwithin its post’s text.

MEME AND QUASI-SPECIES

Term-Subspace as Putative Meme

We are looking for replicators within thecorpus that are subject to natural selection.Elsewhere we have argued at length as to why

the term-subspace captures the requirementsof a true meme because its word sets act as aunit of selection within the corpus (Best,1996, 1997; Pocklington & Best, 1997). Thestrengths of this term set as a replicating unitof selection are due to it meeting the follow-ing conditions:

• it is subject to replication by copying, • it has strong copying fidelity, • but not perfect fidelity, it is subject to

mutation, • it has a strong covariance with replicative

success (Eigen, 1992; Lewontin, 1970).

We will quickly review each of thesepoints in turn.

SVD techniques exploit structure withinthe term/document matrix by locating co-occurring sets of terms. Clearly, these termsets are replicating through the corpus sincethat is the precise statistical phenomena theSVD analysis detects. However, it is not obvi-ous that this replication is generally due tocopying. Instances of precise copying occurwhen an in-reply-to thread includes elementsof a previous post’s text via the copyingmechanism provided by the software system.Other instances of copying occur within aparticular context or discussion thread whenauthors copy by hand words or phrases fromprevious posts into their new texts. Moreabstractly, replication occurs because certainmemes are traveling outside of the NetNewsenvironment (and thus outside of our meansof analysis) and authors again act as copyingagents injecting them into the corporal ecol-ogy. But, clearly, some re-occurrences are notdue to copying but are a chance processwhere unrelated texts bring together similarwords. The likelihood of such chance re-occurrences will be a function of the size andquality of our replicating unit. In summary,term-subspaces are instances of replicationoften due to copying.

The copying fidelity of a term-subspace isalso a direct outcome of the SVD statistical

166 • PART 3

*The eigenvalue of a transformation is that value on which it converges after infinitely many applications of thattransformation. The eigenvector of a transformation, here T, is that vector whose direction remains unchanged by thattransformation.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 166

analysis. But importantly, the copying fidelityof re-occurring term sets is not perfect acrossthe entire corpus; the term sets will co-occurwith some variation. These mutations are bothchanges designed by human authors andchance variation due to copying errors. Ineither case, the mutations are random from thevantage of selection; in other words, humanauthors are not able to perfectly predict theadaptive significance of their inputted varia-tions. These mutations work “backwards” intothe actual term-subspace representation for apost organism. That is, a random mutation atthe post level will actually result in a randommutation in the vector subspace representation(the memotype) for the post organism. In thisway, the memes as represented in the memo-type are subject to mutation.

Finally, we have elsewhere shown therecan be a strong covariance between thereplicative success of a cluster or thread ofposts and the degree to which they expresscertain term-subspaces (Pocklington & Best,1997). In other words, a group of posts canincrease its volume of activity over time byincreasing the degree to which it expressescertain term sets within its post’s text. This,then, is a covariance between the fitness of apopulation of posts and the expression of aparticular trait as defined by a term-subspace.The demonstration of this covariance is criti-cal to establishing that a replicator is subjectto natural selection.

Quasi-Species

If the term-subspace is a reasonable modelfor the meme then the term-subspace vectorrepresentation of a post is a good model ofthe post’s memotype. Much as a genotypedescribes a point within genetic sequence-space for each organism, the memotypedescribes a point within conceptual sequence-space. By sequence-space, we mean any ofthe search spaces defined by a replicatorundergoing selection. Examples of sequence-spaces include the gene space, protein spacesunder molecular evolution, and the memespace defined within a corporal ecology.

The notion of a quasi-species is due primar-ily to Manfred Eigen (Eigen, 1992; Eigen

et al., 1988). He states that the “quasi-speciesrepresents a weighted distribution of mutantscentered around one or several mastersequences. It is the target of selection in asystem of replicating individuals that replicatewithout co-operating with one another (RNAmolecules, viruses, bacteria)” (Eigen, 1992).One organism is a mutant of another if it is par-ticularly close to the other in sequence-space.

We wish to group our posts into quasi-species. This requires finding groups of mem-otypes that are centered together within theconceptual sequence space. To do so weemploy a simple clustering algorithm, theNearest Neighbor Algorithm (Jain & Dubes,1988). We first normalize each post memotypeto unit length; this amounts to discarding textlength information and representing only therelative strength of each meme within a text.The clustering algorithm then considers eachpost memotype in turn. The current memotypeis compared to each memotype, which hasalready been assigned to a cluster. If the closestof such vectors is not farther than a thresholddistance, then the current vector is assigned tothat cluster. Otherwise, the current vector isassigned to a new cluster. This continues untileach and every vector is assigned to a cluster.

This process assigns each post to a quasi-species defined as those posts which are closeto one another in conceptual sequence-space.

The overall aim in grouping organisms isto bring to light certain evolutionarily signifi-cant relationships. Clearly, our quasi-speciesclustering method is a-historical; that is, itdoes not directly account for descent whengrouping together text organisms. The extentto which such groupings are effective whenstudying the relatedness of natural organismsis a matter of continued controversy as can beseen in the debates of the cladists versus evo-lutionary systematists versus pheneticists.While we are currently agnostic to this con-troversy, we do agree with an original claimof the pheneticists: the more traits are usedwhen assessing the relatedness of individualsthe more accurate are the groupings (Mettler,Gregg, & Schaffer, 1988).

We are in the happy situation of cluster-ing based on the complete memotype for eachof our organisms. The result is that under

An Ecology of Text • 167

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 167

empirical verification our clusters exhibitextremely strong historical relatedness. Wehave found that the vast majority of textsclustered together come from the same in-reply-to thread and thus are related by descent(Best, 1997). But our clustering method hasthe added benefit of grouping related textseven when the in-reply-to mechanism is notused and, alternatively, breaking up texts thatare within the same thread but are not seman-tically related. This is of value since manyposters to NetNews use the in-reply-to mech-anism to post unrelated texts or, alternatively,post follow-up texts without bothering to usethe in-reply-to facility. Thus, we claim thatour clustering mechanism, due to its accessto hundreds of traits, is actually superior atgrouping together both related and descen-dent texts then would be a simple reliance onthe threading mechanism. The clusteringmethod meets our goal of illuminating evolu-tionarily significant relationships.

Comparison to Natural Ecologies

We are describing phenomena within acorpus of texts in terms of population ecologyand population genetics. This is not simply ametaphorical device; we believe that interact-ing populations of texts and their constituentmemes are evolving ecologies quite exactly.However, there are clearly a number of inter-esting differences between genes and memes(as here operationally defined), natural organ-isms and texts, ecologies and corpora.Important differences include the drivingforces behind mutation within the texts andthe role of self-replication and lineage withinthe corpora. We leave to future work a morecomplete analysis of these differences.

MODELS FOR

INTERACTING POPULATIONS

We now turn to studying the interactionbetween quasi-species of posts. We have sofar only studied the pair-wise interactionsbetween post quasi-species. Similar pair-wiseinteractions have been widely studied within

theoretical ecology. Consider two interactingpopulations: one population can either have apositive effect (+) on another by increasingthe other’s chance for survival and reproduc-tion, a negative effect (–) by decreasing theother population’s survival chances, or a neu-tral (0) effect. The ecological community hasassigned terms to the most prevalent forms ofpair-wise interaction, in particular:

• Mutualism (+, +) • Competition (–, –) • Neutralism (0, 0) • Predator/prey (+, –)

(May, 1981; Pielou, 1969).Our goal is to study the pair-wise interac-

tions of quasi-species within the corporalecology with the hope of discovering some ofthese interaction types.

Time Series

To study how the interactions of popula-tions affect growth rates we must define amethod to measure a quasi-species’ growthover time. Recall that a quasi-speciesdescribes a collection of posts, which areclose to one another in sequence-space. Eachof these posts has associated with it a time-stamp identifying when that text was postedto the system; in effect, its birth time anddate. (Note that a post organism has some-thing of a zero-length life-span; it comes intoexistence when posted but has no clear timeof death.)

A histogram of the timestamp data iscreated with a 24-hour bucket size. That is,for each quasi-species we count how manymember texts were posted on one day, howmany on the next, and so forth through theentire population of texts. The datasets cur-rently used span on the order of two weeksand consist of thousands of posts. So, for eachday a quasi-species has a volume of activity,which can range from zero to 10’s of posts.This rather coarse unit, the day, has been cho-sen to neutralize the strong daily patterns ofpost activities (e.g. activity may concentratein the afternoons and drop off late at night,different time zones will shift this behavior

168 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 168

and thus encode geographic biases). Thus, thepatterns of rise and fall in the volume of postswithin a quasi-species when measured at theday level will, hopefully, reflect true changesin interest level and authorship activity ratherthen other external or systemic factors.

The Test Corpus

Figure 1 is a typical graph for the volumeof posts within a particular quasi-species overa period of ten days. This cluster was foundwithin a corpus of all posts sent to thesoc.women newsgroup between January 8,1997 (the far left of the graph) and January28, 1997 (the far right). In the figure, thenumber of posts in a day is represented by theheight of the graph. This particular cluster oftexts exhibited an initial set of posts, a fewdays worth of silence, and then a rapid build-ing up of activity that finally declined precip-itously at the end of the dataset. The entirecorpus used consisted of 1,793 posts over thesame ten day period. The clustering mecha-nism arrived at 292 quasi-species, the largestof which contained 103 posts.

Time Series Cross-Correlation

To study the relationship between the timeseries of two populations of posts we use thecross-correlation function. The use of thecross-correlation to study bivariate processes,and time series in particular, is well known(Chatfield, 1989). Each time series is normal-ized to be of zero mean and unit standarddeviation; that is, we subtract off the meanand divide by the standard deviation. In thisway, the cross-correlations will not be domi-nated by the absolute volume of post activitywithin some cluster and instead will be sensi-tive to both large and small sized clusters.

We assume the readers are familiar withthe regular covariance and correlation func-tions. Then the cross-correlation for two timeseries, X and Y, is given by

Here, γxy = Cov(X, Y) and γxx and γyy are thevariance of X and Y respectively. Note this

ρxy =γxyffiffiffiffiffiffiffiffiffiffiffiffiγxxγyyp :

An Ecology of Text • 169

25

20

15

10

5

0

Nu

mb

er o

f P

ost

s

Time

8.525 8.53 8.54 8.55

× 103

8.535 8.545

Figure 1 A typical time series of posts to quasi-species. Time axis is measured in seconds sinceJan. 1 1970.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 169

170 • PART 3

formulation only considers the cross-correlationfor a zero time lag. That is, it considers howthe two time series are correlated for identi-cally matching points in time. With a nonzerolag the cross-correlation would study caseswhen the two series might have correlationsoffset by some fixed amount of time. Sincewe group our time data into day-long chunksthe zero-lag cross-correlation will be sensitiveto covariances, which have a time offset aslarge as 24 hours; this builds into the timeseries an adequate time lag.

When the cross-correlation between twosets of data is significantly different than zeroit suggests the two sets of data have somerelationship between them. A positive valuemeans an increase in one series is likely toco-occur with an increase in the other series.A negative value means an increase in oneseries is likely to co-occur with a decrease inthe other series.

Figure 2 shows the pair-wise cross-correlations for the 125 largest quasi-speciesclusters within our corpus. The diagonal repre-sents the cross-correlation between a timeseries and itself which, as expected, is identi-cally one. Note that the matrix is symmetric

about the diagonal. The off-diagonal valuesrange from near one to –0.26. The mean cross-correlation is 0.3. This value is quite high, indi-cating that most of these post clusters aresomehow positively related. We suspect thishigh average cross-correlation is at least par-tially due to external or systemic effects,which were not removed by the day-long bucketsize. For instance, our analysis would be sen-sitive to patterns due to the Monday-Fridaywork week common in the West. Further, someof this correlation may be due to a high levelof mutualistic interactions amongst the posts.Clearly, the ideas conveyed within the soc.women newsgroup often share similar contexts.

In our analysis, this overall high correla-tion does not particularly matter since weare concerned with the relative cross-correlation⎯that is, those that are the largestand those that are the smallest.

NEGATIVE CROSS-CORRELATIONS:COMPETITION VERSUS PREDATOR/PREY

We have primarily studied those pairs ofquasi-species with relatively strong negative

30

20

10

0

10

20

30

12010080604020

20

40

60

80

100

120

Figure 2 The Pair-Wise Time Series Cross-Correlation for 125 Largest Quasi-Species Clusters

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 170

An Ecology of Text • 171

cross-correlations; to wit, those whereρxy ≤ –0.2. Note that in all such cases (thereare 42) p < .001, suggesting that withextremely high probability the correlationsare not due to chance. Figure 3 plots twosuch interactions, both fairly characteristicof this population. [It] demonstrates a clearnegative covariance between the two vol-umes of activity of the two post clusters.This negative covariance is both statisti-cally significant and visually compelling.But what do these graphs signify and can itbe interpreted within the rubric of ecologi-cal interactions?

At first glance the interactions appear to beof a predator/prey variety; they have a (+, –)relationship to them. However, competitionmight also produce similar interaction phe-nomena if the competitors are operating closeto some limitation of environmental carryingcapacity. In such instances, the relationshipbetween population sizes will be a zero-sumgame; when one goes up the other must comedown. To be able to classify the interactionsof Figure 3 we need to consider the qualita-tive details of these two interactions throughdirect study of the texts.

Recall that in the case of a predator/preyrelationship, one population enjoys anincreased growth rate at the expense ofanother population (e.g., one population feedson the other). The presence of a relativelylarge population of predators will result in adiminished level of success for the prey (they

get eaten up). Conversely, the relative absenceof prey will result in diminished success forthe predator (they have nothing to eat).

Now consider the case of competition. Incompetition, two interacting populationsinhibit each other in some way, reducing eachother’s level of success. This often occurswhen the two populations rely on the samelimited resource. Unlike the predator/preyrelationship where the predator requires theprey for success, with competition the twopopulations would just as soon avoid eachother all together.

This pressure towards avoidance is thesource of much ecological diversity since itpropels populations to explore new and there-fore competition-free niches (Pianka, 1981).An ecological niche, for some particularspecies, is simply that collection of resourcesthe species relies on. Interspecific niche over-lap occurs when two or more species shareone, some, or perhaps all of their resources.When those resources are scarce, interspecificcompetition will result. The width of a nicheis simply a qualitative sense of the variety andnumber of resources a population makes uses.

Competition and Niche Behavior

We have studied posts that make up thefour quasi-species shown in Figure 3 in anattempt to qualitatively classify their interac-tions. The quasi-species (on the left side) ofFigure 3 are made up of posts within a single

The cross-correlationbetween these twoseries is −0.26

8

7

6

5

4

3

2

1

08.525 8.53 8.535 8.54 8.545 8.55

× 103Time

Nu

mb

er o

f P

ost

s

The cross-correlationbetween these twoseries is −0.23

8

7

6

5

4

3

2

1

08.525 8.53 8.535 8.54 8.545 8.55

× 103Time

Nu

mb

er o

f P

ost

s

(a) (b)

Figure 3 Volume of Activity for Two Quasi-Species

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 171

thread. The subject line for these posts reads,“Men’s Reproductive Rights.” In general,these posts are concerned with the responsi-bilities and rights of men towards theirunborn children. The quasi-species displayedwith a dashed line in this part of the figure iscentered on the use of contraceptives. It con-sists of a collection of posts wherein theauthors debate who is most responsible, thewoman or the man, when using contraception.The quasi-species with a solid line dealsinstead with the use of abortion and whetherthe father has any intrinsic rights in decidingwhether or not to abort an unborn child.

[On the right side of] Figure 3, the twoquasi-species are also from a single thread.The subject line here reads, “Unequal distrib-ution of wealth?” This particular thread ofdiscussion was rather large. In fact, there werea total of 365 posts to this thread, which ourtext analysis tools broke up into a number ofquasi-species due to significant bifurcationsof the topic. In other words, many paralleldiscussions occurred all within a single in-reply-to thread. The cluster of discussionshown with the solid line centered around adebate as to whether the US military was a“socialist collective.” The quasi-species withthe dashed line was a debate on the value ofreleasing the mentally ill from hospitals.Clearly, these two debates are quite dissimilareven though they span the same set of daysand are posts to the same discussion thread.

The quasi-species [on the left in] Figure 3are different but related discussions. Those[on the right] are different and not clearlyrelated. Still, we believe that both of these setsof interactions demonstrate elements of com-petition. Within the texts, there is no evidenceof predator memes; in fact, the memes seementirely orthogonal to one another. However,in both examples the memes are competingfor the same collection of human authors whomust act as their agents if they are to propa-gate and succeed. This seems even morelikely when we consider that all these postsare to the same newsgroup, which due to itsnarrow subject area supports only a limitedsupply of human posters. Moreover, each pairof interactions are confined to a single threadof discussion, which again has an even more

limited set of potential human authors sinceusers of the NetNews system often zero-in onparticular threads they find interesting andignore others. After inspecting most of theinteractions, which demonstrated strongnegative correlations, we observed no examplesof predator/prey interactions but manyinstances, which appeared to be examples ofcompetition.

Statistical Artifacts

We computed the cross-correlation between125 different clusters, arriving at 15,625 differ-ent correlations. It is possible, therefore, thatthe cross-correlations with large negative val-ues exist simply by chance; they represent thetail of the distribution of correlations.

However, we believe that our qualitativeanalysis provides strong evidence that thesenegative correlations are not artifacts but areindeed due to an interaction phenomenonbetween the two quasi-species. The two pairsof quasi-species described in detail abovedemonstrate this point. The likelihood thattwo quasi-species would be brought togetherby mere chance and both be from the samethread (out of 324 threads within the corpus)seems vanishingly small.

Competition

We now will test our theory that theseinteractions are of a competitive nature.Again, recall that competition is often causedby populations existing within the same (nar-row) ecological niche. What makes up an eco-logical niche for a meme within NetNews?We argue that the newsgroups themselvesmake up spatially distributed ecologicalniches. Since there is relatively little interac-tion between newsgroups (save the phe-nomena of cross-posting) we would expectthese niches to behave something like islandecologies⎯they remain relatively isolatedfrom each other. Within a single newsgroup(which is all we have studied so far) nichesmight be described by threads of discussions.As previously stated, we have found that indi-vidual posters to the system tend to becomeinvolved in particular in-reply-to threads thatinterest them. Thus, the memes within a

172 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 172

particular thread make use of a set of humanresources, which is smaller than the entire setof potential human resources available to thenewsgroup. These resources define the niche.

We theorize that cross-correlations thatapproach –1 in our corpus are examples ofcompetition, and competition will be morelikely between populations that are posted tothe same threads and thus have overlappingniches. The most direct way to test this theoryis to see if negative cross-correlationsbetween two quasi-species correlate with thedegree to which they post to the same threads.For each of the 125 × 125 pair-wise interac-tions we computed the number of threadseach of the quasi-species pairs had in com-mon and divided that by the total number ofthreads posted to by each quasi-species. Forexample, one quasi-species may contain poststhat went to two different in-reply-to threads.Another quasi-species may have posts thatspan three different threads one of which isidentical to a thread within the first group. Sothis pair of quasi-species would have postedto a total of four different groups one of whichwas shared. Their relative niche overlapwould therefore be 0.25.

We calculated the correlation coefficientbetween the negative cross-correlations ofFigure 2 and the percentage of thread overlapbetween these quasi-species pairs. We foundthis correlation to be –0.04. While this corre-lation is statistically significant (p < .001), itis not very pronounced. The negative sign,though, does indicate that as the level ofcompetition increases (a negative cross-correlation) the percent of overlap of theirniche also increases (a larger positive sharedthread percentage).

This small correlation coefficient may bedue to a small signal/noise ratio. Since mostpair-wise interactions result in small correla-tions, the relative number of large negativecorrelations is quite small. The number ofinteractions grows with the square of thenumber of quasi-species. We suspect that asimpler experiment, which grows linearlywith the number of quasi-species, will have abetter signal/noise ratio.

We have studied the correlations betweenthe absolute number of in-reply-to threads a

quasi-species is posted to and the averagedegree to which the quasi-species finds itselfcorrelated with other clusters. Our hypothesisis that the absolute number of threads a quasi-species is posted to will be related to the aver-age degree of competition the quasi-speciesexperiences in its interactions. Since thevariety of resources used by an entity definesits niche, if a quasi-species is posted to a rela-tively small number of threads then it exists ina narrow ecological niche. Should there subse-quently be any interspecific overlap of thesenarrow niches, scarcity will result in competi-tive encounters. We computed the correlationcoefficient between the total number ofthreads within a quasi-species and its averagecross-correlation value. The correlation coeffi-cient here is 0.25. Thus, as the number ofthreads within a quasi-species increases (theset of available resources is widened) the aver-age level of competition diminishes (the meanpair-wise cross-correlation also increases).This correlation is statistically significant(p < .001) and rather pronounced.

We further computed the correlation coef-ficient when the absolute number of threadswas normalized by the size of the quasi-species. We might expect that the number ofthreads employed by a quasi-species wouldgrow with the number of posts within thatquasi-species. In other words, as a quasi-species gets larger the number of threadsincreases too. This might affect the analysisabove such that instead of measuring nichewidth we were simply measuring quasi-species size. Dividing out the size amounts tocomputing the average number of threadsemployed by a post for a given quasi-species.When this set of values was correlated withthe mean cross-correlation, we arrived at anearly identical coefficient as above andagain clear statistical significance. Thus,quasi-species size is not a major factor inlevel of competition.

CONCLUSIONS

We have described a set of text analysis tools,based primarily on Latent Semantic Indexing,which distill replicating memes from a corpus

An Ecology of Text • 173

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 173

of text. We have trained this analysis systemon a corpus of posts to NetNews. This makesup a corporal ecology where the posts areorganisms, NetNews is the environment, andhuman authors are a scarce resource. Weargue that this represents an important bridg-ing of text analysis and the Alife researchprogram. Further, it amounts to a novel shiftfor Alife research—rather than synthe-sizing life-like agents, we are analyzing apre-existing environment and discoveringlife-like behaviors.

In results reported here, we group togetherposts, which make use of similar sets ofmemes. These groups, clouds within a con-ceptual sequence-space, describe quasi-species. For each quasi-species, we computeits time-wise volume of activity by his-togramming its daily post levels. We thenstudy the pair-wise interaction between quasi-species by computing the cross-correlationsbetween their time series. In our corpus,strong negative cross-correlations signifyconditions of competition between the inter-acting populations where the quasi-speciesare competing for a limited set of humanauthors. Furthermore, quasi-species with rela-tively narrow ecological niches, those thatmake use of a small number of in-reply-tothreads, are more likely to be in competitionwith other quasi-species. This behavior isanalogous to what is found in natural ecolo-gies (Pianka, 1981).

Why do these quasi-species compete?Qualitative analysis of the posts, such as thosedescribed in the previous section, shows thatmany competing quasi-species are posts sentto the same or similar threads. Competition isover the scarce authorship resources withinthese specific thread niches. Over time a par-ticular thread of discussion may bifurcate intotwo or more internal themes which then pro-ceed to compete for “air-time” within thethread.

REFERENCES

Berry, M. W. (1992). Large-scale sparse singularvalue computations. International Journal ofSupercomputer Applications 6:13–49.

Berry, M., O’Brien, T., Do, G., Krishna, V., &Varadhan, S. (1993). SVDPACKC (Version1.0) User’s guide. University of TennesseeComputer Science Department TechnicalReport, CS-93-194.

Berry, M. W., & Fierro, R. D. (1995). Low-rankorthogonal decompositions for informationretrieval applications. University of TennesseeComputer Science Department TechnicalReport, CS-95-284.

Best, M. L. (1996). An ecology of the Net: Messagemorphology and evolution in NetNews. MITMedia Laboratory, Machine UnderstandingTechnical Report, 96-001.

Best, M. L. (1997). An ecology of text: Using textretrieval to study Alife on the net. Journal ofArtificial Life 3:261–287.

Chatfield, C. (1989). The analysis of time series:An introduction. London: Chapman and Hall.

Dawkins, R. (1976). The selfish gene. New York:Oxford University Press.

Deerwester, S., Dumais, S. T., Furnas, G. W.,Landauer, T. K., & Harshman, R. (1990).Indexing by latent semantic analysis. Journalof the American Society for InformationScience 41,6:391–407.

Dumais, S. T. (1992). LSI meets TREC: A statusreport. In D. Harman (Ed.), The First TextRetrieval Conference (TREC-1). NISTSpecial Publication 500-207.

Dumais, S. T. (1993). Latent semantic indexing(LSI) and TREC-2. In D. Harman (Ed.), TheSecond Text Retrieval Conference (TREC-2).NIST Special Publication 500-215.

Eigen, M. (1992). Steps towards life: A perspectiveon evolution. Oxford: Oxford University Press.

Eigen, M. J., McCaskill, J., & Schuster, P. (1988).Molecular quasi-species. Journal of PhysicalChemistry 92,24:6881–6891.

Foltz, P. W. (1990). Using Latent SemanticIndexing for information filtering. Proceedingsof the 5th Conference on Office InformationSystems, ACM SIGOIS Bulletin, 11,2–3:40–47.

Frakes, W. B., & Baeza-Yates, R. (Eds.). (1992).Information retrieval: Data structures andalgorithms. Englewood Cliffs, NJ: Prentice Hall.

Furnas, G. W., Deerwester, S., Dumais, S. T.,Landauer, T. K., Harshman, R. A., Streeter, L. A.,& Lochbaum, K. E. (1988). Information retri-eval using a Singular Value Decompositionmodel of Latent Semantic Structure. Proceed-ings of the 11th International Conference onResearch and Development in InformationRetrieval (SIGIR). New York: Association forComputing Machinery.

174 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 174

Jain, A. K., & Dubes, R. C. (1988). Algorithmsfor clustering data. Englewood Cliffs, NJ:Prentice Hall.

Kantor, B., & Lapsley, P. (1986). Network newstransfer protocol: A proposed standard for thestream-based transmission of news. InternetRFC-977.

Lewontin, R. C. (1970). The units of selection.Annual Review of Ecology and Systematics1:1–18.

May, R. M. (Ed.). (1981). Theoretical ecologyprinciples and applications. Oxford: BlackwellScientific Publications.

Mettler, L. E., Gregg, T. G., & Schaffer, H. E.(1988). Population genetics and evolution(2nd ed.). Englewood Cliffs, NJ: Prentice Hall.

Pianka, E. R. (1981). Competition and nichetheory. In R.M. May (Ed.), Theoretical ecol-ogy principles and application (pp. 167–196).Oxford: Blackwell Scientific Publications.

Pielou, E. C. (1969). An introduction to mathemat-ical ecology. New York: Wiley-Interscience.

Pocklington, R. (1996). Population genetics andcultural history, Masters Thesis. Burnaby:Simon Fraser University.

Pocklington, R., & Best, M. L. (1997). Cultural evo-lution and units of selection in replicating text.Journal of Theoretical Biology 188:79–87.

Salton, G., & Buckley, C. (1988). Term-weightingapproaches in automatic text retrieval.Information Processing & Management24,5:513–523.

An Ecology of Text • 175

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 175

3.7IDENTIFYING THE UNKNOWN

COMMUNICATOR IN PAINTING,LITERATURE AND MUSIC

WILLIAM J. PAISLEY*

176

The task of identifying the author of ananonymous work has long challengedstudents of communication. The prob-

lem is usually posed in one of three ways:

(1) A work is attributed to a communica-tor well known for other works (the generic“communicator” here denotes painter, writer,composer), but it may be an imitation orforgery (e.g., the “Corelli” violin sonatas byFritz Kreisler)

(2) A truly anonymous work is attributedby default to a well-known communicatorwhose own works are similar, but it may bethe work of a disciple or lesser-known col-league (e.g., the “Letter to the Hebrews,” longattributed to Paul)

(3) A work is attributed variously to eachof two or more well-known communicators(e.g., the perennial Shakespeare-Bacon-Marlowe-Oxford wrangle)

Research directed to this problem has twogoals. The practical goal is a correct attribu-tion of the anonymous or disputed work. Thetheoretical goal is a better understanding ofone phase of the communication process, theencoding of messages.

When authorship of a work is disputed, wemay assume that historical evidence is inade-quate. Therefore, clues must be sought in thetext itself, usually in the style of the work.“Style,” however, is a concept often embrac-ing the ineffable qualities of a communicator’soutput. To focus on objective characteristicsof the text, a concept such as “encodinghabits” should be substituted for “style.” Thenthe unique character of a work may bedefined in terms of successive decisions madeby the communicator as he chooses from hisrepertory of symbols (notes, words, brushstrokes, etc.).

In the last quarter of the 19th century,when the connoisseur of art finally rejected

*From Paisley, W. J. (1964). Identifying the unknown communicator in painting, literature and music. Journal ofCommunication 14, 4:219–237.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 176

dubious historical evidence and turned hisattention to the encoding habits of painters aspreserved in their works, he achieved a greatrefinement in the technique of connoisseur-ship. Recent successful efforts to identify theauthors of anonymous literary works reveala refinement of method in that field also.Although art connoisseurship has remained aquantitative science while literary detectionhas become exhaustively quantitative, never-theless there is surprising consensus in bothfields concerning those encoding habitswhich clearly distinguish a communicatorfrom all other communicators with superfi-cially similar output. This consensus is inter-esting insofar as it defies common sense andfavors minor encoding habits which areinconspicuous in the work and do not carrythe burden of meaning. Since the communal-ity of the two streams of research has perhapsnever been discussed, the first half of thisreport will summarize their shared assump-tions and procedures. The second half reportsthe results of an extension of these proceduresto the study of musical encoding habits.

THE CONNOISSEUR AS

CONTENT ANALYST

In 1889 the best-known connoisseur of thiscentury, Bernard Berenson, visited Rome anddiscovered a new way of looking at paintings.He has left a memoir of the experience:

A generation ago, when a beginner, I enjoyedthe privilege of being guided through theBorghese Gallery by a famous connoisseur.Before the Pieta now ascribed to Ortolano I fellinto raptures over the pathos of the design. Mymentor . . . cut me short with, “Yes, yes, butplease observe the little pebbles in the fore-ground. They are highly characteristic of theartist.” “Observe the little pebbles” has becomeamong my intimates a phrase for all thedetailed, at times almost ludicrously minute,comparisons upon which so large a part ofactivities like mine are spent. (Kiel, 1962:145–146)

Berenson’s mentor was Giovanni Morelli,whose “scientific connoisseurship” provoked

international controversy in art circles.Whereas his colleagues were content toaccept the testimony of Vasari and other earlyhistorians of art, Morelli contended that paint-ings sufficiently identified themselves, eachwork signed by its creator in dozens of littledetails, which no two painters executed alike.Traditional criteria based on the overall styleof a work were entirely misleading, heclaimed, since the student in any Renaissancestudio soon learned the superficial marks ofhis master’s style.

Morelli was at heart a taxonomist, asFig. 1 and the following excerpts illustrate:

Look at the Raphaelesque type of ear in thechildren; see how round and fleshy it is; how itunites naturally with the cheek and does notappear to be merely stuck on, as in the works ofso many other masters; observe the hand of theMadonna with the broad metacarpus and some-what stiff fingers, the nails extending to the tipsonly. (Morelli, 1900:37)

Among Sandro Botticelli’s characteristic formsI will mention the hand, with bony fingers—notbeautiful, but always full of life; the nails,which, as you perceive in the thumb here, aresquare with black outlines. (Morelli, 1900:35)

Such attention to minor detail broughtderision from Morelli’s colleagues. They didnot understand his distinction between“appreciating” a work and studying a painter’sencoding habits. Morelli was called “the con-noisseur of fingernails.” Yet the success of hismethod could not be ignored [he exposedscores of mislabeled Renaissance works inItalian and German galleries, 46 in theDresden Gallery alone (Wind, 1964:29)], andconnoisseurship gradually committed itsfuture to content analysis. Bernard Berensonhas summarized the assumptions of the newconnoisseurship:

Obviously, what distinguishes one artist fromanother are the characteristics he does notshare with others. If, therefore, we isolate theprecise characteristics distinguishing eachartist, they must furnish a perfect test of thefitness or unfitness of the attribution of agiven work to a given master. (Berenson,1902:123–124)

Identifying the Unknown Communicator • 177

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 177

178 • PART 3

Figure 1 Hands and Ears Sketched by Giovanni Morelli to Illustrate Idiosyncrasy in the Executionof Minor Details by Renaissance Painters

SOURCE: Morelli (1900:77–78).

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 178

THE LITERARY DETECTIVE AS

CONTENT ANALYST

The venerable “who was Shakespeare?” con-troversy provides a mirror in which thechanging character of literary sleuthing canbe traced. If it happened that connoisseurs ofart turned to content analysis when they foundhistorical evidence inadequate, then certainlytheft colleagues in literature turned to contentanalysis as the only alternative to a conspira-torial view of history. That is, the modemchampions of Bacon, Marlowe, the Oxfordgroup and other contenders for the crownhave admitted that history indeed appears toaffirm that the playwright and the actorShakespeare are one man, but history isdeceitful and by clever subterfuge the plays ofBacon (or the Oxford group, or the exiledMarlowe) were performed under the name ofthe dull-witted and mercenary Stratfordian.Courtly discretion (they assert) led Bacon andOxford to avoid association with the vulgarstage, while Marlowe thought it necessary topretend that he was dead. When scholarshatch conspiracies, then content analysis ofthe plays themselves is clearly the onlyrecourse.

Unfortunately, this Elizabethan imbrogliomay not be solved, even by content analy-sis. Of all the contenders, only the leastlikely, Marlowe, left plays, which permit thenecessary comparison of encoding habits.Mendenhall (1887) thought it legitimate touse Bacon’s civil and political essays as a cor-pus for comparison, but no modem scholarwould insist that a communicator’s encodinghabits must remain constant between worksas topically and structurally remote as theessays and the plays. If a striking consistencyhad been found between essays and playsin Mendenhall’s analysis of word-length fre-quencies, then Bacon’s claim would havebeen strengthened. The inconsistencies actu-ally found are evidence of nothing.

If content analysis is not likely soon todisclose who Shakespeare was, neverthelessother applications of the technique in author-ship identification have been successful.

Moreover, over time the focus of these effortshas shifted steadily from major to minorencoding habits. Three representative effortswill be discussed in chronological order:

(1) Yule’s study of Gerson, à Kempis,Macaulay and others.

George Udny Yule contributed a great dealmore to the study of literary vocabulary thancan be acknowledged here, but at least hisstudy of the Imitatio Christi must bedescribed. Because the Imitatio had beenascribed both to Gerson and to Thomas àKempis, Yule first compiled a large sampleof undisputed works by each man. Then inthe Imitatio, he chose words, chiefly nouns,which seemed to be favored by the unknownauthor. Finally, he tabulated frequencies ofoccurrence for these words not only in theImitatio but also in the two comparison sam-ples. When the Imitatio distributions werearrayed beside those of Gerson and Thomas àKempis, it could not be doubted that àKempis was the author.

Yule’s perseverance in this most tediousresearch was remarkable, but his concentra-tion on major encoding habits may haveincreased his labor while reducing the sensi-tivity of his measure. Thus use of the noun“prayer” involves a major encoding (in thiscontext major encoding habits are those thatcarry the burden of meaning), while use ofthe article “the” involves a minor encodinghabit. Unlike the ubiquitous “the,” “prayer”is a relatively rare word; large samples of textmust be scanned to obtain minimum stablefrequencies. Moreover, because “prayer” istopic-constrained (found in texts with a reli-gious topic), texts on different topics by thesame author may yield anomalous frequen-cies of “prayer.”

In his comparison of three essays(“Milton,” “John Hampden,” “Frederick theGreat”), Yule (1944:122) shows—perhapsinadvertently—the effect of topic differ-ences on major encoding habits. The mostcommon nouns in each essay are (indescending order):

Identifying the Unknown Communicator • 179

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 179

Milton Hampton Frederick

Man King KingPoet parliament manCharacter man armyPoetry house warMind time timeTime commons yearWork people princePeople member partLiberty year powerPower party day

Although all three essays are biographiesof public figures by the highly idiosyncraticMacaulay, these lists lack authorship commu-nality. Sixteen words are found in one listonly; four words in two lists; only two wordsin all three lists. Of all literary encodinghabits, the use of nouns is perhaps most con-strained by topic differences.

In the course of this research Yule himselfconsidered, and decided against, the study ofminor encoding habits:

Would it be of service to include other words(in addition to nouns, adjectives, and verbs)? Ifso, should it be all other words or should therebe some specified exceptions, such as say thedefinite article in English, or auxiliary verbs?My impression is that the inclusion of allwords without exception would be a mistake;that the inclusion of a and the and is and thelike, each with a very large number of occur-rences in any author, would merely tend toobscure differences, and it would be best tolimit data to what are in some sense “signifi-cant words.” (Yule, 1944:280)

Two recent studies indicate that Yule’s biasagainst “insignificant words” was unjustified.

(2) Ellegard: the “Junius” letters.

Ellegard (1962) incorporated many ofYule’s procedures in his investigation of theauthorship of the “Junius” letters (publishedover that pseudonym between 1769 and 1772in the London Public Advertiser). Alert, how-ever, to the noun-counting trap, which costYule so much labor, Ellegard limited his ownselection of words to abstract nouns, adjectives,adverbs and prepositional constructions.

Without explaining his strategy in these terms,Ellegard nonetheless rejected Yule’s focus onmajor encoding habits.

There are five steps in Ellegard’s procedure:

(i) The disputed text is scanned for words thatare conspicuously frequent. These are“plus words” for that text

(ii) A large body of contemporary writing isscanned for words which are conspicu-ously absent in the disputed text. These are“minus words” for that text

(iii) When lists of “plus words” and “minuswords” have been compiled, all texts arescanned again and exact frequencies ofoccurrence are recorded. These countsmay show that some words are not as“plus” or as “minus” as had been sup-posed, and the list may have to be revised

(iv) All likely authors of the disputed text aresampled. Their texts are counted for fre-quencies of “plus” and “minus” words

(v) That author whose word-profile mostnearly resembles that of the pseudonymousauthor is the probable choice, provided thatother candidates’ profiles are significantlydissimilar (i.e., their counts regularly falloutside the confidence limits establishedfor the many samples of disputed text)

This painstaking procedure permittedEllegard to conclude that Sir Philip Francis,historically regarded as the author of the“Junius” letters, is a best choice on statisticalgrounds also.

(3) Mosteller and Wallace: the Federalistpapers.

In 1963, about 80 years after Morelliobserved the stable idiosyncrasy or minorencoding habits in painting, Mosteller andWallace reported the same phenomenon inliterature. Morelli ignored major encodinghabits because he assumed that any compe-tent forger could fool him at that level. Intheir study of twelve disputed Federalistpapers, Mosteller and Wallace chose to ignoremajor encoding habits for two reasons: (i) thetwo authors in question, James Madison andAlexander Hamilton, conformed their writ-ings to the intricate and formal rhetoric of thetime—hence superficial differences cannot be

180 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 180

found; (ii) the disputed Federalist papersdiscuss a wide range of topics—hence author-ship differences and topic differences may beconfounded in the case of semantically signif-icant words.

Mosteller and Wallace define their focus inthese terms:

As we implied in discussing the word war, thewords we want to use are noncontextual ones,words whose rate of use is nearly invariantunder change of topic. For this reason, the littlefiller words, called function words, are espe-cially attractive for discrimination purposes.(1963:280)

Within the class of function worth (as distin-guished from content words, cf. Fries, 1952)there is great variation in frequency of occur-rence. After they had tested each function wordfor low variance within an author’s works andhigh variance between the works of the twoauthors, Mosteller and Wallace divided their listof words into sets according to frequency. Theyfound that the highest-frequency set (consistingof the words to, there, on, of, by, an and also)discriminated as powerfully as the three lower-frequency sets taken together. Remarkably,the preposition upon proved to be a reliablediscriminator by itself, since Hamilton used itfive times more often than Madison.

Calculating odds from these discrimina-tions, Mosteller and Wallace were able toattribute all the disputed papers to Madison.The least distinctive paper gives Madison 80 to1 odds; the next weakest gives him 800 to 1odds; thereafter the odds become astronomical.

In summary, Mosteller and Wallace reliedon the “insignificant” words which Yulethought too ubiquitous to distinguish betweenauthors. As a result they achieved discrimina-tion (when the two sets of known papers arecompared) with median odds of 3 million to 1.

A WORKING DEFINITION

OF “MINOR ENCODING HABITS”

The distinction between major and minorencoding habits has been viewed from twoperspectives, which may now be combined.

Morelli and Berenson introduced four defin-ing criteria:

(1) The detail to be studied should not beprominent; else imitators will appropriate it

(2) It should be executed mechanically (i.e.,with little feedback for self-criticism); elsethe communicator may consciously vary itfor effect

(3) Its use should not be dictated wholly byconvention (e.g., the halo in Renaissancepaintings)

(4) It should not be so rare that examplescannot be found in each disputed work

To this list, Mosteller and Wallace wouldadd:

(5) The detail should remain constant in fre-quency whatever the topic of the work; elsetopic difference will confound authorshipdifferences

But their quantitative procedure requires arestatement of other criteria:

(2a) Use of the detail should exhibit low vari-ance within a communicator’s works

(3a) Use of the detail should exhibit low vari-ance within a communicator’s works

(4a) Frequency of occurrence should be highrelative to the sampling error

These criteria have helped to identifyunknown communicators in painting and liter-ature. It would be desirable to report their rel-evance to the study of authorship differencesin music, the third leg of the triangle of “artis-tic” communication, but that research has yetto be done. The findings reported below con-cern only one of many aspects of musicalcommunication, which invite investigation.

IDENTIFYING THE

COMMUNICATOR IN MUSIC

In the encoding process called musical com-position, certain variables can assume manystates and therefore require successive

Identifying the Unknown Communicator • 181

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 181

choices on the part of the communicator.Among these are: (1) tempo, (2) dynamics,(3) harmony, (4) instrumentation, (5) pitch.Each variable implies a set of encodinghabits, major or minor, and each may prove todiscriminate reliably between composers.However, it is within the scope of the presentstudy to consider only the variable of pitch, ofnote-to-note pitch transitions in the themes ofselected composers. This variable has beenchosen for two reasons: (1) changes in pitchare easily coded for processing by computer;(2) some research involving tonal transitionshas already been reported.

Previous studies of tonal transitions havebeen guided either by information theory or bythe probability model of stochastic processes.The measures employed in the information-theory studies are those of uncertainty andconstraint, each composer or sample of worksdescribed by two or three parameters only(cf. Youngblood, 1958). The stochasticprocess studies (cf. Brooks et al., 1957) yieldsmatrices of transitional probabilities over 1, 2,3 . . . n steps, a rich mine of information aboutthe composer’s encoding habits.

Unfortunately, no study reports testabledata on differences between composers.Lacking replication within the works of eachcomposer, variance terms necessary for testingthe inter composer difference cannot be com-puted. This is not to be construed as a defectin method, of course, since the investigatorswere not seeking to test such differences. Theysought single best estimates for each com-poser (or for each sample of melodies) eitherof uncertainty or of the probability of a giventransition, and these estimates they obtained.

Therefore, this study focuses on a variablethat, although not entirely unresearched,varies in yet-undetermined patterns betweenand within composers’ works.

Problem

Bernard Berenson said that connoisseur-ship “proceeds, as scientific research alwaysdoes, by the isolation of the characteristicsof the known and theft confrontation withthe unknown” (Berenson, 1902:124). Thisstudy seeks to isolate, in the themes of five

composers (Bach, Haydn, Mozart, Beethoven,Brahms), those minor encoding habits, whichidentify each man. Then, given the character-istics of the known, four “unknown” samplesare tested to determine whether they may beaccepted or rejected as the work of any of thefive composers. Since the four test samplesare actually of known authorship (Handel,Mozart, Beethoven, Mendelssohn), the valid-ity of the discrimination technique may beassessed. Yet the rigor of the test is preservedby setting aside the “unknown” samples andleaving them unanalyzed until the discrimi-nating characteristics of the five composershave been established.

Procedure. The collection of hundreds ofcompositions, the transcription of sectionsfrom them and the transposition of thesesections to a common tonality for comparisonare a task, which only a team of well-financedinvestigators could undertake. Fortunately,for the fate of the present modest study, asource was found in which this work hasalready been done. Barlow and Morgenstern(1948) have indexed about 10,000 themesfrom the works of dozens of composers aftertransposing each theme from its originaltonality to the keys of C major and C minor.Since each theme is represented in the indexas a sequence of letters (e.g., E F B C F#F#—the first theme of Beethoven’s FirstSymphony), it is a relatively simple task tokeypunch and then code the samples ofthemes for computer processing.

The major composers named above werechosen for this study because Barlow andMorgenstern have indexed particularly largereactions of their themes. Altogether, 2,240themes were sampled systematically from thissource. Because the index consists only ofthose opening notes of each theme, whichclearly identify it, only the first six notes fromeach theme (the minimum entry in the index)could be keypunched. This is a waste of datain those frequent instances in which it takes asmany as 12 notes to identify a theme, butsampling consistency is the more importantconsideration. The entire data of this studytherefore consist of 13,340 notes, divided asfollows: Bach, 1,920; Handel, 960; Haydn,

182 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 182

1,920; Mozart, 2,880; Beethoven, 2,880;Mendelssohn, 960; Brahms, 1,920.

The samples from Handel and Mendelssohn(160 themes each) were set aside as“unknowns.” Random samples of 160 themesfrom Mozart and Beethoven were also set asidefor later testing. The remaining 320 themesfrom each of the five comparison composerswere randomly divided into half-samples of 16themes in order to permit the estimation of vari-ance within the works of each composer.

A first step in the computer processing (per-formed on the Stanford University 70901) wasthe recoding of letters as numbers. For this pur-pose a twelve-tone scale is a more efficientmodel than a diatonic scale, and the lettersreceived the following values: C, 1; C# andDb, 2; D, 3; D# and Eb, 4; E, 5; F, 6; F4 andGb, 7, G, 8, G and Ab, 9, A, 10, A# and Bb, 11,B, 12. In a later analysis the twelve categorieswere reduced to five: the tonic (1), the third (5),the fifth (8), all other diatonic tones (3,6,10,12)and all chromatic tones (2,4,1,9,11). This col-lapsing yields more equal proportions in eachcategory by taking account of the do—mi—soltrimodality of this music.

Results: First Analysis

It was decided to look first at the simplesttransitions and subsequently, if necessary, to

study also the more complex ones. Greatestidiosyncrasy is found in greatest complexity,of course, since no two composers have everassembled even a hundred notes in quite thesame sequence, but idiosyncrasy found incomplexity is not generalizable to otherworks by the same man—as other composersdo not repeat the sequence, neither does he.

If two-note transitions are classified in termsof the identities of the first and second notes,144 (12 × 12) categories result. As the numberof original categories is reduced by collapsing(say from 12 to 5), then the number of jointclassifications diminishes in proportion to thesquare of the number of categories, to 25. Thereis great economy in reducing all pitch encodingdecisions to 25, but even greater simplicity maybe achieved by sacrificing pitch identities andconsidering only the size of the interval separat-ing the first and second notes. Since two notescan be separated by no more than six semitones(e.g., the distance from C to fl in either direc-tion), the number of categories needed is onlyseven (from 0 to 6 semitones). Thus, the 8002-note transitions in each 160-theme samplemay be coded just seven ways.

Table 1 shows the tabulation of “jumps”(so called to avoid the term “interval,” whichsuggests vertical harmony rather than hori-zontal motion) for the ten samples of the fivecomposers.

Identifying the Unknown Communicator • 183

Table 1 Frequency With Which The Five Composers “Jump” From 0 to 6 Semi-Tones in Each ofTheir 160-Theme Samples

Number of Semi-Tones“Jumped” in Each

Bach Haydn Mozart Beethoven Brahms

Transition 1 2 1 2 1 2 1 2 1 2

0 60 86 141 150 163 164 147 136 76 100

1 200 199 175 177 172 168 159 182 193 211

2 264 269 229 196 203 191 216 222 235 225

3 90 80 98 106 100 99 96 99 113 112

4 72 62 75 69 71 76 81 70 73 52

5 113 103 82 100 86 101 98 89 104 93

6 2 2 1 3 6 2 4 3 7 8

NOTE: Number of 2-note transitions in each sample is 800.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 183

Since pitch identities have been lost, cate-gory 0 represents such transitions as C-C, C#-C#, D-D, etc. Category 2, the modal category inall ten samples, represents such whole-tone(two semitone) transitions as C-D, C#-D#, D-E,D#-F, etc. Three facts are immediately apparentin Table 1: (1) composers agree roughly in thefrequency with which they use each “jump,”(2) there is error variance between the two sam-ples of each composer, (3) yet the two samplesof each composer tend to vary less around theirown mean than around the ten-sample mean.This last fact is a test of the discriminatorypower of these simple categories in both theknown and the “unknown” samples.

The appropriate statistic is the chi-squaregoodness-of-fit test.1 Theoretical frequenciesare defined as the means for each composer’stwo samples of the six categories (six ratherthan seven, with 5 and 6 aggregated, becausethe theoretical frequencies for category 6 byitself would be insufficient for chi-square).Table 2 reports chi-squares obtained for theoriginal samples and also for the “unknown”samples. Values entered on the major diagonalare in effect error terms—chi-squares of theextent to which each composer deviates fromhis own mean. With 5 degrees of freedom, chi-squares of 11.1, 15.1 and 20.5 have probabili-ties of .05, .01, and .001, respectively. Thus inthe matrix of known samples Bach and Brahmseach reject, and are rejected by, all composers

except himself. This pattern persists in thegroup of “unknown” samples, the probabilitybeing extremely small that either Bach orBrahms could have written any of the four.

Unfortunately, Haydn, Mozart andBeethoven cannot be distinguished on the basisof “jumps.” Discrimination of the three classi-cal composers is not only weak; it is anom-alous in that Haydn accepts Beethoven’sknown samples with less error than his own.Therefore, the “jumps” analysis separates Bathand Brahms from the group and establishesthat neither could have written the “unknown”samples, but the three contemporaries cannotbe found to differ in this encoding habit.

Second Analysis

The next simplest classification of two-notetransitions has already been described: 25 jointclasses based on the tonic, the third, the fifth,all other diatonic tones and all chromatic tones.Accordingly, the ten known samples wereprocessed again on the computer to provide thefrequency distributions reported in Table 3.The patterns observed in Table 1 reappear inthis table: error variance within a composer’sworks, but less variance around each two-sample mean than around the ten-sample mean.

The tables have other affinities. In Table1 it was seen that Mozart was the composermost likely to repeat a note (i.e., make a

184 • PART 3

Table 2 Goodness-of-Fit Chi-Squares Obtained When the Mean “Jump” Frequencies for EachComposer Are Taken as Expected Frequencies and the 14 Sets of Sample Means AreTaken as Observed Frequencies

Source ofExpected

Known Samplesa Unknown Samplesb

Frequencies Bach Haydn Mozart Beethoven Brahms 1 2 3 4

Bach 3.2 94.6 139.7 83.2 21.8 78.1 164.2 49.9 89.5

Haydn 61.3 2.8 4.5 2.7 35.5 6.2 8.5 10.2 4.2

Mozart 85.9 6.5 0.6 7.2 53.7 11.4 4.2 21.2 9.4

Beethoven 54.8 3.7 6.3 1.7 35.3 3.3 13.6 6.6 3.6

Brahms 18.4 48.9 79.3 45.2 4.1 48.2 90.0 27.5 51.0

a. Each entry in these cells is derived from the mean for the two samples of each composer.b. The four “unknown” composers are Handel, Mozart, Beethoven and Mendelssohn.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 184

0-semitone “jump”). In Table 3 it is Mozartwho most often uses the repetitions transi-tions (tonic—tonic, third—third, fifth—fifth). In Table 1 Bach and Brahms werehighest in 1-semitone “jumps,” which in tenout of twelve eases involve chromatic tones(E-F and B-C being the only 1-semitone

intervals in the diatonic scale of C major).Therefore, it is not surprising that Bach andBrahms lead the group in all chromatictransitions in Table 3. Other examples maybe found in which the tables mutually sup-port the patterns of encoding habits foundin each.

Identifying the Unknown Communicator • 185

Table 3 Frequency of 25 Types of Two-Note Transitions in Each of the Ten 160-Theme SamplesFrom the Five Composers

Bach Haydn Mozart Beethoven Brahms

From-To 1 2 1 2 1 2 1 2 1 2

Tonic to tonic 9 23 46 63 64 74 34 33 19 17

Tonic to third 23 19 24 28 31 41 22 22 11 12

Tonic to fifth 47 37 23 31 25 35 22 24 25 19

Tonic to other diatonic 88 101 80 93 69 70 97 81 62 79

Tonic to all chromatic 24 20 11 8 12 14 10 16 32 21

Third to tonic 9 14 26 21 15 21 25 17 10 17

Third to third 6 11 20 17 29 18 23 42 18 26

Third to fifth 15 15 11 28 22 26 17 17 25 19

Third to other diatonic 50 44 80 53 69 63 52 54 63 46

Fifth to tonic 52 52 36 53 45 40 44 40 40 30

Fifth to third 14 23 30 31 32 35 27 18 23 21

Fifth to fifth 31 28 48 55 51 59 58 38 16 36

Fifth to other diatonic 65 59 50 56 50 52 54 57 57 61

Fifth to all chromatic 33 34 9 13 19 18 14 27 41 41

Other diatonic to tonic 52 63 62 69 49 59 63 65 56 48

Other diatonic to third 36 29 54 44 43 44 40 45 43 42

Other diatonic to fifth 51 51 48 32 42 33 49 39 41 35

Other diatonic to other diatonic 37 48 77 64 76 35 86 75 47 67

Other diatonic to all chromatic 56 38 19 9 13 14 17 19 44 36

All chromatic to fifth 25 28 8 12 21 19 6 23 35 41

All chromatic to other diatonic 51 42 22 12 16 17 20 22 53 33

All chromatic to all chromatic 8 15 9 5 3 7 10 9 23 25

aAll chromatic to tonic and third21 6 7 3 4 6 10 17 16 27

aThird to all chromatic

a. Aggregated to provide sufficient expected frequencies for chi-square analysis.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 185

Table 4 reports chi-squares obtained whenthe goodness-of-fit test is applied to meanscomputed from the data of Table 3.

Since there are 23 means for each com-poser (3 of the 25 classes having been aggre-gated to provide sufficient frequencies), theresulting chi-squares must be evaluated with22 degrees of freedom. The .05, .01 and .001probability values of chi-square are 33.9, 403and 483, respectively.

Unlike Table 2, Table 4 shows sharp dis-crimination in the matrix of known samples.The chi-squares on the major diagonal, againmeasures of inter-sample error, give eachcomposer at least 50–50 odds that he actuallywrote his own works, while the remainingchi-squares in this matrix give odds exceed-ing 1,000 to 1 that each composer did notwrite the known samples of the other fourcomposers. As before, Bach and Brahms areseparated from the other composers byextremely high chi-squares, but even theHaydn-Beethoven distinction is sharplydrawn. The weakest discrimination, betweenHaydn and Mozart, easily meets a 1,000–1criterion of rejection.

With the assurance that this set of encod-ing habits actually does discriminate, thefour “unknown” samples may be testedagain. Chi-squares resulting from this phaseof the analysis show immediately that noneof the five composers could have writtensamples 1 and 4. Nor, if a .001 criterion is

established, could Bach, Haydn or Brahmshave written any of the four. Nor couldBeethoven have written sample 2, norMozart sample 3. But sample 2 could easilyrepresent a chance deviation from Mozart’smean frequencies, while sample 3 suggests amore outlying but quite possible deviationfrom Beethoven’s mean frequencies. Fore-knowledge of the authorship of samples 2and 3 may dispose the investigator toattribute them to Mozart and Beethoven, butit seems that the chi-squares also speak forthemselves.

Summary of the Analyses

It was decided to begin with the simplestclassifications of encoding habits and to seekthe least complex behavior, which wouldprove to be reliably idiosyncratic. Whereas ithad been expected that analyses would berequired of 3-note, 4-note and perhaps evenhigher-order transitions, a classification of2-note transitions into only 25 categoriessatisfied a stringent discrimination criterionand led to the proper disposition of the“unknown” samples.

DISCUSSION

As a contribution to communication research,this study scarcely ranks with the precedent

186 • PART 3

Table 4 Goodness-of-Fit Chi-Squares Obtained When the Mean Transitional Frequencies forEach Composer Are Taken as Expected Frequencies and the 14 Sets of Sample MeansAre Taken as Observed Frequencies

Source ofExpected

Known Samplesa Unknown Samplesb

Frequencies Bach Haydn Mozart Beethoven Brahms 1 2 3 4

Bach 16.2 295.3 376.7 229.4 110.7 157.9 414.6 181.9 148.6

Haydn 355.5 20.0 51.9 82.5 468.2 116.3 55.5 78.8 236.5

Mozart 324.3 51.2 14.3 100.3 386.0 127.2 24.2 65.9 180.6

Beethoven 201.3 65.6 103.0 17.3 214.1 91.1 114.6 36.5 102.7

Brahms 94.9 278.7 351.1 155.0 17.4 195.4 343.1 122.8 143.0

a. Each entry in these cells is derived from the mean for the two samples of each composer.b. The four “unknown” composers are Handel, Mozart, Beethoven and Mendelssohn.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 186

studies reviewed above. In the first place, itlacks their research problem—an unknowncommunicator to identify. Secondly, modestamounts of data are involved. Thirdly, com-puter processing eliminates most of the tally-ing and testing which attained awesomeproportions in Yule’s work.

Yet, acknowledging these differences, thisstudy seems to close the triangle of “artistic”communication by establishing that com-posers too have their minor encodinghabits—analogs of the writer’s prepositionsand the painter’s fingernails. Although onlythe variable of pitch was studied, it seemssafe to infer that composers also differ at thismicroanalytic level in their use of rhythms,harmonies, etc.

Indeed, converging evidence now sug-gests that all human communicative behav-ior exhibits two types of idiosyncrasydeserving study. The first is the obvious idio-syncrasy of complex constructions. That is,no two men could independently encodeParadise Lost or even three stanzas from it.Even the single sentence “Him the AlmightyPower/Hurled headlong flaming from theeternal sky/With hideous ruin and combus-tion down/To bottomless perdition, there todwell/In adamantine chains and penalfire/Who durst defy the Omnipotent to arms”would keep the fabled chimpanzees at theirtypewriters for centuries. The second type ofidiosyncrasy is that of minor encodinghabits, which lie at an opposite pole fromcomplex constructions on the continuaof deliberation and self-consciousness. Forinstance, whether he was aware of his pro-clivity or not, Fra Filippo liked to render anearlobe as circular while Bonifazio preferredan elongated ellipse (see Fig. 1). Mozartliked to repeat a tone in consecutive noteswhile Bach preferred to move up or down asemitone. If asked, could these communica-tors state why they had chosen certain pat-terns and not others?

It is tempting to link such behaviors tounconscious determinants and thus escaperesponsibility for explaining them. But thebehaviors under consideration here (e.g.,the use of prepositions) are as devoid of affectas human activity can be, and motivationfor unconscious determination is therefore

lacking. It would be absurd to argue thatHamilton felt impelled to use upon or thatMadison felt impelled to censor his useof it.

A more satisfactory perspective is thatof learning theory. Having at some time,somehow, been reinforced for using upon,Hamilton continued to use it more frequentlythan did his contemporary Madison, who mayor may not have been reinforced for avoidingit. We cannot yet infer what the relevant rein-forcements might have been, but we may cer-tainly infer that selective reinforcement wasinvolved and that behavior was “shaped” tothis end.

Common sense supports the assertion thattrivial details are subject to random variationwhile significant details are frozen in themold of the communicator’s intention.Evidence now suggests, however, that nodetail is so trivial that it does not vary system-atically within and between communicators’works. Many studies, most recently this one,have asked how? Perhaps the next will askwhy?

NOTE

1. Chi-square provides an estimate of theprobability that a given set of departures from the-oretical frequencies could have occurred bychance. Unlike tests based on the standard error ofthe mean (t test, analysis of variance), chi-squarerequires no assumptions of normality and variancehomogeneity and is therefore applicable in situa-tions (such as this) in which each mean is com-puted from only two values and in whichinformation about the sampling distribution isunavailable.

REFERENCES

Painting

Berenson, B. (1902). The study and criticism ofItalian art. London: G. Bell and Sons.

Kiel, H. (Ed.). (1962). The Bernard Berenson trea-sury. New York: Simon and Schuster.

Morelli, G. (1900). Italian painters (Vol. 1, C. J.Ffoulkes, Trans.). London: John Murray.

Wind, E. (1964). Critique of connoisseurship. ArtNews 63:26–29, 52–55.

Identifying the Unknown Communicator • 187

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 187

Literature

Ellegard, A. (1962). A statistical method for deter-mining authorship. Goteborg, Sweden: ElandersBoktryckeri Aktiebolag.

Fries, C. (1952). The structure of English. AnnArbor: University of Michigan Press.

Mendenhall, T. C. (1887). The characteristic curveof composition. Science 9, 214:237–246.

Mosteller, F., & Wallace, D. L. (1963). Inferencein an authorship problem. Journal ofthe American Statistical Association 58:275–309.

Yule, G. U. (1944). The statistical study of literaryvocabulary. Cambridge, England: CambridgeUniversity Press.

Music

Barlow, H., & Morgenstern, S. (1948). A dictio-nary of musical themes. New York: Crown.

Brooks, F. P., Jr., et al. (1957). An experiment inmusical composition. I.R.E. Trans. Elec.Computers, EC-0:175.

Youngblood, J. E. (1958). Style as information.Journal of Music Theory 2:24.

188 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 188

3.8WHEELS OF TIME AND THE

INTERDEPENDENCE OF VALUE

CHANGE IN AMERICA

J. ZVI NAMENWIRTH*

189

. . . It is my contention that the history ofvalue change is neither progressive norregressive, but basically cyclical. I shalltherefore try to demonstrate the plausibilityof this assertion; relate cycles of change in avariety of values, thereby delineating theunderlying structure of the cyclical findings,or “The Wheel of Time”; attempt to interpretthe meaning of the wheel; and then concludewith a speculation about its possible causes.

VALUES: CONCEPT AND ASSESSMENT

The definition and assessment of valuechange determine to some extent the find-ings, and an explication is therefore in order.For this exposition, the distinction betweengoods and values is basic. Goods are theavailable resources of a society at any onetime. These resources are not restricted to

material commodities, but also include suchthings as friendship, recognition, health, orpower. Values are goal states, or conceptionsabout the desirable level of goods. Lasswell’sconceptions have structured this understand-ing, and he asserts that eight categories willexhaustively classify both goods and values.In his schema, there are four deference values(power, rectitude, respect, and affection) andfour welfare values (wealth, well-being,enlightenment, and skill) (Lasswell &Kaplan, 1950:55).

To assess changes in value priorities overtime, American Republican and Democraticparty platforms from 1844 through 1964 werecontent analyzed, using procedures describedby Stone, Dunphy, Smith, and Ogilvie(1966).1 The use of content analysis is predi-cated by two assumptions: (1) The differentialoccurrence of a content category is an indica-tion of the differential concern with the value

*From pages 649–664, 671–677, and 680–683, with the permission of the editors of The Journal of InterdisciplinaryHistory and The MIT Press, Cambridge, Massachusetts. © 1973 by the Massachusetts Institute of Technology and TheJournal of Interdisciplinary History, Inc.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 189

III. General Value Transaction Indicators

1. transaction indulgence2. transaction deprivation3. transaction4. scope indicators5. base indicators6. arena 7. participant8. nations9. self

10. audience11. others12. selves

classified by that category; and (2) the rela-tive value concern which is thus measured isan appropriate measure of the relative priorityof that value in the total value schema of eachand all documents. The content analysis, then,produces a profile of frequency changes inreference to seventy-three categories.2 Whatare these categories?

About 95% of the words which occur inparty platforms were entered in a dictionary,and these words were defined by one or moreof the seventy-three categories. Many ofthese are really subcategories of the LasswellValue categories. When possible, a distinc-tion was made between categories (and,therefore, words) which indicate a substantive

190 • PART 3

Table 1 Classification of the Value Dictionary

I. Deference Values

1. Power

2. Rectitude

3. Respect

4. Affection

II. Welfare Values

1. Wealth

2. Well-being

3. Enlightenment

4. Skill

Substantive Values

other authoritative powercooperation solidarityconflict doctrine

ethicsreligious

other

other

Substantive Values

other

somaticpsychicother

aestheticsother

Value Transactions

Arenaindulgencedeprivationscope value indicatorgeneral participantauthoritative participant indulgencedeprivationscope value indicatorindulgencedeprivationdeprivationparticipant

Value Transactions

transactionparticipantsindulgencedeprivationindulgencedeprivationscope indicator participants

IV. Anomie

1. anomie

V. Sentiments

1. positive affect2. negative affect3. not4. sure5. if

VI. Space-Time Dimension

1. space-time

VII. Residual Categories

1. n-type word2. undefinable 3. undefined

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 190

value concern and categories which indicatea concern with some value transactionwhereby the actor (participant) gains (indul-gences) or loses (deprivations) in a particularvalue environment (arenas, countries, etc.).Also, some categories indicate whether thevalues and the considerations are intrinsic forthe participant (scope values) or instrumental(base values). If a subclassification were notfeasible, words were classified in a residualcategory (other). For some words it is unclearwhat the particular value reference is; theyare classified as general value unspecificindicators.

Three residual categories deserve furtherexplanation: n-type words are high frequencywords with little semantic information, suchas articles and conjunctions. The categoryundefinable contains words that have novalue implications whatsoever. The categoryundefined includes words with ambiguousvalue implications, which will change fromcontext to context.

Armed with this instrument and usingcomputers, the full text of the platforms wasmatched with the dictionary (or word classi-fications) and this matching produced thenoted frequency profiles. Even if one wereto agree that these frequencies may indicatechanging value preferences in party plat-forms, the reader may well question the rele-vance of such data. Why bother with partyplatforms?3

The choice of party platforms to assessmagnitude and direction of changing valuesin American society seems justified for thefollowing reasons: (1) The two-party systemin the United States is competitive in moststates of the Union, i.e., the parties compete inthe same electoral market for the sympathiesof various interests. The planks therefore con-tain the platform committee’s best guessesabout policies and values that will maximizethe party’s appeal to the electorate, and, inorder to survive, parties must guess their vot-ers’ preferences correctly more often than not.Consequently, the content of party platformsis especially suitable for the study of values ofthe whole society. (2) Party platforms notonly reflect predominant values, but they alsocreate or modify value orientations by their

presentation and the ensuing public disputesduring election campaigns. (3) Parties andparty platforms are features of many othersocieties so that their examination allows forfuture cross-national comparisons.

DATA AND CYCLES

Basic data of this investigation are as follows:For each Democratic and Republican plat-form and for each campaign from 1844 to1964 (or thirty-one campaigns), there are sev-enty-three observations, one observation foreach category (or variable). Each observationis the frequency of that category in the par-ticular platform. This frequency is thenexpressed as a percentage of words in thatcategory of all words in the document, sincethis manipulation controls for the fact thatcampaign documents are of varying lengths.A plot of the thirty-one observations for thecategory “wealth-total” (a summary measureof all wealth subcategories) over the years1844–1964 indicates that the concern withwealth varies a good deal from campaign tocampaign.

Figure 1 illustrates . . . that in general theconcern with wealth is low in the 1840s and1850s; it increases over the next eightyyears, to decrease again after 1932. Thislong-term cyclical tendency is estimated bya sine curve (the dotted line . . . ). As will benoted, the actual observations do not lie onthe dotted line. . . . If we plot the deviationsfrom the dotted line (residuals) over time(see Figure 2, which represents these devia-tions for the Democratic platform), then wenote a secondary cyclical trend which in thenature of the case has a more limited swing(or amplitude) and a shorter time span. Thissecondary cycle is also described by a sinecurve which varies about the first one, andthese secondary curves are represented bythe drawn line in Figure 1. In conclusion,two cyclical trends seem to describe, if notoperate on, changing concern with wealth inAmerican platforms. . . . [S]imilar cyclestend to operate in most other value cate-gories as well. How did I arrive at the latterconclusion?

Interdependence of Value Change in America • 191

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 191

+2%

+1%

−1%

−2%

44 52 60 68 76 84 92 00 08

Year

16 24 32 40 48 56 64

z = 9 sinθ36 r = .65

0

Figure 2 Short-Term Sine Curve Fitting Deviations From Long-Term Cycle Describing Referencesto Wealth in Democratic Party Platforms, 1844–1964

192 • PART 3

6%

5%

4%

3%

2%

1%

44 52 60 68 76 84 92 00 08Year

16 24 32 40 48 56 64

γ = 3.8 + 1.9 sinθ10 + 9 sinθ r = .91

Figure 1 Two Superimposed Sine Curves Fitting Percent Concern With References to Wealth(Wealth Total) in Democratic Party Platforms, 1844–1964

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 192

Plots of the data revealed provisional out-lines of a curve in each category and thereforethe amplitude, wavelength, and year of maxi-mum (or minimum) of each of these curves.These first estimates were subsequently testedand adjusted by an iterative computerprogram. A particular sine curve is consideredan acceptable estimate of the underlyingcyclical trend if it correlates with the data atr = .45 or better (Namenwirth & Ploch, 1968;Porter & Johnson, 1961). This conservativedecision rule is not wholly arbitrary since itprovided a unique solution only in thesecases. In a similar manner, if a short-termcycle correlated .40 with the data, I acceptedthe existence of a secondary curve. Of theforty-two categories, about 80% displayedsome type of cycle.

To state that sine curves approximate agood part of value change is not just to saythat value changes display fluctuations, butthat they display fluctuations of a particularkind. First, values fluctuate around an averagelevel of concern, which is constant over time.Second, the magnitude of these fluctuations(or amplitude) is also constant over time.Third, the time span of each wavelength isconstant, as well. In the case of the primarycurve, the data, therefore, suggest static equi-librium. In the case of the secondary curve,the findings suggest a moving equilibrium,since the curve varies about an average level,which itself, is subject to constant change i.e.,the primary curve.

In this manner, one can conceive of allvalue concerns and their changes as consistingof long-term curves, short-term curves, and de-trended fluctuations. These three componentparts have their own causes and dynamics.

THE FIT OF LONG-TERM

VALUE CYCLES

Table 2 presents all of the content categorieswhich fit a longer-term sine curve and fourcharacteristics of each curve: (a) the party, i.e.Democratic or Republican; (b) wave length(or time span) in number of years; (c) the peak

(or year when the curve is at its maximum);(d) r2, a measure of goodness of fit. The tableindicates that in the long run, concern with thecategory others in the Republican platform,for instance, is at its height in the year 1808,and that this will again be the case in the year2040 (i.e., 1808 + 232). The long-term cycleexplains about 64% of the variance in chang-ing references to the category. In other words,the curve fairly well describes the varyingusage of this concept over time, since the vari-ance explained would equal 100% if it were todescribe the variation perfectly, and zero per-cent if it were not to describe this variation atall. Even so, there was no party platform in1808, and the statement about the platform in2040 is only a projection into the future. Theestimation of long-term change in value con-cerns is therefore often based on extrapolation.

In the Republican platforms, about three-fifths of the content categories, and morethan seven-tenths of the categories in theDemocratic platforms, were estimated by thelong-term sine curve. Approximately one-fourth of the categories did not fit a long-termsine curve in either the Democratic orRepublican Party platforms.

Unfortunately, the fifty-five long-term sinecurves do not all have the same wavelength,and this complicates their interpretation.Although the modal wavelength is 152 years,the shortest cycle runs 104 years, while thelongest lasts 232 years—or more than twiceas long. Table 3 presents the specifics.

THE STRUCTURAL INTERDEPENDENCE

OF LONG-TERM VALUE CHANGE

The internal relationships among the long-termsine curves are presented by a circle. To theright are the sine curves, which peak in subse-quent years; to the left, are the curves, whichpeaked in previous years. The whole circle rep-resents a 152-year sequence of peaking anddropping concerns with a variety of values. Forinstance, while long-term concern with thecategory wealth peaks around 1932, long-termconcern with the categories affection, respect,

Interdependence of Value Change in America • 193

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 193

194 • PART 3

Table 2 Selected Characteristics of 55 Long-Term Sine Curves

Wave Length Content Category & Partya Peak (in Years) r 2

1. Others 1808 232 0.64

2. Undefined 1816 184 0.46

3. Rectitude Scope Indicator 1820 232 0.37

4. Rectitude Scope Indicator 1860 168 0.43

5. Respect Indulgence 1864 168 0.46

6. Rectitude Total 1864 152 0.40

7. Affection Total 1864 152 0.30

8. Respect Total 1868 152 0.49

9. Rectitude Total 1868 136 0.45

10. Affection Total 1872 120 0.38

11. Respect Total 1872 152 0.21

12. Respect Indulgence 1880 136 0.32

13. Power Authoritative Participant 1880 184 0.28

14. Power Participant 1884 136 0.22

15. Power Authoritative 1888 168 0.35

16. Power Authoritative Participant 1896 152 0.18

17. Positive Affect 1924 152 0.62

18. Wealth Participant 1928 136 0.58

19. Skill Total 1928 184 0.53

20. Wealth Transaction 1928 184 0.34

21. Nations 1928 184 0.24

22. Wealth Total 1932 152 0.73

23. Wealth Other 1932 152 0.73

24. Wealth Other 1932 152 0.69

25. Wealth Total 1932 152 0.69

26. Skill Other 1932 184 0.51

27. Wealth Participant 1932 136 0.46

28. Wealth Transaction 1936 152 0.32

29. Transaction Indulgence 1940 184 0.54

30. skill Total 1944 152 0.67

31. Selves 1944 168 0.65

32. Selves 1944 184 0.30

33. Skill Other 1948 152 0.69

34. Well-being Total 1948 136 0.34

35. Transaction 1948 168 0.26

36. Transaction 1948 184 0.25

37. Well-being Total 1952 136 0.68

38. Well-being Somatic 1952 152 0.32

39. Well-being Somatic 1956 152 0.74

40. Base Indicator 1956 168 0.70

41. Transaction Indulgence 1956 184 0.65

a. Italics = Republican Party Platforms; otherwise, Democratic Platforms.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 194

and rectitude peak half a cycle earlier or later(76 years), i.e., where concern with wealth is atits peak, preoccupation with affection andrespect is at a low. In addition, an increasingconcern with wealth over time leads to a fixeddecrease in concern with respect, and viceversa. An understanding of these dynamicsrequires a description of the sequence of peak-ing value concerns around the wheel of time.

Around 1856, concern with the categoriesothers, affection-total, rectitude-total, andrespect-total, and several of their subcate-gories, is at a maximum. The category otherscontains all references to the third person pluralpronoun (they, their, their’s, themselves). In

Interdependence of Value Change in America • 195

Table 3 Distribution of Long-Term SineCurves by Wave Length

Wave Lengths (in years) N

104 years 1

120 years 1

136 years 7

152 years 20

168 years 9

184 years 11

200 years 1

232 years 5

Positive A

ffectW

ealth Transaction

Wealth Total

Wealth P

articipantW

ealth Other

Sel

ves

Ski

ll To

tal

Tran

sact

ion

Trans

actio

n Ind

ulgen

ce

Power Indulgence

Base In

dicator

Well-being To

tal

Well-being Somatic

Scope Indicator

Well-being Participant

Well-being Deprivation

Enlightenment TotalPower Confilct

Power Doctrone

DemocratsPower Cooperation

Undefined

Skill

Oth

er

1970

1856

Rec

titud

e S

cope

Indi

cato

r

Rec

titud

e To

tal

Affe

ctio

n To

tal

Respe

ct To

tal

Respe

ct In

dulge

nce

Power Authoritative

Power Participant

Power Authoritative

Participant

Wealth P

articipantN

ations Skill Total

Wealth Total

Skill O

therW

ealth Other

Wea

lth T

rans

actio

n

Tran

sact

ion

Indu

lpan

ceS

elve

sTr

ansa

ctio

n

Well

-bein

g Tot

al

Well-b

eing Somatic

Base In

dicator

Scope Indicators

Power CooperationPower Conflict

Power InduigencePower DoctrineRepublicans

Others

Affection TotalR

ectit

ude

Tota

l

Res

pect

Indu

lgen

ce

Res

pect

Tot

al

Rec

titud

e S

cope

Indi

cato

r

Power AuthoritativeParticipant

Instrumental

Integrative

Expressive

Adaptive1894

1932

Figure 3 The Internal Structure of Long-Term Value Changes (Cycle Lengths Set at 152 Years,Variable Origins)

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 195

Party platforms, such references often standfor a concern with the “other” party, but moreoften with people in general, and their wishesand qualities. A frequent usage of this cate-gory invokes a distinction between leadershipand the masses, the “they” and all of thosewithout a name, and it therefore indicates anelitist orientation to social reality and thepolitical order.

The category affection contains referencesto love and friendship in general, and in partyplatforms, such references often indicate anassociation between devotion to family lifeand loyal patriotism. For instance;

Resolved that, with our Republican Fathers[affection-participant], we hold it to be a self-evident truth, that all men are endowed with theinalienable right to life, liberty, and the pursuitof happiness. . . . (Porter & Johnson, 1961:27)

In party platforms, the category respectincludes the words honor, equality, andinequality. The category rectitude containsrecurrent words like ought and must, whichsuggest a call for natural duty and principles.To illustrate:

We recognize the equality [respect-others] ofall men before the law and hold that it is theduty [rectitude-scope-indicator] of the govern-ment in its dealings with the people to mete outequal and exact justice [rectitude-ethics] to all,of whatever nativity, race, color or persuasion,religion [rectitude-religious] or politics. (Porter& Johnson, 1961:41)

Slavery was the preponderant issue in theseyears. Policy preferences on this score dividedthe parties and changed over time—theDemocrats favored slavery and its extensioninto the territories, the Republicans opposedthe latter. However, our findings pertain tosimilarities, not differences, between the twoparties, and the mutual concern with rectitudeindicates that whatever the nature of substan-tive policy differences, there was a great andsimilar concern with the justification of policypreferences. In addition, at that time the termsof justification were largely rectitudinal.

In the 1890s, concern with rectitude,respect, and affection declined while concern

with power-authoritative participant andpower-authoritative was at a peak. Thesecategories contain many words, but mostfrequent are references to the federal govern-ment and the constitution. At first sight, itseems as if the political issues remained thesame, i.e., the relationship between the statesand the federal government. However, thejustification of policy preferences changedfrom ethical to legal grounds, from substan-tive to formal justice, from traditional tolegal. To illustrate:

During all these years the Democratic party hasresisted the tendency of selfish interest to the cen-tralization of governmental [power-authoritative]power, and steadfastly maintained the integrityof the dual scheme of government [power-authoritative participant] established by thefounders of this republic of republics. Under itsguidance and teachings the great principle oflocal self-government has found its best expres-sion in the maintenance of the right of the statesand in its assertion of the necessity of confiningthe general government [power-authoritativeparticipant] to the exercise of the powers grantedby the constitution [power-authoritative] of theUnited States. (Porter & Johnson, 1961:97)

On further inquest, one notes that thecentral issue is no longer the relationshipbetween federal and state government, but therole of the federal government in the creationand maintenance of the economic infrastruc-ture of an industrial society. The parties are inconflict about this role in regard to tariffs,transportation, politics (domestic as well asinternational, i.e., a canal through the isth-mus), homesteading, banking and monetarypolicy, antitrust legislation, immigration, andtaxation. Yet, the essential conclusion remainsthe same. Divergent policy preferences mayappear from party to party and from campaignto campaign, and the justification of the diver-gent preferences is in terms of very identicallegalistic constructs.

By 1932, the role of the federal govern-ment is much less disputed. The partyprogram does not elaborate on justification,but simply states its preferences in regard toeconomic policy. The peaking concern withwealth and its subcategories indicates this

196 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 196

finding. Thus, one reads in the Democraticplatform:

We favor the maintenance of national credit[wealth-other] by a federal budget [wealth-other] annually balanced based on accurateexecutive estimates within revenues [wealth-other], raised by a system of taxation [wealth-other], levied on the principle of ability to pay[wealth-transaction]. We advocate a soundcurrency [wealth-other] to be preserved at allhazards and an international monetary [wealth-other] conference called on the invitation of ourgovernment to consider the rehabilitation of sil-ver [wealth-other] and related questions.(Porter & Johnson, 1961:331)

And the Republican platform states:

Generally in economic [wealth-other] matterswe pledge the Republican Party: 1. to maintainunimpaired the national credit [wealth-other].2. To defend and preserve a sound currency[wealth-other] and an honest dollar [wealth-other]. 3. To stand steadfastly by the principleof a balanced budget [wealth-other]. . . . (Porter& Johnson, 1961:350)

First, there is a near total absence of jus-tification of policy preferences. Second,a preoccupation with the material and techno-logical (skill) well-being of the nation is acharacteristic for the platforms at this time.This is further confirmed by the frequent ref-erences to the category selves. In party plat-forms, the use of we, us, ourselves, etc., oftenreveals a denial of status differentiation,either within the party or within the nation asa whole.

Typical for platforms in the 1950s and1960s is an orientation toward the futurerather than the past. This is revealed bymaximum preoccupation with the categorytransaction.

We shall [transaction] insist [transaction] onbusinesslike and efficient administration ofall foreign aid . . . We shall [transaction] erect[transaction] our foreign policy on the basis offriendly firmness. . . . We shall [transaction]pursue [transaction] a consistent foreign pol-icy. . . . We shall [transaction] protect thefuture. . . . (Porter & Johnson, 1961:453)

Frequent references to base and scopeindicators with words such as plan, strategy,future, project, and development, point in thesame direction. In this framework, there isalso maximum concern with health andwell-being in general.

Projecting the findings into the future,the major concerns of the 1970s will beagain of a different order. Since the 1930s,society as a whole was the object of con-cern; in the 1970s, the major preoccupationwill be with conflicting groups and individ-uals. Frequent words in the subcategorypower-cooperation are agreement, coalition,compromise, cooperative, organization, sol-idarity, unity, and, in the subcategorypower-conflict, agitation, anarchy, break-down, disagreement, disunity, fight, hostil-ity, rebellion, resistance, and revolution.Seemingly, problems of the distribution ofpower and other social resources will be themajor value issue at that time.

THE FIT OF THE SHORT-TERM

VALUE CYCLES

Table 4 presents all of the categories that fita short-term since curve. According to thistable, Republican concern with the categorywell-being-total tended to be at its height in1908, as it did in 1868 and 1948. The short-term sine curve explains 26% of the residualvariance in the category over time. Theshorter-term cycle is therefore only a ten-dency in the data, which often explains but alimited part of the total amount of valuechange. In addition, the estimation of thecycles is based in part on extrapolation.

As was done with the long-term cycles, thewavelengths of the shorter-term cycles wereset at the modal length of forty-eight yearsand peaks transformed accordingly. In thiscase, the wavelength of the transformed sinecurve is considerably less than the periodunder observation, and, therefore, each pointon the circle represents a set of peaks, whichare forty-eight years apart. For instance, thetop of the circle (in Figure 4) represents theyears 1884, 1932, and 1980.

Interdependence of Value Change in America • 197

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 197

198 • PART 3

Table 4 Selected Characteristics of Short-Term Sine Curves

Wave Length % Totalb

Content Category and Partya Peak (in years) r 2 Variance

1. Well-being Total 1908 40 0.26 0.17

2. Arena 1908 68 0.22 —

3. Respect Total 1908 36 0.19 0.1

4. Respect Total 1910 32 0.22 0.18

5. Others 1912 68 0.23 0.08

6. Enlightenment Total 1916 52 0.21 0.15

7. Power Indulgence 1916 44 0.2 0.11

8. Scope Indicator 1916 32 0.17 0.03

9. Undefined 1920 80 0.3 0.16

10. Power Authoritative Participant 1920 52 0.23 0.19

11. Undefinable 1920 32 0.2 —

12. Power Scope Indicator 1926 32 0.3 —

13. Power Indulgence 1926 32 0.17 0.13

14. Well-being Participant 1928 32 0.25 0.14

15. Wealth Transaction 1928 48 0.17 0.13

16. Power Participant 1928 48 0.12 0.09

17. Power Arena 1932 40 0.24 —

18. Wealth Other 1934 48 0.43 0.13

19. Wealth Total 1934 48 0.43 0.13

20. Rectitude Total 1934 68 0.35 0.19

21. Wealth Total 1936 48 0.26 0.07

22. Wealth Other 1936 48 0.24 0.06

23. Wealth Transaction 1936 48 0.2 0.15

24. Power Cooperation 1936 64 0.18 0.15

25. Power Conflict 1938 36 0.25 0.19

26. Power Cooperation 1940 40 0.33 0.18

27. Power Other 1940 64 0.3 —

28. Wealth Participant 1940 48 0.21 0.11

29. Rectitude Scope Indicator 1942 48 0.24 0.15

30. Undefined 1944 20 0.21 —

31. Transaction 1944 44 0.17 0.13

32. Selves 1946 48 0.27 0.19

33. Affection Total 1946 52 0.23 0.16

34. Positive Affect 1946 44 0.17 0.06

35. Power Scope Indicator 1948 64 0.23 —

a. Italics = Republican Party Platforms; Otherwise, Democratic Platforms.b. Where blank, curve fitted to raw data.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 198

AN EXPLANATION OF SHORT-TERM

CYCLICAL VALUE CHANGES

The reader will have noted that the early1890s and the years around 1932 representperiods of sustained depression and businesscontraction in the American and world econ-omy. In addition, twenty-five to thirty yearsearlier or later were periods of sustainedeconomic growth (Burns & Mitchell,1946:429; Fellner, 1956:43–54; Gordon,1952:235–243). In short, there appears arather striking fit between the short-termwheel of time and a particular economiccycle. What is the latter cycle?

Economists distinguish between variouscyclical fluctuations in business activity: sea-sonal fluctuations, the business cycle, thelong wave, and the secular trend (Fellner,

1956). The long wave, even though disputedby some economists, is said to vary, extend-ing and contracting over a period of fifty tosixty years. How does the latter process relateto value articulations?

During long-wave economic deterioration,the nation turns inward, gradually relinquish-ing international ventures and then obliga-tions, becoming more and more parochial inits orientations. This parochialism is first con-servative, probably stressing discipline, thetightening of belts, the necessity of temporaryunemployment as well as charity to overcomethe economic decline. Usually, this goestogether with growing indifference, if nothostility, toward foreign claims and condi-tions as the outside world will be seen as com-petitive, fickle, a cause of troubles, and anobject of scapegoating. With the ongoing but

Interdependence of Value Change in America • 199

Pow

er Arenes

188419321980

Rec

tltud

e To

tal

Pow

er C

oope

ratio

nW

ealth

Oth

er

Wea

lth T

rans

actio

n

Wea

lth T

otal

Power O

ther

Power Conflic

t

Power Cooperation

Rectitude Scope Indicator

Wealth Particlpant

Power Scope Indicator

Affection Total

TransactionSelves

Positive AffectRespect Total

Respublicans

Rectitude EthicsPositive AffectRespect TotalPower DoctrineDemocrats

Scope Indicator

Nations

Power Indulgence

Enlightenment Total

187219201968

1812186019081956

184818961944

Undefined

Well-being Participant

Power Partlcipant

Wealth Transaction

Wea

lth O

ther

Wea

lth T

otal

Scope Indicator

Well-being Total

Enlightenm

ent Total

Skill Total

Skill O

ther

Nat

ions

Undef

inable

Undef

ined

Arena

Others

Power Indulgence

Power Authoritative Participant

Power Scope Indicator Parochial

Cosmopolitan

Conservative Progressive

Figure 4 The Internal Structure of Short-Term Value Changes (Cycle Length Set at 48 Years,Variable Origins)

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 199

diminishing calamity, the mood will changefrom conservative to progressive. Increasingly,belt tightening and charity will be seen as pal-liatives. A growing demand will arise for achange in collective arrangements and struc-tural intervention. Whether cause or effect,the ensuing structural change seems to worksince prosperity returns. With increasing sur-plus, attention turns again to the world scene;value articulations become more cosmopoli-tan, at first in a progressive vein. Progressiveintervention works at home; therefore, itneeds to be exported in the fulfillment ofAmerica’s ethos and liberal designs. At anyrate, money is there in growing abundance.However, once the expansion turns its peakand contraction sets in, the cosmopolitanimpulse turns from progressive to conserva-tive, from national mission to national inter-est, from, for instance, Marshall Plan to GreenBerets. One may well speculate that withVietnamization and the Nixon Doctrine, theparochial phase is on the rise again.

The relationship between long-wave eco-nomic contraction and expansion, and a shiftin basic understandings regarding the natureof morality and criteria of worth, are equallysystematic. Briefly, the shift from conflict toconsensus and vice versa commences at thebeginnings of the period of sustained deba-cle and prosperity, while the shift from par-ticularistic to universalistic and vice versabegins at the onsets of sustained growth andcontraction.4

CONCLUSIONS

A content analysis of American party plat-forms produced results, which seem to fit avariety of trends in a great many differentvalue categories. In addition to a long-termtrend of about 148 years, one can often dis-cern short-term trends of about forty-eightyears. The latter represent variations in valueconcern over and above long-term trend vari-ations. In combination, the short- and long-term cycles describe (or explain) a good partof the variation in value concerns.

The wheels of time summarize the internalstructure of both types of value cycles.

Sequentially, the varying value concerns ofthe long wheel of time are explained in termsof four fundamental functional problems ofany society. Accordingly, the solution of oneproblem always takes precedence over thesolution of the next one until all fourproblems—adaptive, instrumental, integrative,and expressive—have been articulated to thefullest and the progression commences anew.

The sequential articulations of the shortwheel of time—parochial, progressive, cos-mopolitan, and conservative—are most likelyproduced by a dynamics that differs from thelong-term functional mechanisms, and the“long wave” periodic contraction and expan-sion of the national economy seems, for themoment, the most plausible explanation.

Long- and short-term dynamics are notequally important in the determination of valuechange. On the average, long-term cyclesdescribe about three times as much of the vari-ance in value change as do short-term cycles.The larger part of changing value articulationsin platforms is therefore attributed to thedynamics of social problem-solving rather thanto social structural changes. Yet, the theory isnot purely functional, since it is suggested thateconomic mechanisms are operating beyondand above the functional dynamics.

In the exposition, it is assumed that thetime span and magnitude of value change areconstant for all times. This seems an unwar-ranted assumption. Indeed, if the “long wave”explains the shorter-term wheel of time, thenthe sine curve may well be too constrictive amodel of value change. Even though the“long wave” is a recurrent and rhythmicallyalternative cycle, the magnitude and wave-length of these cycles seem to vary in history.If this is the case for the cause, so it must befor the consequences, and thus the sequenceof political philosophies must be of varyingduration. One would like to believe thatchanges in duration and amplitudes are them-selves a simple function of time and thereforegradual and continuous, but the world ofvalue transformations may not submit itselfso readily to this persistent search for ele-gance and order.

Quantitative procedures, such as contentanalysis and curve fitting, suggest to the

200 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 200

uninitiated reader an exactness and precisionwhich are far greater than the results of morecustomary procedures of historical analysis.This practitioner is under no such illusions.I forewarned the reader about the approxi-mate nature of estimates and the speculativecharacter of subsequent interpretations andexplanations. Their correctness cannot beestablished in one experiment, and judgmentson that score must await future examinationsof different historical sources using differentprocedures of analysis.

NOTES

1. In 1844, 1848, and 1852, there were noRepublican platforms since the Republican Partydid not exist prior to 1856. For the first three cam-paigns, I used the Whig platforms because theWhigs are in many respects the precursors of theRepublican Party.

2. Thirty-one categories were eliminated fromthe analysis because of low frequencies or poordistribution. For a discussion of all of the cate-gories, see Namenwirth and Weber, 1987.

3. The study of party platforms has otheruses besides the assessment of value changes. SeeBenson, L. (1961). The concept of Jacksoniandemocracy: New York as a test case (Princeton,N.J: Princeton University Press), p. 216, andKlingberg, F. L. (1952). The historical alternationsof moods in American foreign policy. WorldPolitics, IV:239–274.

4. For a general discussion of the relationshipbetween economic cycles and political if not philo-sophical thought, see Pareto, V. (1935). The mindand society. New York: Harcourt & Brace, para-graph 2387.

REFERENCES

Burns, A. F., & Mitchell, W. C. (1946). Measuringbusiness cycles. New York: National Bureauof Economic Research.

Fellner, W. J. (1956). Trends and cycles in eco-nomic activity: An introduction to problems ofeconomic growth. New York: Holt.

Gordon, R. A. (1952). Business fluctuations. NewYork: Harper.

Lasswell, H. D., & Kaplan, A. (1950). Power andsociety: A framework for political inquiry.New Haven, CT: Yale University Press.

Namenwirth, J. Z., & Ploch, D. R. (1968).Structural and contingent value changesin American political party platforms. Unpub-lished master’s thesis, Yale University, NewHaven, CT.

Namenwirth, J. Z., & Weber, R. P. (1987).Dynamics of culture. Boston: Allen andUnwin.

Porter, K. H., & Johnson, D. B. (1961). Nationalparty platforms, 1840–1960. Urbana: Univer-sity of Illinois Press.

Stone, P. J., Dunphy, D. C., Smith, M. S., &Ogilvie, D. M. (Eds.). (1966). The GeneralInquirer: A computer approach to contentanalysis. Cambridge, MA: M.I.T. Press.

Interdependence of Value Change in America • 201

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 201

3.9INFERRING THE

READABILITY OF TEXT

KLAUS KRIPPENDORFF

202

We tend to attribute readability towritten text. The 2000 OxfordEnglish Dictionary defines it as “the

quality of, or capacity for being read with plea-sure or interest, considered as measured bycertain assessable features as ease of compre-hension, attractiveness of subject and style.”This definition backgrounds the obvious—thatit requires a reader for texts to be readable, infact, for texts to be texts. Readability, like liter-acy, is a cultural phenomenon, and efforts toinfer readability from text serve various socialinstitutions, in effect creating functional dif-ferentiations of individuals’ ability. Read-ability research provides content analysts withan interesting case study of how analyticalconstructs are constructed and applied.

INFERRING READABILITY FROM TEXTS:A BRIEF HISTORY

Educational research pioneered efforts tomeasure readability in the 1920s. Readabilitybecame an issue in deciding on reading mate-rial appropriate for schoolchildren on differ-ent levels. In the 1930s, readability research

expanded to adults, serving the emergingneeds of industry, government, and the mili-tary to evaluate magazines and books for theirpublishability, as well as technical communi-cations, training manuals, and forms for theirreliable use in processes of an administrativenature. Now, leading word-processing soft-ware features readability measures, intendedto aid good writing.

The guiding idea of readability research isto find an index of readability that is general,not content specific. This is why traditionalcontent analysts have not participated in itsdevelopment. However, what students ofreadability and content analysts have in com-mon is the effort of making reliable and validinferences from text to a chosen context, aswell as the need to connect the two empiricaldomains by means of what content analystscall analytical constructs. The whole historyof readability research is one of graduallyrefining definitions of readability so that itcan be inferred from measurable textualattributes and of improving the underlyinganalytical constructs.

Early readability studies looked intovocabularies—cataloging words with which

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 202

students on various grade levels would befamiliar (Thorndike, 1921) and could beexpected to cause few reading problems. Inthe 1930s, readability researchers began toemploy statistical correlations betweennumerous measurable attributes of texts andjudgments of the difficulty of reading thesetexts. Regression equations were used to iden-tify predictors of readability from whichcomputationally efficient formulas could beconstructed. For schoolchildren, a series oftest lessons, developed by McCall and Crabbs(1925), still published, increasingly becamethe standard criterion against which mostreadability measures were tested. For adults,researchers used the opinions of library users,the popularity of publications, and multiple-choice comprehension tests. Adults showedmore diversity than did children owing to theinfluence of different interests, backgrounds,levels of education, and occupation, renderinga common formula more difficult.

In 1934, Rudolph Flesch, probably themost cited readability researcher, proposed asimple three-factor formula, which correlated.74 with McCall-Crabbs test scores (Klare,1963:56ff). His procedure:

Systematically select samples of 100 wordsthroughout the material to be rated

Compute average sentence length inwords (xs)

Count the number of affixes (xm)

Count the number of personal references (xh)

Average the results and insert in the formula:

.1338xs + .0645xm – .0659xh – .7502.

The resulting index brought most of themeasured texts between 1 (easiest) and 7(most difficult). Adding the constant 4.2498instead of .7502 gave the reading grade place-ment at which 75% comprehension could beobserved. His formula acknowledged earlierfindings that long sentences are difficult butadded the intuition that abstractions, indicatedby affixes, add to this difficulty while per-sonal references subtract from it.

An interesting controversy led Flesch tomodifications of his formula. According to

Klare (1963:58), the statistician S. S. Stevens,known for his distinctions of four levels ofmeasurement, and Geraldine Stone (1947)applied the Flesch formula to psychologytextbooks used at Harvard University andfound the difficulty of William James’sPsychology to be overestimated and Koffka’sPrinciples of Gestalt Psychology to be under-estimated according to student judgment.Following this controversy, Flesch (1948)separated readability into two kinds: readingease and human interest. Because of the labo-rious nature of counting affixes, in his newreading ease formula, the count of affixes wasreplaced by a count of the number of syllablesper 100 words. Flesch’s reading ease (seebelow) continued to correlate highly withthe McCall-Crabbs criterion, .7047, whereashuman interest was .4306. Counting syllableshas been the most common feature of mostreadability measures ever since. Flesch didnot abandon his earlier insight that readingease had much to do with using abstractionsand developed a measure of the level ofabstraction (Flesch, 1950) of words in a text.This could be used as such but also as a read-ability measure because it correlated .72 withthe criterion, a slight increase over the .7047for the reading ease measure.

The 50 years following Flesch’s and hispredecessors’ proposals were filled with read-ability studies. Some 40 formulas have beenproposed, tried out, abandoned, or refined—not all of them tested by empirical evidence(DuBay, 2004). But the analytical constructsevolved very little.

CURRENTLY POPULAR

ANALYTICAL CONSTRUCTS

Flesch Reading Ease (Flesch, 1948)

This formula has now withstood the testof time. It calls for selecting any 100-wordsample from a text, counting

ASL = the average sentence length = totalnumber of words/total number of sentences

ASW = the average number of syllables perword

Inferring the Readability of Text • 203

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 203

and computing

206.835 – 1.015 ASL – 84.6 ASW.

The resulting score ranges from 0 to 100and suggests that fifth graders can read textsat score 90 to 100, eighth to ninth graders at60 to 70, and college graduates at 0 to 30.According to Wikipedia, Readers Digestmagazine scores about 65, Time magazineabout 52, and Harvard Law Review below 30.Microsoft Word computes this score afterusing its spell checker. This chapter scores29.9.

Flesch-Kincaid Grade Level

This formula translates the components ofFlesch’s reading ease into a score that reflectsgrade levels of education, making it easierfor teachers, parents, and librarians to recom-mend appropriate reading materials, includingbooks, to students. It too calls on counting:

ASL = the average sentence length = totalnumber of words/total number of sentences

ASW = the average number of syllables perword

but computing this index:

0.39 ASL + 11.8 ASW – 15.59.

Its score corresponds to the grade level atwhich an average student would be able toread the measured text. Microsoft Wordproduces this automatically as well. Thischapter scores 12.6, which would suggest alevel below college.

Passive Sentence Readability

Microsoft Word also computes the propor-tion of passive to all sentences of a document.It ranges from 0 (supposedly easiest) to 1(supposedly most difficult) and is based onthe contention that passive sentences weakenthe direction of the verb and can confuse themeaning of a sentence, even when grammati-cally correct. A reference for this index couldnot be found, nor evidence of its validity.Since it has become available, it is being

discussed. Writers insist that one cannot doentirely without passive constructions, sug-gesting that 25% would still be readable. Inthis chapter, 9% of sentences are passive.

Dale and Chall Formula(Dale & Chall, 1948)

These researchers sought to improve onFlesch’s AWS, replacing it by a count of dif-ficult words, defined as not occurring on acarefully researched list of 3,000 easy words.Forty years later, Dale and O’Rourke (1981)revised this list. They recommend taking sev-eral 100-word samples from different parts ofa text, for books every 10th page, counting

PDW = the proportion of words not on the listof 3,000 easy words = number of difficultword/total number of words

ASL = the average sentence length = totalnumber of words/total number of sentences

and computing

0.1579 PDW + 0.4996 ASL + 3.6365.

Its score, designed to indicate the gradelevel, consistently correlated .70 with theMcCall-Crabbs criterion.

SMOG (Simple Measure ofGobbledygook) (McLaughlin, 1969)

SMOG scores too indicate reading levels,defined as the level at which readers canunderstand 90% to 100% of the informationin a text. It calls for counting

NS = the number of sentences involved—atleast 30: 10 consecutive sentences selected nearthe beginning of a text, 10 in the middle, and 10near the end. In long sentences with colons orsemicolons followed by a list, count each partof the list, together with the beginning phrase ofthe sentence, is an individual sentence.

NP = the number of polysyllable words in thesesentences (i.e., words with three or more sylla-bles, even if the same word appears more thanonce). Count words with hyphens as one. Readnumbers aloud to determine the number of syl-lables it takes to verbalize them. Take abbrevi-ations as the whole word they represent.

204 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 204

and computing

Fry Readability Formula (Fry, 1977)

Fry’s approach surely is most user-friendly. It requires randomly selecting three

passages of exactly 100 words, beginningwith a sentence, counting

X = the number of sentences in the 100words, estimating the last sentence tothe nearest 1/10th

Y = the number of syllables in the 100 words

and finding the grade level in the intersectionof the X and Y coordinates in the followinggraph.

1:0430

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

NP30

NS

� �s

+ 3:1291:

Inferring the Readability of Text • 205

1082.02.53.03.33.53.63.73.94.04.24.34.54.85.05.25.66.36.77.18.39.1

10.011.112.514.316.720.025.0+

112 116 120 124 128 132 136 140 144 148 152 156 160 164 168 172

long words

longsentences

6

7

8

910

11

12

13

1415 16 17 18 19

Average number of syllables per 100 words

Ave

rag

e n

um

ber

of

sen

ten

ces

per

100

wo

rds

Figure 1 Fry Graph for Estimating Reading Ages (in Years)

FORCAST Readability Formula(Caylor, Sticht, Fox, & Ford, 1973;see DuBay, 2004, p. 51)

The use of this formula is even simplerthan Fry’s. It was developed to evaluate read-ing requirements in the U.S. Army, applied tomilitary reading matters, especially technicalinstructions, and tested with members of theArmy in various occupational roles. It asks tocount

N1SW = the number of one-syllable words in apassage of exactly 150 words

and compute

20 – N1SW/10.

This surprisingly simple formula wasfound to correlate .98 with Flesch’s formula,.98 with Dahl-Chall’s, and .77 with gradedmilitary reading matter. It had the advantageof working within a relatively homogeneousadult population of military recruits and ser-vice personnel. DuBay (2004:52) reports on asimilar research project for the U.S. Navy.

STRUCTURAL PROBLEMS

OF THE PATHS TAKEN

Over 80 years of efforts by a growing com-munity of researchers to improve the validityof the construct underlying these formulae,multiple regression equations, have reached aceiling. Correlations with readability criteria

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 205

seem to stay below .80. One can identify sixreasons for this ceiling. They can serve as awarning to content analysts who seek todevelop similar computational constructs formaking inferences from text:

• Surface measures. Sentence lengths,numbers of syllables in words, short words,difficult words, and so on are easily countablebut pertain only to epiphenomena of readingand writing. Long words, for example, are notdifficult as such, but because they tend to beused less often, they naturally include moreunfamiliar words than short words do. Thereare multisyllable words most English readershave no problems with, like television, andmonosyllable words, like hod,* that may wellbring reading to a halt. Counts attend to therarely noted surface of texts that readers typi-cally penetrate.

• Typography. Webster’s dictionaryincludes legibility in the definition of read-ability, originally good handwriting. Asgraphic artists know too well, the readabilityof printed matter is influenced by font styles,sizes, colors, and background, as well as bythe organization of text (hierarchical organi-zation of headlines, bullets, and highlightingdevices) and layouts, including the use ofillustrations and graphs, known to add tointerest and comprehension.

• Narrative structures. Counts cannotcapture the organization of the countedwords, propositions, sentences, or paragraphsinto larger compositions. Narratives, argu-ments, syllogisms, coherence, and the devel-opment of plots that writers consider crucialfor making complex ideas clear escape con-text-free counting of units of text.

• Readers’ choices. Readers of technicalinstructions, even of large newspapers,rarely feel bound to work through a textlinearly, from its beginning to its end. Theytypically navigate through textual matter,selecting what supports the construction of

their own mental narratives along the way.Hypertext documents support nonlinearitiesexplicitly.

• Discursive competencies. Students, dur-ing their formal education, constitute a popu-lation that is relatively easy to differentiateinto grades, but adults develop unequal com-petencies and approach texts situationally.Discourse communities distinguish them-selves by their members’ interests, use of spe-cialized vocabularies, customary patterns ofreasoning, and prior knowledge of relevantsubject matter. What is readable in one com-munity may be incomprehensible in another.A single formula cannot do justice to thisdiversity.

• Cultural dynamics. Familiarity withvocabularies and grammatical constructionschanges with reading experiences and overtime. When used, what is difficult today isdestined to be less so in the future. The differ-ence in vocabulary between young people andolder folks is not merely developmental, assupposed by formulae that predict readinggrades. It signals a dynamics of culture andlanguage use to which reading and writingcontribute, literature and poetry in particular.For one example, in pre-Elizabethan English,the average sentence length was 50 words. InElizabethan English, it was 45 words. InVictorian English, it was 29 words (Sherman,1893, cited in DuBay, 2004). Currently, aver-age sentence length is down to 20. As read-ability shifts, ways to infer it must do as well.Culture-free formulae cannot.

Writers and teachers associations haveconsidered it a danger and resisted equatingthe components of readability formulae withguidelines for good writing. Indeed, if writerswere rewarded by achieving high readabilityscores, and only that, they could easily pro-duce meaningless strings of monosyllablewords. This possibility suggests that theabove formulae address only one epiphenom-enon of reading and writing, not the heart

206 • PART 3

*A tray with a pole handle that is borne over one’s shoulder for carrying loads, typically mortar or bricks. Hods are mostlikely familiar to masons.

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 206

of it. Inferred readability is unlike actualreadability. Readability has to do with howreaders understand texts and whether theirwriters can put themselves in the place ofreaders and make sense of their texts:together. For writers, readability formulae canat best serve as a warning that something mayhave to be addressed.

MEASURING THE READABILITY OF

TEXTS: THE CLOZE PROCEDURE

The above formulae are analytical con-structs, derived from multiple regressionsof textual attributes on a criterion variable.They are used predictively, to infer not thereadability of texts but various measures ofit. Prediction takes place in advance of a textbeing read—other than by their authors. Thisinference implicitly assumes that the analyt-ical construct underlying these formulaecould model and represents readers in somerespect. As shown above, this representationis shallow indeed.

Taylor (1953) developed a test that over-comes most of the above-mentioned short-comings. It relies on an ability of readersthat is most closely aligned with comprehen-sion: their ability to anticipate and makesense of words from their context of use.Osgood (1959:78–88) introduced Taylor’s“Cloze procedure” to the community ofcontent analysts. The procedure is simpleenough. From a text:

Replace, say, every fifth word by a blank thatreaders can fill in. Texts with at least 50 blanksmake the procedure quite reliable.

Count the number of correct guesses of thedeleted word—correct in form (no synonyms),number, person, tense, voice, and mode. Ignoredifferences in spelling.

The proportion of correct guesses is the Clozescore.

Cloze scores correlate highly with themultiple-choice answers by readers and canbe considered a substitute for subjective judg-ments of reading difficulty.

Guessing the correct words from theircontext of use enables readers to employmost of the abilities that the above formulaemust ignore: the information provided bytypography, grammar, narrative structures,readers’ discursive competencies, author-reader commonalities, and changes inculture—readers are always from the pre-sent and can be chosen from the populationof interest.

Scores below .35 indicate frustration,.35 to .50 assisted reading, instructional, and.50 to .60 unassisted reading (DuBay,2004:27). The Cloze procedure does not inferreadability from textual attributes; it measuresthe redundancy needed for reading compre-hension. It relies on real people rather thanregression equations. Inferences are inductive(from a sample to a population of texts +readers), not abductive (from text to thehuman ability to read it).

LESSONS FOR THE CONTENT ANALYST

For content analysts, the lessons from thehistory of readability research are as follows:

• It is important to be clear about theresearch questions to be answered and what isto be inferred, and to develop and empiricallyvalidate analytical constructs before applyingthem inferentially.

• To allow validity to accumulate. It paysto consider the categories and analytical con-structs of previous research before inventingnew categories whose validity is uncertain.

• Most analytical constructs have struc-tural limitations. They reach ceilings in theirability to answer research questions, in theiraccuracy of the inferences they enable. Goingbeyond these ceilings requires structural inno-vations in the analytical constructs used tobridge the gap between text and what is to beinferred from it.

• Using the intellectual capabilities ofreaders, coders, and observers may introduceproblems of reliability but can greatlyenhance validity in the end.

Inferring the Readability of Text • 207

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 207

REFERENCES

Caylor, J. S., Sticht, T. G., Fox, L. C., & Ford, J. P.(1973). Methodologies for determining read-ing requirements of military occupationalspecialties (Technical Report No. 73–5).Alexandra, VA: Human Resources ResearchOrganization.

Dale, E., & Chall, J. S. (1948). A formula for pre-dicting readability. Educational ResearchBulletin 27:1–29, 37–54.

Dale, E., & O’Rourke, J. (1981). The livingworld vocabulary: A national vocabularyinventory. Chicago: World Book-ChildcraftInternational.

DuBay, W. H. (2004). The principles of readability.Costa Mesa, CA: Impact Information.Retrieved October 25, 2006, from www.impact-information.com/impactinfo/readability02.pdf

Flesch, R. F. (1934). Estimating the comprehen-sion difficulty of magazine articles. Journal ofGeneral Psychology 28:63–80.

Flesch, R. F. (1948). A new readability yardstick.Journal of Applied Psychology 32:221–233.

Flesch, R. F. (1950). Measuring the level ofabstraction. Journal of Applied Psychology34:384–390.

Fry, E. B. (1977). Fry’s readability graph:Clarification, validity, and extension to level17. Journal of Reading 24:242–252.

Klare, G. R. (1963). The measurement of readabil-ity. Ames: Iowa State University Press.

McCall, W. A., & Crabbs, L. M. (1925). Standardtest lessons in reading: Teacher’s manual forall books. New York: Bureau of Publications,Teachers College, Columbia University.

McLaughlin, G. (1969). SMOG grading: A newreadability formula. Journal of Reading12:639–646.

Osgood, C. E. (1959). The representational modeland relevant research methods. In I. de SolaPool (Ed.), Trends in content analysis(pp. 33–88). Urbana: University of Illinois Press.

Sherman, A. L. (1893). Analytics of literature: Amanual for the objective study of Englishprose and poetry. Boston: Ginn & Co.

Stevens, S. S., & Stone, G. (1947). Psychologicalwriting, easy at hand. American Psychologist2:230–235.

Taylor, W. L. (1953). Cloze procedure: A newtool for measuring readability. JournalismQuarterly 30:415–433.

Thorndike, E. L. (1921). The teacher’s word book.New York: Bureau of Publications, TeachersCollege, Columbia University.

208 • PART 3

03-Krippendorff-45602.qxd 6/24/2008 4:53 PM Page 208