Journal of Memory and Language 46,57–84 (2002)doi:10.1006/jmla.2001.2797, available online at on

How Incremental Is Language Production? Evidence from the Productionof Utterances Requiring the Computation of Arithmetic Sums

Fernanda Ferreira and Benjamin Swets

Michigan State University

The incremental approach to language production assumes that the production system interleaves planning andarticulation processes. Two experiments examined this assumption. In the first, participants stated the sums of two

two-digit numbers in one of three different kinds of utterances, the sum by itself, the sum followed by quence “is the answer,” or the frame “The answer is” followed by the sum. Problem difficulty was manipulawell, so that in some conditions, speakers could (in principle) state the tens component of the sum while pthe ones. Latencies to begin to speak were the same for all three utterance types and were affected byculty of the problem as a whole. Utterance durations were unaffected by problem difficulty. In the secondiment, participants were induced to speak incrementally through the use of a deadline procedure. Both and utterance durations were influenced by the difficulty of the problem. This latter finding supports a basiise of the incremental approach: Speakers sometimes speak and plan simultaneously. Nevertheless, theproduction system appears not to be architecturally incremental; instead, the extent to which people spementally is under strategic control.© 2001 Elsevier Science

Key Words: language production; arithmetic; incrementality; syntax; phonology.

guistics began in the 1960s, researchers havegued for a hierarchical model of language pduction (Fromkin, 1971; Garrett, 1975, 1971980) in which processing proceeds from smantic intention to articulation. According to recent version of this model (Bock & Level1994) activity begins in the Message Compnent when a speaker forms an idea of whatwishes to say. The Grammatical Component erates next, and is itself subdivided into twstages of processing. The “Functional Level” slects the meaning and syntactic features

This study was supported in part by grant SBR-93192


sum re-heiousm.irs&bly

We also thank Gary Shroyer, Richard Falk, Mel Johnson, Robin Revette, Leah English, David Carnahan, Patrick Williams, and Janis Stacey for their assistance. Finally, we are grateful to Karl Bailey, Herb Clark, David Davidson, Vic Ferreira, Zenzi Griffin, John Henderson, Mike Tanenhaus for helpful discussions about the study and the Stanford Language Users Group (SLUGs) for their comments on the work. Address correspondence and reprint requests to Fernanda Ferreira, Department of Psychology, Psychology Research Building, Michigan State University, East Lansing, MI 48824-1117. E-mail: ferreira@msu.edu

and assigns them a grammatical role suchsubject or object. Then, positional processitakes place: The word-forms correspondinglemmas are retrieved, and the serial order ofdividual phrases is put together. The next coponent engages in phonological processing (see F. Ferreira (1993) for evidence that the coputation of the intonational and metrical fetures of an utterance precederetrieval of word-forms), and from there the system sends ordto the articulatory organs to produce speesounds appropriately.

Any adequate model of language productimust incorporate mechanisms to explain thimportant features of normal speech: First,terances generally conform to the speakecommunicative goals. Second, speakers chofrom among various ways they might expressidea. For instance, a speaker may say Ten plusseven make seventeen, or Seventeen is theof ten plus seven, and so on. The final choiceflects both the communicative needs of tspeaker and the processing states of the varcomponents of the language production systeFinally, although pauses, repetitions, and repacertainly do occur in normal speech (Clark Wasow, 1998), speakers are usually reasona




© 2001 Elsevier Science. All rights reserved.



fluent (Levelt, 1989). At the extreme, no comptent speaker needs to pause before retrieveach successive word of his utterance.

In many recent models of how people tathese features of speech are explained withsingle assumption that language productionincremental (V. Ferreira, 1996; Levelt, 198Roelofs, 1998; Wheeldon & Lahiri, 1997). Onca piece of information at a level of processibecomes available, it triggers activity at the nlevel down in the production system (Leve1989, p. 23: “Each processing component wbe triggered into activity by a minimal amouof its characteristic input”), and this sequenmay continue all the way to articulation. Th“vertical” aspect of the architecture is coordnated with what we might term its “horizontaactivities: At the same time that a piece of infomation works its way from idea to articulatioother pieces are constructed and make their through the system as well. Thus, incremenmodels assume that various levels can operaparallel. At the same time, this vertical paralelism implies a certain amount of horizontal sriality. That is, because the system is incremtal, information at a particular level ofprocessing is not necessarily handled in paraFor example, a word at the end of an utterawould not be syntactically available in parallwith one at the beginning, because the earword was shunted off for phonological procesing after it was syntactically encoded.

To illustrate how an incremental system opates, consider a speaker who wishes to expthe fact that 17 is the sum of 10 and 7. The pcessing events might take place as follows. Fiassume that the concept TEN becomes activfirst. That activation will trigger retrieval of thcorresponding lemma, which will cause thlemma to take its place at the left edge of a stactic frame. (Simultaneously, other concepbecome activated, but their presence is not rvant at the moment.) These events will causeword-form for TEN to be accessed, so the woten may be articulated. A consequence of tarchitecture, then, is that the speaker could ten before knowing exactly how the utteranwill end. As Wheeldon and Lahiri (1997) stat

fragment of an utterance should not be depeent on information available in later fragmenof the utterance. For example, constructing initial prosodic units of a sentence should notdependent on how the sentence will end”361). Having now said ten, the speaker is committed to some sort of a global form in whicthe addends occur earlier in the utterance the sum occurs later. Therefore, a likely utteance given these events is something like Tenand seven make seventeen.

The incrementality hypothesis is appealinbecause it accounts for all three featureshuman speech mentioned above. A concept cresponding to what a speaker intends to saycomes activated based on information containin the message-level representation. The eaactivation of the concept leads to early placment in the utterance. Incrementality, then, eplains at least in part how speakers choose framong different sentence forms for expresstheir ideas. Syntactic variations are taken advtage of by the production system to accommdate the states of activation of information in tproduction system. And speakers are fluent cause they do not need to pause for long periof time in order to plan their utterances. Insteain a highly parallel system, a concept may maits way vertically through the architecture whilsimultaneously, other concepts are becomactivated and encoded. Therefore, speakersnot need to plan whether to say ten and sevenmake seventeenor seventeen is the sum of teand seven. The decision is made by the avability of concepts in long-term memory.

Furthermore, the incrementality hypotheshas received a fair bit of empirical support in rcent experimental studies of language prodtion. The consequence that has been most toughly examined is the notion that syntacforms are chosen to reflect states of activationthe production system. Bock (1986) demostrated that a semantically primed word tendsoccur early in a sentence. Thus, if an experimtal participant is shown the word worship andthen sees a picture of lightning striking church, she is likely to produce a passive uttance such as the church is being struck by lig

rning. The idea is that the semantic prime caused





Wheeldon and Lahiri (1997) conductedmore direct test of incrementality. Participareceived a noun phrase such as het water(“thewater” in Dutch) followed by the question Watzoek je(“What do you seek?”), and their tawas to answer the question in a full senteusing the noun phrase they were given (i.eikzoek het water). Participants were encouragebegin to speak as quickly as possible. Wheeand Lahiri recorded utterance initiation timand found that latencies were shorter whenfirst phonological word of the utterance was lcomplex. The characteristics of the other phological words had no effect. They concludthat the production system is radically incmental: What determines when a speaker beto talk is how long it takes her to get the fiphonological word ready. This conclusion is inforced by results from their other three expements, which demonstrated that latencies are

fected by the complexity of the entire utteranc






On the other hand, some findings in the liteture are inconsistent with incrementality. Fexample, Lindsley (1975) asked participantsrespond as quickly as possible to a simple pture showing a transitive action (e.g., one perstouching another). He found that speakers beto phonologically encode their utterances befothey had syntactically encoded the object otransitive action but not before they knew tverb. In another study, Meyer (1996) usedword distractor paradigm to examine how muinformation about the words of an utteranceaccessed prior to articulation. Speakers Dutch) produced simple utterances and wpresented with a spoken distractor that wasther semantically or phonologically related either the subject or object noun of the sentenMeyer found that a semantic distractor for eithnoun increased initiation times, while a phonlogical distractor impaired performance onwhen it was related to the subject noun. Meyconcluded that sentence production requiresretrieval of the semantic/syntactic informatioassociated with most of the utterance, but it orequires the retrieval of phonological informtion for the first word or phrase. Thus, Meyerexperiments provide evidence that grammatiencoding requires simultaneous knowledabout both preverbal andpostverbal material, afinding that is inconsistent with incrementality

Further evidence that is inconsistent with icremental production comes from a study Stallings, MacDonald, and O’Seaghdha (199who examined the tendency for sentences wmain verbs such as transferredand introducedto occur in either a canonical or a shifted for(e.g.,Mary introduced the new neighbor to Bivs Mary introduced to Bill the new neighbor, re-spectively). They found that the more oftenverb tended to occur in structures separatin

efrom its complement, the more likely partici-



Finally, just how fluent is real language prduction? Studies of naturally occurring speehave shown that speakers do tend to pausefore major syntactic constituents. For exampHenderson, Goldman-Eisler, and Skarb(1966) found that periods of fluency alternawith phases during which the speaker engain a great deal of pausing. Also, more demaing speaking tasks (e.g., interpreting a cartversus merely describing it) result in less fluspeech (Goldman-Eisler, 1968). Similarly, Fo(1982) examined spontaneous speech for palonger than 200 ms and found that they pceded about 20% of deep clauses. Holm(1988) asked speakers to talk spontaneouslvarious topics and then asked another grouparticipants to read the utterances the forgroup produced. She found that pauses anditations occurred before complement and retive clauses in spontaneous but not read speThese pauses suggest that speakers disrupency in order to plan their speech (see alsFerreira, 1991, 1993), and the fact that they pin units that are approximately clausal in s(see also Bock & Cutting, 1992) indicates ththe system has nonincremental aspects. FinClark and Wasow (1998) have shown that disencies in normal speech are quite common furthermore, that they are more likely to occthe more complex the upcoming constitueThus, although an “ideal delivery” (Clark &Clark, 1977) might be the speaker’s goal, unmost normally demanding speaking circustances, pauses, hesitations, and other disflcies are fairly frequent.

Where are we so far? On the one hand, eximents in which speakers are asked to begi

speak as soon as they can tend to show that



The participants in Brysbaert et al. (199stated the sum of a one-digit number and a twdigit number as quickly as possible. For exaple, someone might see “21 14” and respond“25”. The problems were presented either wthe larger number preceding the smaller or in other order. In addition, Brysbaert et al. testtwo groups of participants, native French speers and native Dutch speakers. The reasonlooking at these two groups is that the languadiffer in whether the ones or tens are pronounfirst for a number such as “twenty-five”. IFrench, one says the equivalent of “twenty-fivebut in Dutch one says the equivalent of “five atwenty”. Thus, if participants are faster to initiaarticulation of the sum when the format of thproblem allows them to prepare the first phonlogical word (“five” or “twenty”) of their utter-ance more easily, then we will have support radical incrementality. This was indeed the resBrysbaert et al. reported: For no-carry items suas 21 14, French participants showed a 56-madvantage for the 21 1 4 order, and Dutch participants showed a 51-ms advantage for the 4121 order. Brysbaert et al. interpreted the resuas follows: “the Dutch speakers try to get acceto the unit of the response first, because they start programming the pronunciation of the aswer as soon as the value of the unit is knowncontrast, the French speakers have to capitaon the value of the ten, which they must knobefore response execution can be started”67). This, then, is incrementality of the sort pr

peo-posed by Wheeldon and Lahiri (1997): Speakers





otgi hctr






Overall, research in language production dnot provide clear evidence one way or the ofor the incrementality hypothesis. What doseem apparent is that under some circumstaspeakers are able to begin to speak as soothey have formulated the smallest bit of lingutic structure (a phonological word), but thatother situations speakers are more cautiousdo not speak until a reasonably large chunkan upcoming utterance has been planned. observation reveals that the language producsystem is strategically incremental: An imptant component of speech planning is to demine whether the situation calls for “blurtinor for more careful planning. This perspectsuggests an important empirical question forsearchers in language production: Even wspeakers find that it is in their interests to artilate as quickly as possible in order to hold floor, do they still plan more than just the fiphonological word of the utterance anyway?finding that they do would suggest that the lguage production system mandates some pning regardless of the speaker’s goals and stgies.

To examine these issues, we conducted experiments that utilized an arithmetic tamuch like the one used in the Brysbaert et(1998) study. In our first experiment, partipants responded to addition problems by pducing three different types of sentence forjust the answer itself, the answer at the bening of a sentence, and the answer at the ena sentence. This experiment yielded no evide

for incrementality: Participants planned the e



The goal of this experiment was to examincrementality in language production by seewhether we could take the Brysbaert et (1998) effect one step further. Participaadded together two two-digit numbers astated the sum in the form of an utterance. varied the difficulty of computing the tens coumn and of computing the ones column inpendently. In addition, participants were toldstate their answers in one of three ways: Theterance could consist of just the sum, a sentein which the sum was the grammatical subjeor a sentence in which the sum was a grammcal object. Recall that what Brysbaert et al. ported was that speakers preferred to haveaddends arranged so that the leftmost addenlowed them to compute and articulate just first phonological word of the sum. An extesion of this result would be a finding that, in tsum-alone and sum-first conditions, initiatitimes are influenced only by the difficulty calculating the tens column. Moreover, eventhis quite radical incrementality is not observin the experiment, the design allows us to twhether speakers are at least somewhat inmental. If they are, then problem difficulty (the entire problem, not just the tens columshould influence initiation times but only in thconditions in which the sum occurred utteran



e ecd


edmouedr baa

Participants. All participants tested for thexperiment were undergraduate studentsMichigan State University, and all participatin exchange for partial credit in their introdutory psychology courses. In the sum-only contion, 40 participants were tested in order to up with 32 participants whose data could used. To be included in the analyses, a parpant had to produce correct answers to mthan 75% of the problems. This criterion elimnated 5 participants; another 3 people werecluded because they did not speak louenough to trigger the voice-key. For the suframe condition (“SUM is the answer”), 3 peple were not included in the analyses becathey made too many errors, and 2 becausvoice-key problems. For the frame-sum contion (“The answer is SUM”), 4 participants weexcluded because of high error rates, and 1cause of voice-key errors. Thus, in total, dfrom 96 participants were analyzed, 32 in eof the three between-participants conditions.

Materials. Two different stimulus lists wecreated,1 and a given participant saw only onethose lists. Each list contained 168 problemsof those did not require a carrying operat(no-carry problems), 56 did (carry problemand 56 were filler problems. The no-carry prlems occurred in four different conditions: f14 of them, computation of the tens part of sum and the ones part of the sum was easyanother 14, computation of the tens was eand of the ones was hard; for another 14, c

putation of the tens was hard and of the on

-se ofi-ee-




was easy; and for the final 14, computationboth tens and ones was difficult. A column wconsidered easy if the sum was smaller thaand hard if the sum was greater than 5 (GeBow-Thomas, & Yao, 1992). (Problems fwhich either column summed to 5 were ecluded from the no-carry and carry sets.) An ample of each problem type is given in TableThe carry problems also participated in foconditions, “easy”, “hard”, “harder”, and “hardest”. This continuum is based on the size of resulting sum (Ashcraft, 1992). There wethree types of filler problems, those that cluded an addend smaller than 20 but did noquire a carrying operation, ones that includedaddend less than 20 and that did necessitaterying, and ones in which an addend was divble by 10. The no-carry, carry, and filler prolems were put into a random order, constraiso that two problems of the same type (sacondition or filler type) did not occur in a rowThe lists were identical, except that the ordeaddends was swapped from one list to the ot

Procedure. Participants were tested individally. The session began with the experimenreading the instructions. Participants were tthat they would see an addition problem on computer monitor, and their task was to statesum as quickly as possible into the microphoThey were reminded that they should responsoon as possible, but that they should make they had the correct answer. A trial began witfixation cross located in the center of the screOne second later, the problem was presewith both addends on the same line separatea plus sign, with spaces around the plus. participant responded and then pushed a buto proceed to the next trial. He or she waslowed to correct the response initially givest,


The experimenter pushed one key on the k


Examples of the Problems Used in Experiment 1

No-carry problems Carry problems Filler problemsCondition Example Condition Example Type Example

Easy 10s, easy ones 21 1 22 543 Easy 23 128 551 Addend ,20, no-carry 16 111 527Easy 10s, hard ones 21 1 26 547 Hard 23 138 561 Addend ,20, carry 17 136 553Hard 10s, easy ones 21 162 583 Harder 23 158 581 Addend divisible by 10 12 1 70 582Hard 10s, hard ones 21 166 587 Hardest 23 168 591

reinm tori-alltis-ok




FIG. 1. Initiation times (in ms) for no-carry problems, Experiment 1.

board if the answer given was correct, and other if it was incorrect. The sessions were tarecorded to allow those experimenter decisito be checked off-line, and to allow the dutional properties of the utterances to be analyat a later point.

For the sum-only condition, participants weasked to state the sum by itself. For the suframe condition, they were asked to give thanswers “in the format ‘X is the answer’, wheX is the sum.” They were then given an exam(“if given 25 1 25, you would say ‘Fifty is theanswer’”). For the frame-sum condition, thwere asked to state their answers “in the for‘The answer is X’, where X is the sum.” Thewere then given an example (“if given 25 1 25,you would say ‘The answer is fifty’’’).

An experimental session lasted about 45 mand participants were free to take a break tween trials whenever they wished. The sessbegan with a practice session consisting ofproblems (with the same characteristics asfillers) and ended with a debriefing in whithey were told the purposes of the study.

Results and Discussion

Latencies and accuracy data, no-carry prolems. A trial was excluded if the response ti

was less than 300 ms or greater than 20 s (2%








the data were excluded) and if the response incorrect. Latencies were analyzed as a 3 32 32 mixed factorial. The first variable is betweeparticipants and represents the three utteratypes (sum-only, sum-frame, frame-sum). Tother two variables are within-participants aconcern the difficulty of the problem: The tencolumn was either easy or harder to calculaand similarly for the ones. (The two differelists produced the same results, so all analycollapse over this variable.) All reported effecare significant at p, .05.

The latencies for the no-carry items ashown in Fig. 1. People took longer to begsaying the answer when it was difficult for theto calculate the tens and when it was difficultcalculate the ones. The effect of the two vaables was additive, and it was identical for three between-participant conditions. The statical analyses are as follows: Participants tolonger to initiate production of the sums whthe tens were more difficult (2397 vs 2194 mF(1,93) 5 24.22, MSe 5 164606, and longerwhen the ones were more difficult (2433 2159 ms),F(1,93) 5 101.86, MSe 5 70999.The two variables did not interact,F(1,93) 52.44, MSe 5 52436. There was no main effeof utterance type,F , 1, and no significant in-

The latency data place constraints on the tent to which the language production systemincremental, because the difficulty of both ttens andthe ones contributed to initiation timeIn other words, participants began their uttances only once they knew both parts of sum, not just the part corresponding to the fiproduced phonological word. (Recently, Brybaert and Fias have conducted a study idento this one but with Dutch-speaking participanfrom the Catholic University of Leuven. Thehave reported the same results: both the dculty of the tens and the difficulty of the ones fluence time to initiate production of the sumPerhaps more surprisingly, the type of utteranin which the participant had to state the sum not matter at all. Indeed, not only was the ptern the same across conditions, the absoluteaction times were remarkably similar as well.does not appear, then, that participants produtheir utterances in an incremental fashion.they had, then latencies in the frame-sum contion would have been the same for the differproblem types, because latencies would hbeen controlled by the time to produce the saphonological word across all conditions (“Thanswer” or “the answer is”). In addition, the rquirement to produce not just the sum but ato remember to place it in a particular sort frame does not seem to have affected lateneither (and we will see next that it does not fect accuracy).

The accuracy data are given in Table 2. Acracy was unaffected by the between-participamanipulation: People were 95, 94, and 94%

The accuracy results make clear that thequirement to produce the sum in a carrier fradid not impose the sort of burden on participathat would result in lower accuracy. Furthemore, the need to hold the sum in memory whsaying “the answer is” does not seem to hcaused any particular burden either, as indicaby the equivalent accuracy in the sum-frame frame-sum conditions.

Latencies and accuracy data, carry problem.Three percent of trials were eliminated from analyses because of response times less 300 ms or greater than 20 s, or because of acorrect response. Latencies were analyzed a3 4 mixed factorial: The first variable is btween-participants and represents the threeterance-type conditions. The other variablewithin-participants and concerns the difficuof the problem as reflected in the progressivincreasing size of the sum (easy, hard, harhardest).

Percentage Correct for No-Carry Problems, Experiment 1

Utterance typeSum-only Sum-frame Frame-sum

Easy ones Hard ones Easy ones Hard ones Easy ones Hard

Easy tens 94 94 94 94 93 94

FIG. 2. Initiation times (in ms) for carry problems, Experiment 1.

and 81 and 83% accurate in the harder and hest conditions.

The data from latencies and accuracy for bthe no-carry and carry problems provide litsupport for the notion that the language prodtion system is highly incremental. To explothe incremental hypothesis further, the duratiof the utterances were measured and analyzorder to assess whether participants compthe sum online during utterance production (ias they were saying “the answer is”) as welprior to its initiation. Disfluencies were anlyzed as well.

Utterance durations. Data from the framsum condition were analyzed in order to asswhether speakers might have engaged in splanning, or checking of their answers, as tproduced the frame component of their utances. Unfortunately, not all the recordinmade of the experimental sessions were of form quality, and therefore only 26 of the participants whose data were included in analyses for the frame-sum condition couldincluded in these analyses of duration and fluencies. (Analyses of the latencies and acracy levels of just these 26 participants de

mined that this subgroup showed the sapattern as the larger one.) The utterances w



rensd intede.,as-


digitized at a rate of 20 kHz using ComputerizSpeech Laboratory (CSL, from Kay Elemerics), and durations were measured using CSwaveform editor. The utterance was divided inparts as follows, the sequence the answer is,sum, the first digit of the sum, and the secodigit of the sum. Any pause time before the fidigit was included as part of its duration, anany pause time before the second digit wascluded as part of the second digit’s duratioOne concern is that the digits themselves different in the different experimental condtions (see Appendix A), and it is possible thsome of the differences in sum durations are dto the digits themselves and not the experimtal manipulations. We will present the data anway because we wish to explore any measthat might potentially reveal incrementality, anbecause the results are not discrepant from thfor the answer is. Results are shown in Table

For the no-carry utterances, the duration the answer iswas longer when the tens weeasyto compute than when they were more dficult (475 vs 445 ms),F(1,25) 5 16.84, MSe 51431. Also, the duration was longer when tones were difficult (454 vs 467 ms),F(1,25) 5


we re-nd tok-


The duration of the sum was longer when cculation of the tens was difficult vs easy (687650 ms),F(1,25) 58.33, MSe 52858. The du-ration of the first digit of the sum (the tens) wlonger when the tens were difficult (347 vs 3ms),F(1,25) 5 2.87, MSe 5 1222,p , .10, andsimilarly for the duration of the second digit—the second digit was longer when the tens been difficult (340 vs 315 ms),F(1,25) 5 6.84,MSe 5 2273. It appears that when the tens wmore difficult speakers stretched out the tithey spent saying the sum.

For the carry utterances, the durations of answer is, the first digit, and the second dwere unaffected by problem difficulty (see Tab3), all Fs ,1.

Analyses of disfluencies for frame-sum uttances, both problem types. A laboratory assistant unfamiliar with issues in psycholinguistiand cognitive science listened to the utteranand noted any disfluencies. Disfluencies wcategorized into three major types, based ontype of editing term the speaker used (ClarkWasow, 1998): “uh”, “um”, and other (e.gwords such as “geez” and profanities). Thwere five potential locations for a disfluency, bfore the utterance, after “the”, after “answeafter “is”, and after the first digit of the sum. (N

disfluency term occurred inside a word.) Ove








scesrethe &,ree-”,o

all, editing terms of any kind were quite rare. the 3854 utterances that were digitized for duration and disfluency analyses, only 69 ctained a disfluency (and only 2 of those involvthe term “um”), and half of those occurred afthe word “is”. A total of 3530 utterances weassociated with correct responses; 63 (1.78included some disfluency. A total of 324 utteances were associated with incorrect respononly 6 (1.85%) included a disfluency.

In sum, this experiment yielded little evidenfor incrementality. A skeptic could argue, however, that the results are due to the nature oftask. The experimental situation calls for tspeaker to produce error-free utterances, andspeaker has no need to begin to speak quicThe problems themselves were quite difficuparticularly the carry problems. Therefore, calclation might have required all the speakers’ atttion and not permitted the participants to “dutask” such that they were speaking and calculaat the same time. The second experiment, thwas changed so as to encourage participanspeak incrementally as much as possible.

Before turning to the second experiment,will address one final issue concerning thesesults. Recall that our goal was to try to exteBrysbaert et al.’s (1998) result that seemedsupport radical incrementality. If French-speaing participants (for instance) preferred the xx1x order because it allowed them to articulate tens while calculating the ones column, thperhaps with two two-digit addends our speers of English would plan only the tens comp



di la








Because of the surprisingly strong nonincmental effects obtained in Experiment 1,made changes to the experimental task in oto maximize the chances that incremental havior could emerge. First, problems were measier. Second, by introducing a timing barthe task, we gave participants an incentivebegin to speak quickly rather than to wait unthey felt entirely ready. Finally, the stimuluproblems were left on the screen when partpants began to speak, unlike in Experiment 1which participants’ voices caused the stimulidisappear. All utterances were produced in form “The answer is SUM” (see the section

pilot work below).

Participants. Fifty participants were testeall of whom were undergraduates at MichigState University receiving partial credit in theintroductory psychology courses. To be cluded in the analysis, a participant had to pduce correct answers to more than 75% of bthe practice problems and the experimenproblems. All participants met these criteria. Aadditional 4 participants were tested prior to 50 whose data we are reporting. These 4 inviduals were tested in pilot work designed to sess what would happen if speakers could any utterance form they wished (see “Pilot structions,” below, for more details).

Materials. Two different stimulus lists wercreated, and a given participant saw only onethem. Each list consisted of 112 problems, 56the no-carry experimental items from Expement 1 (all of which consisted of two two-digaddends), and 56 new no-carry problems csisting of a one-digit addend and a two-digit adend. Computation of the ones column couldeasy or hard, and similarly for the tens columA column was considered easy if the sum wbetween 1 and 5, and hard if the sum was tween 6 and 9. Examples illustrating the contions are shown in Table 4.

Problems were presented in a random orThe lists were identical, except that the orderaddends was swapped from one list to the otIn any given list, half of the mixed addend pro

Mixed addends problems Same addends problemsCondition Example Condition Example

Easy tens, easy ones 2 1 41 543 Easy tens, easy ones 21 1 22 54341 12 5 43 22 121 543

Easy tens, hard ones 5 1 24 529 Easy tens, hard ones 21 126 54724 15 5 29 26 121 547

Hard tens, easy ones 3 1 71 574 Hard tens, easy ones 21 1 62 58371 13 5 74 62 121 583

Pilot instructions. The goal of this experment was to allow participants to speak as inmentally as possible. As we were developthis experiment, we reasoned that speamight even choose to use different utteratypes to facilitate this goal. For example, if thknew the sum quickly and wanted to get it outhe way, they might choose to say “SUM is tanswer”; if they needed extra time, they migprefer the other arrangement, and they meven want to include extra “filler” words tstretch out the amount of planning time thwould have before having to articulate the s(e.g., “I think that the answer to that oneSUM”). If this strategy were observed, wwould have more evidence for incrementalas Levelt argued (1989, p. 245), speakers mchoose the form and even the content of theiterance to accommodate the way that an uance is evolving as it is encoded left to right.

Thus, the first four participants we testwere told to use any utterance form they likexcept that we required them to place the sinto a complete sentence. No examples wprovided, to prevent our biasing their responsAfter four participants, it became clear thspeakers did not wish to take advantage oflatitude we had given them; on all 448 tria(112 trials 34 participants), the speakers chothe form “The answer is SUM” (the frame-sucondition of Experiment 1). The instructions fthe actual experiment, then, which excludthese four participants (because this latter grreceived different instructions from the subquent participants), required participants to the frame-sum format for their responses.

Procedure. Participants were tested individally. The session began with the participareading a set of written instructions. They wtold that they would see arithmetic problemsthe computer monitor, and that their task wa

give the answer in the format “The answer SUM”. They were then given an example, e.g












“if you saw 25 125, you would say ‘The answer is fifty’”.

Participants were also warned that the prlem on the screen would be accompanied btiming bar, and that they should answer question before the timing bar counted all way down and produced a loud “beep” souThe timing bar was a horizontal rectangle tgradually shrunk in size until it disappearewhereupon a beep sound was emitted by computer. The duration of the timing bar varrandomly between 2 and 4 s. Participants wtold that they should begin to speak quicklyavoid being beeped. They were also assuredthey could correct their response if they belieit to be incorrect.

The 2- to 4-s value range for the timing bthat we ultimately used for this experiment wthe first range that we attempted, and becauworked so well we saw no need to changeOur goal at the start was to pick a value for lower bound of the deadline that was loenough that it could be beaten on almost allals, because we did not want to eliminate a laproportion of our data based on the participahaving been beeped. The lower bound of 2 s about the average time that participants requin the first experiment to begin to respond to no-carry problems in the frame-sum conditioThe 4-s upper bound was chosen simplyallow there to be trials on which participanwould easily beat the deadline. After analyzthe data from the pilot participants it becaclear that the deadline values we had choworked optimally: Response latencies were duced by more than two-thirds, and accurwas not compromised (see Results and Dission).

Finally, participants were told that the prolem would remain on the computer monithroughout the trial. In Experiment 1, the prolems disappeared as soon as participants bto speak, which may have made it difficult fpeople to time-share the tasks of articulatingframe of the utterance while computing the sof the problem from memory.


to bring up the addition problem. They were lowed to correct their responses if they wishbut only the first response was considered in egorizing a trial as correct or incorrect. The perimenter then typed in their response and“enter” to proceed to the next trial. The sessiwere tape-recorded to allow utterance duratito be analyzed at a later point.

An experimental session lasted about 35 mThe session began with a practice session coning of 32 problems (with the same characterisand conditions as the experimental stimuli) aended with a debriefing in which participanwere told the purposes of the study. It is also rvant to evaluating the effectiveness of the timbar to note that at the beginning of the practrials, participants sometimes got beeped—is, they allowed the timing bar to elapse. Hoever, by the end of the practice trials, participawere never beeped, nor did they allow the timbar to elapse during the experimental triaClearly, for whatever reason, participants fouthe beep aversive and performed on every expmental trial so as to avoid hearing it.

Results and Discussion

We will first describe the latency and accuradata. We will begin with the problems consistiof two two-digit addends (same addend prlems), because these data are the most comble to the data collected in Experiment 1. Thwe will proceed to the problems consistingone- and two-digit addends (mixed addend prlems). Next, we describe utterance durationsboth types of problems. The final section su

For latencies, a given trial was excluded if tresponse time was less than 300 ms or grethan 5 s. Trials associated with incorrect responwere also excluded (trials were considered increct if the firstresponse given as part of the uttance was the wrong sum). Latencies were alyzed as a 2 32 3 2 factorial. All three variableswere within-participants. The first variable worder of addends. The other two variables ccerned the difficulty of the problem: The tens cumn was either easy or harder to calculate, andsame was true for the ones column.

Same addend problems. No differences baseon order of addends were observed, so all anses collapse over this variable. Latencies shown in Fig. 3. First, latencies to begin articlation were much faster in Experiment 2 ththey were in Experiment 1. Whereas the averlatency for frame-sum, no-carry problems Experiment 1 was 2281 ms (the exact saproblems analyzed here), the average latencythe two two-digit addend problems in this eperiment was 634 ms. This marked differensuggests that the changes implemented intask were extremely effective in getting subjeto respond more quickly—latencies were duced by over 70%. Even more surprisingthis increase in speed did not compromise acracy: People were correct on 96% of trials. comparison of Table 2 and Table 5 reveals,curacy was just as high in this experiment asthe comparable conditions of the first.

Contrary to the incrementality hypothesis,tencies were longer when the ones column w


more difficult to calculate,F(1,49) 5 11.84,MSe 5 6052. For easy ones, latencies were 6


Percentage Correct, All Problems, Experiment 2

Problem typeMixed addends problems

Same addends problems Single-digit Double-digit Smaller addend first Larger addend first addend first addend first

rdiffi-. Ififfi-erehenesesr-ard-elyvi-ere


The accuracy data are given in Table 5. Ovall, accuracy was very high, and performanvaried little over conditions (from 93 to 99%correct). Significant effects of the manipulatvariables were observed, nonetheless. F

there was a main effect of addend order (97.1








There is a suggestion in the data that peresponded faster overall when given the twdigit addend before the one-digit addend (5vs 600 ms for the opposite order),F(1,49) 53.56, MSe 5 4386,p , .07. Further evidencfor this possibility comes both from the accracy data and from comments made by parpants to the experimenter during the experimtal session. First, participants were maccurate when the two-digit number was leftmost addend (98.4 vs 97.3% when the odigit addend was on the left),F(1,49) 5 5.61,MSe 5 0.0023 (although a significant interation between addend order and ones difficqualifies this main effect; see below). Secoafter the experiment was over, participants comented that when a one-digit number was first addend, its position to the left of the twdigit number made it difficult for them to coceptualize the one-digit number as belongingthe ones column. They reported that they sotimes incorrectly considered adding that numto the tens place of the second addend.

This result concerning order of the addereplicates the Brysbaert et al. (1998) effect: Pple prefer the xx1 x order over its oppositeThey have a tendency with the other order toto add the single addend to the tens columthe second addend because they are tryindeal with the columns in the order in which thwill be spoken. Of course, the fact that this fining emerges in the latency data—at a point wbefore the speaker actually has to articulatesum—suggests that this preference is in planning and not in the articulation. In othwords, when speakers plantheir utterances, the

prefer that material unfolds in the ultimate, to




Overall, the main results for the mixed-adend problem types clearly show that even these easier problems, participants do not beto speak until they have taken account of at lesome aspects of the sum. Even with a great of incentive to speak incrementally, participashowed signs of planning all the way to the fiword of the utterance.

The accuracy data are given in Table 5. Pticipants were more accurate when the owere easier to calculate (98.7 vs 97.0%F(1,49) 5 9.04, MSe 5 0.0033, and there wano effect of tens difficulty. A significant interation was found between the order of addeand the difficulty of the ones,F(1,49) 5 5.57,MSe 5 0.0023. Participants were equally accrate for both orders of addends when the owere easy (98.7%), but if the ones were haparticipants preferred to have the two-digit adend to the left of the one-digit addend. Accracy was 98.1% for the former order and 95.for the latter.

To summarize, latency data from both typof problems revealed that, despite all of changes made to the procedure to induce paipants to speak incrementally, people nonetless persisted in showing nonincremental, plning effects from the difficulty of the problemFirst, people took longer to begin to speak the more difficult two two-digit addend problems than for the mixed addend problems. Sond, difficulty of the ones affected initiatiotimes for both types of problems. While thedata are difficult to reconcile with a radically icremental view of language production, thmay be consistent with a more moderate vsion. In order to test whether participanplanned and calculated during articulatiothereby demonstrating some incremental tdencies, we analyzed utterance durations.

Utterance Durations

Utterances from the first 24 participants we

Mixed addend problems. Participants took


measured using a waveform editor. (We alyzed only 24 of the 50 participants’ data bcause waveform measurements are extremtime-consuming to obtain, and based on the experiment we estimated that 24 participawould suffice to provide stable estimates these duration means.) Each utterance wasvided into three parts, the sequence the answeris, the first digit of the sum, and the second dof the sum. (As with Experiment 1, it is probbly prudent to put more weight on the duratdata for the frame than for the sum, becausedigits of the sum differ across conditions, atherefore intrinsic durations are not controlleIncorrect trials and trials on which correctioto previously given incorrect partial and/or fuanswers were made were excluded from data. Results are given in Table 6.

Overall, the duration of the answer iswaslonger for the more difficult two two-digit addend problems than for the mixed addend prlems (750 vs 610 ms),F(1,23) 5 58.33, MSe 532334. The same pattern held for the duratiothe first digit of the sum (479 vs 417 mF(1,23) 5 18.18, MSe 5 20534, and the duration of the second digit of the sum (435 vs 3ms),F(1,23) 5 21.93, MSe 5 7851. These dif-ferences are the first bit of evidence that peoare planning as they speak, and that calculaand articulation may go on in parallel.

Same addend problems. The duration of the

answer iswas longer when the tens were diffi-longer to say The answer iswhen the 1-digit ad-


1,88s 3est isva- the ofe-n-fterte



Clearly, the duration data overall reveal teven though speakers planned some aspeceven the very final word of their utterances fore they began to speak, they did not prepwell enough to enable them to speak confidewithout engaging in more planning during artulation. This pattern is critical, because it maclear that the production system has both inmental and nonincremental aspects. In additeven when people perform in a situation thatmost forces them to speak as incrementallypossible, they still engage in some planning.nally, the data make clear that it is possiblepeople to speak and calculate simultaneousl

Analyses of Disfluencies, Both Problem Type

Disfluencies were categorized into thrmajor types, based on the type of editing te

By comparing the latency and duration resufrom the two experiments, it is possible to assin what ways people speak differently whthey can plan carefully versus when they mgrab the floor quickly. Figure 4 allows the comparisons to be made easily for the differeproblem types in both experiments. For the Eperiment 1 results, all data are from the framsum condition only. The top set of four bars dpicts the data for the carry problems from tfirst experiment. Clearly, speakers wait a velong time before beginning to speak when thare confronted with these difficult arithmetproblems. The next set of four bars representno-carry problems from the first experiment. can be seen, latencies to speak are much shoAnd, as described in the Results for Experim1, utterance durations do not change in respoto problem type, because speakers in the first

The middle set of four bars shows the datathe no-carry problems from the second expment. It is interesting to compare them with four bars immediately above. The problems the same (so the utterances are identical in tent); what is different between the two expements is that in the second, speakers were psured to begin to speak quickly. Clearly, latencto speak are much longer in the first experimwhere speakers had the luxury of planning. In

dition, utterance durations are longer in the s

durations were longer when the ones were more

difficult to calculate, as would be expected.


This section will be organized into thremajor parts. First, we will summarize the maresults and discuss their implications for the cremental hypothesis. Second, we will descra model of language production (F. Ferrei2000) that can account for why radical incrmentality of the sort proposed by Wheeldon aLahiri (1997), for example, was not observedthe experiments described here. Finally, we wbriefly consider issues relating especially to tarithmetic task we required our participants perform, focusing particularly on how our results compare to those obtained by Brysbaeral. (1998).

Incrementality in Language Production

According to an incremental model of language production, people do not plan their utt

othances completely before they begin to talk. In-



uldplen anotle,

a-s to se-

la- the to,tionlemm- the


The results that support these conclusionsthe following.

(1) The task used in Experiment 1 providlittle incentive for people to speak incremetally. Results showed that participants did initiate speech until they had planned the scarefully. The extent of planning was the saregardless of whether the sum occurred atbeginning of the utterance, at the end, or byself. Utterance durations were not influencedproblem difficulty, suggesting that speakers not engage in arithmetic calculations while ticulating.

(2) The task used in Experiment 2 stronencouraged people to speak incrementally.found that utterance durations were influenby problem difficulty. Therefore, it appears thspeakers were calculating as they articulatedthe same time, latencies were also affectedproblem difficulty, suggesting that speakplanned at least some aspects of the sum bspeaking, even though the sum occurred at

end of the utterance. Planning effects were le










extensive than they were in the first experimhowever. In Experiment 2, latencies were inenced by the difficulty of the ones column not that of the tens; in Experiment 1, the diculty of both column calculations affected latecies. For both experiments, more difficult prolems overall were associated with longutterance initiation times: In Experiment 1,terances for no-carry problems were initiasooner than those for carry problems; in Expment 2, utterances for mixed addend proble(a one-digit and a two-digit addend) were inated sooner than those for two two-digit addproblems.

(3) It might be argued that Experiment 2 dencourage people to speak more incrementallythan they did in Experiment 1, but that peomight be capable of speaking even more inmentally still. This argument cannot be rebutdefinitively, but given that latencies in Expement 2 were reduced by over 70%, and partlarly given that they were about 650 ms on avage, it is hard to imagine that people coinitiate speech any faster. After all, when peoare asked to simply name a single word ocomputer monitor, their latencies are often much lower than 600 ms or so (for exampDuffy, Henderson, & Morris (1989) reported ltencies of 608 ms on average for participantname the word that occurred at the end of amantically neutral sentence, e.g.,the womansaw the MOUSTACHE). Thus, the averagetency in Experiment 2 is not much more thantime it would have taken participants simplyread and say out loud the phrase the answer. Yeteven with such short latencies, the reactimes managed to display influences of probdifficulty. Furthermore, accuracy was not copromised: Participants were as accurate insecond experiment as they were in the first. Nther were speakers any less fluent: Disfluenoccurred on 1.78% of trials in the first expement, but on less than 1% of trials in the seco

(4) The phrasing the answer is SUMdoes notappear to be an unnatural way for speakerstate the sum of an arithmetic problem, as icated by the results of the pilot study conducin association with Experiment 2. Participa

sswere free to use any utterance form they wished


in which to state the sum, as long as they usfull sentence. Participants spontaneously hitthe form the answer is SUM, and they never wavered from it. This result should be viewed tentative, because the experiment was not signed to test directly whether utterance foand content vary in response to online demaof utterance formulation. The result may viewed as suggestive, however. It hints at possibility that the following general stomight be right: Speakers might sometimchange the form and content of an utteranceline as they become aware that the utterathey are producing is not likely to end happFor example, a speaker might change fromNoone has to give money to a group that they dlike what they’re doingto something like No onhas to give money to groups whose actions don’t approve ofin order to avoid a subjacencviolation (Chomsky, 1977). But they might bless likely to change the form and content of utterance in order to give themselves more tmerely to formulate some later part of an uttance; instead, the results of Experiment 2 sgest that they would choose instead to streout the utterance to give themselves the neetime.

Clearly, the language production systemnot structured such that “processing at all levoccurs in an incremental fashion with a procsor being triggered by any piece of charactetic input from the processors that feed into (Wheeldon & Lahiri, 1997, p. 361). In this “triggering” view, which is what we have terme“architectural incrementality” (and which pemits radical incrementality), a module of thlanguage production system that encountersformation stated in its vocabulary is called inaction automatically, so it performs its computions obligatorily (Fodor, 1983). Architecturincrementality is ruled out in favor of whmight be termed strategic incrementality: A dcision that every speaker must make is howstrike the appropriate balance between plannand initiating speech quickly. The finding thspeakers are in principle capable of speakingcrementally without any cost in accuracy sugests that, in fact, speakers prefer to plan m

carefully than they absolutely need to. Indee


d aon











TAG-Based Model of Language Production

The model of language production proposby F. Ferreira (2000) predicts that speakers be unable to produce utterances in the radicincremental manner proposed by Levelt (198and Wheeldon and Lahiri (1997). This approaassumes that the representational format syntactic information is a version of a Tree-Ajoining Grammar (TAG; Frank, 1992; Josh1985; Joshi, Levy, & Takahashi, 1975; Kroch Joshi, 1985). The basic unit of a TAG is the ele-mentary tree, which consists of a lexical heand the arguments the head licenses. For exple, access of the word readwould result in acti-vation of not just the word but its associated e

d,mentary tree as well, as shown below.








red in-


eechce’sw-celta-ndhtnds-rbspeak- by

The verb readis the lexical head and it licenses two arguments—a subject and an obTAG assumes that a verbal head projects only its own VP node but all the clausal projetions as well (IP and CP). Thus, elementtrees are prototypically clause-like. Indeed, thare often described as corresponding roughla simple clause (Kroch, 1987), and as besimilar to Chomsky’s (1955) original kernsentences (Frank, 1992). Now let us examthe amount of syntactic structure that is trieved when a noun such as answerbecomesactivated.



DP 1’






cen-point for our purposes here is that neither thenor answercan project any further; in particula



Arithmetic and Language Production

One of the most important inspirations for texperiments reported here was the studyBrysbaert et al. (1998), which seems to supradical incrementality. Brysbaert et al. obserthat speakers are faster to produce the ansto arithmetic problems consisting of a two-diaddend and a one-digit addend when the dends are arranged so that the first columnculated was also the first phonological wordticulated. Thus, speakers of Dutch prefer single digit addend to precede the double-daddend, because numbers in Dutch are spso that the ones come before the tens. Speof French prefer the opposite order, becaFrench works just like English: A number suas 44 is spoken so that the tens column precthe ones column. This result seems consiswith radical incrementality.

The results of Experiment 2 are entirely cosistent with the data (but not quite the interpretion) found in Brysbaert et al. (1998). We afound that, for mixed addend problems, speers preferred the arrangement in which the tdigit number came first. Of course, we wougive our result a different explanation, particlarly because the preference for the xx1 x orderemerged in the latency data, many syllablesfore speakers had to articulate the sum itsemight be, then, that the Brysbaert effect ccerns planning: Speakers want to formulate encode their utterances roughly in the ordewhich the constituents will be articulated. other words, even if people planned the enutterance The answer is SUMbefore beginningto speak (to some extent), there is a quesabout the order in which that planning toplace: Did they plan the sum and then the an-

swer is, or did they plan in the other order? T


finding that people preferred to receive the dends in the xx1 x order suggests that peopmight have planned the utterances roughlythe order in which they would be articulateThe Brysbaert et al. finding, then, can be viewas demonstrating that, in general, speakersinclined to plan in this manner. Indeed, thisthe version of incrementality argued for by Ferreira (1996) in his work demonstrating thpeople speak more efficiently when they hachoices regarding syntactic form than when thare constrained to just one syntactic structOn this view of incrementality, active items grearlier syntactic positions, and as a result, acsibility of concepts influences syntactic formThus, syntactic plans are preferentially built in the order in which words become availabThis type of incrementality does not imply thphonological encoding will take place on tsmallest unit possible (viz. the phonologicword), and it is compatible with our resushowing that speakers plan the sum even interances in which the sum occurs at the enthe utterance.

Furthermore, it is critical to note that Brybaert et al. (1998) did not provide any evidenthat speakers were not influenced by the dculty of both the tens and the ones calculatifor the sum before beginning to speak. Thstudy was not designed to address this quesso the difficulty of the tens and the ones calcutions were not independently manipulated they were here. Indeed, we should emphathat the Brysbaert et al. study was not originadesigned or reported as a study of language duction; instead, it is an important contributito the literature on numerical cognition and lguistic relativity hypothesis (Sapir, 194Whorf, 1956). Their finding that speakers pferred the articulation order is consistent wthe idea of radical incrementality but, as we gued above, is not mandated by it. The pretion that speakers of English would be inflenced by just the difficulty of the tens columwhen producing utterances requiring the callation of two-digit sums was an inference wdrew from their reported work. As we have seit was not supported, probably because eve

hethe original Brysbaert et al. study, speakers




columns, and for some problems they had to engage in carry operations as well. Perhaps thicombination is critical: Many problems requiredcarrying, making it difficult for the participantsto be confident that they could deal with just thetens independently of the ones; and, all theproblems required participants to add the tencolumn as well as the ones column. These twcharacteristics together might have led participants to believe that their most efficient strategywas to deal with the problems as a whole befor









The two experiments that we have reporshed light on the important question of whethlanguage production is incremental. The swer that is most consistent with the resultsported here and in previous work is that system is not architecturally incremental. stead, the extent to which planning occurs ileast partly under speakers’ control, and it pends on the intentions that motivate speech. Moreover, even when speakers hincentives to initiate speech quickly, they sappear to engage in planning that goes beythe immediate phonological word. Therefothe system seems to be architecturally cstrained to require planning beyond the initphonological word, particularly for clausal uterances. At the same time, it is importantstress that language production can be inmental: The results of these experimedemonstrate that speakers are capable of pning upcoming portions of an utterance as tare articulating. This finding is especial




striking given that the planning they engagedin was arithmetic calculation, because it mighthave been supposed that addition of two-digitnumbers could not be carried out concurrentlywith utterance articulation. Apparently, the twocan go on in parallel, at least for problems thatdo not require carrying operations. Thus, a fun-damental premise of the incremental view issupported: The language production system iscapable of interleaving planning processes andarticulation.



Problems Used in Experiment 1

No-carry problems Carry problems Filler problems

Prob Sum Condition Prob Sum Condition Prob Sum

21 1 22 43 EE 231 28 51 easy 301 11 4121 1 26 47 EH 231 38 61 hard 601 15 7521 1 62 83 HE 231 58 81 harder 131 79 9221 1 66 87 HH 23 1 68 91 hardest 701 17 8723 1 21 44 EE 291 23 52 easy 751 13 8827 1 21 48 EH 391 23 62 hard 121 15 2763 1 21 84 HE 591 23 82 harder 661 30 9667 1 21 88 HH 69 1 23 92 hardest 541 11 6521 1 24 45 EE 241 27 51 easy 181 36 5421 1 28 49 EH 241 37 61 hard 121 58 7021 1 64 85 HE 241 57 81 harder 181 72 9021 1 68 89 HH 24 1 67 91 hardest 121 61 7331 121 52 EE 28 124 52 easy 43 113 5635 121 56 EH 38 124 62 hard 73 112 8571 121 92 HE 58 124 82 harder 77 110 8775 121 96 HH 68 124 92 hardest 77 119 9621 132 53 EE 24 129 53 easy 19 177 9621 136 57 EH 24 139 63 hard 17 149 6621 172 93 HE 24 159 83 harder 10 178 8821 176 97 HH 24 169 93 hardest 18 146 6433 121 54 EE 27 125 52 easy 68 110 7837 121 58 EH 37 125 62 hard 62 111 7373 121 94 HE 57 125 82 harder 36 117 5377 121 98 HH 67 125 92 hardest 36 118 5421 134 55 EE 25 128 53 easy 12 151 6321 138 59 EH 25 138 63 hard 40 143 8321 174 95 HE 25 158 83 harder 60 112 7221 178 99 HH 25 168 93 hardest 30 142 7231 122 53 EE 29 125 54 easy 31 160 9135 122 57 EH 39 125 64 hard 76 110 8671 122 93 HE 59 125 84 harder 77 117 9475 122 97 HH 69 125 94 hardest 51 130 8122 132 54 EE 26 127 53 easy 13 155 6822 136 58 EH 26 137 63 hard 18 113 3122 172 94 HE 26 157 83 harder 30 133 6322 176 98 HH 26 167 93 hardest 17 173 9033 122 55 EE 28 126 54 easy 17 114 3137 122 59 EH 38 126 64 hard 62 114 7673 122 95 HE 58 126 84 harder 19 116 3577 122 99 HH 68 126 94 hardest 16 111 2723 122 45 EE 26 129 55 easy 18 117 3523 126 49 EH 26 139 65 hard 50 116 6623 162 85 HE 26 159 85 harder 11 161 7223 166 89 HH 26 169 95 hardest 10 140 5031 123 54 EE 28 127 55 easy 66 118 8435 123 58 EH 38 127 65 hard 39 115 5471 123 94 HE 58 127 85 harder 12 170 82


75 123 98 HH 68 127 95 hardest 551 14 69

Problems Used in Experiment 1.

No-carry problems Carry problems Filler problems

Prob Sum Condition Prob Sum Condition Prob Sum23 132 55 EE 27 129 56 easy 18 154 7223 136 59 EH 27 139 66 hard 19 169 8823 172 95 HE 27 159 86 harder 40 143 8323 176 99 HH 27 169 96 hardest 12 142 5431 124 55 EE 29 128 57 easy 14 117 3135 124 59 EH 39 128 67 hard 45 130 7571 124 95 HE 59 128 87 harder 71 110 8175 124 99 HH 69 128 97 hardest 65 130 95

Note. EE, easy tens, easy ones; EH, easy tens, hard ones; HE, hard tens, easy ones; HH, hard tens, hard ones

Problems Used in Experiment 2

Mixed addend problems Two-digit addend problems

Prob Sum Condition Prob Sum Condition2 1 21 23 EE 21 122 43 EE22 12 24 EE 23 121 44 EE2 1 31 33 EE 21 124 45 EE41 12 43 EE 31 121 52 EE2 1 51 53 EE 21 132 53 EE21 13 24 EE 33 121 54 EE3 1 31 34 EE 21 134 55 EE41 13 44 EE 31 122 53 EE3 1 22 25 EE 22 132 54 EE51 13 54 EE 33 122 55 EE4 1 21 25 EE 23 122 45 EE31 14 35 EE 31 123 54 EE4 1 41 45 EE 23 132 55 EE51 14 55 EE 31 124 55 EE5 1 24 29 EH 21 126 47 EH33 15 38 EH 27 121 48 EH6 1 22 28 EH 21 128 49 EH23 16 29 EH 35 121 56 EH6 1 32 38 EH 21 136 57 EH33 16 39 EH 37 121 58 EH6 1 42 48 EH 21 138 59 EH43 16 49 EH 35 122 57 EH6 1 52 58 EH 22 136 58 EH53 16 59 EH 37 122 59 EH7 1 22 29 EH 23 126 49 EH32 17 39 EH 35 123 58 EH7 1 42 49 EH 23 136 59 EH

52 17 59 EH 35 124 59 EH


Mixed addend problems Two-digit addend problems

Prob Sum Condition Prob Sum Condition1 1 62 63 HE 21 162 83 HE72 11 73 HE 63 121 84 HE2 1 61 63 HE 21 164 85 HE62 12 64 HE 71 121 92 HE2 1 71 73 HE 21 172 93 HE72 12 74 HE 73 121 94 HE2 1 63 65 HE 21 174 95 HE73 12 75 HE 71 122 93 HE3 1 61 64 HE 22 172 94 HE71 13 74 HE 73 122 95 HE3 1 62 65 HE 23 162 85 HE72 13 75 HE 71 123 94 HE4 1 61 65 HE 23 172 95 HE71 14 75 HE 71 124 95 HE4 1 74 78 HH 21 166 87 HH75 14 79 HH 67 121 88 HH5 1 62 67 HH 21 168 89 HH63 15 68 HH 75 121 96 HH5 1 64 69 HH 21 176 97 HH72 15 77 HH 77 121 98 HH5 1 73 78 HH 21 178 99 HH74 15 79 HH 75 122 97 HH6 1 62 68 HH 22 176 98 HH63 16 69 HH 77 122 99 HH6 1 72 78 HH 23 166 89 HH73 16 79 HH 75 123 98 HH7 1 62 69 HH 23 176 99 HH72 17 79 HH 75 124 99 HH












ones; HE, hard tens, easy ones; HH, hard tens, hard ones







