the effects of addressee attention on prosodic prominence

10
This article was downloaded by: [University North Carolina - Chapel Hill] On: 15 May 2013, At: 05:49 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Language and Cognitive Processes Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/plcp20 The effects of addressee attention on prosodic prominence Elise C. Rosa a , Kayla H. Finch a , Molly Bergeson a & Jennifer E. Arnold a a Department of Psychology, University of North Carolina at Chapel Hill, Davie Hall CB #3270, Chapel Hill, NC, 27516, USA Published online: 05 Apr 2013. To cite this article: Elise C. Rosa , Kayla H. Finch , Molly Bergeson & Jennifer E. Arnold (2013): The effects of addressee attention on prosodic prominence, Language and Cognitive Processes, DOI:10.1080/01690965.2013.772213 To link to this article: http://dx.doi.org/10.1080/01690965.2013.772213 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Upload: unc

Post on 01-Mar-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

This article was downloaded by: [University North Carolina - Chapel Hill]On: 15 May 2013, At: 05:49Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Language and Cognitive ProcessesPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/plcp20

The effects of addressee attention on prosodicprominenceElise C. Rosa a , Kayla H. Finch a , Molly Bergeson a & Jennifer E. Arnold aa Department of Psychology, University of North Carolina at Chapel Hill, Davie Hall CB#3270, Chapel Hill, NC, 27516, USAPublished online: 05 Apr 2013.

To cite this article: Elise C. Rosa , Kayla H. Finch , Molly Bergeson & Jennifer E. Arnold (2013): The effects of addresseeattention on prosodic prominence, Language and Cognitive Processes, DOI:10.1080/01690965.2013.772213

To link to this article: http://dx.doi.org/10.1080/01690965.2013.772213

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form toanyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses shouldbe independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims,proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly inconnection with or arising out of the use of this material.

The effects of addressee attention on prosodic prominenceElise C. Rosa*, Kayla H. Finch, Molly Bergeson and Jennifer E. Arnold

Department of Psychology, University of North Carolina at Chapel Hill, Davie Hall CB #3270,

Chapel Hill, NC 27516, USA

(Received 30 March 2012; final version received 24 January 2013)

How do speakers accommodate distracted listeners? Specifically, how does prosody change when speakers know thattheir addressees are multitasking? Speakers might use more acoustically prominent words for distracted addressees,to ensure that important information is communicated. Alternatively, speakers might disengage from the task anduse less prominent pronunciations with distracted addressees. A further question is whether prosodic prominencechanges globally or if there are effects specific to the most relevant information. We studied these effects in twoinstruction-giving experiments. Speakers instructed listeners to move objects to locations on a board. In thedistraction condition, addressees were also completing a demanding secondary computer task; in the attentivecondition they paid full attention. Results demonstrated that speakers modify their speech for distracted listeners,and in an instruction-giving task they specifically use more acoustically prominent (longer) pronunciations fordistracted listeners. This effect was localised to the most task-relevant information: the object to be moved.

Keywords: prosody; attention; audience design

Speakers have numerous choices to make for every

message they want to communicate. They can be

concise (Crackers please!) or verbose (Can you please

hand me that box of crackers?). They can specify

objects with detail (That box of saltines next to you)

or not (that). They can enunciate words prominently

or with a reduced pronunciation. Many of these choices

are related to the information being communicated.

Already-known or predictable information is generally

expressed with fewer words, less detail and reduced

pronunciation, whereas new or important information

is referred to with more words, more detail and

acoustically prominent forms (Arnold, 1998, 2008,

2010; Brown, 1983; Chafe, 1976; Gundel, Hedberg, &

Zacharski, 1993; Halliday, 1967; Sityaev, 2000). A

much-debated issue is whether these choices are made

as a result of the speaker’s knowledge about their

addressee’s knowledge or attention � a process known

as audience design (Arnold, 2008; Arnold, Kahn, &

Pancani, 2012; Galati & Brennan, 2010; Horton &

Keysar, 1996).

In this paper we ask whether people speak differ-

ently when their addressee is distracted. This provides a

window onto questions about whether audience design

affects speech, since distracted behaviour reflects the

addressee’s attentional state. For example, if your

request for crackers is directed at someone engaged in

a different task, like driving a car, how will your word

choice and pronunciation be affected? There are a lot

of dimensions on which you might change your

prosody � you might speak the whole sentence more

slowly, or loudly, or you might speak only particular

words more slowly. We focus here on how speakers

modify the acoustic prominence of their words, speci-

fically word duration, but also examine how it co-

occurs with other types of linguistic form variation.

Duration is especially interesting because it may vary

as a function of the speaker’s desire to make certain

words prominent (Breen, Fedorenko, Wagner, & Gib-

son, 2010; Ladd, 1996), but also can provide a cue

about the speaker’s fluency (Bell et al., 2003), which in

turn can affect comprehension (e.g., Arnold, Hudson

Kam, & Tanenhaus, 2007; Arnold, Tanenhaus, Alt-

mann, & Fagnano, 2004).

It is well established that speakers use language

differently for different addressees (e.g., Clark, 1996;

Clark & Krych, 2004; Galati & Brennan, 2010), and

there is good evidence that audience design impacts

lexical choices (Brennan & Clark, 1996; Brown-

Schmidt & Tanenhaus, 2006; Heller, Gorman, &

Tanenhaus, 2012; Horton & Keysar, 1996; Gorman,

Gegg-Harrison, Marsh, & Tanenhaus, 2012). Speakers

refer to objects in conversation using partner-specific

terms they have developed over the course of conversa-

tion. They also keep track of the physical presence of

objects they are referring to for themselves and their

conversational partners.

*Corresponding author. Email: [email protected]

Language and Cognitive Processes, 2013

http://dx.doi.org/10.1080/01690965.2013.772213

# 2013 Taylor & Francis

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

However, an ongoing debate concerns the effect of

audience design on acoustic prominence. Some the-

ories suggest that audience design is the primary

determinant of the speaker’s choice to acoustically

emphasise some words (Chafe, 1987; Lindblom, 1990).

This account is consistent with the idea that new and

unpredictable information tends to be accented (e.g.,

Venditti & Hirschberg, 2003), since this information

should be less accessible to listeners, and thus require

more explicit input. However, a strong version of thisaccount has found little support in empirical studies

where the speaker and addressee’s knowledge are

examined separately. For example, Bard and collea-

gues (Bard, Anderson, Aylett, Doherty-Sneddon, &

Newlands, 2000; Bard & Aylett, 2005) found that

intelligibility was unaffected by numerous measures of

the listener’s knowledge. They proposed the dual

process hypothesis (Bard et al., 2000), in which fast

automatic processes allow for the speaker’s memory to

affect articulation, and slower processes incorporate

information about the listener for purposes likechoosing pronominal forms. This work showed that

acoustic reduction is not primarily driven by audience

design, although it did not test the possibility that

addressee knowledge has a partial constraint. Kahn

and Arnold (2012), under review) similarly found that

speakers shortened nouns that they had recently

heard, and moreover that the degree of shortening

was unaffected by whether the addressee had also

heard the word or not.

In contrast, Galati and Brennan (2010) found thatwords directed at knowledgeable addressees were rated

as less intelligible than those directed at naı̈ve addres-

sees, even though they did not differ on duration.

Arnold et al. (2012) found that speakers in their

experiment did modulate the duration of words in

response to addressee behaviour, but specifically on a

word associated with utterance planning � the deter-

miner the (Clark & Wasow, 1998). This suggested that

effects of audience design may be mediated by produc-

tion-internal processes of utterance planning.Whether audience design affects acoustic variation

or not, there is abundant evidence that speakers use

longer and more acoustically prominent pronuncia-

tions for information that is harder to retrieve or plan

and shorter pronunciations for easy-to-produce words

and referents (Balota & Chumbley, 1985; Bard et al.,

2000; Bell, Brenier, Gregory, Girand, & Jurafsky, 2009;

Clark & Fox Tree, 2002; Lam & Watson, 2010; Arnold

& Watson, 2012, under review; Kahn & Arnold, 2012).

For example, when speakers are disfluent, saying um,

uh or repeating words, it indicates that they are havingspeech production difficulty. Words surrounding such

disfluent elements also tend to be longer (Bell et al.,

2003).

In sum, previous work suggests that speakers

accommodate their listeners’ needs in many ways, but

effects of audience design on acoustic variation are

variable. However, the majority of work on this

question has focused on whether speakers adjust their

pronunciations in response to their addressee’s knowl-

edge. The current work instead examines the effects of

the listener’s attentional state. Do speakers modulate

the acoustic properties of their speech in response to

visible evidence that their addressee is distracted?Distraction is a common characteristic of day-to-

day life, yet relatively little is known about how

speakers adjust their linguistic form when speaking to

distracted addressees. In a narrative recall study,

Pasupathi, Stallworth, and Murdoch (1998) found

that speakers with attentive addressees produced

more information than those with distracted addres-

sees. Similarly, Kuhlen and Brennan (2010) found that

speakers told narrative jokes with more detail with

attentive rather than distracted addresses, but this

effect went away when the speaker expected theaddressee to be distracted. Thus, these studies found

that speakers provided less information to distracted

addressees.

By contrast, a study by Arnold et al. (2012) suggests

that speakers provide more information for less-atten-

tive addressees. Speakers gave pairs of instructions to

addressees to place objects on a board of coloured dots,

e.g., The cat goes on red. The teapot goes on yellow. In

one condition the addressee simply followed each

instruction after hearing it, showing a normal level of

attention. In another condition the addressee wasespecially attentive, anticipating the second object. In

this condition the addressee picked up both objects

after the first instruction, rather than waiting for the

second instruction to pick up the second item. Speakers

both used more words and longer pronunciations of the

word the with non-anticipating addresses. Unlike the

narrative tasks in which distracted addressees elicited

less detail, this task required the addressee to follow

instructions, so increased verbal specificity may have

had a concrete advantage for completing the task.

Similar results come from a narrative production studyby Rosa and Arnold (2011), except that they found that

speakers provided more explicit referring expressions

when they themselves were distracted.

The current study used the same instruction-follow-

ing task as Arnold et al. (2012) to examine the effects of

addressee attention on speakers’ prosody. In contrast

to Arnold et al. (2012), in which addressees anticipated

speakers’ instructions, addressees in the current study

were distracted with a secondary task. If speakers use

acoustic prominence to ensure effective communica-

tion, we would expect to see longer words withdistracted addressees. If such an effect is driven by

2 E.C. Rosa et al.

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

the comprehension needs of the listener, we would

expect increased prominence to be localised to the most

central information for the task, that is, the word

describing the target object. Alternatively, if speakers

engage more with attentive addressees, they might

decrease acoustic prominence for distracted addressees,

as found in narrative recall tasks for lexical detail

(Kuhlen & Brennan, 2010; Pasupathi et al., 1998). A

third possibility is that addressee distraction may have

no effect on speakers’ acoustic choices, as predicted by

other studies that have found no effect of audiencedesign on acoustic reduction (Bard & Aylett, 2005;

Kahn & Arnold, 2012, under review).

A second question was whether effects of addressee

distraction would interact with known informational

predictors of acoustic prominence. One well-known

determinant of word duration is predictability: Pre-

dictable words tend to be acoustically reduced (Bell et

al., 2009; Gahl & Garnsey, 2004; Jurafsky, Bell,

Gregory, & Raymond, 2001), where predictability can

stem from the surrounding words, the prior sentencemeaning, or syntactic structure. Likewise, when the

discourse context leads to an expectation of a specific

referent, words referring to it tend to be reduced

(Arnold, 1998, 2001; Lam & Watson, 2010; Watson,

Arnold, & Tanenhaus, 2008). If speakers think that

distracted addressees cannot follow predictability cues

effectively, they may resist the usual tendency to reduce

predictable information, and thereby show greater

acoustic prominence specifically for predictable words.

Alternatively, they may use a simpler strategy of

adjusting their speech for distracted addressees overall,

regardless of predictability.We therefore examined the effects of addressee

distraction in two experiments. Both used the same

instruction-giving task. Each trial involved two objects,

and speakers always produced one instruction for each

object. In Experiment 1 the target object was the

second item in the pair, meaning that after the first

instruction was given, the object in the second instruc-

tion was fully predictable. In Experiment 2, the same

target objects were used as the first instruction, so they

were relatively less predictable. Word duration was themain variable of concern in this study, but the effects of

predictability and distraction on lexical choices were

also examined. Latency to begin speaking was also

analyzed, as latency can be seen as a measure of

planning, and addressee behaviour may also affect

speakers’ planning processes.

One of the advantages of this experimental para-

digm is that it involved a concrete task, which provided

the speaker with motivation to accommodate the

addressee. Another advantage to this task is that it

imposed little to no memory burden on the subject, incontrast to other studies that used either maps or

narratives that only the speaker had viewed (Bard &

Aylett, 2005; Bard et al., 2000). As this added burden

of ‘record-keeping’ was reduced in our task, speakers

presumably had more resources with which to complete

the task, and therefore might be more capable of

considering their listeners in planning their utterances.

Additionally, partner-specific findings or audience de-

sign effects are most likely to occur in an interactivedialogue setting (Brown-Schmidt, 2009).

Experiments 1 and 2

We tested how speakers would modify their speech in

reaction to the listener’s state of distraction, in two

experiments. The methods and analyses were nearly

identical across the two experiments, so they are

reported together.

Method

Participants

Twenty undergraduate students from the University

of North Carolina participated, 10 in Experiment 1 and10 in Experiment 2. All participants were native

speakers of English, and had normal or corrected-to-

normal vision. Participants received course credit for

their participation.

Materials and design

Target stimuli consisted of 48 physical objects whose

names were matched for number of phonemes, syllables

and frequency. The same target stimuli were used in

Experiments 1 and 2. Filler stimuli (i.e., those objects

used for the other instruction) were all one syllable, andwere the same across experiments. Two lists were

formed for each experiment, as participants worked

with one attentive addressee and one distracted ad-

dressee. The lists were paired by length, phonemes and

frequency as closely as possible, and each list occurred

equally in each condition. Targets were presented to

each participant once, either as the second item in a

pair (Experiment 1, predictable targets), or as the firstitem in a pair (Experiment 2, unpredictable targets).

Different study participants performed the two experi-

ments. Thus, there were 20 participants in total with 48

trials each. The order in which addressees (distracted,

attentive) were encountered was counterbalanced

across participants to provide a control of any carry-

over effect between first and second blocks.

Equipment

Stimuli were presented on a computer monitor in a

slide-show format, using Powerpoint. The objects to bemoved were stored in containers, and were put on the

Language and Cognitive Processes 3

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

table in pairs. Responses were recorded using a headset

microphone.

Procedure

Participants worked with two confederates during

each of the experiments. For generalisability of the

results, we used two confederates, and confederate was

entered as a control predictor into the statistical model.

Participants were told that the confederates were

members of the lab. Participants were instructed to

give instructions to the confederates to move objects to

the coloured circles on the board on the table in

between them.

During half of the trials the participant was paired

with a distracted confederate, who was performing a

secondary computer task while completing the primary

task. The participant was informed that the confeder-

ate had to perform this task. Confederates and

participants could see each other, and confederates

did not speak unless clarification was necessary. The

secondary task was a timed state-labelling game that

required the confederate to be continuously engaged,

except when pausing to carry out the instructions.

During the other half of the trials the participants

worked with a confederate who was not performing any

secondary task. The order of distraction was counter-

balanced between participants. Given the number of

trials and the counterbalancing of order, confederates

had real communication needs, as it was not possible to

remember every item and its location.

The primary task was an instruction-giving task. No

practice trials were given. Once the experiment began,

participants would see pictures of two objects per trial

appear, one at a time, on a computer screen behind the

confederate, indicating the intended destination. The

objects were on coloured circles on the screen. The two

objects to be moved were placed on the table, and the

participant instructed the confederate to move the first

object to the appropriate coloured circle, as shown on

the computer screen. The second object would then

appear on the screen, and the participant again

instructed the confederate to place the object on the

table on the depicted colour. In Experiment 1 the target

item was the second object to appear, making it entirely

predictable. In Experiment 2 the target item object was

the first to appear on the screen, making it relatively

unpredictable to the confederate, who could not view

the screen. Participants issued verbal instructions to

confederates to move the objects, for example, ‘Put the

fox on the green circle. Now put the cork on the red

circle’. As soon as confederates moved the second

object of the pair, the computer screen was advanced to

the next trial. Participants were not given a verbal

example of how to instruct the confederates. The only

two guidelines given to the participants for instructing

confederates were to not point at the objects or the

circles, and to not move the objects themselves.

Analysis

Dependent variables

We examined how the distraction manipulation

affected the speakers’ choices in both (1) number of

words in the target expression and (2) the acoustic

prominence of their pronunciations, as measured by the

duration of four key regions: (a) the latency to beginspeaking, as indexed by the time between the onset of

the target visual stimulus on the computer screen and

the onset of the first word in the response (after any

filled pauses like uh); (b) the determiner the, when

produced, (c) the target noun, e.g., fox; and (d) the

colour word, e.g., red. Filled pauses were included in

the measure of latency because both were considered to

represent planning time. Word duration was measuredwith Praat.

Duration and latency analyses were restricted to

definite noun phrases ‘the koala’ or bare noun ‘koala’

phrases. Phrases such as ‘koala on green’ therefore were

examined for latency, noun and colour word length,

but did not contribute to determiner analyses. Trials

were excluded if the speaker repaired the target word or

showed naming confusion (1%), if the target wasreferred to with a pronoun or zero (2%), or if they

included a multiword utterance to describe the target

object (e.g., ‘the stuffed animal’ instead of ‘the giraffe’;

7%). Out of the 960 trials (48 trials per 20 participants),

8.96% of the data in Experiment 1 and 9.58% of the

data in Experiment 2 were excluded from the acoustic

analyses by these criteria. 50.56% of the excluded

tokens were from the attentive condition and 49.44%from the distracted. Therefore, the excluded trials

occurred equally in distracted and attentive conditions,

across participants in both experiments. There were 437

tokens in the analysis for Experiment 1, and 434 for

Experiment 2. Latency analyses additionally excluded

outliers that were more than 2.5 standard deviations

above the mean (33 cases of the 871 included in the

acoustic analyses, or 3.79%).

Statistical modelling procedure

Data were analysed with multilevel logistic regres-

sions in SAS using the proc mixed command. All

models included a random intercept for both subject

and either item (for the number of words analysis) or

target noun (for the duration analyses). Target noun

was used instead of item because subjects used a

different label for the target object than the intended

one on 14% of the trials, and the noun heavilyconstrains duration. Similar results obtain if analyses

4 E.C. Rosa et al.

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

are restricted to the trials where the intended word was

used. We also included random slopes for subject and

item/noun by condition, where possible, following the

procedure below.

For each analysis, we used the following procedure:a control model was constructed first, containing all of

the control variables, the random intercepts, but not

the critical condition predictor. Control variables that

had a t-value of �1.5 were retained in the final model.

This final model was constructed, containing those

control variables, plus condition. The model was

initially fit using a maximal random-effects structure,

including random intercepts for subject and item/noun,and random slopes for subject�condition and item/

noun�condition. If the model did not converge or was

not positive definite, we eliminated the random effects

one at a time, in this order: (1) item/noun�attentive-

ness; (2) subject�attentiveness; (3) item intercept. The

variables included in each model are shown in Table 1.

Independent and control variables

The primary predictor in our model was the currentcondition (confederate attentive or distracted). Criti-

cally, the acoustic analyses examined this predictor

against the backdrop of numerous control predictors

that are expected to affect word duration. We con-

trolled for speech rate, calculated as the average time

per syllable in the response utterance. Other control

predictors indexed characteristics of the preceding and

following context (whether the participant used adeterminer, what the target word was preceded and

followed by (pause/disfluency/phrase initial vs. another

word), and, for the colour word analysis, whether the

colour word was the last word in the sentence). Both

lexical and acoustic analyses included control variables

about the experimental design (the current itemset,which itemset had come first, which condition had

come first, the current confederate, and item order).

Results

When the target was predictable (Experiment 1),

speakers used more words to describe the target area

(including the target noun and any determiners and

modifiers) when confederates were distracted (mean �1.68) than attentive (mean �1.47; t (477) ��2.39,p B.05). Most of this variation was driven by determi-

ner usage, but some was due to participants’ use of

multiword utterances (e.g., The little trumpet). Partici-

pants provided such modifiers in 6.25% of the trials

with distracted addressees, versus 4.58% of the time

with attentive addressees. With predictable targets,

there was also a tendency for participants to speak

more quickly in the attentive condition, as measured bytime per syllable (t (430) ��2.14, p �.03). In Experi-

ment 2, there was no effect of condition on the number

of words used to describe unpredictable targets

(mean �1.75), nor overall rate of speech.

The critical analyses concerned word duration (see

Table 2), where we found an effect of condition on the

target noun in both experiments. Participants with

distracted addressees produced target words withlonger durations than did speakers with attentive

Table 1. Control variables and random effects in each model.

# words targetarea Latency duration ‘the’ duration Object duration Colour duration

Exp. 1 Exp. 2 Exp. 1 Exp. 2 Exp. 1 Exp. 2 Exp. 1 Exp. 2 Exp. 1 Exp. 2

Itemset � 3.34 � � � � � �2.27 � �Itemset order � � � � � � � 1.90 � 3.62Item order � �2.68 �3.34 �3.87 � 2.70 � �1.67 � �1.50Attentiveness first � � � � � � � � � �Target noun syllables 6.39 9.37Rate of speech 2.82 1.82 7.20 11.31 13.93 12.73 6.98 4.44Use of determiner N.S. � � �2.26 � �Confederate N.S. � � � � N.S. � N.S. N.S. N.S.Preceding word � �3.59Following word 2.82 2.76Is colour last word 4.02 9.01Subject intercept * * * * * * * * * *Object intercept * * * * * * * * *Subj.�Att. Slope * * *Obj.�Att. Slope *

Note: For control variables, dashes mean that the variable was not significant in the control model and was therefore not included in the final model.

The t-values mark significant effects and the direction of the effect (positive/negative); N.S. means not significant. Empty boxes indicate that the

control variables were not included in the control models. Models were run separately for Experiments 1 and 2. Asterisks indicate that the random

intercept or slope was included in the model.

Language and Cognitive Processes 5

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

addressees, resulting in a main effect of condition for

both predictable targets in Experiment 1 (t (432) ��2.74, p B.01), and unpredictable targets in Experi-ment 2 (t (423) ��3.21, p B.01). An analysis of both

experiments together revealed a robust effect of atten-

tiveness (pB.01) on target noun duration. Even though

target nouns were produced with shorter durations in

the predictable experiment (pB.01), predictability did

not interact with attentiveness (p�.54). There was also

a significant effect of condition on latency to begin

speaking in Experiment 1: latency was longer withdistracted addressees than with attentive addressees (t

(410) ��4.60, p B.0001). Condition did not affect

latency for Experiment 2. Analyses of the other two

regions revealed no significant effect of condition on

duration of ‘the’ or the colour word.

A visual examination of the average durations in

Table 2 across the two experiments shows that the

averages are considerably longer in Experiment 2 thanExperiment 1. This was fully expected, given that the

time to produce the first instruction may have been

influenced by the need to survey both objects for the

trial, as well as the relative unpredictability of the target

object. This contrast was also orthogonal to the goals

of the current study, so we did not submit this

comparison to statistical analysis.

Discussion

We found that people speak differently to distractedand attentive addressees, both in terms of how much

information they provided overall, and the acoustic

prominence of key words in their response. In general,

distracted addressees elicited longer words and more

detailed utterances. The effect of distraction was robust

against variation in the predictability of the target

object, and distracted addressees elicited longer target

words than attentive addressees in both experiments. Acomparison between this study and other studies

suggests that distraction has different effects on speak-

er’s choices in different kinds of tasks. In previous

studies that required participants to recall a narrative

or tell a joke, speakers provided less information to

distracted listeners, as measured by shorter utterances

and less-detailed narratives (Kuhlen & Brennan, 2010;

Pasupathi et al., 1998). These findings may reflect thesocial function of narratives and jokes, as a disinter-

ested listener may change the speaker’s task goals. In

the current experiment the task goals were clear and

consistent, and the speaker’s utterances had the func-

tion of instructing the addressee to move the correct

object to the right location. This specific set of task

goals may have allowed speakers to assume that greater

lexical detail and greater acoustic prominence would

facilitate successful task completion for a distracted

listener.

Importantly, the increased duration in the distracted

condition occurred specifically on the target word, and

not all regions in the utterance. In Experiment 1, this

effect occurred over and above the tendency for

participants to speak faster with attentive addressees.

In Experiment 2, there was no general rate change

between conditions, yet participants still used longer

durations for attentive addressees. This suggests that

speakers were emphasising words with high informa-

tion content for their listeners. The object name was

especially critical for the initiation of the action, which

began with selecting the object.

Our results clearly indicate that speakers accommo-

date distracted addressees by varying the acoustic

prominence of their words. This finding contrasts

with other studies in which duration and intelligibility

are frequently unaffected by the addressee’s knowledge

(e.g., Bard & Aylett, 2005; Bard et al., 2000). This

difference may have resulted from the fact that in our

task, speakers also did not have to keep track of what

the addressee knew, as we were manipulating the

addressees’ obvious attention. Our task also made the

communicative goal transparent, so speakers were

highly motivated to communicate clearly.

This study did not explicitly test the mechanism

underlying the effects of addressee’s attention, but we

can speculatively offer some possibilities. A strong

audience design explanation of the increased object-

name duration is that speakers were emphasising the

object’s name to increase addressee understanding.

Under this view, speakers in the distracted condition

recognised that their addressees needed extra help. This

realisation may have triggered a speaking mode that

provided additional information, which presumably

would help the distracted addressee complete the

task. The fact that our durational effects were strongest

on the target noun is consistent with this view, since

Table 2. Mean durations and standard deviations (in ms) for each region in each condition.

Latency ‘the’ Object Colour

Experiment 1: Predictable Attentive 666.36 (380.46) 105.46 (77.25) 363.63 (103.62) 268.48 (84.04)Distracted 851.4 (481.67) 112.03 (98.96) 401.42 (123.83) 275.47 (98.50)

Experiment 2: Unpredictable Attentive 1490.05 (576.64) 212.63 (218.23) 464.83 (145.50) 334.2 (139.78)Distracted 1598.89 (598.12) 210.31 (185.57) 494.28 (135.82) 332.31 (125.92)

6 E.C. Rosa et al.

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

this is the piece of information most critical for

initiating the response. One question is why distraction

had no effect on the colour word, which presumably

was also an important piece of information for

completing the task. We speculate that colour word

duration was relatively stable, due to the fact that these

words were repeated throughout the experiment and

thus relatively facilitated. Additionally, colour words

are at the end of the sentence, so speakers presumably

had more time to plan.As Galati and Brennan (2010) suggest, this kind of

addressee accommodation could be done with a ‘one-

bit model’. Speakers can calculate once for each block

whether the addressee is distracted, as this information

is readily available and continually present, and this

one-time ‘either/or’ decision can inform their speech for

the entirety of the block. If this kind of calculation

underlies our effects, it would predict that speakers can

accommodate distraction best when the addressee’s

attentional state is fairly constant. Whether speakers

can adjust to moment-by-moment changes in theaddressee’s apparent attention is a topic for future

research.

An alternate possibility is that the effects of distrac-

tion in our study are not the result of audience design

per se, but rather effects that the addressee’s behaviour

have on the speaker’s own cognitive processes. For

example, the addressee’s distraction may have led the

speaker to be distracted, or at the very least it may have

affected the speaker’s ability to plan each utterance.

Words tend to be shorter when planning is facilitated

(Bell et al., 2009; Christodoulou, 2012; Kahn &Arnold, 2012; see Arnold & Watson, 2012, under

review, for a review), which means that audience design

effects may be mediated by planning effects, as opposed

to an adjustment of speech forms on the basis of a

specific representation of the addressee’s needs. This

possibility would be consistent with evidence that

speakers choose more explicit words when distracted

(Rosa & Arnold, 2011).

This planning-based account is consistent with

findings from a similar experiment, reported by Arnold

et al. (2012). Their experiment used the exact sameparadigm and materials, except that the manipulation

consisted of the addressee’s behaviour immediately

before the second instruction � specifically, whether

the addressee anticipated the target object or not. In

fact, the ‘waiting’ condition in Arnold et al. (2012) was

virtually identical to the ‘attentive’ condition in Ex-

periment 1. However, their findings differ from the

ones reported here. That study found that the addres-

see’s behaviour affected the latency to begin speaking,

and the duration of the determiner the, but not the

duration of the target word. Given that the latency anddeterminer regions are associated with utterance plan-

ning, they interpreted that profile of results as evidence

that the anticipation behaviour affected planning

processes.

By contrast, the current experiment finds effects of

distraction on target word duration, and less clear

effects on the planning regions. Distraction affected

latency to speak in Experiment 1, but not Experiment

2. There was no effect of condition on determiners.1

The lack of a determiner effect in Experiment 1 is

particularly surprising in the context of a conditioneffect on latency: if both are planning regions, why is

duration not affected? One possibility is that the

distraction manipulation in the current experiment

encouraged a generally slower approach to the task

overall, even in the attentive condition. This may have

led participants to pre-plan their utterances more than

the participants in Arnold et al. (2012). This is

supported by the fact that the latency to speak was

shorter in the waiting condition in Arnold et al. (2012)

(586 ms) than in the attentive condition in Experiment

1 (666.36 ms), while the durations of the determinerand target were longer in the waiting condition (Arnold

et al., 2012; determiner: 141 ms; target: 412 ms) than

the attentive condition in Experiment 1 (determiner:

105.46 ms, target: 363.63 ms).

Thus, participants in Arnold et al. (2012)’s task

seemed to be ‘thinking while speaking’ to a relatively

greater degree than in the current task. If the speaker is

planning incrementally, small variations in the ease of

planning the upcoming target (as those due to addres-

see anticipation) can affect the duration of the deter-

miner. If the speaker has pre-planned the target beforespeaking, as participants seem to have done in the

current experiment, word durations can be shorter

(Christodoulou, 2012), and any small differences in

ease of planning should have been reconciled before

utterance initiation.

Nevertheless, we still found longer target pronuncia-

tions when the addressee was distracted. If participants

were doing relatively more pre-planning in this experi-

ment than in Arnold et al. (2012), it means that the

effect of addressee distraction on target word duration

is not primarily due to facilitation of the productionprocesses. This suggests that the effect of distraction is

likely a ‘true’ effect of audience design. This is

consistent with the strength of the manipulation here

� distraction was a salient, global manipulation,

whereas Arnold et al. (2012) used a transitory manip-

ulation of anticipation. The salience of addressee

distraction � and the fact that it could be calculated

on a one-bit model � may have facilitated the engage-

ment of audience design processes, in addition to any

planning-mediated effects of addressee behaviour.

In sum, our findings contribute to mounting evi-dence that variation in word duration is affected by the

Language and Cognitive Processes 7

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

speaker’s perception of the addressee’s behaviour and/

or mental state. This effect goes beyond the influence of

situational variables like the Lombard effect (Lane &

Tranel, 1971). Moreover, we found that distraction has

multiple effects, including the lexical specificity of the

utterance, the delay to begin speaking and the duration

of critical words. These findings contribute to the idea

that ‘audience design’ is not a single process, and

instead, a single dimension � like word duration � can

respond to addressee behaviour in multiple ways.

Acknowledgements

This research was supported by NSF grant BCS-

0745627. We gratefully acknowledge the assistance of

Giulia Pancani.

Note

1. The only analysis in which distraction affected determi-ner duration was for Experiment 2, when the analysis waslimited to items where the speaker used the intended labelfor the target object. In this analysis, the effect ofcondition was marginal (t (248) � �1.92, p � .056).

References

Arnold, J. E. (1998). Reference form and discourse patterns. Disserta-

tion: Stanford University.

Arnold, J. E. (2001). The effects of thematic roles on pronoun use and

frequency of reference. Discourse Processes, 31(2), 137�162.

doi:10.1207/S15326950DP3102_02

Arnold, J. E. (2008). Reference production: Production-internal and

addressee-oriented processes. Language and Cognitive Processes, 23,

495�527. doi:10.1080/01690960801920099

Arnold, J. E. (2010). How speakers refer: The role of accessibility.

Language and Linguistic Compass, 4(4), 187�203. doi:10.1111/

j.1749-818X.2010.00193.x

Arnold, J. E., Hudson Kam, C., & Tanenhaus, M. K. (2007). If you say

thee uh- you’re describing something hard: The on-line attribution

of disfluency during reference comprehension. Journal of Experi-

mental Psychology: Learning, Memory, and Cognition, 33, 914�930.

doi:10.1037/0278-7393.33.5.914

Arnold, J. E., Kahn, J. M., & Pancani, G. (2012). Audience design

affects acoustic reduction via production facilitation. Psychological

Bulletin & Review, 19, 505�512. doi:10.3758/s13423-012-0233-y

Arnold, J. E., Tanenhaus, M. K., Altmann, R. J., & Fagnano, M.

(2004). The old and thee, uh, new. Psychological Science, 15, 578�582. doi:10.1111/j.0956-7976.2004.00723.x

Arnold, J. E., & Watson, D. (2012). Synthesizing meaning and processing

approaches to prosody: Performance matters. Manuscript submitted

for publication.

Balota, D., & Chumbley, J. (1985). The locus of word-frequency effects

in the pronunciation task: Lexical access and/or production?

Journal of Memory and Language, 24(1), 89�106. doi:10.1016/

0749-596X(85)90017-8

Bard, E. G., Anderson, A. H., Aylett, M., Doherty-Sneddon, G., &

Newlands, A. (2000). Controlling the intelligibility of referring

expressions in dialogue. Journal of Memory and Language, 42(1), 1�22. doi:10.1006/jmla.1999.2667

Bard, E. G., & Aylett, M. (2005). Referential form, duration, and

modelling the listener in spoken dialogue. In J. Trueswell & M.

Tanenhaus (Eds.), Approaches to studying world-situated language

use: Bridging the language-as-product and language-as-action tradi-

tions (pp. 173�191). Cambridge, MA: MIT Press.

Bell, A., Brenier, J. M., Gregory, M., Girand, C., & Jurafsky, D. (2009).

Predictability effects on durations of content and function words in

conversational English. Journal of Memory and Language, 60(1),

92�111. doi:10.1016/j.jml.2008.06.003

Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., &

Gildea, D. (2003). Effects of disfluencies, predictability, and

utterance position on word form variation in English conversation.

The Journal of the Acoustical Society of America, 113, 1001�1024.

doi:10.1121/1.1534836

Breen, M., Fedorenko, E., Wagner, M., & Gibson, E. (2010). Acoustic

correlates of information structure. Language and Cognitive Pro-

cesses, 25, 1044�1098. doi:10.1080/01690965.2010.504378

Brennan, S. E., & Clark, H. H. (1996). Conceptual pacts and lexical

choice in conversation. Journal of Experimental Psychology: Learn-

ing, Memory and Cognition, 22, 482�1493. doi:10.1037/0278-

7393.22.6.1482

Brown, G. (1983). Prosodic structure and the given/new distinction. In

A. Cutler & D. R. Ladd (Eds.), Prosody: Models and measurements

(pp. 67�77). Springer: Berlin.

Brown-Schmidt, S. (2009). The role of executive function in perspective

taking during online language comprehension. Psychonomic Bulle-

tin & Review, 16, 893�900. doi:10.3758/PBR.16.5.893

Brown-Schmidt, S., & Tanenhaus, M. K. (2006). Watching the eyes

when talking about size: An investigation of message formulation

and utterance planning. Journal of Memory and Language, 54, 592�609. doi:10.1016/j.jml.2005.12.008

Chafe, W. (1976). Givenness, contrastiveness, definiteness, subjects,

topics and point of view. In C. Li (Ed.), Subject and topic (pp. 25�55). New York: Academic Press.

Chafe, W. (1987). Cognitive constraints on information flow. In R.

Tomlin (Ed.), Coherence and grounding in discourse (pp. 21�51).

Amsterdam: John Benjamins.

Christodoulou, A. C. (2012). Variation in word duration and planning

(Ph.D. dissertation). University of North Carolina at Chapel Hill.

Clark, H. H. (1996). Using language. Cambridge: Cambridge University

Press.

Clark, H. H., & Fox Tree, J. E. (2002). Using uh and um in spontaneous

speaking. Cognition, 84(1), 73�111. doi:10.1016/S0010-

0277(02)00017-3

Clark, H. H., & Krych, M. A. (2004). Speaking while monitoring

addressees for understanding. Journal of Memory and Language,

50(1), 62�81. doi:10.1016/j.jml.2003.08.004

Clark, H. H., & Wasow, T. (1998). Repeating words in spontaneous

speech. Cognitive Psychology, 37, 201�242. doi:10.1006/

cogp.1998.0693

Gahl, S., & Garnsey, S. (2004). Knowledge of grammar, knowledge of

usage: Syntactic probabilities affect pronunciation variation. Lan-

guage, 80, 748�775. doi:10.1353/lan.2004.0185

Galati, A., & Brennan, S. E. (2010). Attenuating information in spoken

communication: For the speaker, or for the addressee? Journal of

Memory and Language, 62(1), 35�51. doi:10.1016/j.jml.2009.09.002

Gorman, K. S., Gegg-Harrison, W., Marsh, C. R., & Tanenhaus, M. K.

(2012). What’s learned together stays together: Speakers’ choice of

referring expression reflects shared experience. Journal of Experi-

mental Psychology: Learning, Memory and Cognition. Epub ahead

of print.

Gundel, J., Hedberg, N., & Zacharski, R. (1993). Cognitive status and

the form of referring expressions in discourse. Language, 69, 274�307. doi:10.2307/416535

Halliday, M. A. K. (1967). Intonation and grammar in British English.

The Hague: Mouton.

8 E.C. Rosa et al.

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3

Heller, D., Gorman, K. S., Tanenhaus, M. K. (2012). To name or to

describe: Shared knowledge affects referential form. Topics in

Cognitive Science, 4(2), 290�305.

Horton, W. S., & Keysar, B. (1996). When do speakers take into

account common ground? Cognition, 59(1), 91�117. doi:10.1016/

0010-0277(96)81418-1

Jurafsky, D., Bell, A., Gregory, M., & Raymond, W. (2001). Probabil-

istic relations between words: Evidence from reduction in lexical

production. In J. Bybee & P. Hopper (Eds.), Frequency and the

emergence of linguistic structure (pp. 229�254). Amsterdam: John

Benjamins.

Kahn, J., & Arnold, J. E. (2012). A processing-centered look at the

contribution of givenness to durational reduction. Journal of

Memory and Language, 67(3), 311�325.

Kahn, J., & Arnold, J. E. (2012). Speaker-internal processes drive

durational reduction. Manuscript submitted for publication.

Kuhlen, A. K., & Brennan, S. E. (2010). Anticipating distracted

addressees: How speakers’ expectations and addressees’ feedback

influence storytelling. Discourse Processes, 47, 567�587.

doi:10.1080/01638530903441339

Ladd, R. (1996). Intonational phonology. Cambridge: University Press.

Lam, T. Q., & Watson, D. G. (2010). Repetition is easy: Why repeated

referents have reduced prominence. Memory & Cognition, 38, 1137�1146. doi:10.3758/MC.38.8.1137

Lane, H., & Tranel, B. (1971). The Lombard sign and the role of

hearing in speech. Journal of Speech and Hearing Research, 14, 677�709.

Lindblom, B. (1990). Exploring phonetic variation: A sketch of the H

and H theory. In W. J. Hardcastle & A. Marchal (Eds.), Speech

production and speech modeling (pp. 403�439). Dordrecht: Kluwer.

Pasupathi, M., Stallworth, L. M., & Murdoch, K. (1998). How what we

tell becomes what we know: Listener effects on speakers’ long-term

memory for events. Discourse Processes, 26(1), 1�25. doi:10.1080/

01638539809545035

Rosa, E. C., & Arnold, J. E. (2011). The role of attention in choice of

referring expression. In L. Carlson, C. Hoelscher & T. F. Shipley

(Eds.), Proceedings of the 33rd annual conference of the cognitive

science society. Austin, TX: Cognitive Science Society.

Sityaev, D. (2000). The relationship between accentuation and informa-

tion status of discourse referents: A corpus-based study. UCL

Working Papers in Linguistics (12).

Venditti, J. J., & Hirschberg, J. (2003). Intonation and discourse

processing. Proceedings of ICPhS 2003, Barcelona, pp. 107�114.

Watson, D. G., Arnold, J. E., & Tanenhaus, M. K. (2008). Tic tac TOE:

Effects of predictability and importance on acoustic prominence in

language production. Cognition, 106, 156�1557. doi:10.1016/j.cog-

nition.2007.06.009

Language and Cognitive Processes 9

Dow

nloa

ded

by [

Uni

vers

ity N

orth

Car

olin

a -

Cha

pel H

ill]

at 0

5:49

15

May

201

3