probabilistic online processing of sentence anomalies

26
Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=plcp21 Language, Cognition and Neuroscience ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/plcp21 Probabilistic online processing of sentence anomalies Laurel Brehm, Carrie N. Jackson & Karen L. Miller To cite this article: Laurel Brehm, Carrie N. Jackson & Karen L. Miller (2021): Probabilistic online processing of sentence anomalies, Language, Cognition and Neuroscience, DOI: 10.1080/23273798.2021.1900579 To link to this article: https://doi.org/10.1080/23273798.2021.1900579 © 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group View supplementary material Published online: 15 Mar 2021. Submit your article to this journal Article views: 96 View related articles View Crossmark data

Upload: others

Post on 06-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Full Terms & Conditions of access and use can be found athttps://www.tandfonline.com/action/journalInformation?journalCode=plcp21

Language, Cognition and Neuroscience

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/plcp21

Probabilistic online processing of sentenceanomalies

Laurel Brehm, Carrie N. Jackson & Karen L. Miller

To cite this article: Laurel Brehm, Carrie N. Jackson & Karen L. Miller (2021): Probabilisticonline processing of sentence anomalies, Language, Cognition and Neuroscience, DOI:10.1080/23273798.2021.1900579

To link to this article: https://doi.org/10.1080/23273798.2021.1900579

© 2021 The Author(s). Published by InformaUK Limited, trading as Taylor & FrancisGroup

View supplementary material

Published online: 15 Mar 2021.

Submit your article to this journal

Article views: 96

View related articles

View Crossmark data

REGULAR ARTICLE

Probabilistic online processing of sentence anomaliesLaurel Brehma,b,c, Carrie N. Jacksona,d and Karen L. Millera,e

aCenter for Language Science, Pennsylvania State University, University Park, PA, USA; bMax Planck Institute for Psycholinguistics, Nijmegen,Netherlands; cDepartment of Psychology, Pennsylvania State University, University Park, PA, USA; dDepartment of Germanic and SlavicLanguages and Literatures, Pennsylvania State University, University Park, PA, USA; eDepartment of Spanish, Italian, and Portuguese,Pennsylvania State University, University Park, PA, USA

ABSTRACTListeners can successfully interpret the intended meaning of an utterance even when itcontains errors or other unexpected anomalies. The present work combines an onlinemeasure of attention to sentence referents (visual world eye-tracking) with offlinejudgments of sentence meaning to disclose how the interpretation of anomalous sentencesunfolds over time in order to explore mechanisms of non-literal processing. We use ametalinguistic judgment in Experiment 1 and an elicited imitation task in Experiment 2. Inboth experiments, we focus on one morphosyntactic anomaly (Subject-verb agreement; Thekey to the cabinets literally *were… ) and one semantic anomaly (Without; Lulu went to thegym without her hat ?off) and show that non-literal referents to each are considered uponhearing the anomalous region of the sentence. This shows that listeners understandanomalies by overwriting or adding to an initial interpretation and that this occursincrementally and adaptively as the sentence unfolds.

ARTICLE HISTORYReceived 14 July 2020Accepted 16 February 2021

KEYWORDSSentence processing; visual-world paradigm; subject-verb agreement; implicitnegation; noisy channel;good-enough processing

1. Introduction

Language is full of variability with the potential tohinder communication. Speakers produce utteranceswith varied success, making occasional disfluenciesor speech errors, and what is considered acceptableto one speaker might be ungrammatical or infelicitousto another. Yet, despite these potential obstacles,most messages are successfully conveyed fromspeaker to listener. The present work examines theonline processing and ultimate comprehension ofanomalous ungrammatical sentences, such as The keyto the cabinets literally *were on the table, and seman-tically implausible sentences, such as Lulu visited thegym without her hat ?off late yesterday night. By com-bining an offline measure of sentence interpretationwith an online measure of attention to referents, wedemonstrate the real-time processing mechanismsthat allow the comprehension of these morphosyntac-tic and semantic anomalies. This provides insight intohow attention, meaning and morphosyntax contributeto language comprehension and sheds light on whatallows communication to be so robust despite poten-tial pitfalls.

1.1. Non-literal processing

Sometimes understanding anomalous utterances iseasy, leading to many circumstances where a literalutterance is efficiently interpreted non-literally. Oddphrases like The mother gave the candle the daughter(Gibson et al., 2013) or Lulu went to the gym withouther hat off (Brehm et al., 2018; Frazier & Clifton, 2015)are often reconstrued to add an element (The mothergave the candle to the daughter) or to eliminate anelement (Lulu went to the gym without her hat __). Simi-larly, erroneous sentences like The key to the cabinets*were on the table are also often interpreted as if theyinclude a grammatical, verb-matching subject noun(The keys to the cabinets were on the table; Brehm et al.,2018; Patson & Husband, 2016).

The good enough processing (Christianson et al., 2001,2010; Ferreira, 2003) and noisy channel frameworks(Gibson et al., 2013; Levy et al., 2009) suggest mechan-isms for how non-literal processing could occur.Broadly, both suggest that our expectations aboutlikely mappings between semantics, pragmatics, andmorphosyntactic forms can be used to reconstrue anodd or erroneous utterance. If an utterance has an

© 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis GroupThis is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or builtupon in any way.

CONTACT Laurel Brehm [email protected] Max Planck Institute for Psycholinguistics, P.O. Box 310, Nijmegen 6500 AH, NetherlandsSupplemental data for this article can be accessed at https://doi.org/10.1080/23273798.2021.1900579.

LANGUAGE, COGNITION AND NEUROSCIENCEhttps://doi.org/10.1080/23273798.2021.1900579

unlikely form or unlikely meaning, comprehenders mayinfer that what was literally said was not what wasintended. This is a sensible strategy if we assume thatthe goal of conversation is the communication of mean-ings – not literal forms – between speakers and listeners.

The good enough processing framework was orig-inally developed to describe how readers can interpretgarden-path sentences non-literally: in these sentences,an initial interpretation that turns out not to be licensedby the sentence’s syntax remains available once disam-biguating information has appeared (Christiansonet al., 2001). Good enough processing suggests thatthis happens because readers often interpret sentencesusing fast heuristics, instead of slow algorithmic proces-sing (see Karimi & Ferreira, 2016, for a recent review).Heuristics cause individuals to fail to re-parse a sentencewhen its form conflicts with a more likely meaning or toincompletely incorporate new information as it isreceived. The consequence is that an initial interpret-ation often tends to linger even once new evidencepoints against it.

The heuristics incorporated under the good enoughprocessing framework are based upon experience atmultiple levels of processing. This includes argumentstructure, such that likely agents are interpreted asagents (Ferreira, 2003) and syntactic structure, suchthat individuals fail to re-analyse infrequent, garden-path structures and instead stick with an initial parseof a verb’s structure (e.g. Christianson et al., 2001). Prag-matics, focus, and word order also cause comprehendersto process shallowly, leading to a variety of cognitiveillusions (see e.g. Sanford & Sturt, 2002, for review). Evi-dence for good enough processing often comes fromoffline judgments (e.g. Christianson et al., 2001; Ferreira,2003; Frazier & Clifton, 2011, 2015; Patson & Husband,2016), and reading studies (self-paced reading; e.g.Christianson & Luke, 2011; Swets et al., 2008; eye-track-ing while reading; Slattery et al., 2013), though recentwork has expanded into ERPs (Qian et al., 2018). Theupshot of this is that we have robust evidence that indi-viduals sometimes ignore how sentences are syntacti-cally, pragmatically, or semantically odd, and thisoddness does sometimes lead to difficulty in processing.

The more recent noisy channel theory and relatedrational approaches to sentence processing were orig-inally developed to account for the phenomenon thatsometimes readers fail to notice small errors in theinput (see e.g. Gibson et al., 2013). The theory, formalisedunder a Bayesian equation, says that sentence proces-sing relies probabilistically on an algorithm that com-bines prior expectations with the observed form,which causes processing difficulty when expectationsare not met (connected with the concept of surprisal,

Levy, 2008) or causes small errors to be corrected seam-lessly (Gibson et al., 2013).

The noisy channel theory suggests that expectationsof likely utterances map to what is semantically or prag-matically plausible, such that individuals either fail tonotice plausible errors at all (Gibson et al., 2013), orrevise them during reading by maintaining uncertainty(Levy et al., 2009). The likelihood of correcting an erroror revising an implausible utterance is also affected bythe amount of noise in the system (literal noise orsignals of speaker unreliability, see e.g. Gibson et al.,2017), and the size of the edit that is needed toconvert the literal input to a non-literal interpretation(e.g. Gibson et al., 2013). This is evidenced by answersto comprehension questions (Gibson et al., 2013,2017), and increases in regressions and overall readingtime in studies using eye-tracking while reading (Levyet al., 2009). The upshot of this is that errors can be men-tally converted into correct utterances, especially in situ-ations where errors are expected.

In the current paper, we explore predictions of boththeories with one set of data, exploring how non-literalprocessing of anomalous sentences happens dynami-cally over time. Both frameworks overlap substantiallyin their predictions: listeners will re-interpret semanti-cally and syntactically anomalous sentences as some-thing that is more likely. Under good enoughprocessing, non-literal processing arises because oflearned heuristics about argument structure and wordordering. This might cause an initial interpretation of asentence to linger in the face of new input, even if itconflicts with the original meaning. Under the noisychannel framework, non-literal processing arisesbecause of prior experience with meaning-to-form map-pings. This might lead to a swift repair of a small error,immediately overwriting the original interpretation. Byexploring two different anomaly types in a paradigmthat simultaneously investigates offline and online pro-cessing, we can evaluate whether these mechanismsboth contribute to sentence processing. Finally, both fra-meworks predict that individuals should update theirinterpretation of sentences based upon incoming infor-mation; good-enough processing also predicts that lis-teners may not always succeed at revising an earlierinterpretation, and the noisy channel theory also pre-dicts small errors will be corrected quickly.

1.2. Measuring sentence interpretations onlineusing visual world eye-tracking

Much of the previous work on non-literal processing hasfocused on the separate consequences of offline judg-ments and online difficulty during reading (though,

2 L. BREHM ET AL.

see Qian et al., 2018; Slattery et al., 2013). In the currentpaper, we integrate online and offline measures, demon-strating using how sentence interpretations change con-tinuously over time using the visual world paradigm.This provides concrete evidence about when differentrepresentations are activated during processing, andhow this leads to a particular interpretation of a sen-tence. This evidence then can shed light on howquickly reinterpretation can happen and what cues ituses.

The premise of the visual world eye-tracking para-digm is that an individual’s fixations to objects on a com-puter screen while listening to a sentence serve as aproxy for their attention to referents described in thesentence. Changes to patterns of fixations over timeshow how an utterance interpretation changes inresponse to new linguistic information, providing infor-mation about the time-course of language processing(e.g. Allopenna et al., 1998; Altmann & Kamide, 1999;Tanenhaus et al., 1995, among many, see Huettig et al.,2011 for review).

Previous work has used visual world eye-tracking toshow how sentences with multiple possible meaningsare interpreted. For example, one line of work examinesresponses to syntactic ambiguities such as “Tickle thefrog with the feather”. Tracking fixations to objectsmatching each reading of the word “with” (designatingan instrument: a frog, a feather vs. a modifier: a frogholding a feather) shows which referents are consideredat which points in time (see e.g. Ryskin et al., 2017; Sne-deker & Trueswell, 2004; Tanenhaus et al., 1995). Simi-larly, visual world eye-tracking has been used todemonstrate that comprehension of pronouns (Arnoldet al., 2000; Brown-Schmidt et al., 2005; Fukumuraet al., 2010) and quantifiers (Grodner et al., 2010;Huang & Snedeker, 2009) entails integration of multiplesources of information over time. In both lines ofresearch, the key principle is that the visual world para-digm demonstrates how various types of informationcontained within a sentence are quickly and efficientlyintegrated through time to update an expected sen-tence interpretation. The current work is inspired bythis earlier literature on determining reference. Wetrack changes in participants’ fixations, time-locked tocritical words, to items reflecting literal and non-literalinterpretations of sentences. This extends the visualworld paradigm to demonstrate how, and when, thefinal meaning of an anomalous sentence is obtained.

1.3. Current studies

In two studies, we adapt the visual world paradigm toexplore non-literal processing of semantic and

morphosyntactic sentence anomalies. We do so by sub-setting the online data based upon an offline measure ofultimate interpretation; this allows us to show how pat-terns of visual attention correspond to differentinterpretations of sentences. We measured interpret-ation in two ways. Experiment 1 used an offline metalin-guistic task where participants selected the imagematching the subject of the sentence, such that selec-tions of a non-literal subject index non-literal interpret-ations. Experiment 2 used an offline elicited imitationtask where participants repeated the sentence after afiller task, such that sentences repeated with changesprovide an index of non-literal interpretation. The goalwas to provide converging evidence for the linkbetween non-literal interpretations and online proces-sing in two different ways. Asking participants to selectthe subject provides an explicit measure of how the sen-tence was interpreted and links directly to previous non-literal processing research (e.g. Brehm et al., 2018), butrequires use of explicit grammatical knowledge andmay draw attention to the critical items, affecting howvisual attention is allocated. Elicited imitation tapsimplicit linguistic knowledge (see Bley-Vroman & Chau-dron, 1994; Chrabaszcz & Jiang, 2014; Erlam, 2006), butis quite different from tasks previously used to studynon-literal processing. Equivalent results from bothtasks therefore would provide a robust picture of thelink between online and offline processing.

Across experiments, we examined the processing oftwo kinds of anomalies; the goal here was to examinebroad principles of multiple aspects of non-literal pro-cessing. Experiments 1a and 2a examined subject-verbagreement (SVA), while Experiments 1b and 2b exam-ined implicit negation using without and the prepositionoff or on (Without). These reflect examples of non-literalprocessing based primarily on morphosyntax (Exper-iments 1a and 2a) and based primarily on semantics(Experiments 1b and 2b). These two anomalies alsodiffer in their likelihood of occurrence and consistencyof interpretation, allowing us to further examine howexpectations are affected by prior experience, and howexpectations change given experience within an exper-iment. Agreement errors are common and have a pre-ferred interpretation (the head was correct and theverb incorrect), but blends of implicit negation areuncommon and more varied in interpretation (see e.g.Brehm et al., 2018; Frazier & Clifton, 2015). If thesedifferent types of anomalies disclose similar behaviouralpatterns, this demonstrates general principles ofanomaly processing that cut across both non-literal pro-cessing frameworks.

Within the experiments, we focused on three researchquestions. First, we asked whether the anomalies led to

LANGUAGE, COGNITION AND NEUROSCIENCE 3

the non-literal interpretation of sentence subjects: this ispredicted under both non-literal processing frameworks.Next, we asked if, regardless of interpretation, fixationswere driven towards new referents after hearing ananomaly and away from the first referent that wasspecified in the sentence. This shows the speed withwhich individuals revise the interpretation of sentencemeanings given new input, regardless of the ultimateinterpretation. Lingering initial interpretations are pre-dicted under good-enough processing, and fast correc-tions of small errors are predicted under the noisychannel framework. Finally, we also asked how proces-sing each anomaly changes through the course of theexperiment, as individuals develop expectations aboutthe types of anomalies present in it. Changes to proces-sing over time are consistent with the noisy channel fra-mework because expectations result directly from whatthe comprehender has previously observed; this funda-mentally must update over time. Adaptation wouldalso be consistent with experience-driven changes toheuristics under good enough processing because heur-istics also develop out of past experience. We walkthrough how these research questions translate tospecific predictions for each anomaly type below.

1.3.1. Subject-verb agreementIn Experiments 1a and 2a, we examine a morphosyntac-tic anomaly involving subject-verb agreement (SVA).Production errors occur for SVA, such that singular sub-jects with plural “local” (non-subject) nouns often elicitplural verbs (termed “attraction”, e.g. Bock & Miller,1991). SVA production variability also occurs basedupon dialect (e.g. Tagliamonte & Baayen, 2012; Tortora& den Dikken, 2010) and as a result of the phrase’smeaning (e.g. Brehm & Bock, 2013, 2017). This suggeststhat multiple inferences about morphosyntax could belicensed about why an SVA anomaly has occurred.

This variability has consequences for SVA processing.Attraction makes anomalous plural verbs relatively felici-tous (1a is less bad than 1c; e.g. Pearlmutter et al., 1999;Wagers et al., 2009) 1; this is typically explained using acue-based retrieval framework: the local noun (cabinets)is retrieved instead of the head to control the verb sinceit has a verb-matching number feature (see e.g. Dillonet al., 2013; Wagers et al., 2009). Anomalous pluralverbs (1a/1c) also cause non-literal comprehension ofthe head, as do plural non-subject nouns (1a/1b),meaning that in both cases, the literal head noun (key)is interpreted as plural (keys; see Brehm et al., 2018;Patson & Husband, 2016; Schlueter et al., 2019).

(a) The key to the cabinets literally *were on the table(b) The key to the cabinets literally was on the table

(c) The key to the cabinet literally *were on the table(d) The key to the cabinet literally was on the table

Earlier work leaves open whether attraction and non-literal SVA comprehension have the same origins, whichmake this anomaly an ideal candidate to investigate in aparadigm combining online and offline measures. Oneinterpretation of non-literal SVA comprehension is thata plural number feature on the local noun or verbcauses re-construal of the head number (from key tokeys) as a repair (as suggested by Patson & Husband,2016 and consistent with e.g. Levy, 2008). This meansthat non-literal SVA comprehension and attractioncould arise from the same mechanism: recasting thehead’s number from singular to plural to repair the utter-ance would cause increased plural interpretations andalso lead to attraction. Alternatively, attraction couldcome from mis-retrieving the local noun, not as arepair, following the typical view of cue-based retrievalin agreement processing. Or, attraction could arise as arepair, but non-literal interpretations could arise from afaulty or underspecified memory of the utterance (thegist was “plural”; suggested in Patson & Husband,2016, and consistent with Tanner et al., 2014). Finally, amixture of multiple mechanisms could obtain (see Schlu-eter et al., 2019, for discussion).

In the current work, we examine how attention is allo-cated to sentence referents varying in number (seeFigure 1(a)); we can therefore see when the literal (key)and non-literal (keys) versions of the head noun, andliteral and non-literal versions of the local noun(cabinet(s)) receive attention during listening. Thissheds light on the link between processing difficultyand non-literal comprehension, asking whether non-literal interpretations of SVA anomalies are due to thequick integration of information as it unfolds or to alate revision process.

1.3.2. Without blendsIn Experiments 1b and 2a, we examine a semanticallyimplausible utterance combining two implicitly-nega-tive elements: “without” and the preposition “off”. Useof negation is variable in English such that negativeconcord (didn’t have no = had nothing) and compo-sitional negation (didn’t have no = had something) canboth be used, depending on the speaker’s dialect (seee.g. Blanchette, 2013; Blanchette & Lukyanenko, 2019).Having two similar forms available to convey onemessage results in production blends (e.g. Butterworth,1982; Cutting & Bock, 1997). In the case that speakershave more than one dialect available (e.g. StandardisedEnglish and a non-Standardised dialect), the two plansmight compete to elicit a blend. This suggests that

4 L. BREHM ET AL.

multiple inferences might be drawn about why Withoutanomalies are produced (purposefully, reflectingconcord, versus in error).

Correspondingly, comprehension of Withoutanomalies is variable. Potentially-anomalous utteranceslike 2a have two possible meanings, such that theyelicit literal (“Lulu was wearing a hat”) and non-literalinterpretations (“Lulu was not wearing a hat”) in contrastto utterances like 2b with no possible concord or blend,which are nearly always understood literally (“Lulu wasnot wearing a hat”, Brehm et al., 2018; Frazier &Clifton, 2015).

(a) Lulu visited the gym without her hat ?off late yester-day night

(b) Lulu visited the gym without her hat on late yester-day night

The suggestion is that the infrequent co-occurrenceof “without” and “off” and the plausible site for ablend between two sentence plans (without her hat on;with her hat off) can cause listeners to infer the presenceof a speech error despite the fact that the utterance isactually grammatical. Frazier and Clifton (2015)propose that this inference derives from the comprehen-der “reading through” the original utterance, ignoringredundant information when it can be attributed tonoise in the system. This would suggest that when indi-viduals take a non-literal interpretation of this anomaly,they should continue attending to the original referentat the anomalous off, rather than revising their choice.An alternate possibility might be that the comprehendernotices the change in interpretation at the anomalousoff, and spends time considering the alternate referentbefore ultimately selecting the original referent as themost plausible candidate for the utterance meaning.

In the current studies, we examine how Withoutanomalies are comprehended by examining howimplicitly-negative elements (without, off) affect atten-tion to agents who are wearing an optional accessorylike a hat or shoes (Yes-Accessory, see Figure 1(b))versus minimally-different agents with the accessorybeside them (No-Accessory, see Figure 1(b)). The wordwithout may lead a listener to the “No-Accessory”interpretation, drawing attention to the “No-Accessory”image. To obtain the literal interpretation of (2a), a lis-tener then needs to revise the initial interpretation,which might be effortful or time-consuming. We askhow this revision occurs, examining whether non-literal interpretations of Without anomalies are due tothe online integration of unfolding information or dueto offline processes.

2. Experiment 1

In this experiment, we assessed individuals’ interpret-ations of sentences containing SVA (Experiment 1a)and Without anomalies (Experiment 1b) using a meta-linguistic forced-choice judgment combined withvisual world eye-tracking to measure changes ininterpretation over time, within trials and across theexperiment.

2.1. Method

2.1.1. ParticipantsData were collected from 38 members of the Pennsylva-nia State University community (mainly graduate andundergraduate students); participants were paid $10 orgiven course credit in exchange for their participation.Of these, five individuals were excluded because they

Figure 1. Sample image arrays. A: SVA item (The key to thecabinet(s) literally was/were on the table). Top left = literalsubject key; bottom left = non-literal subject keys; right pictures= local noun and a non-literal version of it (cabinet/cabinets). B:Without item (Lulu visited the gym without her hat on/off late yes-terday night). Top left = Yes-Accessory agent (the literalinterpretation of “without + off”); bottom left = No-Accessoryagent (the literal interpretation of “without + on” and non-literal interpretation of “without + off”). Right pictures = locationof the verb and a semantic competitor for it.

LANGUAGE, COGNITION AND NEUROSCIENCE 5

were not native speakers of English, and one wasexcluded due to poor calibration. All 32 remaining par-ticipants were native speakers of English, ranged inage from 18 to 28 (M = 20) and had normal or cor-rected-to-normal vision and hearing.

2.1.2. Materials and designA list of critical items appears in Appendix A; items wereadapted from previous work (Brehm et al., 2018;Solomon & Pearlmutter, 2004). There were four versionsof each SVA item, varied in the number of the local noun(singular/plural) and verb (singular/plural), as in (1a) –(1d) above. There were two versions of each Withoutitem, varied in preposition (on/off), as in (2a) and (2b)above. Filler items included datives (e.g. The devil gavethe treasure to the tooth fairy), numerals (e.g. Twogiraffes were eating grass in the zoo), and embedded rela-tive clauses (e.g. The button that fell off the boy’s coat wasblue).

Items were recorded by a female speaker of AmericanEnglish from the Washington DC area. All critical itemrecordings were cross-spliced before the anomalousregion of the sentence (the verb in SVA trials and thepreposition in Without trials). The first half of eachitem was taken from a canonical sentence (e.g. a sen-tence like 1b, 1d, 2b) and the second half was takenfrom either another recording of a canonical sentenceor from an anomalous sentence (1a, 1c, 2a). This wasdone to match stimuli across sets of items prior to thesentence anomaly.

Images were simple coloured line drawings. Thesewere taken from stimulus databases (Duñabeitia et al.,2017; Rossion & Pourtois, 2004) or public domainsources (Wikimedia Commons, Pixabay, and Flickr),created using Pixton.com, or drawn by the first author.Images for SVA trials (e.g. as in Figure 1(a)) containedone- and two-token version of both the head and localnouns; in the two-token images, the colours of eachtoken were adjusted to make them different. Imagesfor Without trials (e.g. as in Figure 1(b)) contained anagent (half female, half male) with an accessory (“Yes-Accessory”), an agent without an accessory (“No-Acces-sory”), the theme or argument of the verb, and anitem/location related to the theme/argument of theverb. For the “No-Accessory” images, the accessoryappeared near, but separate from, the agent.

Filler trials used similar image types. Numeral fillerscontained images with one to three tokens of an item,dative fillers contained two people and two semanti-cally-similar objects/animals, and relative clause fillerscontained a pair of images with “Yes-Accessory” or

“No-Accessory” version of the agent and a pair ofobjects with a colour contrast.

The session consisted of 108 trials, divided into fourblocks of 27 trials. This included 48 critical SVA items,24 critical Without items, 20 dative fillers, 8 fillers begin-ning with a numeral, and 8 fillers containing anembedded relative clause. Items were assigned to 16lists with a Latin square design such that the four ver-sions of SVA items and two versions of Without itemswere equally represented and so that the target imagewas equally likely to appear in each screen location.Within a list, items were presented in a fixed pseudo-ran-domised order such that an equal number of items ofeach condition were presented in each block, suchthat no more than two critical items were adjacent,and such that no two items in the same conditionwere adjacent. Two participants viewed each list.

2.1.3. Apparatus and procedureParticipants were instructed to listen to sentences and toclick on the image from a four-image array that bestmatched the subject of the sentence. “Subject” wasdefined for participants at the onset of the experimentwith the statement: “The subject of the sentence is thedo-er of an action or the thing that is being described”.

Each trial began with a fixation target in the middle ofthe screen that the participant clicked on to start thetrial. Then, four images appeared for 1500 ms ofpreview (as in Figure 1), a long duration in contrast toprevious work which was designed to allow participantssufficient time to “spot the difference” between similarimages and to fully encode the entire array. Thepreview phase was followed by a sentence played overspeakers (range of lengths: 2213 ms to 3495 ms). Partici-pants were then given an unlimited amount of time toselect the image corresponding to the sentencesubject. Once the participant clicked an image, anorange box appeared around the selection andremained there for 2000ms. Participants were offeredthe chance to take breaks between blocks and thesession lasted between 30 and 40 min.

The experiment was run using Experiment Builder ona Dell PC with a 21 × 11.5 inch monitor and an EyeLink1000 Plus Desktop in remote mode (sample rate500 Hz) placed 20 inches in front of the computermonitor. Participants were seated about 20 inches infront of the camera. Images appeared in ports thatwere 6.75 inches wide by 5 inches high, spaced 2.75inches apart horizontally and 1.5 inches apart vertically;these ports defined the experimental interest areas. Allsounds were played over speakers at a volume comfor-table to the participant.

6 L. BREHM ET AL.

2.1.4. AnalysisAll analyses were conducted in R (version 3.3.3; R CoreTeam, 2014) using the package lme4 (version 1.1-13;Bates et al., 2014). Analysis of subject selection used logis-tic regression with odds of selecting a non-literal targetobject as the dependentmeasure. Analysis of eye-trackingdata used linear regression with the dependent measureofproportionoffixations perpersonper trial to competitorobjects in 500 ms windows offset from onsets of criticalwords by 200 ms, reflecting the typical time to launch aneye movement (e.g. Cooper, 1974). These windows wereselected because they were of a similar length to thenouns and verbs in the sound stimuli and did notoverlap in time; 500 ms also provided a reasonableamount of data per window. Analyses were performedseparately for each window without correction for mul-tiple comparisons, as is typically done in the field.However, because the data are not completely indepen-dent across time windows due to autocorrelation, wealso note in the tables any effects that are not significantwhen the alpha level was corrected by dividing by thenumber of timewindowswithin an experiment. All predic-torswere contrast codedandcentered;moredetail followswithin eachexperiment. Randominterceptswere includedfor Participants and Items and as many random slopes asjustified by the data were included (e.g. Barr et al., 2013),with slopes being removed due to non-convergence(beginning with the highest-order interactions) or due tocorrelations above 0.9 between them. For eye-trackinganalyses, p-values were obtained by model comparison.

2.2. Experiment 1a: SVA trials

The SVA anomaly reflects a syntactic dependency goneawry. We examine the meaning interpreted from anom-alous, ungrammatical utterances such as The key to thecabinet(s) literally *were on the table in contrast to cano-nical utterances containing the verb was, providinginsight into howmorphosyntax is processed non-literallyand how attraction arises in comprehension. The predic-tion was that if non-literal interpretation and attractionarise from the same mechanisms – a recasting of thehead or head number – local plural nouns and pluralverbs should increase the likelihood of non-literalinterpretation and increase fixations to objects otherthan the literal head noun, but that if the two are funda-mentally separate, patterns should differ betweeninterpretation and eye-tracking measures.

2.2.1. AnalysisAnalysis of subject selections examined the odds ofselecting the plural number competitor to the head

(“Keys”) in the offline interpretation task, with the predic-tors of Local Noun Type (singular, plural, contrasts of .5,-.5), Verb Type (singular, plural, contrasts of .5, -.5), andTrial Number (scaled and centered). Analysis of eye-tracking data examined the proportion of fixations perparticipant per trial to the plural version of the head(e.g. “Keys”) in four 500 ms time windows. Timewindows began 200 ms after onset of the head noun,the local noun, and the verb, and 700 ms after theonset of the verb, with the aim of capturing processingin response to each critical word, and after hearing theverb and integrating it with the rest of the sentence.Analysis of eye-tracking data was performed only onthe trials where the canonical subject was selected, asthis was the overwhelmingly dominant response. Forthe eye tracking analyses, predictors were Local Noun(singular, plural, centered contrasts of .48, -.51), Verb(singular, plural, centered contrasts of .47, -.53) andTrial Number (scaled and centered).2

2.2.2. ResultsSubject selection. As shown in Figure 2, plural local nounsand plural verbs increased the odds of selecting the non-literal head “Keys” as the subject of the sentence. Plurallocal nouns led to selection of the non-literal head 10%of the time, vs. 5% for singular local nouns, and pluralverbs led to selection of the non-literal head 14% ofthe time, vs. 2% for singular verbs. These patternswere confirmed by mixed-effect modelling (see Table 1).

On top of these patterns, we also observed a maineffect of Trial Number and an interaction between TrialNumber and Verb such that the non-literal head “Keys”was selected more often at the beginning than theend of the experiment (11% in Block 1 vs. 6% in Block4)3, and the decrease was clearest in the plural verb con-ditions (dropping from 16% in Block 1 to 12% in Block 4).We did not test whether the interaction between localnoun and verb was qualified by trial number: since thenon-literal head was selected only on one trial in thesingular local noun, singular verb condition, weomitted the three-way interaction between LocalNoun, Verb and Trial Number from the model of thesedata.4

Proportion of fixations. Proportions of fixations split bycondition appear in Figure 3. Throughout the trial, par-ticipants mostly fixated the image of the literal headnoun (e.g. “Key”) and its non-literal number competitor(e.g. “Keys”), with few fixations to the images represent-ing the local noun and its competitor (e.g. “Cabinet”/“Cabinets”).

There were minimal differences between sentenceversions before the anomalous region of the sentence(see Table 2 for mixed-effect models). In the first time

LANGUAGE, COGNITION AND NEUROSCIENCE 7

window, representing processing relative to the headnoun (“key”), there were no significant differences infixations to the non-literal head (“Keys”). In the secondtime window, representing processing relative to thelocal noun (“cabinet(s)”), there was a marginal effect ofTrial Number (p=0.052) and a significant interactionbetween Local Noun and Trial Number such that trialsat the end of the experiment elicited fewer fixations tothe non-literal head (“Keys”) and the reduction waslargest for items with singular local nouns (droppingfrom 32% in Block 1 to 20% in Block 4).

Figure 2. Subject selection for Experiment 1 SVA trials, split by condition.

Table 1. Results from logistic mixed-effect regression of subjectselection (odds of selecting head noun number competitorversus all other items) for Experiment 1 SVA trials.Fixed effects b̂ SE p(z)

Intercept −4.76 0.51 <0.001Local Noun −1.85 0.58 0.01Verb −4.14 0.77 <0.001Trial Number −0.95 0.26 <0.001Local Noun × Verb −1.85 1.14 0.10Local Noun × Trial Number 0.17 0.27 0.53Verb × Trial Number −1.46 0.51 <0.01Random effects Groups SD

Item Intercept 0.49Participant Intercept 1.45

Figure 3. Average proportion of fixations to images in Experiment 1 SVA trials receiving a literal sentence interpretation (KEY selectedas the subject), zeroed to reflect verb onset. Vertical lines represent onsets of head nouns, local nouns, and verbs; panels representsentence conditions. Confidence bands are 95% CIs from a non-parametric bootstrap (1000 iterations) sampled over participants withreplacement at 10 msec intervals.

8 L. BREHM ET AL.

Larger differences between sentence versionsappeared after the anomaly “were” (see Table 2). In thethird time window, representing processing relative tothe verb (“was/were”) there was a significant maineffect of Local Noun such that a plural local nounincreased fixations to the non-literal head uponhearing the verb (21% for plural local nouns, 17% forsingular local nouns) and a main effect of Trial Numbersuch that fixations to the non-literal head uponhearing the verb decreased during the experiment(from 24% in Block 1 to 14% in Block 4). In the fourthtime window, representing processing after hearingand integrating the verb with prior context, there wasa significant main effect of Verb such that havingheard a plural verb increased fixations to the non-literal head (23% for plural verbs, vs. 14% for singularverbs), and a main effect of Trial Number such thatfixations to the non-literal head decreased during theexperiment (from 24% in Block 1 to 15% in Block 4).

2.2.3. DiscussionFor SVA anomalies (The key to the cabinet(s) *were/was… ) containing local plural nouns and plural verbs, thenon-literal sentence subject (“Keys”) was selected moreoften in a metalinguistic judgment task compared tocontrol items. This replicates previous results (e.g.Brehm et al., 2018; Patson & Husband, 2016). Localplural nouns and plural verbs also influenced attentionon trials where the literal sentence subject (“Key”) wasselected. Within these trials, plural features on localnouns (cabinets) and verbs (were) led to an increase infixations to a non-literal version of the head (“Keys”).This shows how conflicting number features lead listen-ers to consider an alternate referent and provides evi-dence that attraction and non-literal processing areboth supported – at least in part – by repair processes.

These eye-tracking results further suggest that as inthe standard cue-based retrieval view of agreementattraction (e.g. Wagers et al., 2009), attraction happenswhen a plural verb cues retrieval of a plural controller.Existing literature interprets this pattern to mean thelocal noun controls agreement. However, we observedthat a plural version of the head was chosen instead torepresent the sentence subject. This difference couldbe due to the task: as an anonymous reviewer notes, par-ticipants tended to fixate both versions of the head nounprior to hearing the verb, which could cause the non-literal head to be encoded, and later, retrieved, followingstandard cue-based retrieval mechanisms. Alternatively,the comprehender might retrieve the head noun lemmaand a plural feature, which are recombined to make anon-literal plural version of the head (see Patson &Husband, 2016 for similar logic).Ta

ble2.

Results

from

linearm

ixed-effectregressionof

prop

ortio

nof

fixatio

nsto

head

noun

numbercom

petitor

binn

edat

500msectim

ewindo

wsforE

xperiment1

SVAtrialseliciting

aliteral-headnu

mberinterpretatio

n.

Fixedeff

ects

Windo

w1:

Headno

unregion

Windo

w2:

Localn

ounregion

Windo

w3:

Verb

region

Windo

w4:

Post-Verbregion

b̂SE

p(χ2)

b̂SE

p(χ2)

b̂SE

p(χ2)

b̂SE

p(χ2)

Intercept

0.25

0.01

<0.00

10.26

0.02

<0.00

10.20

0.02

<0.00

10.20

0.02

<0.00

1LocalN

oun

0.01

0.02

0.44

0.00

0.02

0.87

−0.05

0.02

0.03

0.03

0.02

0.11

Verb

−0.01

0.02

0.52

−0.02

0.02

0.32

−0.00

0.02

0.94

−0.12

0.02

<0.00

1TrialN

umber

−0.02

0.01

0.09

−0.02

0.01

0.05

−0.04

0.01

<0.00

1−0.04

0.01

<0.00

1LocalN

oun×Verb

0.01

0.04

0.78

−0.03

0.04

0.41

0.00

0.03

0.95

−0.01

0.03

0.79

LocalN

oun×Trial

0.02

0.02

0.21

−0.04

0.02

0.02

0.02

0.02

0.15

−0.01

0.02

0.82

Verb

×Trial

0.00

0.02

0.36

−0.03

0.02

0.15

−0.00

0.02

0.78

0.03

0.02

0.07

LocalN

oun×Verb

×Trial

−0.02

0.04

0.61

−0.00

0.04

0.89

−0.04

0.03

0.25

0.06

0.03

0.06

Rand

omeff

ects

Group

sSD

Group

sSD

Group

sSD

Group

sSD

Item

Intercept

0.04

Item

Intercept

0.04

Item

Intercept

0.03

Item

Intercept

0.04

Participant

Intercept

0.04

Participant

Intercept

0.06

Participant

Intercept

0.07

Participant

Intercept

0.08

Verb

0.01

LocNou

n0.04

LocNou

n0.07

LocNou

n0.07

Trial

0.02

Verb

0.05

Residu

al0.30

Residu

al0.30

Residu

al0.33

LocNou

n×Verb

0.08

Residu

al0.33

Effectsareitalicised

whenthey

wou

ldbe

sign

ificant

undera0.05

alph

alevel,bu

tno

tan

alph

alevelcorrected

formultip

lecomparison

s.

LANGUAGE, COGNITION AND NEUROSCIENCE 9

Changes in fixations to the non-literal head occurredin separate time windows for nouns and verbs,suggesting their separate influence on interpretation.These patterns also tended to attenuate with exposure,such that at the end of the experiment, non-literalinterpretations occurred less often and fewer fixationswere made to non-literal referents in time-windowstwo, three, and four. Combined, these data suggestthat, consistent with the noisy channel framework,offline non-literal interpretations can arise due to a fastonline repair process, which is impacted by past andrecent experience.

2.3. Experiment 1b: without trials

This experiment focuses on processing of Withoutanomalies, a semantically-odd sentence that could beproduced purposefully or as an error. We examine howitems such as Lulu visited the gym without her hat ?offlate yesterday night are interpreted in contrast to plaus-ible control items, asking whether Without anomalieslead to increases in non-literal interpretations and dis-closing how interpretations change during processing.The prediction was that if non-literal interpretationsarise due to a faulty revision process, fixations to theoriginal referent should be maintained even afterhearing the anomaly.

2.3.1. AnalysisThe analysis of subject selections examined the odds ofselecting the non-literal target image, which was the“Yes-Accessory” image for without-on items and the“No-Accessory” image for without-off items. Predictorswere Preposition type (on, off, contrasts of .5, -.5) andTrial Number (scaled and centered).

The eye-tracking analysis examined the proportion offixations per participant per trial to a competitor object

(“Yes-Accessory” image, the less likely agent to beselected) in 500 ms time windows beginning 200 msafter the onset of without and the preposition on/off,and 700 ms after the onset of the preposition, reflectingprocessing in response to without, the preposition, andafter integrating the preposition with the rest of the sen-tence. Preposition was combined with subject selectionto yield a three-level variable of Interpretation: “Without-On” items interpreted literally, “Without-Off” items inter-preted non-literally, and “Without-Off” items interpretedliterally.5 These were analysed with centered Helmertcontrasts. The first contrast compared trials withdifferent subject selections, contrasting the No-Acces-sory selections (“Without On”+literal = .08 plus“Without Off”+non-literal = .30) vs. the Yes-Accessoryselections (“Without Off”+literal =−0.62); the secondcontrast compared trials when the No-Accessorypicture was selected (“Without On”+literal =−0.22 vs.“Without Off” + non-literal = 0.78).6 Scaled and centeredTrial Number was also entered as a predictor.

2.3.2. ResultsSubject selection. Items containing the non-anomalouspreposition on led to little variability in responding(see Figure 4), mainly eliciting selection of the literally-interpreted No-Accessory agent as the sentencesubject (95% of trials). The rest of the trials were splitbetween the Yes-Accessory agent (3%) representing anon-literal interpretation, and the theme/location (2%),representing confusion about the subject. Items withthe potentially-anomalous preposition off were inter-preted more variably, with the Yes-Accessory agent(literal interpretation) selected on 72% of trials, the No-Accessory agent (non-literal interpretation) selected on26%, of trials and the theme/location or its foil selectedon 1% of trials. This was reflected in a main effect of

Figure 4. Subject selection for Experiment 1 Without trials, split by condition.

10 L. BREHM ET AL.

Preposition on non-literal responses in a mixed-effectmodel; see Table 3.

As Experiment 1a, non-literal response rates droppedthrough the course of the experiment. In a mixed-effectmodel, (see Table 3), this was reflected in a main effect ofTrial Number, such that 26% of the responses in Block 1were non-literal but 9% of the responses in Block 4 werenon-literal. No other significant main effects or inter-actions were observed.

Proportion of fixations. As shown in Figure 5, partici-pants tended to fixate both potential agents roughlyequally (“Yes-Accessory” and “No-Accessory”) untilhearing the word without; while there was a slightnumeric tendency to prefer the No-Accessory agent,this difference was not statistically reliable. At thispoint, looks to the No-Accessory agent rose. The

preposition off often led to a revision of the originalinterpretation, such that when the sentence was inter-preted literally, looks to the Yes-Accessory agentincreased at late time windows of the sentence.Overall, few looks were directed to the location/theme (depicted in grey) or its competitor (notdepicted for simplicity).

Mixed-effect model results are shown in Table 4. Inthe first analysis window, representing processing inresponse to the word without, there was a main effectof Trial Number such that more looks were directedtoward the “Yes-Accessory” agent late in the experiment(weighted mean Block 1: 27% vs. Block 4: 31%). TrialNumber interacted with the second Interpretation con-trast such that for non-literalWithout-Off-“No-Accessory”interpretations, fixations to the “Yes-Accessory” agentdropped during the experiment (from 32% in Block 1to 25% in Block 4), while for literal Without-On-“No-Accessory” interpretations, fixations to the “Yes-Acces-sory” agent rose during the experiment (from 22% inBlock 1 to 34% in Block 4). This suggests that if partici-pants directed their attention away from the “Yes-Acces-sory” agent before hearing the sentence anomaly (e.g.for Without-Off sentences), they were less likely toselect it as the subject.

In the next window, representing processing inresponse to the preposition, an interaction between

Table 3. Results from logistic mixed-effect regression of subjectselection (odds of selecting non-literal subject versus all otheritems) for Without trials.Fixed effects b̂ SE p(z)

Intercept −6.92 1.65 <0.001Preposition −9.35 3.29 <0.01Trial Number −0.96 0.39 0.01Preposition × Trial Number −1.16 0.78 0.14

Random effects Groups SD

Item Intercept 5.62Preposition 11.19

Participant Intercept 0.00

Figure 5. Average proportion of fixations to images in of Experiment 1 Without trials, zeroed to reflect preposition onset. Vertical linesrepresent onsets of without and prepositions; panels represent sentence conditions crossed with sentence interpretation. The non-literal interpretation of “Without-On” (bottom left panel) was not analysed because of an insufficient number of trials but we present ithere for completeness. Confidence bands are 95% CIs from a non-parametric bootstrap (1000 iterations) sampled over participantswith replacement at 10 msec intervals.

LANGUAGE, COGNITION AND NEUROSCIENCE 11

Trial Number and the Interpretation contrast 1 wasobserved such that more looks were directed towardthe “Yes-Accessory” agent at the end of the experimentespecially if it was selected as the subject (26% in Block1, vs. 45% in Block 4). This suggests that participantslearned to tune their attention during the experimentin response to the preposition.

All differences were largest in the third time window,representing processing after hearing and integratingthe anomaly with prior context. In this window, therewas a main effect of the Interpretation contrast 1, suchthat items for which the “Yes-Accessory” agent wasselected elicited more looks to the “Yes-Accessory”agent (61% of fixations, vs. “No-Accessory” weightedmean of 11%). This interacted with Trial Number suchthat fixations to the “Yes-Accessory” agent increasedduring the experiment when the “Yes-Accessory” agentwas selected as the sentence subject (36% of fixationsin Block 1, vs. 70% in Block 4).

2.3.3. DiscussionForWithout anomalies (without her hat ?off), we observedmore selections of a non-literal sentence subject (“No-Accessory”) in a metalinguistic judgment task comparedto non-anomalous control items. This replicates previouswork (e.g. Brehm et al., 2018; Frazier & Clifton, 2015). Wealso showed that fixations to possible referents varieddepending on whether the sentence contained a poten-tial anomaly and how the sentence was interpreted.When without + off was interpreted literally, there weremore fixations to the “Yes-Accessory” agent after theword off was processed, demonstrating how revisionsin interpretation can be prompted by even potentially-anomalous elements. In contrast, when without + off

was interpreted non-literally, there were few fixations tothe competitor “Yes-Accessory” agent in the sameregion, such that all trials receiving the same “No-Acces-sory” interpretation had statistically equivalent patternsof fixations to the competitor object. This underscoresthe connection between visual attention and inferredmeaning, a premise of the visual-world paradigm, evenin anomalous sentences. It also underscores how, consist-ent with the good-enough processing framework, non-literal interpretations arise from the dynamic revision ormaintenance of how a sentence is interpreted inresponse to new information.

As in Experiment 1a, all patterns tended to changewith exposure: at the end of the experiment, fewernon-literal interpretations occurred and fewer fixationswere made to non-literal referents. This demonstrateshow exposure influences expectations about what a sen-tence is likely to mean and confirms the link betweenanomaly processing and experience with the mappingbetween likely meanings and likely forms.

3. Experiment 2

In this experiment, we assessed individuals’ interpret-ations of sentences containing SVA (Experiment 2a)and Without anomalies (Experiment 2b) using an elicitedimitation task. We again paired this with visual-worldeye-tracking in order to measure interpretationchanges during processing. Elicited imitation providesan implicit measure of sentence meaning withoutrequiring a metalinguistic judgment; the question waswhether the online and offline patterns in Experiment1 replicated with an implicit measure of non-literalinterpretation.

Table 4. Results from linear mixed-effect regression of proportion of fixations to yes-accessory agent binned at 500 msec timewindows for Experiment 1 Without trials split by interpretation.

Fixed effects

Window 1:Without region

Window 2Preposition region

Window 3:Post-Preposition region

b̂ SE p(χ2) b̂ SE p(χ2) b̂ SE p(χ2)

Intercept 0.28 0.02 <0.001 0.25 0.03 <0.001 0.19 0.02 <0.001Interpretation 1: (Yes- vs. No-Accessory) 0.02 0.06 0.78 −0.10 0.07 0.13 −0.61 0.06 <0.001Interpretation 2: (No-Accessory Literal vs.Non-Literal)

−0.06 0.07 0.43 −0.13 0.07 0.08 0.08 0.06 0.19

Trial Number 0.04 0.02 0.02 0.00 0.00 0.77 0.00 0.00 0.34Interpretation 1 (Yes- vs. No-Acc) × Trial 0.08 0.06 0.18 0.00 0.00 0.04 −0.01 0.00 <0.001Interpretation 2 (Yes-Acc Lit vs. Non-Lit) ×Trial

−0.15 0.07 0.03 0.00 0.00 0.40 0.00 0.00 0.28

Random effects Groups SD Groups SD Groups SD

Item Intercept 0.03 Item Intercept 0.06 Item Intercept 0.03Trial 0.02 Interp. 1 0.11 Participant Intercept 0.06

Participant Intercept 0.08 Participant Intercept 0.09 Interp. 1 0.20Interp. 1 0.14 Interp. 1 0.17 Residual 0.31Interp. 2 0.11 Interp. 2 0.10Trial 0.04 Residual 0.33

Residual 0.36

Effects are italicised when they would be significant under a 0.05 alpha level, but not an alpha level corrected for multiple comparisons.

12 L. BREHM ET AL.

3.1. Method

3.1.1. ParticipantsData were collected from 35 members of the Pennsylva-nia State University community (mainly undergraduateand graduate students); participants were paid $10 orgiven course credit in exchange for their participation.Two individuals were excluded due to missing audiodata and one was excluded due to poor eye-trackingcalibration. The 32 remaining participants were allnative speakers of English, ranged in age from 18 to 36(M = 23) and had normal or corrected-to-normal visionand hearing.

3.1.2. Materials and designCritical items were identical to Experiment 1. Toimplement a distractor task for the elicited imitationparadigm, two-thirds of the filler items (24 out of 36,22% of all items) were altered by swapping nounsbetween pairs of fillers from Experiment 1 to makethem refer to highly unlikely events (e.g. Three clockswere eating grass in the zoo).

3.1.3. Apparatus and procedureThe first phase of the trial matched Experiment 1: trialsbegan with a fixation cross, a preview of four images,and a sentence played over the computer speakers.Images remained on screen 4000 ms after the onset ofthe sentence. The next phase was inspired by Chra-baszcz and Jiang (2014). Participants performed a prag-matic distractor task in which they were given unlimitedtime to judge “How likely is this event?” by selectingfrom a three-point scale (Likely, Not Sure, Unlikely)with the mouse. Next, images were re-presented in thesame or scrambled locations in order to elicit the pro-duction of the sentence presented earlier in the trial.For critical trials, all four images appeared in a new pos-ition, counterbalanced such that across participants,each image appeared equally often in each position;the goal of this was to increase the difficulty of therecall task. For filler trials, the new array was identicalor had two to four images in a new position. Thenumber of trials, length of experiment, and apparatusmatched Experiment 1.

3.1.4. AnalysisRepetitions were coded as to whether they reflectedliteral or non-literal interpretations, as outlined separ-ately for each sub-experiment below. Logisticregressions on odds of non-literal interpretation andeye-tracking analyses on fixations to the non-literalhead were performed as described in Experiment 1.

3.2. Experiment 2a: SVA trials

Experiment 1a examined items with a potential syntacticanomaly (SVA items: The key to the cabinet(s) *were/was).This experiment disclosed that a non-literal sentencesubject (“Keys”) was selected more often when the sen-tence contained plural local nouns or plural verbs andthat these plural cues also increased attention to a non-literal version of the head noun during processingwhen the sentence was interpreted literally. The goal ofExperiment 2awas to assesswhether these patterns repli-cated with a more implicit offline measure. If so, pluralfeatures should again lead to more non-literal inflec-tion-changing repetitions and the pattern of fixationsduring processing should broadly mirror Experiment 1a.

3.2.1. AnalysisThe analysis of sentence repetitions examined the logodds of sentences being produced non-literally, asdefined as changes to the number inflections on thetwo preamble nouns or the verb. We compared thesenon-literal responses against identical repetitions andother types of repetition error. Identical repetitionswere all productions in which all content words werereproduced as in the original prompt, including trialswith disfluencies or restarts. The remaining trials wereclassified as other repetitions. These included all trialswhere the inflections matched the original prompt butthe adverb was misplaced or the preposition waschanged. We contrasted the non-literal repetitionswith all other repetition types because only the non-literal repetition of inflections could change whichpicture matched the sentence subject.

The analysis of eye-tracking data examined the pro-portion of fixations directed at the non-literal headimage across the same time windows as described inExperiment 1a. In this analysis, we included all identicaland other repetitions: trials in which all noun and verbinflections were reproduced as in the prompt.

As in Experiment 1a, the analysis of repetitions con-tained the predictors of Local Noun (singular, plural, con-trasts of .5, -.5), Verb (singular, plural, contrasts of .5, -.5),and Trial Number (scaled and centered). Analysis of eye-tracking data focused on the comprehension phase ofthe experiment, with the same windows and analysisprocedure as in Experiment 1a. Predictors were LocalNoun (singular, plural, centered contrasts of .47, -.53),Verb (singular, plural, centered contrasts of .47, -.53)and Trial Number (scaled and centered).

3.2.2. ResultsPragmatic distractor task. Prompts were judged as“likely” the majority of the time (Local plural, plural

LANGUAGE, COGNITION AND NEUROSCIENCE 13

verb: 84%, Local plural, singular verb: 87%, Local singu-lar, plural verb: 78%, Local singular, singular verb:85%), with the remaining trials more often rated as “unli-kely” (Local plural, plural verb: 12%, Local plural, singularverb: 10%, Local singular, plural verb: 15%, Local singu-lar, singular verb: 11%) than “not sure” (Local plural,plural verb: 5%, Local plural, singular verb: 4%, Localsingular, plural verb: 7%, Local singular, singular verb:4%). In a model predicting the odds of “likely” ratingsby Local Noun and Verb, with random intercepts byParticipants and Items and a random slope for Item byVerb, there were reliably more “likely” ratings whenthe prompt contained a plural local noun(b̂ = −0.36, SE = 0.16, z = −2.20, p . 0.05) or asingular verb(b̂ = 0.78, SE = 0.24, z = −8.01, p . 0.01).

Sentence repetition. Prompts were repeated identi-cally on slightly more than half of the trials (55% of alltrials), with high rates of repetitions that changed partof the sentence but preserved all inflections (otherresponses; 29% of all trials) and lower rates of inflec-tion-changing repetitions (16% of all trials). Followingthe tendency for elicited imitations to mirror the stat-istics of the language (see e.g. Erlam, 2006 for review),most of the inflection-changed repetitions were gram-matical (60%). See Table 5 for a fine-grained coding ofrepetitions.

As shown in Table 5, inflections were changed to formtwo main types of repairs. For trials containing anoma-lous plural verbs, the verb was changed from plural tosingular on 15% of trials, while the head was changedfrom singular to plural on only 4% of trials. These twotypes of repairs are likely to reflect different inferencesabout the source of the error: verb inflections are likelyto be mis-produced, but noun inflections are morelikely to be mis-heard (see Brehm et al., 2018 forfurther discussion). However, given the sparse data, we

collapse both together to analyse the overall odds ofinflection changes – our measure of non-literal interpret-ation in this experiment. This also serves as a more directcomparison to Experiment 1a.

Mirroring Experiment 1a, odds of inflection changewere affected by the Local Noun and Verb in the originalprompt and by Trial Number. As shown in Figure 6,plural local nouns and plural verbs led to increasednumbers of repetitions with inflection changes (lightand bright pink bars), with most changes occurringwhen the local noun and verb were both plural (26%of trials). These patterns manifested as significant maineffects of Local Noun and Verb and an interactionbetween them in mixed-effect models (see Table 6).An interaction between Trial Number and Verb wasalso observed such that plural verb-containing trials eli-cited fewer inflection changes late in the experiment(from 31% in Block 1 to 18% in Block 4).

We also qualitatively examined whether the samepatterns obtained in non-literal trials that were pro-duced as grammatical versus ungrammatical utter-ances (see bold versus non-bold items in the “non-literal responses” section in Table 5). The pattern inthe grammatical non-literal responses (bright pinkin Figure 6) was comparable to the omnibus results,such that these increased when the prompt con-tained local plural nouns and plural verbs. In contrast,ungrammatical non-literal responses seemed only tobe affected by local noun number; this suggeststhat plural local nouns tended to elicit attractionerrors in this elicited imitation paradigm (light pinkin Figure 6).

Proportion of fixations. Proportions of fixations ontrials in which all inflections were repeated correctly(identical and other repetitions) split by conditionappear in Figure 7. Compared to Experiment 1a, partici-pants fixated less often on the head noun and its

Table 5. Response types by category, Experiment 2 SVA trials.Response type Loc Plural, Plural Verb Loc Plural, Singular Verb Loc Singular, Plural Verb Loc Singular, Singular Verb

Identical repetition 174 210 216 247No response 1 1 2 1Non-literal responses (Inflections changed)N1 changed 19 4 2 0N2 changed 27 34 14 8V changed 42 21 38 9N1 and N2 changed 9 2 2 0N1 and V changed 1 0 0 0N2 and V changed 1 1 3 1N1, N2, and V changed 1 1 0 0Other changes (Inflections preserved)Adverb & verb re-ordered 83 86 81 100Adverb changed 2 8 3 1Preposition changed 4 3 6 6All other changes 20 13 17 11

Grammatical utterances in bold.

14 L. BREHM ET AL.

competitor and more often on the prompt’s local noun(“Cabinet”/ “Cabinets”), such that all four imagesreceived more comparable numbers of looks.

Mixed-effect models (see Table 7) confirmed thatthere were no reliable differences by condition in pro-portions of fixations to the non-literal head noun com-petitor (“Keys”) during the head noun and local nounregions (time windows 1 and 2). However, mixed-effectmodels showed clear differences in proportions offixations to the non-literal competitor (e.g. “Keys”) afterthe anomalous verb (“were”) appeared, as shown inTable 7.

Figure 6. Sentence repetition for Experiment 2 SVA trials, split by condition and whether the inflection change led to a grammatical orungrammatical utterance.

Table 6. Results from logistic mixed-effect regression ofresponses with number inflection changes, versus identicaland other repetitions, for Experiment 2 SVA trials.Fixed effects b̂ SE p(z)

Intercept −2.36 0.24 <0.001Local Noun −1.12 0.19 <0.001Verb −1.27 0.19 <0.001Trial Number −0.07 0.09 0.47Local Noun × Verb −0.74 0.37 0.05Local Noun × Trial Number −0.12 0.18 0.52Verb × Trial Number 0.77 0.18 <0.001Local Noun × Verb × Trial Number 0.32 0.36 0.37Random effects Groups SD

Item Intercept 0.86Participant Intercept 0.94

Figure 7. Average proportion of fixations to images in the listening phase of SVA trials where all inflections were repeated veridically(identical and other trials), zeroed to reflect verb onset. Vertical lines represent onsets of head nouns, local nouns, and verbs in audiorecording; panels represent sentence conditions. Confidence bands are 95% CIs from a non-parametric bootstrap (1000 iterations)sampled over participants with replacement at 10 msec intervals.

LANGUAGE, COGNITION AND NEUROSCIENCE 15

In the third time window, representing processingrelative to the verb, there was a significant main effectof Local Noun such that items containing a plural localnoun had increased fixations to the non-literal head(21% for plural vs. 16% for singular local nouns). Therewas also a main effect of Trial Number such thatfixations to the non-literal head decreased during theexperiment (from 24% in Block 1 to 17% in Block 4),with no other significant main effects or interactions.

In the fourth time window, representing processingafter hearing the verb and integrating it with the priorcontext, there was a significant interaction betweenVerb and Trial Number, such that plural verbs increasedfixations to the non-literal head, but only in the begin-ning of the experiment (from 30% for plural vs. 23%for singular in Block 1 to 18% for plural vs. 24% for singu-lar in Block 4). There was also a main effect of TrialNumber such that fixations to the non-literal headdecreased throughout the experiment (from 26% inBlock 1 to 21% in Block 4), with no other significantmain effects or interactions.

3.2.3. DiscussionWe showed that SVA items (The key to the cabinet(s)*were/was) containing local plural nouns and pluralverbs elicited more repetitions where changes weremade to inflections than items with local singularnouns and singular verbs; this replicates the offlineresults of Experiment 1a with an implicit task. Non-literal repetitions most often resulted in grammaticalutterances, which supports the notion that the maindriver of non-literal repetitions is gist memory for theprompt’s meaning (as in Erlam, 2006; Potter & Lombardi,1990). Ungrammatical non-literal utterances were pro-duced more often following plural local nouns; thissuggests that in addition to measuring attraction incomprehension, this paradigm also elicits attractionerrors due to language production constraints (e.g.Bock & Miller, 1991; see Tanner et al., 2014 for discussionof differences between attraction in comprehension andproduction).

When we focused on the online processing duringtrials for which all inflections were correctly reproduced,participants looked less often at the literal and non-literal head nouns (“Key/Keys”) and more often at thelocal noun (“Cabinet”/“Cabinets”) than in Experiment1a, despite listening to the same sentence prompts. Assuch, the data from Experiment 2a more clearly alignwith the canonical visual-world pattern of looking tomentioned objects (e.g. Allopenna et al., 1998).However, the most important patterns still replicatedwhat we found in Experiment 1a. Local plural nouns(cabinets) and anomalous verbs (“were”) increasedTa

ble7.Results

from

linearm

ixed-effectregressionofprop

ortio

noffixatio

nsto

head

noun

numbercom

petitor

binn

edat500msectim

ewindo

wsforSVA

trialswhereallinflectio

nswere

repeated

correctly

(identicalandothertrials).

Fixedeff

ects

Windo

w1:

Headno

unregion

Windo

w2:

Localn

ounregion

Windo

w3:

Verb

region

Windo

w4:

Post-Verbregion

b̂SE

p(χ2)

b̂SE

p(χ2)

b̂SE

p(χ2)

b̂SE

p(χ2)

Intercept

0.20

0.01

<0.00

10.17

0.01

<0.00

10.20

0.01

<0.00

10.21

0.02

<0.00

1LocalN

oun

−0.00

0.02

0.85

0.01

0.01

0.64

−0.07

0.02

<0.00

1−0.00

0.02

0.85

Verb

−0.01

0.02

0.68

0.02

0.01

0.13

−0.01

0.02

0.72

−0.03

0.02

0.11

TrialN

umber

−0.01

0.01

0.11

−0.00

0.01

0.51

−0.03

0.01

<0.01

−0.02

0.01

0.02

LocalN

oun×Verb

0.01

0.03

0.97

0.02

0.03

0.54

−0.03

0.03

0.36

−0.01

0.03

0.78

LocalN

oun×Trial

0.03

0.02

0.08

0.02

0.02

0.19

0.02

0.02

0.30

−0.01

0.02

0.59

Verb

×Trial

−0.02

0.02

0.28

−0.03

0.02

0.09

−0.02

0.02

0.30

0.05

0.02

<0.01

LocNou

n×Verb

×Trial

0.04

0.03

0.22

0.04

0.03

0.22

0.05

0.03

0.12

−0.04

0.03

0.29

Rand

omeff

ects

Group

sSD

Group

sSD

Group

sSD

Group

sSD

Item

Intercept

0.03

Item

Intercept

0.04

Item

Intercept

0.04

Item

Intercept

0.06

Participant

Intercept

0.03

Participant

Intercept

0.04

Participant

Intercept

0.06

LocNou

n0.03

Residu

al0.27

Residu

al0.27

LocNou

n0.01

Participant

Intercept

0.06

Residu

al0.29

LocNou

n0.07

Verb

0.05

Trial

0.03

LocNou

n×Verb

0.03

Residu

al0.29

Effectsareitalicised

whenthey

wou

ldbe

sign

ificant

undera0.05

alph

alevel,bu

tno

tan

alph

alevelcorrected

formultip

lecomparison

s.

16 L. BREHM ET AL.

fixations to the non-literal plural head (“Keys”) in thesame time windows as in Experiment 1a. Specifically,local plural nouns increased fixations to the non-literalplural head (“Keys”) at the region following the verbanomaly on all trials, whereas plural verbs affectedfixations to the non-literal plural head at the finalregion of the sentence in the beginning of the exper-iment only. We also observed that fixations to the non-literal head noun tended to attenuate in generalthroughout the experiment, doing so significantly intime windows 3 and 4. As in Experiment 1a, this rep-resents adaptation to syntactic anomalies within thecontext of the experiment. Both patterns underscorefurther that, consistent with the noisy channel frame-work, non-literal processing occurs due to the quick revi-sion of sentence meanings while listening.

3.3. Experiment 2b

Experiment 1b examined items containing a potentialsemantic anomaly (Without-off blends: Lulu visited thegym without her hat ?off late yesterday night). Weshowed that these items elicited more selections of anon-literal sentence subject (“No-Accessory”) in a meta-linguistic judgment task compared to non-anomalouscontrol items, and demonstrated that fixations to a“Yes-Accessory” agent after hearing the prepositiondepended on the interpretation of the utterance. Thegoal of Experiment 2b was to assess whether these pat-terns replicated with an implicit comprehension task.The prediction was that items containing the poten-tially-anomalous preposition off should lead to morenon-literal repetitions where the sentence was repro-duced in an “unblended” fashion by either changingthe without or the preposition, and that patterns offixations to a non-literal agent after hearing the preposi-tion would depend upon how the utterance wasrepeated.

3.3.1. AnalysisThe analysis of sentence repetitions examined the logodds of sentences being produced in a non-literalfashion by making changes to the preposition orwithout (changing without to with, deleting the preposi-tion, or changing the preposition), compared to all otherrepetitions. Identical repetitions were coded as pro-ductions where all content words matched the originalprompt and included trials with disfluencies or restarts,and the remaining trials were classified as other rep-etitions. These were most commonly trials where thepreposition and without matched the original promptbut the name was mis-recalled or the predicate waschanged. As in Experiment 1a, we contrasted non-

literal repetitions with all other trials (identical andother trials) because non-literal repetitions involvingchanges to the without and preposition were the trialsin which the referent of the sentence could be changed.

The eye-tracking analysis was performed on all trialsreceiving a clear literal interpretation of the referent(correct and other repetitions) with the same timewindows as in Experiment 1b. As in Experiment 1b, wechose to only analyse trials receiving a literal interpret-ation of the referent (identical and other trials)because there were relatively few non-literal repetitionsoverall. This analysis included the centered predictorPreposition (on: -.48 vs. off: .52). Scaled and centeredTrial Number was also entered as a predictor.

3.3.2. ResultsPragmatic distractor task. Prompts were judged as“likely” the majority of the time (Without-on: 82%,Without-off: 82%), with the remaining trials more oftenrated as “unlikely” (Without-on: 15%, Without-off: 12%),than “not sure” (Without-on: 2%, Without-off: 6%). In amodel predicting the odds of “likely” ratings by Preposi-tion with random intercepts by Participants and Itemsand a random slope for each by Preposition, ratingswere statistically equivalent for both prepositions(b̂ = −0.49, SE = 0.55, z = −0.90, p = 0.37).

Sentence repetition. As in Experiment 2a, odds ofrepeating the item identically to the prompt were some-what low (64% of all trials), with high rates of other rep-etitions (preserving referent; 30% of all trials). Changesto the without or preposition were infrequent (6% oftrials), and most of these changed the referent (5% ofall trials; see boldface entries in Table 8). The mostcommon type of non-identical repetition was a type ofother response where the time phrase at the end ofthe sentence was mis-repeated (e.g. last Wednesdaynight -> yesterday evening, occurring on 23% of trials;note that these were coded as “literal” repetitions for

Table 8. Response types by category, Experiment 2 Withouttrials.Response type Without-On without-Off

Identical repetition 272 220No response 1 1Non-literal responses (Without and/or preposition changed)Preposition changed 7 24Preposition omitted 4 1Without -> With 1 5Without-> With; Preposition changed 1 1Other changes (Without and preposition maintained)Time phrase changed 63 87Name changed 5 8Verb phrase changed 17 18Object changed 2 2Multiple changes 11 17

Referent different than prompt in bold.

LANGUAGE, COGNITION AND NEUROSCIENCE 17

the analyses). See Table 8 for responses by category andcondition.

As in Experiment 1b, odds of non-literal repetitions ofthe without or preposition (whether preserving or chan-ging referent) were affected by Preposition and TrialNumber (see Table 9). As shown in Figure 8 (brightand dark blue bars), anomalous without + off items ledto significantly more non-literal repetitions thancontrol without + on items. Significantly fewer non-literal repetitions occurred at the end of the experiment(from 9% in Block 1 to 3% in Block 4). Qualitatively, mostof the changes in non-literal repetitions resulted in achange of referent (86%), especially in the anomalouswithout + off trials (see Figure 8; bright blue bars); thiswas not analysed statistically due to low power.

Proportion of fixations. Proportions of fixations to trialssplit by repetition type (non-literal = changes to withoutor preposition; literal = identical and other trials) by con-dition appear in Figure 9. The overall pattern mirroredExperiment 1b: Participants fixated both agents (“Yes-Accessory” and “No-Accessory”) until hearing the wordwithout, when attention was directed to the “No-Acces-sory” agent. Then, about 800 milliseconds after hearingthe preposition off, attention was redirected to the “Yes-Accessory” agent if the utterance was interpreted literally.

Mixed-effect models (see Table 10) focused only onliterally interpreted trials (upper left and bottom rightpanels of Figure 9) due to the overall low rates of non-literal sentence repetitions. These models revealed no

reliable differences in the first two time windows(before the preposition or during processing of the pre-position). In the third time window, representing proces-sing after hearing and integrating the anomalouspreposition with the rest of the sentence, there was amain effect of Preposition such that without-off itemsreceived more fixations to the “Yes-Accessory” agent(28% vs. 18%), and a main effect of Trial Number suchthat more fixations were directed to the “Yes-Accessory”agent throughout the course of the experiment (Block 1:20%, Block 4: 26%).

Though there were insufficient non-literally inter-preted without-off items to analyse (comprising only6% of the data), these showed a similar qualitativepattern as Experiment 1b, with few fixations to the“Yes-Accessory” agent in late time windows and asimilar processing profile as the literal without-on trials(see Figure 9, upper right panel).

3.3.3. DiscussionFor Without anomalies (without her hat ?off) comparedto control items, sentence prompts were more likely tobe repeated with changes to the without or the preposi-tion, leading to changes to the referent of the sentence.This replicates the findings of Experiment 1b in an eli-cited imitation task, providing further evidence for thelink between non-literal interpretations and remem-bered meanings (as in e.g. Erlam, 2006; Potter & Lom-bardi, 1990).

As in Experiment 1b, we observed that fixations tocandidate referents were modulated by the prepositionpresented in the sentence, such that when without + offwas interpreted literally, there were more fixations to the“Yes-Accessory” agent after the preposition off was pre-sented. This confirms the link between online processingand offline measures of interpretation, consistent withboth non-literal processing frameworks. Finally, as inExperiment 1 and Experiment 2a, patterns of

Table 9. Results from logistic mixed-effect regression ofrepetitions containing without/preposition changes, versusidentical and other repetitions, for Experiment 2 Without trials.Fixed effects b̂ SE p(z)

Intercept −3.96 0.47 <0.001Preposition −1.20 0.42 <0.001Trial Number −0.55 0.20 0.01Preposition × Trial Number −0.33 0.40 0.41Random effects Groups SD

Item Intercept 0.76Participant Intercept 1.34

Figure 8. Sentence repetition for Experiment 2 Without trials, split by condition.

18 L. BREHM ET AL.

interpretation and processing changed through theexperiment, suggesting adaptation to anomalies withexposure.

4. General discussion

We investigated the time-course and outcomes ofprocessing sentence anomalies non-literally by com-bining visual world eye-tracking with two differentoffline measures of interpretation. Experiment 1

used a metalinguistic subject selection task, whileExperiment 2 used an elicited imitation task. In bothexperiments, we examined the processing ofsubject-verb agreement anomalies (SVA, The key tothe cabinet(s) *were) and blends of implicit negation(Without, without her hat ?off). These two attestedanomalies primarily draw upon morphosyntactic andsemantic processing, respectively, providing differentways of looking at how online processing leads tooffline interpretations.

Figure 9. Average proportion of fixations to images in the listening phase of Without trials in Experiment 2, zeroed to reflect pre-position onset. Vertical lines represent onsets of without and prepositions; panels represent sentence conditions crossed with sen-tence interpretation. Confidence bands are 95% CIs from a non-parametric bootstrap (1000 iterations) sampled over participantswith replacement at 10 msec intervals. Only literal interpretations (identical and other repetitions) were submitted to analysesdue to the sparse number of non-literal repetitions (involving changes to without and/or preposition).

Table 10. Results from linear mixed-effect regression of proportion of fixations to yes-accessory agent binned at 500 msec timewindows for literally-interpreted (identical and other repetitions) Experiment 2 Without trials.

Fixed effects

Window 1:Without region

Window 2:Preposition region

Window 3:Post-Preposition region

b̂ SE p(χ2) b̂ SE p(χ2) b̂ SE p(χ2)

Intercept 0.19 0.02 <0.001 0.29 0.02 <0.001 0.23 0.02 <0.001Preposition −0.01 0.02 0.76 0.02 0.02 0.45 0.10 0.03 <0.001Trial Number 0.01 0.01 0.56 0.00 0.00 0.23 0.00 0.00 0.03Preposition × Trial Number −0.02 0.02 0.36 0.00 0.00 0.93 −0.00 0.00 0.71Random effects Groups SD Groups SD Groups SD

Item Intercept 0.04 Item Intercept 0.04 Item Intercept 0.00Preposition 0.03 Trial Num 0.05 Participant Intercept 0.06

Participant Intercept 0.06 Participant Intercept 0.05 Preposition 0.10Residual 0.27 Preposition 0.05 Residual 0.32

Trial Num 0.00Residual 0.31

Effects are italicised when they would be significant under a 0.05 alpha level, but not an alpha level corrected for multiple comparisons.

LANGUAGE, COGNITION AND NEUROSCIENCE 19

For both morphosyntactic (SVA, Experiment 1a and2a) and semantic (Without, Experiment 1b and 2b)anomalies, shifts in attention happened shortly afterhearing anomalous or potentially anomalous elements.This shows how quickly changes in interpretation canoccur: consistent with current versions of goodenough processing (Karimi & Ferreira, 2016) and noisy-channel theories (Gibson et al., 2013), revisions tointerpretation happen online. For SVA items, numbercues on nouns and verbs led to the consideration of ahead noun differing in number at and after hearingthe sentence’s verb. For Without items, the semanticinformation in without and on/off led to the consider-ation of a different agent upon hearing each word. Inboth cases, the revision happened as new informationappeared.

This provides evidence that participants interpretanomalous sentences incrementally, using the materialin the sentence piece-by-piece as it is processed toshape their understanding of sentence meaning. Thisshows how processing sentence anomalies occursusing the same mechanisms and cognitive infrastructureas non-anomalous sentences (see e.g. Altmann &Kamide, 1999; Snedeker & Trueswell, 2004; Tanenhauset al., 1995, among many). These results thereforedemonstrate the utility of pairing visual-world eye-track-ing with offline methods. Doing so shows precisely howand when interpretations of sentences change duringprocessing, and discloses that graded changes ininterpretation can be present even when a sentence isultimately interpreted literally.

4.1. Contrasting online and offline measures

A key contribution of the current work relates to theintegration of online and offline measures. By comparingmeasures at different points in processing, we revealthat not all evidence of online difficulty leads to offlineinterpretation changes. This is essential to considerwhen examining online or offline measures alone, asmany studies in the field of non-literal processing havedone. Specifically, we showed that in the anomalousSVA trials (Experiments 1a and 2a), attention wasdrawn to a plausible non-literal referent even whenthe literal referent was eventually selected. This showshow anomalies capture attention in situations evenwhen the ultimate interpretation is canonical and high-lights the need to consider online processing even whenoffline performance seems unperturbed. Similarly, in theanomalous Without trials (Experiments 1b and 2b),attention was sometimes drawn to an alternate, repair-ing referent, and sometimes left on the original referent.This shows how attention to candidate referents can also

be a predictor of the eventual interpretation and high-lights how online performance leads to an offlineinterpretation. Non-literal processing involves bothquick revision of interpretations and sustained mainten-ance of interpretations, and these data disclose that notall online difficulty is obvious when the analysis is limitedto offline interpretations.

Multiple offline measures can be used to index sen-tence interpretations. In order to fully describe the linkbetween online and offline measures of processing, weran two experiments using different measures. In Exper-iment 1, we used an explicit measure of interpretation,while in Experiment 2, we used an implicit measure ofinterpretation; both experiments used the same onlinetask. In both experiments, there was clear evidence forincreased rates of non-literal interpretations andincreased attention to repairing referents in sentencescontaining potential anomalies. There were, however,two important differences. First, individuals were morelikely to take a non-literal sentence interpretationwhen asked directly to evaluate sentence meaning;this was particularly clear when contrasting Experiment2b versus Experiment 1b. This shows a possible role ofexperimental task in how individuals encode or interpretthe meaning of sentences, and it underscores whatchanges in processing when individuals are asked torepeat speech, versus simply listen.

Second, we observed differences in how attentionwas recruited based upon the task (e.g. as in Griffin &Bock, 2000); this was particularly clear when contrastingExperiment 2a and Experiment 1a. Despite identicalstimuli across the two experiments, the implicit SVAtrials in Experiment 2a elicited attention to all fourimages, while in Experiment 1a, individuals fixatedmainly on the item that reflected the sentence subject.This pattern is consistent with individuals focusingmore on the auditory than the visual input because werequired them to repeat the sentence fragment. Theexplicit direction of attention changed the overall rateof fixating any object, though it had little impact on rela-tive rates of fixations to images by condition and overtime. The implication is that when listening to sentenceprompts, individuals explicitly direct their attentionbased upon the experimental task, which can changeoverall patterns of online processing but may notaffect condition-wise differences. One therefore needsto consider which offline task to use in measuring sen-tence interpretation; while the overall pattern is compar-able, nuances in what is elicited for each task might beimportant for the researcher’s question.

A final consequence of our approach is that it allowsus to link non-literal processing with existing sentenceprocessing theories, building a fuller description of

20 L. BREHM ET AL.

language processing at multiple levels of analysis. As dis-cussed in the Introduction, cue-based retrieval is the cur-rently dominant mechanistic description of agreementerror comprehension (e.g. Dillon et al., 2013; Wagerset al., 2009). It states that an erroneous element pro-motes retrieval of an unlicensed but feature-matchingcontroller from memory. As discussed in Experiment1a, cue-based retrieval also fits the observed eye-track-ing data: individuals attend to both versions of thehead noun (“Key”, and “Keys”) before the site of theanomaly, meaning that both are active in memory tobe later retrieved. We suggest this means that noisychannel edits, especially those that correct dependen-cies, could be supported by cue-based retrieval ofelements from memory. This points to the importanceof considering cognitive mechanisms such as memoryand attention in non-literal processing frameworks. Webelieve this is a promising direction for future work.

4.2. Experience-driven changes in interpretation

While both types of anomalies led to clear evidence ofnon-literal processing, there were also some differencesin how this manifested across anomalies. This shedsimportant light on the link between experience andthe processing of anomalous utterances, an underlyingdriver of both non-literal processing theories. In Exper-iment 2a we observed that even anomalous SVA trialstended to be interpreted and repeated with the literalhead number, showing that the verb, not the headnoun, was typically perceived as the erroneouselement in the dependency. This matches statisticslearned from the world: agreement errors are typicallyoperationalised as a mis-inflected verb rather than amis-inflected subject noun (e.g. as in Bock & Miller,1991), and fits with our earlier work that inferences ofmis-inflected verbs increase when speaker-centeredsources of noise increase (Brehm et al., 2018). In contrast,we observed that the Without trials elicited more vari-able interpretations (Experiment 1b) and imitations(Experiment 2b). To our best estimation, theseanomalies, while attested (e.g. as in Frazier & Clifton,2011, 2015), are extremely infrequent. This low fre-quency could lead to more variability in processing: wesuggest that given the low likelihood of utterances ofthis type, individuals had weaker expectations andwere therefore more variable in identifying the locusof the anomaly.

Across items there could also be differences in thelikelihood of expecting an anomaly and the type ofrepair chosen for it because of properties like real-world plausibility, the frequency of nouns in pluralversus singular forms, or the co-occurrence of nouns

with without on/off. To look at this question, we exam-ined item-level differences in non-literal responding forthe subject-verb items, where we had the most item-level power. Items did vary in rates of non-literalresponding but did so both within and between exper-iments. The vase for the flower(s) elicited the fewest non-literal responses in Experiment 1a (0%) and The tag forthe gift(s) elicited the fewest non-literal responses inExperiment 2a (3%). The banana for the monkey(s) eli-cited the most non-literal responses in Experiment 1a(20%) and The author of the novel(s) elicited the mostnon-literal responses in Experiment 2a (63%). The rela-tive inconsistency of non-literal response rates by itemacross the two experiments might result from the factthat the item set, used in earlier work (e.g. Brehmet al., 2018; a subset of the items also repeat fromPatson & Husband, 2016) might not vary sufficientlyalong important dimensions of real-world experience;a more nuanced exploration of these factors would beworthy of future work.

Changes in processing and interpretation over thecourse of the experiments also inform the linkbetween experience and processing anomalous utter-ances. We clearly showed that individuals adaptedacross the course of the experiment. For both types ofanomalies and both types of offline tasks, there werefewer selections of non-literal subjects at the end ofthe experiment than the beginning, fewer non-literalrepetitions, and there was less attention paid to thenon-literal referent, especially at late time windows inthe critical sentences. This suggests that on all measuresfor both types of sentences, participants adapted to rela-tive frequency of the anomalous items presented in theexperiment, and directed their attention correspond-ingly, such that competitors became less distractingwith experience. This replicates prior work on adaptationto novel or infrequent structures from self-paced reading(e.g. Fine et al., 2013; Fraundorf & Jaeger, 2016) andvisual world eye-tracking (Ryskin et al., 2017), andbuilds upon work showing that even anomalous utter-ances lead to structural priming (Ivanova et al., 2017).The implication is that we interpret utterances usingbiases and heuristics about the mapping betweenintended meaning and observed form, and that thesemappings change with experience. While neither thenoisy channel theory nor the good enough processingframework make explicit predictions about adaptationover time, both accounts are fully consistent with experi-ence-based changes in processing. Under the noisychannel theory, experience would change the compre-hender’s expectations of various sentence types, whileunder the good enough processing framework, experi-ence could change the likelihood of deploying heuristics

LANGUAGE, COGNITION AND NEUROSCIENCE 21

in processing or might cause a comprehender todevelop and deploy a new heuristic to easily processsentence meanings.

4.3. Key principles of non-literal processing

The finding that individuals can quickly change theirinterpretation of a sentence upon encountering ananomaly suggests that individuals maintain uncertaintywhile processing and adapt accordingly. Earlier workhas suggested two ways that uncertainty affects proces-sing. Initial interpretations can be overwritten by a repairthat posits an error in the initial input (e.g. the noisychannel theory; Gibson et al., 2013) or can be maintainedlonger than is ideal (e.g. good-enough processing, Chris-tianson et al., 2001); both lead to non-literal processing.We observed both of these patterns in the current exper-iments. Our data suggest that non-literal SVA interpret-ations happen via repair, such that an anomaly forcesa revision of the original sentence subject’s number(from singular to plural), similar to how repair disfluen-cies encourage a re-parse (e.g. Arnold et al., 2003;Lowder & Ferreira, 2016). Importantly, while the repairis more frequent for sentences with ungrammaticalplural verbs, it also occurs for sentences with grammati-cal singular verbs (as first discussed in Patson &Husband, 2016), and regardless of eventual interpret-ation, plural nouns and plural verbs both triggeredlooks to a corrected version of the head. This demon-strates that while ungrammatical utterances are morelikely to trigger a repair to the literal utterance, perfectlygrammatical but unexpected ones do too. In contrast,non-literal Without interpretations happen by maintain-ing an original interpretation (e.g. “No-Accessory”) in theface of potentially-incongruous input, suggesting thelingering influence of an initial syntactic parse (e.g. Chris-tianson et al., 2001; Slattery et al., 2013), while the literalones happen because of a successful reparse of theinitially-posited input that is no longer “good enough”but fully veridical. This is particularly clear in Experiment1b, where differences in early time windows disclosedthat if participants directed their attention away fromthe “Yes-Accessory” agent before hearing the sentenceanomaly, they were less likely to form a non-literal sen-tence interpretation.

These differences underscore that while there aremany cases that are covered well by both the noisychannel framework and good-enough processing,such as garden-path sentences, there are importantdifferences in the primary predictions of each frame-work. We show that core properties of the noisychannel theory (“overwrite and repair”) and thegood-enough processing framework (“persist in the

face of conflict”) provide important and non-identicalways to describe how listeners understand anomaloussentences. We have shown the primary importance ofspeedy repair for the morphosyntactic SVA anomalies,and the primary importance of lingering represen-tations for the semantic Without anomalies; thisimplies that mechanisms central to both theories areimportant to language processing, but does notclearly show whether each is uniquely important ineach case. To refine our knowledge of how each mech-anism impacts comprehension, and in order to dis-tinguish the relative contribution of each mechanismin everyday language use, future work is needed.Doing so could build a fully-integrated non-literal pro-cessing theory that cuts across algorithmic and cogni-tive levels of processing, uniting two frameworkspreviously developed to account for different typesof anomaly processing. We have taken a steptowards unifying these theories in the present work,showing that both describe important principles com-prehenders can use to process non-literally. We hopethat the paradigm outlined in the present work canbe used as a tool to carefully build further theory.For example, one could investigate how anomalies ofdifferent grain sizes (words, affixes, phonemes/letters) at different processing levels (semantic, syntac-tic) with varied real-world frequencies affect whatmechanisms comprehenders use to understand realis-tically anomalous speech; doing so would furtherproperties of interest to both good-enough andnoisy-channel processing.

5. Conclusion

Language processing is a hard task that feels remark-ably easy. By linking the domains of non-literal proces-sing and visual world eye-tracking, we show thatindividuals process language anomalies quickly andefficiently, adapting to new information as it appears.By integrating meaning, form, and expectations, weshow that individuals can repair or overlook anomaliespresent in their linguistic input. This leads to the suc-cessful transmittal of a message even when its formhas gone awry.

Notes

1. Note the adverb in these sentences: in agreement com-prehension studies, adding an adverb before the verbserves as a spillover region to better separate processingof local noun and verb number (e.g. Wagers et al., 2009).

2. Centering contrasts allows the intercept and other maineffects to reflect the average of the two conditions.

22 L. BREHM ET AL.

3. While trial was entered as a continuous predictor, wereport means by block for ease of exposition.

4. If the three-way interaction is included in the model,only Verb approaches significance (p = .051) and allfixed effects are correlated at .9 or above.

5. There were too few Without-On non-literal interpret-ations to analyze fixations.

6. These are weighted unevenly due to the differingnumber of observations in each cell; cells with fewerobservations get more weight to center the factor sothat main effects reflect the average of conditions.

Acknowledgements

Supported by a grant from the National Science Foundation(NSF OISE 0968369; PI: J. F. Kroll; co-PIs: P. E. Dussias andJ. G. van Hell). Thanks to Neil Shook for assistance withstimuli and participant testing, to Quiqi Cong for assistancecreating stimuli, to Alison Kelly and Katherine Muschler forassistance with participant testing, and to Jennifer Rhee andAdrianna Shevlin for assistance with transcription; thanks toRachel Ryskin for feedback on study design and analysis andto Antje Meyer for manuscript comments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Funding

This work was supported by National Science Foundation[grant number 0968369].

References

Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998).Tracking the time course of spoken word recognitionusing eye movements: Evidence for continuous mappingmodels. Journal of Memory & Language, 38(4), 419–439.https://doi.org/10.1006/jmla.1997.2558

Altmann, G. T., & Kamide, Y. (1999). Incremental interpretationat verbs: Restricting the domain of subsequent reference.Cognition, 73(3), 247–264. https://doi.org/10.1016/S0010-0277(99)00059-1

Arnold, J. E., Eisenband, J. G., Brown-Schmidt, S., & Trueswell, J.C. (2000). The rapid use of gender information: Evidence ofthe time course of pronoun resolution from eyetracking.Cognition, 76(1), B13–B26. https://doi.org/10.1016/S0010-0277(00)00073-1

Arnold, J. E., Fagnano, M., & Tanenhaus, M. K. (2003).Disfluencies signal theee, um, new information. Journal ofPsycholinguistic Research, 32(1), 25–36. https://doi.org/10.1023/A:1021980931292

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Randomeffects structure for confirmatory hypothesis testing: Keepit maximal. Journal of Memory & Language, 68(3), 255–278.https://doi.org/10.1016/j.jml.2012.11.001

Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen,R. H. B., Singmann, H., & Dai, B. (2014). lme4: Linear mixed-

effects models using Eigen and S4 (Version 1.1-13). https://cran.r-project.org/web/packages/lme4/index.html

Blanchette, F. (2013). Negative concord in English. LinguisticVariation, 13(1), 1–47. https://doi.org/10.1075/lv.13.1.01bla

Blanchette, F., & Lukyanenko, C. (2019). Unacceptable gram-mars? An eye-tracking study of English negative concord.Language and Cognition, 11(1), 1–40. https://doi.org/10.1017/langcog.2019.4

Bley-Vroman, R., & Chaudron, C. (1994). Elicited imitation as ameasure of second-language competence. In E. Tarone, S.Gass, & A. Cohen (Eds.), Research methodology in second-language acquisition (pp. 245–261). Lawrence Erlbaum.

Bock, K., & Miller, C. A. (1991). Broken agreement. CognitivePsychology, 23(1), 45–93. https://doi.org/10.1016/0010-0285(91)90003-7

Brehm, L., & Bock, K. (2013). What counts in grammaticalnumber agreement? Cognition, 128(2), 149–169. https://doi.org/10.1016/j.cognition.2013.03.009

Brehm, L., & Bock, K. (2017). Referential and lexical forces innumber agreement. Language, Cognition & Neuroscience,32(2), 129–146. https://doi.org/10.1080/23273798.2016.1234060

Brehm, L., Jackson, C. N., & Miller, K. L. (2018). Speaker-specificprocessing of anomalous sentences. The Quarterly Journal ofExperimental Psychology, Advance Online Publication.

Brown-Schmidt, S., Byron, D. K., & Tanenhaus, M. K. (2005).Beyond salience: Interpretation of personal and demonstra-tive pronouns. Journal of Memory & Language, 53(2), 292–313. https://doi.org/10.1016/j.jml.2005.03.003

Butterworth, B. (1982). Speech errors: Old data in search of newtheories. In A. Cutler (Ed.), Slips of the tongue and languageproduction (pp. 73–109). Walter de Gruyter.

Chrabaszcz, A, & Jiang, N. (2014). The role of the nativelanguage in the use of the English nongeneric definitearticle by L2 learners: A cross-linguisticcomparison. Second Language Research, 30(3), 351–379.https://doi.org/10.1177/0267658313493432

Christianson, K., Hollingworth, A., Halliwell, J. F., & Ferreira, F.(2001). Thematic roles assigned along the garden pathlinger. Cognitive Psychology, 42(4), 368–407. https://doi.org/10.1006/cogp.2001.0752

Christianson, K., Luke, S. G., & Ferreira, F. (2010). Effects ofplausibility on structural priming. Journal of ExperimentalPsychology: Learning, Memory, & Cognition, 36(2), 538.https://doi.org/10.1037/a0018027

Christianson, K., & Luke, S. (2011). Context strengthens initialmisinterpretations of text. Scientific Studies of Reading, 15(2), 136–166. https://doi.org/10.1080/10888431003636787

Cooper, R. M. (1974). The control of eye fixation by themeaning of spoken language: A new methodology for thereal-time investigation of speech perception, memory, andlanguage processing. Cognitive Psychology, 6(1), 84–107.https://doi.org/10.1016/0010-0285(74)90005-X

Cutting, J. C., & Bock, K. (1997). That’s the way the cookiebounces: Syntactic and semantic components of experimen-tally elicited idiom blends.Memory & Cognition, 25(1), 57–71.https://doi.org/10.3758/BF03197285

Dillon, B., Mishler, A., Sloggett, S., & Phillips, C. (2013).Contrasting intrusion profiles for agreement and anaphora:Experimental and modeling evidence. Journal of Memory &Language, 69(2), 85–103. https://doi.org/10.1016/j.jml.2013.04.003

LANGUAGE, COGNITION AND NEUROSCIENCE 23

Duñabeitia, J. A., Crepaldi, D., Meyer, A. S., New, B., Pliatsikas, C.,Smolka, E., & Brysbaert, M. (2017). Multipic: A standardizedset of 750 drawings with norms for six european languages.The Quarterly Journal of Experimental Psychology, 1–24. https://doi.org/10.1080/17470218.2017.1310261

Erlam, R. (2006). Elicited imitation as a measure of L2 implicitknowledge: An empirical validation study. AppliedLinguistics, 27(3), 464–491. https://doi.org/10.1093/applin/aml001

Ferreira, F. (2003). The misinterpretation of noncanonical sen-tences. Cognitive Psychology, 47(2), 164–203. https://doi.org/10.1016/S0010-0285(03)00005-7

Fine, A. B., Jaeger, T. F., Farmer, T. A., & Qian, T. (2013). Rapidexpectation adaptation during syntactic comprehension.PloS one, 8(10), e77661. https://doi.org/10.1371/journal.pone.0077661

Fraundorf, S. H., & Jaeger, T. F. (2016). Readers generalize adap-tation to newly-encountered dialectal structures to otherunfamiliar structures. Journal of Memory & Language, 91,28–58. https://doi.org/10.1016/j.jml.2016.05.006

Frazier, L., & Clifton, C. (2011). Quantifiers undone: Reversingpredictable speech errors in comprehension. Language, 87(1), 158. https://doi.org/10.1353/lan.2011.0024

Frazier, L., & Clifton, C. (2015). Without his shirt off he saved thechild from almost drowning: Interpreting an uncertain input.Language, Cognition & Neuroscience, 30(6), 635–647. https://doi.org/10.1080/23273798.2014.995109

Fukumura, K., van Gompel, R. P., & Pickering, M. J. (2010). Theuse of visual context during the production of referringexpressions. The Quarterly Journal of ExperimentalPsychology, 63(9), 1700–1715. https://doi.org/10.1080/17470210903490969

Gibson, E., Bergen, L., & Piantadosi, S. T. (2013). Rational inte-gration of noisy evidence and prior semantic expectationsin sentence interpretation. Proceedings of the NationalAcademy of Sciences, 110(20), 8051–8056. https://doi.org/10.1073/pnas.1216438110

Gibson, E., Tan, C., Futrell, R., Mahowald, K., Konieczny, L.,Hemforth, B., & Fedorenko, E. (2017). Don’t underestimatethe benefits of being misunderstood. Psychological Science,28(6), 703–712. https://doi.org/10.1177/0956797617690277

Griffin, Z. M., & Bock, K. (2000). What the eyes say about speak-ing. Psychological Science, 11(4), 274–279. https://doi.org/10.1111/1467-9280.00255

Grodner, D. J., Klein, N. M., Carbary, K. M., & Tanenhaus, M. K.(2010). Some and possibly all, scalar inferences are notdelayed: Evidence for immediate pragmatic enrichment.Cognition, 116(1), 42–55. https://doi.org/10.1016/j.cognition.2010.03.014

Huang, Y. T., & Snedeker, J. (2009). Online interpretation ofscalar quantifiers: Insight into the semantics–pragmaticsinterface. Cognitive Psychology, 58(3), 376–415. https://doi.org/10.1016/j.cogpsych.2008.09.001

Huettig, F., Rommers, J., & Meyer, A. S. (2011). Using the visualworld paradigm to study language processing: A review andcritical evaluation. Acta Psychologica, 137(2), 151–171.https://doi.org/10.1016/j.actpsy.2010.11.003

Ivanova, I., Branigan, H. P., McLean, J. F., Costa, A., & Pickering,M. J. (2017). Do you what I say? People reconstruct thesyntax of anomalous utterances. Language, Cognition andNeuroscience, 32(2), 175–189. https://doi.org/ 10.1080/23273798.2016.1236976

Karimi, H., & Ferreira, F. (2016). Good-enough linguistic represen-tations and online cognitive equilibrium in language proces-sing. The Quarterly Journal of Experimental Psychology, 69(5),1013–1040. https://doi.org/10.1080/17470218.2015.1053951

Levy, R., Bicknell, K., Slattery, T., & Rayner, K. (2009). Eye move-ment evidence that readers maintain and act on uncertaintyabout past linguistic input. Proceedings of the NationalAcademy of Sciences, 106(50), 21086–21090. https://doi.org/10.1073/pnas.0907664106

Levy, R. (2008). A noisy-channel model of rational human sen-tence comprehension under uncertain input. EmpiricalMethods in Natural Language Processing (EMNLP), 234–243.

Lowder, M. W., & Ferreira, F. (2016). Prediction in the processingof repair disfluencies: Evidence from the visual-world para-digm. Journal of Experimental Psychology: Learning,Memory, and Cognition, 42(9), 1400. https://doi.org/10.1037/xlm0000256

Patson, N. D., & Husband, E. M. (2016). Misinterpretations inagreement and agreement attraction. The QuarterlyJournal of Experimental Psychology, 69(5), 950–971. https://doi.org/10.1080/17470218.2014.992445

Pearlmutter, N. J., Garnsey, S. M., & Bock, K. (1999). Agreementprocesses in sentence comprehension. Journal of Memory &Language, 41(3), 427–456. https://doi.org/10.1006/jmla.1999.2653

Potter, M. C., & Lombardi, L. (1990). Regeneration in the short-term recall of sentences. Journal of Memory & Language, 29(6), 633. https://doi.org/10.1016/0749-596X(90)90042-X

Qian, Z., Garnsey, S., & Christianson, K. (2018). A comparison ofonline and offline measures of good-enough processing ingarden-path sentences. Language, Cognition &Neuroscience, 33(2), 227–254. https://doi.org/10.1080/23273798.2017.1379606

R Core Team. (2014). R: A language and environment for statisti-cal computing version 3.1.2. R Foundation for StatisticalComputing. http://www.R-project.org/

Rossion, B., & Pourtois, G. (2004). Revisiting Snodgrass andVanderwart’s object pictorial set: The role of surface detailin basic-level object recognition. Perception, 33(2), 217–236. https://doi.org/10.1068/p5117

Ryskin, R., Qi, Z., Duff, M. C., & Brown-Schmidt, S. (2017). Verbbiases are shaped through lifelong learning. Journal ofExperimental Psychology: Learning, Memory, & Cognition, 43(5), 781–794. https://doi.org/10.1037/xlm0000341

Sanford, A. J. & Sturt, P. (2002). Depth of processing in languagecomprehension: Not noticing the evidence. Trends inCognitive Sciences, 6(9), 382–386. https://doi.org/10.1016/S1364-6613(02)01958-7

Schlueter, Z., Parker, D., & Lau, E. (2019). Error-driven retrieval inagreement attraction rarely leads to misinterpretation.Frontiers in Psychology, 10, 1002. https://doi.org/10.3389/fpsyg.2019.01002

Slattery, T. J., Sturt, P., Christianson, K., Yoshida, M., & Ferreira, F.(2013). Lingering misinterpretations of garden path sen-tences arise from competing syntactic representations.Journal of Memory & Language, 69(2), 104–120. https://doi.org/10.1016/j.jml.2013.04.001

Snedeker, J., & Trueswell, J. C. (2004). The developing con-straints on parsing decisions: The role of lexical-biases andreferential scenes in child and adult sentence processing.Cognitive Psychology, 49(3), 238–299. https://doi.org/10.1016/j.cogpsych.2004.03.001

24 L. BREHM ET AL.

Solomon, E. S., & Pearlmutter, N. J. (2004). Semantic integration andsyntactic planning in languageproduction.Cognitive Psychology,49(1), 1–46. https://doi.org/10.1016/j.cogpsych.2003.10.001

Swets, B., Desmet, T., Clifton, C., & Ferreira, F. (2008).Underspecification of syntactic ambiguities: Evidence fromself-paced reading. Memory & Cognition, 36(1), 201–216.https://doi.org/10.3758/MC.36.1.201

Tagliamonte, S. A., & Baayen, R. H. (2012). Models, forests, andtrees of York English: Was/were variation as a case study forstatistical practice. Language Variation & Change, 24(02),135–178. https://doi.org/10.1017/S0954394512000129

Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., &Sedivy, J. C. (1995). Integration of visual and linguistic infor-mation in spoken language comprehension. Science, 268(5217), 1632–1634. https://doi.org/10.1126/science.7777863

Tanner, D., Nicol, J., & Brehm, L. (2014). The time-course offeature interference in agreement comprehension:Multiple mechanisms and asymmetrical attraction. Journalof Memory and Language, 76, 195–215. https://doi.org/10.1016/j.jml.2014.07.003

Tortora, C., & den Dikken, M. (2010). Subject agreementvariation: Support for the configurational approach.Lingua. International Review of General Linguistics.Revue internationale De Linguistique Generale, 120(5),1089–1108. https://doi.org/10.1016/j.lingua.2009.04.004

Wagers, M. W., Lau, E. F., & Phillips, C. (2009). Agreement attrac-tion in comprehension: Representations and processes.Journal of Memory & Language, 61(2), 206–237. https://doi.org/10.1016/j.jml.2009.04.002

LANGUAGE, COGNITION AND NEUROSCIENCE 25