visual unimodal grouping mediates auditory attentional bias in visuo-spatial working memory

9
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/authorsrights

Upload: independent

Post on 05-Mar-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

This article appeared in a journal published by Elsevier. The attachedcopy is furnished to the author for internal non-commercial researchand education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies areencouraged to visit:

http://www.elsevier.com/authorsrights

Author's personal copy

Visual unimodal grouping mediates auditory attentional bias invisuo-spatial working memory

Fabiano Botta !, Juan Lupiáñez, Daniel SanabriaDepartment of Experimental Psychology, University of Granada, Spain

a b s t r a c ta r t i c l e i n f o

Article history:Received 12 November 2012Received in revised form 17 May 2013Accepted 25 May 2013Available online xxxx

PsycINFO classi!cation:2346 Attention

Keywords:Visuo-spatial working memorySpatial attentionCross-modal interactionGrouping

Audiovisual links in spatial attention have been reported inmany previous studies. However, the effectiveness ofauditory spatial cues in biasing the information encoding into visuo-spatial working memory (VSWM) is stillrelatively unknown. In this study,we addressed this issue by combining a cuing paradigmwith a changedetectiontask in VSWM. Moreover, we manipulated the perceptual organization of the to-be-remembered visual stimuli.We hypothesized that the auditory effect on VSWM would depend on the perceptual association between theauditory cue and the visual probe. Results showed, for the !rst time, a signi!cant auditory attentional bias inVSWM. However, the effect was observed only when the to-be-remembered visual stimuli were organized in twodistinctive visual objects. We propose that these results shed new light on audio-visual crossmodal links in spatialattention suggesting that, apart from the spatio-temporal contingency, the likelihood of perceptual associationbetween the auditory cue and the visual target can have a large impact on crossmodal attentional biases.

© 2013 Published by Elsevier B.V.

1. Introduction

Substantial crossmodal links in spatial attention have been describedin the last two decades (see Koelewijn, Bronkhorst, & Theeuwes, 2010;Spence & Driver, 2004; Talsma, Senkowski, Soto-Faraco, & Woldorff,2010, for reviews). For instance, it has been repeatedly shown that thepresentation of a peripheral auditory cue can improve the detectionand discrimination of a visual target (Driver & Spence, 1998; McDonald,Teder-Saelejaervi, & Hillyard, 2000; Posner, 1980; Spence & Driver,1997; Ward, McDonald, & Golestani, 1998).

The majority of previous studies assessing crossmodal links in spatialattention have used simple detection or discrimination tasks. However,together with the well-known effect on perceptual processing (Spence& Driver, 2004), spatial attention has been shown to affect the encoding,maintenance, and retrieval of selected information in spatial workingmemory (Awh & Jonides, 2001; Bundesen, 1990). For instance, Schmidt,Vogel,Woodman, & Luck (2002; cf. Luck&Vogel, 1997), using aparadigmthat combined a change detection task in visuo-spatial working memory(VSWM) with a typical cuing paradigm, observed that exogenous visualcues increased the likelihood of visual information presented at thecued (or nearby) locations being transferred into VSWM (see also Botta,Santangelo, Raffone, Lupianez, & Olivetti Belardinelli, 2010, for similarresults using exogenous and endogenous cues).

In an attempt to extend the !nding of crossmodal links in theperceptual domain to the relation between spatial attention andVSWM, Botta et al. (2011) compared the effectiveness of spatially-non-predictive unisensory (visual or auditory), and multisensory(audiovisual) cues in capturing participants' spatial attention towardthe hemi!eld at which a group of to-be-remembered visual stimuliwas presented (cued trials) or toward the opposite hemi!eld (uncuedtrials). In their study, the cue was presented prior to the visualmemory array that participants had to maintain in VSWM. After ashort interval, one of the elements in the memory array was probedand participants had to decided whether that element had changedcolour or not with respect to the previous memory array.

Overall, the critical !nding of that study was that multisensory cueselicited a larger attentional effect in VSWM than unimodal visual cues(which also had a signi!cant effect). However, in sharp contrast withthe evidence from previous crossmodal attention studies, the authorsfailed to observe signi!cant auditory (to visual) cueing effects. Bottaet al. (2011) proposed that the “anomalous” absence of the auditoryeffect in their studymight havebeendue to a failed perceptual associationbetween the unique auditory object (the cue) and the multiple visualobjects (the memory array). In fact, typical crossmodal audio-visualattentional paradigms imply a one-to-one relation between the cue andthe target,while in Botta et al.'s study the presentation of a single auditorycue at a given location was followed by the presentation of 4 or 6stimuli(4 or 6 coloured squares) distributed across the visual !eld. Theconsequence in the authors' opinion was that the association betweenthe cue (object) and the single target (object) represented amore obviousand unambiguous process in previous audiovisual attention studies than

Acta Psychologica 144 (2013) 104–111

! Corresponding author at: Department of Experimental Psychology, University ofGranada, Campus de Cartuja s/n 18011, 18011, Granada, Spain. Tel.: +34 958243763.

E-mail address: [email protected] (F. Botta).

0001-6918/$ – see front matter © 2013 Published by Elsevier B.V.http://dx.doi.org/10.1016/j.actpsy.2013.05.010

Contents lists available at SciVerse ScienceDirect

Acta Psychologica

j ourna l homepage: www.e lsev ie r .com/ locate /actpsy

Author's personal copy

in their VSWM study. That was because the auditory cue and the stimuliin the cued hemi!eld in Botta et al.'s paradigm were less likely to beintegrated as part of the same object/event.

Note that the above is coherentwith the cue-target or event integra-tion theory of spatial cueing effects (which derives from the originalobject !le theory proposed by Kahneman, Treisman, & Gibbs, 1992),according to which spatial and temporally contiguous information isencoded and stored in the same object !le (Lupiáñez, 2010; Lupiáñez,Milliken, Solano, Weaver, & Tipper, 2001; Lupiáñez, Ruz, Funes, &Milliken, 2007). Speci!cally, according to Lupiáñez (2010; see alsoLupiáñez, Mártín-Arévalo, & Chica, 2013), in a typical exogenous cuingparadigm, the automatic selection of the cue on the basis of its percep-tual salience would trigger the creation of an object representation.When an upcoming target appears shortly after at the same location,it would be considered by the perceptual system as an update of theobject representation “opened” by the cue. This bene!t in the spatialselection of the cued targets would lead to the observed bene!ts onits processing (as indexed by faster RT and more precise responses),in comparison with uncued targets. Uncued targets would have to beselected in competition with any other stimuli present in the displayand the cue-opened object representation (i.e., a new object represen-tation would have to be opened for uncued targets). In this context, itis important to note that the target would be treated by the system asan update of the cue as long as it could be easily integrated within itsobject !le representation.

Coherently with the above theoretical proposal, it has been widelyshown that exogenous attention is preferentially directed to objects(Desimone&Duncan, 1995).Manystudieshaveobserved similar object ad-vantages by demonstrating that attention in response to peripheral cues ispreferentially spreadwithin the sameobject (seeEgly, Driver, &Rafal, 1994;He&Nakayama, 1995; Lamy&Tsal, 2000), and that the appearanceof a tar-get does not create any further spatial code if it is integratedwithin the ob-ject representation opened by the cue (Luo, Lupianez, Funes, & Fu, 2010,2011). Regarding this issue, Woodman, Vecera, and Luck (2003) showedthat the attentional bias over VSWM encoding similarly spread into thesame perceptual group of the cued element.

In the present study, we hypothesized that the auditory cueingeffect on VSWM would depend on the perceptual object association

between the auditory and visual inputs (see Chen, Shi, & Müller,2010; Kawachi & Gyoba, 2006; Sanabria, Soto-Faraco, Chan, & Spence,2005; Sanabria, Soto-Faraco, & Spence, 2004). In Botta et al.'s study,given the coarse spatial resolution of the auditory system, the auditorycue could be associated with any of the multiple visual elementspresented in the display, with all, or with none of them (see Fig. 1).Therefore, the auditory cue did not produce a differential attentionalbene!t on the elements at the cued hemi!eld with respect to thoseat the uncued hemi!eld, and consequently no cueing effect wasobserved. We reasoned that if the perceptual organization of the targetdisplay was manipulated so that the elements at the cued hemi!eldwere grouped together and easily segregated from the elements atthe uncued hemi!eld, the processing of the cued visual object wouldbene!t from the prior onset of the auditory cue, thus leading to ameasurable and signi!cant cueing effect.

In other words, we speculated that increasing the likelihood ofperceptual association between a lateralized sound cue and a lateralizedvisual perceptual object should also increase the likelihood of crossmodalauditory facilitation of VSWM.

In Experiment 1, we used the common region perceptual groupingprinciple (see Palmer, 1992) to manipulate the arrangement of theto-be-remembered visual stimuli (see Methods for details) in such away that they could be perceived either as elements of two distinctobjects, or as part of a single object. The rationale was that if theobserver perceived two well-de!ned visual objects, one at each side,then the likelihood of perceptual association with the single auditorycue would be maximized. This would produce signi!cant auditorycuing effects. Instead, in the case the observer perceived a single object(or a group of independent elements; i.e., in the one-object condition inExperiment 1), the ambiguity of the perceptual association between theauditory input and the visual inputwould reduce, or even eliminate, theauditory crossmodal cueing effect. Experiments 2–3 aimed at replicatingthe results in Experiment 1 assessing further the effect of the commonregion and proximity gestalt grouping principles. Finally, in Experiment4, we tested whether the signi!cant auditory cuing effect observed inExperiments 1–3 could be interpreted in perceptual terms or, rather, itwas the mere consequence of participants adopting a particulartask set.

Fig. 1. As illustrated in this !gure, the dynamic of typical attentional experiments on audiovisual crossmodal attention implicitly prompt to a one-to-one association between thesound and the target. Contrariwise, this association was not evident in Botta et al.'s task because the paradigm involved the combination of a single auditory cue followed by amulti-stimuli array implying a one-to-multiple locations relation.

105F. Botta et al. / Acta Psychologica 144 (2013) 104–111

Author's personal copy

2. Experiment 1

In Experiment 1, we tested the perceptual object associationhypothesis by manipulating the perceptual organization of the elementsin the visual memory array so that they could be perceived as twodistinct lateralized objects (two objects condition) or as a uniquecentral perceptual group (one object condition). We used the Gestaltgrouping principle of common region originally proposed by Palmer(1992). We expected to observe an auditory attentional bias in thetwo objects condition but not in the one object condition.

2.1. Methods

2.1.1. ParticipantsA group of 17 psychology students (twomales, mean age 20.3 years

old, ranging from 18–24 years old) from the University of Granadaparticipated in Experiment 1. Three of them were excluded since theiraccuracy in at least one condition was near chance.1 All participants inthe present and in the following experiments (Experiments 2, 3, and4) had a self-reported normal or corrected-to-normal visual acuity,normal colour discrimination and no history of neurological disorders.We also assessed (here and in the following experiments) theirauditory localization ability by means of pre-test trials in which theyhad to indicate the location (left or right) of the onset of the auditorystimulus that served as the spatial cue in the experiment proper2.None of the subjects had problems with this task.

All participants were naïve of the purpose of the study and theywere rewarded with course credits for their participation. Each ofexperiments presented here lasted for approximately 40 min. All theexperiments were conducted in accordance with the ethical guidelineslaid down by the Department of Experimental Psychology, Universityof Granada, in accordance with the ethical standards of the 1964Declaration of Helsinki.

2.1.2. StimuliThe stimuli were displayed on a light grey background (x = .312,

y = .329, 52.7 cd/m2) on a 17" LCD computer monitor (refreshrate = 60 Hz) located in a dark and quiet room. Participants sat atapproximately 70 cm from the monitor. A black !xation point wascontinuously displayed at the centre of the screen (0.16° ! 0.16°).The coloured squares were located within a 4.2° ! 3.5° region centredin the middle of each hemi!eld, at about 7° of visual angle fromthe central !xation point (centre-to-centre). Memory arrays werecomposed of 4, or 6 individual coloured squares (each subtending avisual angle of 0.65° ! 0.65°)whichwere randomly located on the screenwith the constraint that half of them were presented in the righthemi!eld and the remaining half in the left hemi!eld. The colour ofeach squarewas randomly selected froma set of six easily discriminablecolours: brown (x = .64, y = .329, 4.5 cd/m2), blue (x = .150, y = .060,7.22 cd/m2), green (x = .30, y = .60, 71.5 cd/m2), red (x = .64, y =.330, 21 cd/m2), yellow (x = .419, y = .505, 92.7 cd/m2), and violet(x = .320, y = .154, 28.4 cd/m2). The auditory cue consisted of an inter-mittent white noise burst (cf. Butler & Planert, 1976; see also Spence &Driver, 1994). Each cue comprised three successive 20-ms bursts ofwhite noise presented at 75 dB (measured at the participant's headposition), each separated by a gap of 20 ms. The auditory cues werepresented by means of two loudspeakers, located one at either side ofthe computer monitor (at an eccentricity of approximately 18°). In thetwo objects condition, the squares were randomly located inside twoblack outlined circumferences positioned at either side of the !xationcross (see Fig. 2). Each circumference had a radius of about 4.2° and itwas centred in the middle of each hemi!eld, at about 7° of visual anglefrom the central !xation point (centre-to-centre). In the one object

condition all coloured squares were presented inside a black outlinedellipsis centred in the !xation cross (the mayor and the minor axes ofthe ellipsis subtended a visual angle of 21° and 14.8°, respectively). Boththe circumferences and the ellipsis were presented simultaneouslywith the coloured squares. Participants completed 4 blocks of 52experimental trials, for a total of 208 trials [(with 24 trials for eachcombination of Grouping (two objects vs. one object) ! Cuing (cued vs.uncued) ! SetSize (setsize 4 vs. setsize 6) and 16 trials without auditorycue2]. Grouping condition was alternated between blocks. The order ofpresentation of the two grouping conditions was counterbalancedbetween participants. A short rest was allowed between blocks.Before the start of the experiment, the participants performed 16practice trials.

2.2. Results and discussion

A three-way ANOVAwith the within participants factors of Group-ing (two objects vs. one object), Cuing (cued vs. uncued) and SetSize(setsize 4 vs. setsize 6) was conducted on the accuracy data. Since ourmain hypothesis was that signi!cant auditory facilitation effect wouldbe observed only in the two objects condition, we also performedplanned comparisons to test for the magnitude of cueing effects elic-ited by each Grouping condition.

Results indicated a signi!cant interaction between Grouping andCuing, F(1,13) = 5.2, p = .03, !p2 = 0.28. Planned comparisonsshowed a signi!cant cuing effect only in the two object condition ofGrouping, p = .001, with a greater accuracy on cued (M = 82.9 %)than on uncued trials (M = 77.1 %). No signi!cant differencesbetween cued and uncued trials were instead observed in the oneobject condition, p = .75. The analysis also indicated a betterresponse accuracy for setsize 4 (85.2%) than for setsize 6 (72.3%),F(1,13) = 146, p b .001, !p2 = 0.91 (see Fig. 3).

Crucially, the results in Experiment 1 revealed, for the !rst time,auditory cueing effects on VSWM. These results suggest that an audi-tory attentional bias in VSWM information encoding is dependent onwhether the visual display is perceptually segregated in two groups,one on each side, or it is rather perceived as a single perceptualgroup. Data from Experiment 1 clearly indicated an auditory facilita-tion in VSWM only when the elements in the cued visual memoryarray were grouped into a single object separable from the other visualperceptual group located in the other side. In contrast, when the visualinformation at the cued side was presumably perceived as a part of acentral unique object, then no auditory cuing effect was observed.

3. Experiment 2

In Experiment 2, we combined the Proximity and Common Regionprinciples. The main aim was, !rst, to investigate whether cuing effectscan be observed also when the elements are grouped by Proximityprinciple and to control for potential additive effects between thesetwo grouping principles. Therefore, here we had two groupingconditions: Proximity and Proximity&Common Region. We expected alarger cuing effect in the Proximity&Common Region condition thanin the Proximity condition.

3.1. Methods

The experiment was identical to Experiment 2 except for thefollowing. A new group of sixteen students from the University ofGranada (6 males, mean age 22.3 years old, ranging from 19 to26 years old) participated. Here we had two grouping conditions:Proximity and Proximity&Common Region. In the Proximitycondition, the coloured squares were equidistantly located within an

1 We chose a criterion of at least 60% accuracy in the highest memory load condition(SetSize 6).

2 Note that these trials were not analyzed as their utility was just to avoid soundhabituation and therefore to increase the effect of the auditory cue.

106 F. Botta et al. / Acta Psychologica 144 (2013) 104–111

Author's personal copy

imaginary column both in the right and in the left hemi!eld. Inthe Proximity&Common Region condition they were arrangedlike in the Proximity condition but they were included in two2.86° ! 6.92° rectangles (one at each side; see Fig. 2). Each partici-pant received 4 blocks of 54 trials, for a total of 24 trials for eachcombination of Grouping (Proximity vs. Proximity&CommonRegion) ! Setsize (setsize 6 vs. setsize 4) ! Cuing (cued vs. uncued),and 16 trials without auditory cue. The experiment was divided intwo halves, one for each Grouping condition counterbalanced between

participants, so that they could start with the Proximity condition (twoblocks) or with the Proximity&Common Region condition (two blocks).

3.2. Results and discussion

The 2 ! 2 ! 2 ANOVA with the within participants factors ofGrouping (Proximity vs. Proximity&Common Region), Setsize (setsize6 vs. setsize 4) and Cuing (cued vs. uncued) showed a signi!cantinteraction between Grouping and Cuing, F(1, 15) = 5.1, p = .03,

Fig. 2. Schematic representation of the experimental procedure used in the present study. At the beginning of each trial a !xation cross was presented for 1000 ms. An auditoryspatial cue was then presented for 100 ms from one of the two hemi!elds. After a 50 ms blank interval from the offset of the spatial auditory cue, the memory array containingthe to-be-remembered items was presented for 100 ms. This was followed by a 900 ms blank period and then by the test array that appeared for 2000 ms.

Fig. 3. Magnitude of the mean cuing effects (i.e., response accuracy in cued trials minus response accuracy in uncued trials) elicited by the different conditions of grouping inExperiments 1–4. The error bars represent the standard error of the means. Conditions in which the cuing effect reached signi!cance are highlighted (* p b 0.05).

107F. Botta et al. / Acta Psychologica 144 (2013) 104–111

Author's personal copy

!p2 = 0.25. Planned comparisons showed signi!cant better perfor-mance in cued trials (M = 78.9%) than in uncued trials (M = 73.5%)but exclusively in the Proximity&Common Region condition, (p =.002; p = .61 for the Proximity condition; see Fig. 3). The analysisalso indicated better response accuracy for setsize 4 (M = 83.8%)than for setsize 6 (M = 71%), F(1, 15) = 71, p b .001, !p2 = 0.82.None of the other terms in the ANOVA reached statistical signi!cance.

Overall, the present pattern of results con!rmed that of Experiment1, indicating a signi!cant auditory effect in biasing VSWM encodingwhen two distinct and separated visual objects were presented.Moreover, they suggest that Common Region principle seemed to beparticularly ef!cient for visual perceptual grouping, allowing thesubsequent audiovisual interaction in spatial attention. This is notsurprising since boundary closure (common region) is widely acceptedto be one of the most important features that de!ne what it means tobe an object (Marino & Scholl, 2005; Palmer, 1992; Spehar, 2002).

However, one may wonder whether the observed auditory cueingeffects in Experiments 1 and 2 were due to the addition of a physicalboundary (the circles in Experiment 1 and rectangles in Experiment2) in the stimulus array. To address this issuewe conducted Experiment3 inwhichwe presented the squares of thememory array in such awaythat they formed two visual objects without the addition of externalphysical boundaries.

4. Experiment 3

In Experiment 3, we explored further the nature of the relationshipsbetween audiovisual links in attention and visual grouping. Weintended to maximize the likelihood that the visual memory arraywould be perceived as two objects, constituted by different parts orfeatures. To this aim, we maximized Proximity and Connectedness(see Fig. 2). The coloured squares were in fact attached one to eachother as to create two lateralized objects (two vertical multi-colouredrectangles) rather than two groups of elements (as in Experiments1–2). Note that in this case, differently from Experiments 1 and 2, nophysical boundaries or markers were used to group de visual elementsof the memory array.

4.1. Methods

The methods and procedure were the same as in Experiment 2,except for the following. A new group of twenty-two students fromthe University of Granada (4 male, mean age 24.5 years old, rangingfrom 18–45 years old) volunteered to participate. Three of themwere excluded since their accuracy in the setsize 6 condition wasnear chance. Here, the elements were arranged as to create verticalobjects (as illustrated in Fig. 2). Indeed, the coloured squares werenow attached so that they formed two rectangles made up of two(setsize 4) or three (setsize 6) elements.

4.2. Results and discussion

We performed a 2 ! 2 ANOVA on response accuracy, with the withinparticipants factor of Setsize (setsize 6 vs. setsize 4) ! Cuing (cued vs.uncued). The analysis revealed a main effect of Cuing, F(1, 18) = 5.5,p = .02, !p2 = 0.23, with better accuracy in cued trials (M = 81.9%)than in uncued trials (M = 78.2%; see Fig. 3). Change-detection accuracyat setsize 4 (M = 83.8%) was larger than at setsize 6 (M = 76.3%), F(1,18) = 17.8, p b .001, !p2 = 0.49. The interaction between Cuing andSetSize did not reach signi!cance, F b 1.

Overall, this pattern of results appears to give further robustnessto our initial hypotheses. Auditory cuing effects in VSWM seemed todepend on the likelihood of association between a lateralized audito-ry cue and a unique lateralized visual object. Moreover, in Experiment3, the absence of a physical boundary around the elements in thevisual memory array excludes the possibility that the effect observed

in Experiments 1–2 was merely due to a difference in the physicalstimulation between the grouping and no-grouping conditions.

However, it could still be possible that the cuing effect observed inExperiments 1–3 wasmerely the consequence of a different attentionaltaskset in the present experiments vs. in Botta et al.'s experiments. Ineffect, it is possible that, here, participants adopted an attentional taskset to distribute attention between the two hemi!elds. This latterpossibility would imply that the results in Experiments 1–3 shouldnot be interpreted in pure perceptual terms (as we hypothesized) butrather as an indirect effect of the particular visual stimuli arrangement(e.g., with the stimuli inside circumferences in the two objectcondition in Experiment 2) implicitly inducing participants to distribute(endogenously) their attention to the objects presented in the hemi!eldcued by the auditory stimulus. We tested this alternative hypothesis inthe Experiment 4.

5. Experiment 4

Experiment 4 was designed to investigate whether the participants'attentional set could have in"uenced the presence or absence of audi-tory cueing effects in Experiments 1–3. In one condition, after the pre-sentation of the auditory cue and the memory array, the decision boxappeared surrounding all the items in one hemi!eld (hemi!eld decisionbox). In the other condition the decision box appeared surrounding asingle item (single item decision box) as in Experiments 1–3. In thehemi!eld decision box, subjects were asked to discriminate whetherany of the items displayed on the side signaled by the decision boxhad changed its colour or not. In the single item decision box condition,participants were required to report whether the colour of that itemwas the same as the colour of the square that had appeared at that loca-tion in the preceding memory array. If the results in Experiments 1–3were due to participants spreading attention (endogenously) over allthe itemswithin the cued hemi!eld wewould expect to !nd an audito-ry cueing effect only in the hemi!eld decision box condition. However,if the observed auditory cueing effect was due to visual perceptualgrouping affecting audiovisual crossmodal attention, auditory cueingeffects would not emerge here, since conditions for visual groupingwere absent in both experimental situations at the onset of the visualtarget display. Therefore, we expected to replicate Botta et al. (2011)with a null auditory cueing effect.

5.1. Methods

The experiment was the same as Experiment 1, except for thefollowing changes. Importantly, elements in the memory array afterthe onset of the auditory cue were presented without any boundaryor conditions for grouping, like in Botta et al. (2011). A new groupof 17 volunteers, psychology students from the University of Granada(3 males, mean age 25.4 years old, ranging from 20 to 47 years old)participated in Experiment 4. One of them was excluded since his/heraccuracy in setsize 6 condition was near chance. In half of the blocks,the decision box signalled the entire hemi!eld (hemi!eld decisionbox); in the other half the decision box indicated just one element(single item decision box) exactly as in Experiments 1–3. The order ofpresentation of the two decision box conditions was counterbalancedbetween blocks.

5.2. Results and discussion

A 2 ! 2 ! 2 within participants ANOVA with the factors of DecisionBox (hemi!eld vs. single item), SetSize (4 or 6) and Cuing (cued vs.uncued trials) was performed on the accuracy data. The analysis revealedonly a signi!cant main effect of SetSize, F(1,15) = 57.7, p b .001, !p2 =0.79, indicating a greater accuracy for setsize 4 (M = 83.3%) than forsetsize 6 (M = 72.1 %). Coherently with Botta et al. (2011; Experiment1B) themain effect of Cuing aswell as all interactions involving this factor

108 F. Botta et al. / Acta Psychologica 144 (2013) 104–111

Author's personal copy

were far from statistical signi!cance (all ps > .27),3 con!rming the ab-sence of signi!cant auditory cueing effect (M = 78.9 and M = 77.4 forcued and uncued trials, respectively). It is worth noting that theabsence of a signi!cant interaction between Decision Box and Cuing,suggested that our previous !ndings showing no signi!cant auditorycuing effect (Botta et al., 2011) could not be explained in terms ofparticipants attentional setting.

6. General discussion

Crossmodal auditory attentional effects over visual discriminationhave been widely observed in the literature (Spence & Driver, 2004).However, Botta et al., 2011 failed to observe signi!cant auditory spatialattention effects in VSWM. At !rst sight, one could argue that thisunexpected result might have been due to different rules of attentionalinteractions between auditory cues and visual targets in typical(perceptual) detection or discrimination tasks with respect to VSWMtasks, giving rise to signi!cant facilitation effects in the !rst case andto null effects in the latter case. Here, we investigated an alternativeexplanation that could account for the lack of auditory cueing effectsobtained by Botta et al. (2011). Speci!cally, we hypothesized thatauditory attentional bias on VSWM would depend on the likelihood ofperceptual association between two events: the auditory cue and thevisual probe.

On this basis, we manipulated the arrangement of the visual ele-ments of the memory array to increase the likelihood of perceptual as-sociation between the auditory cue and visual input, which is, in ouropinion, a plausible basis of exogenous attentional cuing effects. Atthis aim, we applied Gestalt grouping principles to induce participantsto perceive the visual elements of the memory array as organized intwo lateralized perceptual objects rather than just a disorganized“cloud” of unrelated elements. We hypothesized that this perceptualorganization in two lateralized objects would increase the probabilityof association of the visual input with the auditory lateralized cue,thus eliciting signi!cant auditory attention biasing effects in VSWM.

Coherently with our hypothesis, in Experiment 1, we observedsigni!cant cuing effects only when the visual elements of the memoryarray were grouped into two separated and lateralized objects. Noattentional effect was observed when the visual elements weregrouped in a single object. Overall, Experiment 2 con!rmed theresults of Experiment 1. Moreover, it suggested that, differentlyfrom Proximity (which showed to be a doubtful grouping principlehere), the Common Region perceptual grouping principle was partic-ularly ef!cient in creating a robust audio-visual association betweenthe auditory cue and the visual target. Experiment 3 corroboratedthe basic idea of this study: auditory attentional bias in VSWMdepends on the spatio-temporal contingency between auditory andvisual information, provided that one object or perceptual group isunambiguously associated with the lateralized auditory cue. Finally,Experiment 4 further supported our main hypothesis, excluding an alter-native interpretation of our results in terms of task setting, and replicatedthe absence of cueing effects when the memory array was not perceptu-ally segregated in two lateralized objects, like in Botta et al. (2011).

Far frombeing amere demonstration that it is possible to observe au-ditory attentional effects also in VSWMtasks,we believe that the presentdata have important implications for the nature of crossmodal attentionand crossmodal interactions in general. We postulate that exogenouscrossmodal attentional effects are basically due to an association/integration between two events: the auditory cue and the visual target

(Lupiáñez, 2010; see also Kahneman et al., 1992 and Hommel, 2004).Speci!cally, we hypothesize that exogenous crossmodal attentionaleffects are due to the integration of the peripheral cue and the cuedobject (the target in perceptual studies) within the same object !le, asa direct consequence of the spatio-temporal contingency between thetwo events. In other words, as outlined in the Introduction, we considerthat the salience of the lateralized sound captures attention triggeringthe opening of a novel object representation. To the extent that a subse-quently presented stimulus (either of the same of different modality) isperceived as an update of the already selected object, its processingwould bene!t from attentional selection. Whether the new presentedstimulus is considered by the perceptual system as an update of thealready selected object representation or as novel information woulddepend on the perceived spatial, temporal, and perceptual overlapbetween the stimulus triggering attentional capture (i.e., triggering theopening of the object !le) and the target stimulus. Attentional effectsof unpredictive auditory cues over visual targets might only be presentwhen the target information is perceptually matched with the visualobject or perceptual group. Therefore, the present results suggest thatthe spatio-temporal overlap is just necessary but not suf!cient forcrossmodal links in attention to occur.4 Actually, the crucial conditionfor crossmodal attentional facilitation seems to be a one-to-one asso-ciation between two spatio-temporal contingent objects or events: anauditory and a visual input.

A certain similarity between our proposal and the unity assumptioni.e., the decision that the observer's perceptual systemmakes when he/she has to establish whether two unimodal events are separate or com-ponents of a single multimodal event (see Cook & Van Valkenburg,2009; Laurienti, Kraft, Maldjian, Burdette, & Wallace, 2004; Vatakis &Spence, 2007; Vroomen & de Gelder, 2000; see also Bedford, 2004, fora discussion on this issue) could be drawn from our results. However,it should be noted that our proposal concerns the dynamic of exogenouscuing effects that involves a temporal delay between the two unimodalevents (the cue and the probe), while the unity assumption is related tomultisensory integration processes for which the relevance of temporaloverlap between the two sensory inputs has been stressed by manyresearchers (see Calvert, Spence, & Stein, 2004; Koelewijn et al., 2010;Lippert, Logothetis, & Kayser, 2007). In any case, there is a large debatein the literature regarding whether multisensory integration andcrossmodal attention involve different processes subserved by differentneural systems (i.e. McDonald, Teder-Salejarvi, & Ward, 2001) orwhether they are different outcomes of a similarmechanism and neuralsubstrate (e.g., Macaluso & Driver, 2001; Spence & Driver, 2004). If thissecond hypothesis was proved to be true, then it could be concludedthat exogenous cuing effects in general, or at least crossmodal attention,can be perfectly explained in terms of Gestaltic perceptual groupinglaws (see Van Leeuwen et al., 2011). However, as far as we know, thisis not the way attentional effects have typically been interpreted.

In fact, according to in"uential models of selective spatial atten-tion (Bundesen, 1990; Posner, 1980), the cue presentation automati-cally leads to an increase of the attentional weight of all incominginformation within the cued region, thus resulting in an improvementin both response latency and accuracy. In line with this theoreticalview, in the present study signi!cant auditory effects should havebeen observed independently on the perceptual organization of thevisual information. It is worth noting that while in most typicalaudio-visual attention experiments there is no competition betweenvisual objects for attentional selection (i.e., the target is the onlyvisual stimulus), such competition is rather evident in multi-stimuliarrays like that in Botta et al. (2011). Note that with visual cues,even in these multi-stimuli arrays, due to the higher spatial resolutionof the visual system, it is the element appearing at the same position3 To further con!rm the lack of signi!cant cuing effects in the absence of conditions

for visual perceptual grouping we performed a meta-analysis considering the accuracydata of the auditory cueing condition (cued vs. uncued) of the present experiment andof Botta et al.’s (2011) experiments (Experiment 1B, 2, 3, and 4). The ANOVA with thewithin participants factor of Cueing and the between participants factor of Experimentcon!rmed the absence of signi!cant auditory cueing effect, F(1, 73) = .05, p = .81.

4 Note indeed that in Experiment 4, as well as in all the auditory cuing experimentsin Botta et al. (2011), the cue and the probe had the same degree of spatio-temporalcontingency than in the other experiments of the present study.

109F. Botta et al. / Acta Psychologica 144 (2013) 104–111

Author's personal copy

as the cue that is selected (Botta et al., 2010). With auditory cues,however, due to the coarser auditory spatial localization, the auditorycue is presumably not able to elicit the spatial selection and toincrease the attentional weight of one particular visual element, asindicated by the absence of signi!cant auditory attentional bias.

Crucially, in the grouping condition of the present experiments thevery same visual stimuli, plausibly perceived as stand-alone objects inthe no-grouping condition, constituted features of a two unambiguouslylateralized objects. Inasmuch as the auditory system is rather precise inright/left spatial discrimination, this visual grouping decreased theambiguity of the perceptual association between the auditory cue andthe visual target, thus allowing attentional selection by the auditorycue. It logically follows that, in the present study, visual perceptualorganization seemed to affect auditory attentional effects in visualperception (see Spence, Sanabria, & Soto-Faraco, 2007, for a review onthis topic), and, as a consequence, in VSWM. Note that this is inline with previous accounts showing how unisensory perceptualgrouping affects audiovisual perceptual interactions (e.g., Cook & VanValkenburg, 2009; Sanabria et al., 2004, 2005; Vroomen & de Gelder,2000). Provocatively, it could be even questioned whether it isnecessary to deal with attention and whether it would not be betterto interpret these phenomena just in terms of gestaltic organization ofour perceptual experience (see Van Leeuwen et al., 2011).

Even not accepting such a hypothesis, the present pattern of datamight be better explained in terms of object-based attention theories(Duncan, 1984; Neisser & Becklen, 1975; see also Scholl, 2001 for areview) than in terms of space-based attention (Downing & Pinker,1985; Posner, Snyder, & Davidson, 1980; see also Cave & Bichot,1999 for a review). Actually, a further interpretation of the presentpattern of data is that auditory attention was somehow spread tothe features (the elements) of the cued visual object. The absence ofsigni!cant auditory cuing effects when the visual elements weregrouped in a single whole element (Experiment 1), is not coherentwith an interpretation of exogenous auditory attentional orientingexclusively in terms of canonical space-based attention theories. Ifthat were the case, we should have observed signi!cant attentionalauditory effects regardless of the perceptual organization of visual in-formation. Furthermore, it is important to highlight that our groupingmanipulation takes place at the time the target is presented. Conse-quently, the attentional effect we observed, was determined and notsimply measured at the time the target is presented. This goes againstthe idea that exogenous attentional processes take place at the mo-ment the cue is presented, and the role of the target is simply toprobe the state or the effect of attention. Therefore, exogenous atten-tion effects should be better considered as an interaction betweencues and targets (Folk, Remington, & Johnston, 1992; Lupiáñez et al.,2007).

In sum, the present results showed that auditory spatial attentioneffects on information encoding into VSWM depend on the likelihoodof perceiving the visual elements of the memory array presented atthe cued side as forming a unique visual object independent to theone presented at the other hemi!eld. On this basis, as a take-homemessage, we suggest that crossmodal interactions are not dependentonly on the physical spatio-temporal overlap between sensory inputs,but rather on the likelihood to perceive them as parts of a singleobject or event.

Acknowledgements

This research was supported by Spanish grants SEJ2010-6414from the Junta de Andalucía and PSI2010-19655 from the PlanNacional I + D + i (Ministerio de Innovación y Ciencia) to D.S.,the CSD2008-00048 CONSOLIDER INGENIO (Dirección General deInvestigación) to J.L. and D.S. and research projects PSI2011-22416,and eraNET-NEURON BEYONDVIS, EUI2009-04082, to J.L. and F.B.

References

Awh, E., & Jonides, J. (2001). Overlapping mechanism of attention and spatial workingmemory. Trends in Cognitive Sciences, 5, 119–126.

Bedford, F. L. (2004). Analysis of a constraint on perception, cognition, and development:One object, one place, one time. Journal of Experimental Psychology. Human Perceptionand Performance, 30, 907–912.

Botta, F., Santangelo, V., Raffone, A., Lupianez, J., & Olivetti Belardinelli, M. (2010).Exogenous and endogenous spatial attention effects on visuo-spatial workingmemory. Quarterly Journal of Experimental Psychology, 27, 1–13.

Botta, F., Santangelo, V., Raffone, A., Sanabria, D., Lupiáñez, J., & Belardinelli, M. O.(2011). Multisensory integration affects visuo-spatial working memory. Journalof Experimental Psychology. Human Perception and Performance, 37, 1099–1109.

Bundesen, C. (1990). A theory of visual attention. Psychological Review, 97, 523–547.Butler, R. A., & Planert, N. (1976). The in"uence of stimulus bandwidth on localization

of sound in space. Perception & Psychophysics, 19, 103–108.Calvert, G. A., Spence, C., & Stein, B. E. (Eds.). (2004). The handbook of multisensory

processes. Cambridge, MA: MIT Press.Cave, K., & Bichot, N. (1999). Visuospatial attention: beyond a spotlight model.

Psychonomic Bulletin & Review, 6, 204–223.Chen, L., Shi, Z., & Müller, H. J. (2010). In"uences of intra- and crossmodal grouping on

visual and tactile Ternus apparent motion. Brain Research, 1354, 152–162.Cook, L. A., & Van Valkenburg, D. L. (2009). Audio-visual organisation and the temporal

ventriloquism effect between grouped sequences: Evidence that unimodal groupingprecedes cross-modal integration. Perception, 38, 1220–1233.

Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention.Annual Review of Neuroscience, 18, 193–222.

Downing, C., & Pinker, S. (1985). The spatial structure of visual attention. InM. Posner, & O.S. M.Marin (Eds.), Attention and performance, Vol. XI. (pp. 171–187)London: Erlbaum.

Driver, J., & Spence, C. (1998). Crossmodal attention. Current Opinion in Neurobiology, 8,245–253.

Duncan, J. (1984). Selective attention and the organization of visual information.Journal of Experimental Psychology. General, 113, 501–517.

Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual attention between objects andlocations: evidence from normal and parietal lesion subjects. Journal of ExperimentalPsychology. General, 123, 161–177.

Folk, C. L., Remington, R. W., & Johnston, J. C. (1992). Involuntary covert orienting iscontingent on attentional control settings. Journal of Experimental Psychology.Human Perception and Performance, 18, 1030–1044.

He, Z. J., & Nakayama, K. (1995). Visual attention to surfaces in three-dimensionalspace. Proceedings of the National Academy of Sciences of the United States of America,92, 11155–11159.

Hommel, B. (2004). Event !les: Feature binding in and across perception and action.Trends in Cognitive Sciences, 8, 494–500.

Kahneman, D., Treisman, A., & Gibbs, B. J. (1992). The reviewing of object !les:Object-speci!c integration of information. Cognitive Psychology, 24, 175–219.

Kawachi, Y., & Gyoba, J. (2006). Presentation of a visual nearby moving object altersstream/bounce event perception. Perception, 35, 1289–1294.

Koelewijn, T., Bronkhorst, A., & Theeuwes, J. (2010). Attention and the multiple stagesof multisensory integration: A review of audiovisual studies. Acta Psychologica, 134,372–384.

Lamy, D., & Tsal, Y. (2000). Object features, object locations, and object !les: whichdoes selective attention activate and when? Journal of Experimental Psychology.Human Perception and Performance, 26, 1387–1400.

Laurienti, P. J., Kraft, R. A., Maldjian, J. A., Burdette, J. H., & Wallace, M. T. (2004). Semanticcongruence is a critical factor inmultisensory behavioral performance. ExperimentalBrain Research, 158, 405–414.

Lippert, M., Logothetis, N. K., & Kayser, C. (2007). Improvement of visual contrastdetection by a simultaneous sound. Brain Research, 1173, 102–109.

Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features andconjunctions. Nature, 390, 279–281.

Luo, C., Lupianez, J., Funes, M. J., & Fu, X. (2010). Modulation of spatial Stroopby object-based attention but not by space-based attention. Quarterly Journal ofExperimental Psychology, 63, 516–530.

Luo, C., Lupianez, J., Funes, M. J., & Fu, X. (2011). The modulation of spatial congruencyby object-based attention: Analysing the “locus” of the modulation. Quarterly Journalof Experimental Psychology, 64, 2455–2469.

Lupiáñez, J. (2010). Inhibition of Return. In A. C. Nobre, & J. T. Coull (Eds.), Attention andTime (pp. 17–34). Oxford, UK: Oxford University Press.

Lupiáñez, J., Mártín-Arévalo, E., & Chica, A. B. (2013). Is Inhibition of Return due toattentional disengagement or to a detection cost? The Detection Cost Theory ofIOR. Psicológica, 34, 221–252.

Lupiáñez, J., Milliken, B., Solano, C., Weaver, B., & Tipper, S. P. (2001). On the strategicmodulation of the time course of facilitation and inhibition of return. QuarterlyJournal of Experimental Psychology. A, Human Experimental Psychology, 54, 753–773.

Lupiáñez, J., Ruz, M., Funes, M. J., & Milliken, B. (2007). The manifestation of attentionalcapture: Facilitation or IOR depending on task demands. Psychological Research, 71,77–91.

Macaluso, E., & Driver, J. (2001). Spatial attention and crossmodal interactions betweenvision and touch. Neuropsychologia, 39, 1304–1316.

Marino, A. C., & Scholl, B. J. (2005). The role of closure in de!ning the “objects” ofobject-based attention. Perception & Psychophysics, 67, 1140–1149.

McDonald, J. J., Teder-Saelejaervi, W. A., & Hillyard, S. A. (2000). Involuntary orientingto sound improves visual perception. Nature, 407, 906–908.

McDonald, J. J., Teder-Salejarvi, W. A., & Ward, L. M. (2001). Multisensory integrationand crossmodal attention effects in the human brain. Science, 292, 791.

110 F. Botta et al. / Acta Psychologica 144 (2013) 104–111

Author's personal copy

Neisser, U., & Becklen, R. (1975). Selective looking: Attending to visually speci!edevents. Cognitive Psychology, 7, 480–494.

Palmer, S. E. (1992). Common region: A new principle of perceptual grouping. CognitivePsychology, 24, 436–447.

Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology,32, 3–25.

Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection ofsignals. Journal of Experimental Psychology. General, 109, 160–174.

Sanabria, D., Soto-Faraco, S., Chan, J., & Spence, C. (2005). Intramodal perceptual groupingmodulates multisensory integration: Evidence from the crossmodal dynamic capturetask. Neuroscience Letters, 377, 59–64.

Sanabria, D., Soto-Faraco, S., & Spence, C. (2004). Exploring the role of visual perceptualgrouping on the audiovisual integration of motion. Neuroreport, 15, 2745–2749.

Schmidt, B. K., Vogel, E. K., Woodman, G. F., & Luck, S. J. (2002). Voluntary and automaticattentional control of visual working memory. Perception & Psychophysics, 64, 754–763.

Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition, 80, 146.Spehar, B. (2002). The role of contrast polarity in perceptual closure. Vision Research,

42, 343–350.Spence, C., & Driver, J. (1994). Covert spatial orienting in audition: Exogenous and

endogenous mechanisms. Journal of Experimental Psychology. Human Perceptionand Performance, 20, 555–574.

Spence, C., & Driver, J. (1997). Audiovisual links in exogenous covert spatial orienting.Perception & Psychophysics, 59, 1–22.

Spence, C., & Driver, J. (Eds.). (2004). Crossmodal space and crossmodal attention.Oxford: Oxford University Press.

Spence, C., Sanabria, D., & Soto-Faraco, S. (2007). Intersensory Gestalten and crossmodalscene perception. In K. Noguchi (Ed.), The psychology of beauty and Kansei: New horizonsof Gestalt perception (pp. 519–579). Tokyo: Fuzanbo International.

Talsma, D., Senkowski, D., Soto-Faraco, S., & Woldorff, M. G. (2010). The multifacetedinterplay between attention and multisensory integration. Trends in Cognitive Sciences,14, 400–410.

Van Leeuwen, C., Alexander, D., Nakatani, C., Nikolaev, A. R., Plomp, G., & Raffone, A.(2011). Gestalt has no notion of attention. But does it need one? Humana.MenteJournal of Philosophical Studies, 17, 35–68.

Vatakis, A., & Spence, C. (2007). Crossmodal binding: Evaluating the “unity assumption”using audiovisual speech stimuli. Perception & Psychophysics, 69, 744–756.

Vroomen, J., & de Gelder, B. (2000). Sound enhances visual perception: Cross-modaleffects of auditory organization on vision. Journal of Experimental Psychology.Human Perception and Performance, 26, 1583–1590.

Ward, L. M., McDonald, J. J., & Golestani, N. (1998). Cross-modal control of attentionshifts. In R. D. Wright (Ed.), Visual attention (pp. 232–268). New York, NY, US:Oxford University Press.

Woodman, G. F., Vecera, S. P., & Luck, S. J. (2003). Perceptual organization in"uencesvisual working memory. Psychonomic Bulletin & Review, 10, 80–87.

111F. Botta et al. / Acta Psychologica 144 (2013) 104–111