distinct fmri responses to laughter, speech, and sounds along the human peri-sylvian cortex
TRANSCRIPT
www.elsevier.com/locate/cogbrainres
Cognitive Brain Research
Research report
Distinct fMRI responses to laughter, speech, and sounds along the
human peri-sylvian cortex
Martin Meyera,b,T, Stefan Zyssetb, D. Yves von Cramonb, Kai Alterb,c
aDepartment of Neuropsychology, University of Zurich, Treichlerstrasse 10, CH-8032 Zurich, SwitzerlandbMax Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
cSchool of Neurology, Neurobiology, and Psychiatry, Newcastle Auditory Group, University of Newcastle, UK
Accepted 7 February 2005
Available online 29 March 2005
Abstract
In this event-related fMRI study, 12 right-handed volunteers heard human laughter, sentential speech, and nonvocal sounds in which
global temporal and harmonic information were varied whilst they were performing a simple auditory target detection. This study aimed to
delineate distinct peri-auditory regions which preferentially respond to laughter, speech, and nonvocal sounds. Results show that all three
types of stimuli evoked blood–oxygen-level-dependent responses along the left and right peri-sylvian cortex. However, we observed
differences in regional strength and lateralization in that (i) hearing human laughter preferentially involves auditory and somatosensory fields
primarily in the right hemisphere, (ii) hearing spoken sentences activates left anterior and posterior lateral temporal regions, (iii) hearing
nonvocal sounds recruits bilateral areas in the medial portion of Heschl’s gyrus and at the medial wall of the posterior Sylvian Fissure
(planum parietale and parietal operculum). Generally, the data imply a differential regional sensitivity of peri-sylvian areas to different
auditory stimuli with the left hemisphere responding more strongly to speech and with the right hemisphere being more amenable to
nonspeech stimuli. Interestingly, passive perception of human laughter activates brain regions which control motor (larynx) functions. This
observation may speak to the issue of a dense intertwining of expressive and receptive mechanisms in the auditory domain. Furthermore, the
present study provides evidence for a functional role of inferior parietal areas in auditory processing. Finally, a post hoc conjunction analysis
meant to reveal the neural substrates of human vocal timbre demonstrates a particular preference of left and right lateral parts of the superior
temporal lobes for stimuli which are made up of human voices relative to nonvocal sounds.
D 2005 Elsevier B.V. All rights reserved.
Theme: Neural basis of behavior
Topic: Cognition
Keywords: Speech; Laughter; Vocal timbre; Functional Magnetic Resonance Imaging (fMRI); Peri-sylvian region; Auditory cortex
1 Notably, Heschl’s gyrus in its entirety does not exclusively accom-
modate primary auditory cortex [77]. Cytoarchitectonic and morphometric
1. Introduction
Converging evidence obtained from a plethora of clinical
and neuroimaging observations points to a residence of
speech and auditory bottom-up functions in the superior
temporal lobes of both cerebral hemispheres [5–7,11,12,23,
39,41,45,46,50,52–55,60,68,73,85,86,101,107,112]. Con-
0926-6410/$ - see front matter D 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.cogbrainres.2005.02.008
T Corresponding author. Department of Neuropsychology, University of
Zurich, Treichlerstrasse 10, CH-8032 Zurich, Switzerland. Fax: +41 44 634
4342.
E-mail address: [email protected] (M. Meyer).
sistently, these studies demonstrate a sequence of sensory
and cognitive processes mediated by brain regions stretching
along from the primary auditory cortex (PAC)1 into different
24 (2005) 291–306
studies rather demonstrate that the primary auditory cortex (PAC) usually
covers the most medially situated two thirds of Heschl’s gyrus [67].
However, human auditory areas are not confined to the PAC but are
presumed to extend across the surface of the supratemporal plane (STP),
including the planum polare (PPol) and the planum temporale (PT) even
into the insula (INS) and the parietal operculum (PaOp) [78].
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306292
anterior and posterior parts of the peri-sylvian cortex and
superior temporal sulcus (STS), either on the left or right
hemisphere, depending on the acoustic or linguistic com-
plexity of stimuli. Numerous recent neuroimaging studies
compared speech and nonspeech stimuli (e.g., white noise
bursts, sine wave tones, frequency-modulated tones, ampli-
tude-modulated tones) to distinguish between brain regions
that are more sensitive to speech and nonspeech stimuli
[11,73,87]. However, by comparing spoken language with
nonspeech auditory cues, one has to consider a multitude of
aspects in which speech and nonvocal sounds may differ, for
example, vocal timbre. Therefore, the present study uses
human laughing as experimental condition. In terms of
phonetics, human laughter is akin to speech as it also
consists of syllable-like units but it provides less F0-
variation, monotonous articulatory movements, and a differ-
ent rhythmical structure. Perception of both speech and
laughter therefore requires rapid temporal processing but the
latter to a lesser degree. This makes laughter an interesting
acoustic stimulus to examine, since the perception of
laughter may activate the same brain regions as speech to
a certain extent but can also be considered a holistic
percept which may convey relevant paralinguistic and
affective information in the context of communication. In
this context, the brain’s right hemisphere enters the
limelight as recent neuroimaging research has brought
up some unspecific evidence which associates the right
hemisphere with paralinguistic processes during speech
perception [53,62]. The specific role the right hemisphere
may play during processing vocal sounds has been
vaguely indicated [6,7,56]. In particular, the studies
conducted by Belin et al. uncovered an enhanced
sensitivity to paralinguistic vocal information of the right
superior temporal sulcus and gyrus while the contralateral
structures appear to be more amenable to linguistic
information during spoken language comprehension
[7,26]. These findings are not in agreement with recent
studies which propose a functional segregation of the
temporal lobes with extending activity from posterior
anterior sites as a function of increasing acoustic com-
plexity. More specifically, these studies place posteriome-
dial portions of the superior temporal lobe as preQ
ferentially supporting the analysis of the complex acoustic
structure of sounds. Secondly, they identify anteriolateral
regions such as lateral STG and planum polare (PPol) to
be preferentially activated by linguistically relevant
information available in speech sounds [10,45,82,
84,85,87,88]. Here, we present data obtained from an
event-related fMRI study which sought to reconcile these
competing claims. Akin to recent studies [7,87], we
examined brain responses to connected speech and non-
vocal sounds to distinguish linguistic and auditory
processing. However, this study is novel in that we
contrasted speech and nonvocal sounds to human laughter
in order to learn more about the functional segregation of
human cortical auditory system.
1.1. What makes laughter so special?
Little is known about brain mechanisms of laughter.
Generally, laughter can be considered a vocal expressive
communicative signal and was there before man developed
speech [79]. A recent review article suggested that expres-
sive laughter can be associated with an demotionally drivenTsystem involving the amygdala, thalamic/hypo-, and
subthalamic areas and with a dvoluntaryT system originat-
ing in the premotor/frontal opercular area and leading
through the motor cortex [105]. Clinical studies observed
patients suffering from neurological disorders which
caused dpathological laughterT that is inappropriate,
mood-independent, and uncontrollable laughter. However,
it seems that pathological laughter originates from numer-
ous pathological factors during brain disease rather from
emerging from circumscribed lesions to particular brain
sites [72]. Apparently, expressive laughter may be a
phenomenon so complex that it cannot be assigned to
any single brain area but depends on the temporal
coordination of several areas constituting a functional
network. This view receives partial support by recent brain
imaging data which report a network incorporating left
insula, bilateral amygdala and bilateral auditory cortices
subserving the perception of human laughter [80]. Notably,
activity in the right auditory cortices was stronger than in
contralateral areas of the left hemisphere. According to the
authors, this rightward asymmetry of supratemporal activ-
ity may be explained in terms of a right auditory cortex
specialization for processing changes of pitch, timbre, and
tonal patterns. While these authors used silence and human
crying as baseline conditions in their study, we contrast
human laughing with speech and nonvocal sounds to
reveal a potential functional sensitivity of peri-sylvian
regions.
In terms of phonetics, laughter and speech have
essential acoustic underpinnings in common. First, speech
and laughter share the same source of vocalization [75].
Like speech, laughter can also be described as a stream of
acoustic cues which unfold in time. Typically, laugh
bursts form a phrase-like sequence of repetitive laugh
pulses that are initiated by an ingression (see Methods
section). Laughter involves the respiratory system and the
vocal apparatus and is produced by changes in the vocal and
sub-glottal, respiratory tracts as well as in the musculoske-
letal system [19] that result in staccato-like syllabic vocal-
izations. Like speech, a sequence of laughter syllables can
be combined into utterances. These utterances start with a
vocalized inhalation and consist of a series of short laugh
pulses with almost constant time intervals [66,76]. The
reiteration of syllables with their specific segmental
characteristics enables us to identify it as laughter. In
contrast to speech, laughter contains less F0-variance.
Laughter has a rich harmonic structure [74] and spectral
qualities like formant frequencies can be evaluated [3].
Interestingly, both speech and laughter arise from laryngeal
2 Example sound files are available at dhttp://www.psychologie.unizh.ch/neuropsy/home_mmeyer/BRES-D-04-06566T.
3 http://www.praat.org.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 293
modulations, modified to some degree by supralaryngeal
activity, but in terms of laughter only to a small extent
by articulation. Laughter also contains effects of one
type of co-articulation at syllable-level between the
aspiration onset and the following vowel (/h-a/), but it
is more invariant than speech as vowel quality does not
alter within laughing sequences. Differences between the
two types of vocalization can be observed with regard to
the organization of syllables contained in speech and
laughing utterances. The isochronous reiteration of
syllables in laughter is untypical for normal speech. To
summarize, hearing speech requires the processing of
rapid changes in frequency which reflect co-articulatory
information between single segments, syllable, and word
boundaries, while perceiving laughter involves the
processing of rhythmical aspects and stereotyped changes
in the fundamental frequency as melodic variation is
more prominent in each laugh burst than in each speech
syllable [79]. Notably, slow processing of prosodic and
melodic modulations has been tied to right temporal lobe
functions [56,63,64]. Here, we hypothesize that hearing
human laughter is likely to preferentially drive right
auditory fields as hearing laughter has formerly been
observed to activate the right auditory cortex more
strongly than contralateral areas [80]. Furthermore, when
volunteers hear laughter, we may expect to find activity
in the right anterior STS area as this region has been
associated with nonspeech vocal sounds [6,7,26,99].
In summary, this study aimed to explore the hemody-
namic responses within the peri-sylvian cortex to three
different classes of auditory stimuli (human laughter,
speech, and nonvocal sounds) while participants per-
formed an auditory target detection task. The event-
related design applied with the present study allowed us
to investigate transient brain responses to single exper-
imental stimuli without any confound by differential
attentional levels as we used an auditory target detection
which required the participants only to respond to
specific target trials which did not enter the data
analysis. In other words, volunteers were not explicitly
required to indicate and classify speech, laughter, and
nonvocal sounds, but to simply listen to the inflowing
stimuli which was meant to factor out potential task
effects.
In agreement with expertise provided so far, we
hypothesize that hearing sentences activates anteriolateral
regions of the peri-sylvian cortex with the left hemi-
sphere playing a prominent role as the left STG and the
left IFG have been attributed to auditory sentence
processing [21,22,31,61,63–65]. By regarding laughter
and nonvocal sounds, we expect to find stronger
bilateral or even rightward responses in the human
auditory cortex since several recent studies pointed to a
specialization of a right auditory cortical swathe for
aspects of auditory processing outside the speech
domain [42,89,104,110,114].
2. Materials and methods
2.1. Participants
12 neurologically healthy right-handed volunteers par-
ticipated in this study (mean age 24 years old; range 20 to
32 years old; 6 females and 6 males). We obtained written
consent from all participants. All volunteers were native
speakers of German and had no hearing problem.
2.2. Stimuli and design
This study comprises 3 conditions and one target
condition2:
! Speech: short sentences containing a subject noun phrase
and an intransitive verb such as dPeter sleepsT;! Sounds: nonvocal sounds;! Laughter: 10 female and 10 male laughing phrases;
! Target condition: reiterated sequence of non-sense
syllables (dlalalaT).
2.3. Acoustic analyses
One representative sound file of each condition was
subjected to acoustic analyses to demonstrate differential
temporal and harmonic characteristics. The Praat software3
was used to do acoustic analyses of auditory events.
The wave form in Fig. 1A demonstrates that speech
incorporates a continuous stream of syllables. Like speech,
laughter also shows a syllable-like reiterated sequence of
laugh bursts with an initial aspiration followed by a vowel
/a/ and decreasing amplitude envelope as a function of
time (Fig. 1B). Human vocal sounds such as speech and
laughter contain more variation in the time signal due to
typical speech characteristics such as co-articulation
between single segments, syllable, and word boundaries
and a decreasing amplitude envelope. The sound signal
(Fig. 1C), however, shows less variance in the time and
amplitude structures. On the other hand, both sounds and
laughter show an equivalent distance between laugh bursts
representing the temporal structure which differs from that
for spoken languages.
Fig. 2 shows the wide band spectrograms (0–3 kHz) of
speech (Fig. 2A), laughter (Fig. 2B), and sounds (Fig. 2C).
The figure illustrates the resonance characteristics of the
human vocal tract, that is, frequency bands with higher
amplitude in periodic parts of the signals for speech and
laughter. These frequency bands – the formants – are typical
for human vocal production. These formants transitions
reflect articulatory movements in the vocal tract which are
more variable in speech than in laughter. In other words,
Fig. 2. Wide band spectrogram of acoustic stimuli. This figure shows the
wide band spectrogram (analysis width (s) = 0.005) generated from one
speech event (A), one laughing cycle (B), and one sound event (C).
Frequency changes in both the formant and harmonic structure are typical
for vocal events (speech and laughter) and reflect articulatory dynamics in
the vocal tract. See text for details.
Fig. 1. Wave forms of acoustic stimuli. This figure shows time wave forms
for (A) one speech event (dLotte pfbelt.T), (B) one laughing cycle, and (C)
one sound event. Distances between auditory events reveal the rhythmical
structure and normalized amplitude shows balanced intensity of auditory
stimuli. See text for details.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306294
rapid changes in frequency reflect effects of co-articulation
typical for speech and to a lesser extent for laughter. In
contrast to human vocal sounds like speech and laughter, a
less dynamic frequency spectrum can be observed for
nonvocal sounds. Furthermore, sounds are composed of
single frequency components without the typical speech like
frequency and time variation.
To summarize, the acoustic observations of the material
used in this study, the time wave form (Fig. 1) reflects
temporal characteristics and differences between human
vocal sounds and nonvocal sounds. In speech and
laughter, segments and syllables are connected via co-
articulation. Boundaries of syllables and even words can
be detected. In nonvocal sounds, only the repetitive
character of a re-iterant sound event is observable. The
resonance characteristics, that is, the filtering properties of
the resonance body of the human vocal tract for speech
and laughter (Figs. 2A and B) reflect the presence of (at
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 295
least the first three) formants in the vocalic parts of human
sounds like speech and laughter and their dynamic
transitions, that is, rapid changes in frequency due to
articulatory movements of the tongue, the jaw, and other
articulators. Nonvocal sounds comprise simply superim-
posed and connected single frequencies and lack speech
like dynamic frequency variations (Fig. 2C). The source
characteristics also differ between the signals. By compar-
ing human speech, laughter, and nonvocal sounds, it
becomes obvious that the harmonic structure of the latter
is less dynamic than in human vocal sounds. However, due
to a different temporal organization of laughter consisting of
repeated single laugh syllables, the harmonic dynamics in
laughter is reduced to syllabic nuclei, that is, to the vowel
/a/, whereas in speech, it is distributed over all voiced parts
and thus displays more variance. Furthermore, nonvocal
sounds are not produced by periodic vibrations of vocal
cords in the larynx and show higher frequencies for the first
harmonics. Taken together, the acoustic analysis demon-
strates the presence of vocal cues in human laughter and
spoken sentences and the absence of vocal cues in the
sounds we used.
Auditory stimulation did not differ in level, location, nor
movement of vocal and nonvocal sounds. Speech files
were recorded from a trained female speaker. The laughing
samples were provided by a regional Radio and TV station
(Mitteldeutscher Rundfunk). The sounds were taken from a
freely distributed sound data base (Macintosh). All
conditions were equalized in length (1.5–2.3 s) and
normalized for amplitude. The four conditions were
digitized at a 16 bit/16 kHz sampling rate and presented
in a random order. The relation between image acquisition
and trial presentation was systematically varied so that
numerous time points along the response are sampled [16]
yielding an doversamplingT with a de facto sampling rate
of 2.5 Hz and an average interstimulus-interval of 8 s. In
each functional run, each condition was repeated 15 times
and during the first and last 16 s of each run no trials were
presented, giving a total duration of 10 min and 32 s per
run. In total, participants underwent 5 runs, even though
some participants were only presented with 4 runs for
technical reasons.
2.4. Procedure
Participants were comfortably placed supine in the
scanner and heard stimuli in pseudo-randomized order to
enable event-related data analysis. Cushions were used to
reduce head motion. The sounds were presented diotically
through specially constructed, MR compatible headphones
(Commander XG, Resonance Technology, Northridge,
USA). A combination of sound proof headphones and
perforated ear plugs which conducted the sound directly
into the auditory passage were used to attenuate the
scanner noise without reducing the quality of speech
stimulation.
2.5. Task
It has been shown that actively calling a subject’s
attention toward an auditory stimulus yields increased
activation in auditory fields [51,111]. To avoid a confound
between stimulus perception and attentional demands, we
utilized a simple target detection task.
2.6. MR Imaging
MRI data were collected at 3 T using a Bruker 30/100
Medspec system (Bruker Medizintechnik GmbH, Ettlingen,
Germany). The standard bird cage head coil was used.
Before MRI data acquisition, field homogeneity was
adjusted by means of dglobal shimmingT for each subject.
Then, scout spin echo sagittal scans were collected to define
the anterior and posterior commissures on a midline sagittal
section. For each subject, structural and functional (echo-
planar) images were obtained from 12 axial slices (6 mm
thickness, 2 mm spacing, 64 � 64 with a FOVof 19.2 mm)
parallel to the plane intersecting the anterior and posterior
commissures (AC–PC plane). The six inferior slices were
positioned below the AC–PC plane and the remaining six
slices extended dorsally. The whole range of slices covered
all parts of the peri-sylvian cortex and extended dorsally to
the supplementary motor area (SMA). After defining the
slices’ position, a set of two-dimensional T1 weighted
anatomical images (MDEFT sequence: TE = 20 ms, TR =
3750 ms, in-plane resolution 0.325 mm2) were collected in
plane with the echo-planar images to align the functional
images to the 3D-images [71,94]. A gradient-echo EPI
sequence was used with a TE 30 ms, angle 908, TR = 2000
ms. In a separate session, high-resolution whole-head 3D
MDEFT brain scans (128 sagittal slices, 1.5 mm thickness,
FOV 25.0 � 25.0 � 19.2 cm, data matrix of 256 � 256
voxels) were acquired additionally for reasons of improved
localization.
2.7. Data analysis
Data were analyzed with LIPSIA software [58]. During
reconstruction of the functional data, the 5 (resp. 4)
corresponding runs were concatenated into a single run.
Data preparation proceeded as follows: slice-wise motion
correction (time step 20 as reference), sinc interpolation in
time (to correct for fMRI slice acquisition sequence);
baseline correction (cut-off frequency of 1/80 Hz). To align
the functional data slices onto a 3D stereotactic coordinate
reference system, a rigid linear registration with six degrees
of freedom (3 rotational, 3 translational) was performed.
The rotational and translational parameters were acquired on
the basis of the 2-dimensional MDEFT and EPI-T1 slices to
achieve an optimal match between these slices and the
individual 3D reference data set. The resulting parameters,
scaled to standard size, were then used to transform the
functional slices using trilinear interpolation, so that the
Table 1
Laughter vs. Speech: in this table and in Table 2, results of direct comparison of laughter vs. speech and sounds are listed
Location BA Left hemisphere Right hemisphere
Z score Cluster size x y z Z score Cluster size x y z
Laughter N Speech
PrCG 6 – – – – – 3.36 463 28 �16 47
PrCG 6 – – – – – 3.63a – 49 5 23
HG/ROp/subCG 41 – – – – – 3.89 2701 47 �13 12
PT 40 – – – – – 3.57 734 55 �37 24
STP/HG/subCG 22 3.60 1218 �47 �25 12 – – – – –
Speech N Laughter
IFG (triangularis)/
post IFS
9/45 �3.48 773 �43 17 27 – – – – –
FOp (triangularis) 45/44 �3.98 1688 �40 20 6 – – – – –
STG/STS 22/42/21 �4.86 11,714 �59 �13 6 �3.73 450 49 �22 3
Post MTG 39 �3.31a – �47 �63 11 – – – – –
Z scores indicate the mean activation in a particular anatomical structure. Localization of coordinates correspond to the location of local maximal activation and
is based on standard Talairach brain space. Distances are relative to the intercommissural (AC–PC) line in the horizontal (x), anterior–posterior ( y), and vertical
(z) directions. Functional activation was thresholded at jZ z 3.09j (uncorrected alpha-level 0.001). The table only lists activation clusters exceeding a minimal
size of 225 mm3 (5 measured voxels).a Indicates that Z score refers to local maximum as this cluster is not distinct but overlaps with the cluster listed in the line above. Stereotactic coordinates are
assigned to anatomical locations by using the dAutomatic anatomical labelingT tool [92] as implemented in the MRICRO software packages (http://
www.psychology.nottingham.ac.uk/staff/cr1/mricro.html). Anatomical labels are abbreviated as follows: IFG = inferior frontal gyrus, IFS = inferior frontal
sulcus, PrCG = precentral gyrus, subCG = subcentral gyrus, PaOp = parietal operculum, HG = Heschl’s gyrus, FOp = Frontal operculum, ROp = Rolandic
operculum, PT = planum temporale, PPar = planum parietale, STG = superior temporal gyrus, STS = superior temporal sulcus, MTG = middle temporal gyrus,
FG = fusiform gyrus, IOG = inferior occipital gyrus.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306296
resulting functional slices were aligned with the stereotactic
coordinate system. This linear normalization process was
improved by a subsequent processing step that performs an
additional non-linear normalization [91].
Statistical evaluation was based on a least-square
estimation using the general linear model for serially auto
correlated observations [2,14,32,109]. First, for each
subject, statistical parametric maps were generated. The
design matrix was generated with the standard hemody-
namic response function, its first and second derivative
and a response delay of 6 s. The model equation,
including the observation data, the design matrix, and
the error term, was convolved with a Gaussian kernel of
dispersion of 4 sec FWHM. Thereafter, contrast maps (i.e.,
estimates of the raw-score differences of the beta
coefficients between specified conditions) were generated
Table 2
Laughter vs. Sounds: functional activation indicated separately for direct contrast
Location BA Left hemisphere
Z score Cluster size x y
Laughter N Sounds
HG/STG 41/42/22 – – –
Fund STS 6 – – –
FG 19 – – –
Sounds N Laughter
HG/Ppar/PaOp 22/40/41 �8.34 12,438 �44 �STG/Ppar/PaOp 22/40 – – –
Post. MTG 39 �4.82a – �44 �IFG (orbitalis) 47/10 �3.46 308 �28
For explanations, see Table 1.
for each subject and the differences between conditions
were calculated using the t statistic. Subsequently, t values
were transformed into Z scores. As the individual func-
tional data sets were all aligned to the same stereotactic
reference space, a group analysis was subsequently
performed. Group activations were calculated by the
Gaussian test at corresponding voxels [14]. To protect
against false positive activations, only regions with Z score
greater than 3.1 (P b 0.001; uncorrected) and with a
volume greater than 225 mm3 (5 measured voxels) were
considered [15,27] (Tables 1 and 2).
We also computed an additional post hoc conjunction
analysis to reveal the neural substrates of vocal timbre. For
this purpose, we determined the significant clusters which
demonstrated common activation in both the laughter vs.
sounds and the speech vs. sounds contrast (Table 3).
s between Laughter and Sounds
Right hemisphere
z Z score Cluster size x y z
– – 3.60 884 52 �16 9
– – 3.58 381 40 �37 12
– – 3.62 1491 37 �58 �6
34 24 – – – –
– – �4.42 8173 37 �34 27
60 �14 – – – – –
32 �3 – – – – –
Table 3
Conjunction analysis [Laughter vs. Sounds \ Speech vs. Sounds]: clusters indicate regions which conjunctively activate during vocal timbre (speech and
laughter vs. sounds)
Location BA Left hemisphere Right hemisphere
Z score Cluster size x y z Z score Cluster size x y z
Laughter vs. Sounds \ Speech vs. Sounds
Medial HG 41 – – – – – – 459 40 �33 9
Anterior STG 22 – 54 �56 �6 9 – 594 58 �3 6
IOG 19 – 378 �44 �72 �3 – – – – –
FG 37 – 135 �35 �78 �9 – 999 28 �51 �9
For explanations, see Table 1.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 297
Furthermore, we performed a post hoc dregion of interestT(ROI) analysis which enabled us to compute condition-
specific event-related time courses to uncover differential
involvement of particular cortical sites. We collected BOLD
signals from six bilateral ROIs along the peri-sylvian cortex.
We selected these regions since they have been demon-
strated to mediate cardinal speech and auditory functions,
namely anterior superior temporal gyrus (LH aSTG: �58,
�12, 1; RH aSTG 55, �11, 2) [11,30,45,82,84,85,88],
medial part of Heschl’s gyrus (LH mdHG: �43, �18, 9; RH
mdHG: 40, �12, 5) [42,80], posterior STG (LH pSTG:
�48, �40, 15; RH pSTG: 49, �34, 14) [38,42,55,69,108],
planum parietale (LH Ppar: �54, �30, 25; RH Ppar: 56,
�32, 26) [33,40,49,55,60], parietal operculum (LH PaOp:
�40, �28, 23; RH PaOp: 40, �28, 26) [33,40,49,55,73,
78,90,108], and subcentral gyrus (LH subCG: �49, �10,
12; RH subCG: 48, �11, 12) [79].4 The signal was collected
from the voxel with the highest regional activation in the
average Zmap originating from separate contrasts for each
single condition (laughter, speech, sounds) when compared
to empty trials. The signal was averaged time-locked to the
onset of each stimulus presentation and across all 12
subjects.
3. Results
Fig. 3 displays the main fMRI results, the contrasts
between Laughter vs. Speech, Laughter vs. Sounds, and the
conjunction analysis Vocal timbre vs. Sounds5.
3.1. Laughter vs. speech
Fig. 3A illustrates that several right peri-sylvian areas
more strongly subserved laughter when compared to speech,
namely the PT, HG/ROp, the subCG, and the ventrolateral
precentral gyrus/sulcus (PrCG). Note that the medial strip of
4 An additional Figure showing the position of ROIs projected on brain
macroanatomy is available online (dhttp://www.psychologie.unizh.ch/neuropsy/home_mmeyer/BRES-D-04-06566T).5 Due to spatial restrictions we are not comprehensively discussing the
Speech vs. Sounds contrast here. Brain scans, Tables, and results
originating from the contrast Speech vs. Sounds are available at dhttp://www.psychologie.unizh.ch/neuropsy/home_mmeyer/BRES-D-04-06566T.
the supratemporal plane (STP) in the left hemisphere also
shows stronger activity for laughter relative to speech.
Hearing normal speech relative to laughter revealed
considerable activity in the left peri-sylvian cortex along
the superior temporal gyrus (lateral convexity) extending
into the STS, STP, and in the pars triangularis of the frontal
operculum (FOp). Furthermore, extra-sylvian activity was
observed in the left posterior MTG. Stronger right hemi-
sphere responses to speech relative to laughter were found
in the middle part of the right STG/STS.
3.2. Laughter vs. sounds
As apparent from Fig. 3B, stronger hemodynamic
responses to laughter as compared to sounds were collected
from right hemisphere sites, namely from the HG/STG,
from the fund of the STS, and from the FG. Hearing sounds
relative to laughter involves medial parts of the left and right
Heschl’s gyrus (HG), the Ppar, the PT, the posterior insula,
and the PaOp bilaterally. Furthermore, we observed
activation in the left posterior MTG and in the left pars
orbitalis of the IFG at the border to the anterior insula/
posterior orbital gyrus6.
3.3. Vocal timbre vs. nonvocal sounds
The post hoc conjunction analysis to identify regions
which commonly activate during both speech and laughter
processing revealed bilateral temporal and occipital sites
with the right hemisphere demonstrating stronger involve-
ment. As apparent from Fig. 3C and Table 3, hearing vocal
sounds (speech and laughter) relative to nonvocal sounds
more strongly activates bilateral sections of the anterior
lateral STG. The right medial part of HG lining the posterior
insula also shows starker engagement. Furthermore, hearing
vocal timbre activated the left inferior occipital gyrus (IOG)
as well as both the left and right fusiform gyrus (FG). Taken
together, based on consideration of clusters’ size, we
observed a considerable functional rightward asymmetry
during perception of vocal timbre.
6 The latter cluster is depicted on a supplementary Figure available at
(dhttp://www.psychologie.unizh.ch/neuropsy/home_mmeyer/BRES-D-04-
06566T).
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306298
3.4. ROI analyses
Data obtained from six bilateral ROIs for each of the
three conditions evinces different local maxima in distinct
regions of the peri-sylvian cortex (Figs. 4–6).
Speech comprehension evoked the most considerable
responses in the left temporal cortex, in particular in lateral
parts of the anterior and posterior STG when compared to
sounds and laughter. Left hemisphere dominance for speech
was observed for temporal ROIs (aSTG, pSTG). A general
rightward asymmetry of hemodynamic responses to laughter
was found in four out of six ROIs (aSTG, pSTG, Ppar,
subCG). The strongest responses to laughter relative to
speech and sounds were collected from bilateral subCG with
a salient peak in the right subCG. Notably, time courses for
laughter overlapped with time courses for speech in eight
out of twelve ROIs in both the left (mdHG, Ppar, PaOp,
subCG) and the right hemisphere (aSTG, pSTG, Ppar,
PaOp).
For sound processing, we consistently observed stronger
RH activity in all ROIs (except mdHG). Compared to speech
and laughter, hemodynamic responses to sounds were
strongest in five out of six temporo-parietal ROIs (LH +
RH Ppar, LH + RH PaOp, LH subCG), but only in two out of
six supratemporal ROIs (LH + RH mdHG).
Summing up the major results, the data provide clear
evidence that the three different conditions preferentially
activated distinct areas. Hearing speech primarily involved
areas in the lateral convexity of the left superior temporal
region with the anterior and posterior STG obviously
playing a prominent role. Hearing laughter preferentially
recruited right hemisphere areas. Interestingly, areas in the
transition zone between parietal auditory fields and motor
cortex turned out to be considerably activated by laughter.
Hearing sounds most strongly activated bilateral regions
along the medial convexity of the posterior supratemQ poral
plane (HG, PT, Ppar) up to the parietal operculum with the
right hemisphere consistently producing stronger responses.
4. Discussion
This event-related fMRI study reveals distinct regions in
the human auditory cortex during auditory perception of
speech, vocal non-speech (laughter), and nonvocal sounds
and indicates a functional division of peri-sylvian structures.
Previous brain imaging studies have already been discussing
a functional segregation of temporal regions during auditory
processing but ignored consideration of vocal non-speech
stimuli [9,11] or confounded vocal and non-vocal sounds
Fig. 3. (A) Functional inter-subject activation for the contrast Laughter vs.
Speech. The brain scans show direct comparison between laughter (hot
colors) and speech (cold colors) on left (up x = �54, down x = �47), right
(up x = 51, down x = 49) parasagittal and horizontal (z = 8) sections (jZ z3.09j, uncorrected alpha-level 0.001). In this figure and in b and c,
functional inter-subject activation (N = 12) is superimposed onto a
normalized white-matter segmented 3D reference brain. Thus, the brain’s
white matter is separated from grey matter so that the cortical layers (the
outermost 3 to 5 mm) are removed. (B) Functional inter-subject activation
for the contrast Laughter vs. Nonvocal sounds. The brain scans show direct
comparison between laughter (hot colors) and sounds (cold colors) on left
(up x = �54, down x = �40), right (up x = 55, down x = 43) parasagittal
and horizontal (z = 8) sections (jZ z 3.09j, uncorrected alpha-level 0.001).
(C) Conjunction analysis [Laughter vs. Sounds \ Speech vs. Sounds]. The
brain scans show cluster which commonly activated during perception of
speech and laughter relative to nonvocal sounds. Overlapping responses to
human vocal timbre are depicted on two horizontal (up z = 9, down z = 6)
and two right parasagittal (up x = 43, down z = 38) sections.
Fig. 4. Averaged percent signal change collected from six bilateral regions of interest. The signal was collected from the voxel with the highest activation in the
average Z map voxels. The signal was averaged time-locked to the onset of each stimulus presentation and across all 12 subjects. aSTG = anterior superior
temporal gyrus, mdHG = medial Heschl’s gyrus, pSTG = posterior superior temporal gyrus, Ppar = planum parietale, PaOp = parietal operculum, subCG =
subcentral gyrus.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 299
which made it difficult to properly distinguish between peri-
sylvian regions subserving speech and nonspeech percep-
tion. In order to address this issue more carefully, we
presented three different types of auditory stimuli, namely
sentential vocal speech, human laughter, and nonvocal
sounds. All three types of auditory percepts evoked brain
responses in the peri-sylvian cortex bilaterally with the level
and spatial distribution of activation varying as a function of
experimental condition.
The implications of our findings for auditory processing
in distinct portions of the peri-sylvian cortex are discussed
in turn.
4.1. Speech
Our data on perception of sentential speech are mostly
confirmatory with published work in the field [29,47,84],
so we refrain from elaborating on what has been
comprehensively described elsewhere. In brief, our con-
trasts clearly show that perception of intelligible sentential
speech particularly involves brain regions along the entire
lateral STG and STS, and in the left lateral FOP. In
general, this pattern of consistent brain activation is in
agreement with a host of neuroimaging studies which
observed bilateral recruitment of superior temporal areas
while volunteers heard intelligible sentential speech
[20,22,31,62,70,82,85].
More precisely, comprehending speech signals necessi-
tates the decoding of information from differing linguistic
domains, for example, meaning of words, thematic, and
structural relations, as well as from simple sound-based
signals comprised of multiple co-occurring frequencies
called formants. Recent brain imaging studies identified
distinct brain regions, especially in left peri-sylvian cortex
subserving particular aspects of linguistic and non-linguistic
aspects of speech comprehension [10,11,28,30,31,45]. In
Fig. 5. Time course of mean event-related hemodynamic responses collected from three bilateral regions of interest in the superior temporal lobe.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306300
other words, speech comprehension can be considered a
hierarchical series of processing stages to translate sounds
into meaning [22]. Both simple sound-based and abstract
higher-level aspects of speech comprehension recruit
cortical areas on the undulating surface of the STP as well
as the insula and the deep frontal operculum [12,36,64] as
well as areas stretching along the lateral convexity of the
STG down to the temporal poles. Our ROI analyses
uncovered two regions in the left superior temporal gyrus
which have formerly been identified as subserving speech
comprehension. Due to the present understanding, the
posterior superior temporal gyrus supports the processing
of rapidly changing acoustic cues which correspond to
phonetically relevant acoustic features [55,73,108] while
the anterior part maps acoustic-phonetic cues onto lexical
and grammatical representations [82,88]. In summary, brain
regions mediating higher-level aspects of speech being
made up of complex sounds as well as meaningful semantic
and even syntactic cues can be localised within left
temporo-lateral and predominantly anterior temporal and
fronto-opercular cortex [29,73,81,83]. Based on the design
of the present study, we are not able to ascribe any
particular linguistic function to distinct subportions of the
peri-sylvian cortex. However, all regions which exhibit
stronger activity for sentences relative to laughter or sounds
have been described elsewhere as playing a relevant role
during auditory sentence comprehension [22,31,61].
4.2. Sounds
Even though the sound condition we presented in the
current study was primarily considered a nonvocal control
Fig. 6. Time course of mean event-related hemodynamic responses collected from three bilateral regions of interest in the inferior parietal lobe.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 301
condition, we discovered interesting and novel findings
which are worth a detailed discussion. Processing nonvocal
sounds involves posterior portions of the right and the left
peri-sylvian region, namely the medial part of the primary
auditory cortex, the adjacent PT, the Ppar, and the PaOp
which all subserve auditory functions [1,38,40,43,49–
55,81,87] and can be considered part of the auditory cortex
[33,78,90]. Interestingly, in the context of the present study,
the anterior parts of the STP did not reveal selective
responses to nonvocal sounds, only the posterior structures
including the PT and adjacent sections extending into the
Ppar and the PaOp responded to the presentation of
nonvocal sounds. The planum temporale has been recently
described as a universal dcomputational hubT playing an
integrative role in sound processing [37] and housing
transient representation of temporally ordered sound
sequences without distinguishing between words and other
sounds [36]. Thus, the present data support a view which
suggests the PT as governing intrinsic spectrotemporal
features of sound [103] and which characterize the PT as
responsive but not selective to speech as it is assumed to
constitute an integrating mechanism mediating an ongoing
updating between stored and present auditory information
[69]. Alternatively, it should be mentioned that the
posteromedial compartments of human auditory cortex
have been associated with processing the location of sound
sources [60,102,113]. A recent refinement of the model by
Griffiths and Warren more specifically proposes that the
PT accesses learned representations of temporarily stored
sounds in higher cortical areas and gates spatial and
object-related information [38,101]. Here, we only used
monotonous nonvocal sounds which do not vary in
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306302
location and which are difficult to spontaneously associate
with any environmental or meaningful origin. However, it
cannot be ruled out that perception of nonvocal sounds
automatically activated a neural ensemble which normally
responds to changes in spatial and object-related cues to end
up with a reasonable interpretation of nonvocal sounds.
Corroborating evidence for our finding comes from a recent
fMRI study which investigated brain regions involved in
recognizing environmental sounds [57]. While the bilateral
posterior MTG appears to preferentially subserve semanti-
cally related environmental sounds Lewis et al. [57]
reported that bilateral regions in the posterior Sylvian
cortex including the inferior parietal cortex responded to
both environmental and unrecognizable backward sounds.
Thus, our findings accord with recent studies which
proposed a processing hierarchy in the human peri-sylvian
regions with anterolateral regions such as lateral STG and
PPol being preferentially activated by complex speech
sounds while the posteriomedial portions of the Sylvian
Fissure are more strongly attributed to the processing of the
auditory structure of complex nonspeech sounds and simple
speech sounds (phonemes) [38,55]. This suggestion over-
laps with the observation that the temporo-parietal cortex,
particularly in the right hemisphere mediates abstract sound
knowledge [25]. Our finding of bilateral activity covering
the posterior peri-sylvian cortex, the tissue deeply buried in
the posterior ascending Sylvian branch up to the parietal
operculum may additionally speak to the existence of
neurons sensitive to auditory stimulation in the inferior
parietal lobe accommodating the Ppar and the PaOp. Little
is known about the specific role of the Ppar and PaOp. Both
Ppar and PaOp have been formerly attributed to auditory
processing but have not yet been assigned to any specific
function [40,49]. At least, the Ppar partly overlaps with the
cytoarchitectonic area Tpt which has been classified as
auditory cortex [33]. Thus, the present data add to the view
that the posterior portions of the peri-sylvian region mediate
processing of fine-grained, spectro-temporal, acoustic cues
and can therefore be considered part of the auditory cortex
in man [18] even though it is not possible to precisely
identify the functional contribution of distinct subregions
(PT, Ppar, PaOp) which brought on a signal increase for
nonvocal sounds.
4.3. Laughter
Hearing laughter when compared to speech or sounds
preferentially engages the right STG including the primary
auditory cortex. This finding is in line with a recent fMRI-
study which also reported a stronger right STG involvement
in processing human laughter [80]. As speech and laughter
do not considerably differ in acoustic complexity, other
issues must account for this finding. Numerous clinical
studies pointed to a right hemisphere preference for
affective moods [13,44]. However, contradictory evidence
comes from a view which claims that the right hemisphere
does not universally process affective moods [96]. It rather
seems that the right hemisphere is more proficient in
processing acoustic cues such as changes in pitch, ampli-
tude, and frequency modulations if they load affective
information. Vice versa, linguistically loaded changes in the
same acoustic parameters may preferentially drive the left
hemisphere. As laughter is an affective utterance, it is
plausible that the right auditory cortex exhibits a starker
involvement.
One may have assumed that laughter also may have
evoked responses in other demotionally drivenT system
involving the amygdala, thalamic/hypo-, and subthalamic
areas proposed by Wild et al. [105]. We believe that the
laughter sequences we used in the present study were not
sufficient to induce considerable emotional sensations for
two reasons. First, the laughing samples were recorded
from professional actors. After scanning, volunteers
reported that they did not perceive the laughing samples
as dcontagious laughterT in any case. However, Sander and
Scheich [80] also used laughing recorded from professional
actors as stimuli. Unlike our study, they observed clear
brain responses in the amygdala to laughing and crying.
Notably, Sander and Scheich used continuous streams of
laughter which lasted ten times longer than the laughing
sequences we applied in our study. Therefore, it is plausible
that this difference in duration accounts for the differences
in results.
Additionally, laughter produced activation in the right
subcentral gyrus and precentral gyrus which has been
described as part of the dvoluntaryT system partly constitut-
ing the neural basis of expressive laughter in the brain. It is
therefore conceivable that hearing laughter also excites
brain regions which belong to the neural circuit subserving
expressive laughter [79]. Analogously, brain regions which
support the production of speech also appear to be involved
in speech processing [24,48,64,106].
4.4. Vocal timbre—the potential role of the (right posterior)
STG
The design of the present study also enabled us to
perform a particular post hoc conjunction analysis to
identify regions which commonly activate during both
perception of speech and laughter, that is, human vocal
timbre which could be considered the most prominent
acoustic feature which distinguishes between vocal and
nonvocal sounds. In the view of the present results, the right
and left anterior lateral STG preferentially responded to
stimuli made up of vocal timbre with the right hemisphere
playing a considerably stronger role. Furthermore, the
medial compartment of HG lining the posterior insular
cortex also activated saliently during vocal timbre percep-
tion. This strong asymmetry conflicts with at least one
recent PET study which concluded that temporal brain
structures we observed to respond more strongly to vocal
timbre (mid STG/HG) are functionally symmetrical [93].
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 303
However, this study by Tzourio-Mazoyer et al. did not
explicitly investigate the significance vocal information may
have for the right hemisphere. Corroborating evidence
which lends credence to our view comes from a multitude
of clinical and brain imaging studies. For decades, the study
of neurologically impaired patients was the only source of
information about the neural processing of voices. Evi-
dently, our findings agree with the former neuropsycho-
logical research as these clinical observations suggest that
lesions of the temporal lobe of either hemisphere may lead
to deficits in the discrimination of unfamiliar voices
[95,97,98]. Apparently, a lesion in the right hemisphere
only leads to a deficit in the recognition of familiar voices
[95]. Recently, the neural processing of voice information
has also been particularly investigated by functional
imaging experiments corroborating and complementing the
clinical data. In a recent functional magnetic resonance
imaging (fMRI) study [99], it was shown that a task
targeting on the speaker’s voice (in comparison to a task
focussing on verbal content) leads to a response in the right
anterior superior temporal sulcus of the listener. In another
series of studies [6,7], it was shown that temporal lobe areas
in both hemispheres responded more strongly to human
vocal timbre than to other vocal sounds, such as bells, dog
barks, machine sounds with the right anterior polymodal
STS being preferentially driven by human vocal information
[4,8,36,56,100]. We also observed activation at the fund of
posterior STS when volunteers heard vocal timbre relative
to sounds. We presume that differences in baselines used in
the different studies may account for the different STS
locations.
Admittedly, the picture on vocal timbre perception
remains fragmentary, but this post hoc conjunction analysis
is at least in harmony with a host of clinical and
neuroimaging studies which describe the superior temporal
lobes as eminently relevant brain structures for the
perception of human vocal information.
When compared to nonvocal sounds, vocal timbre
produced activity in the left and right fusiform gyrus.
Cross-modal mechanisms may account for this intriguing
result. Previous observations in normal-hearing subjects and
in patients with cochlear implants have established that
visual cortex participates in speech processing [35], in
particular when listening to meaningful sentences [34]. The
precise location of activity in visual areas evoked by
auditory stimuli varies across studies. Giraud et al. observed
activity in the vicinity of the fusiform face area while
cochlear-implant patients were listening to speech [34]. The
authors suggest that patients probably translated auditory
speech into facial expressions. Therefore, it is not unlikely
that participants in the present study mentally associated a
laughing or speaking face, while hearing speech and
laughter. More corroborating evidence comes from an fMRI
study which reports fusiform and lingual activity collected
while participants heard sentences [99]. In the line of the
authors, these results imply that participants translated
auditory information into specific visual representations to
better accomplish the task. Interestingly, the same cross-
modal interaction works vice versa as it has been
demonstrated that seeing speech recruits auditory areas
[17,59]. Due to our results hearing nonvocal sounds did not
evocate internal images and therefore did not produce
activity in areas associated with imagery of faces.
Generally, the data allow us to draw the following picture
sketching a functional division of the peri-sylvian cortex.
Regions in the PAC and the peri-auditory cortex appear to
be proficient at integrating complex spectro-temporal
signals even though our data suggest a functional asymme-
try varying with global and temporal information available
in acoustic stimuli. Once the inflowing acoustic signal has
been identified as spoken language, regions in the antero-
lateral and posteriolateral STG, particularly in the left
hemisphere, initiate fine-grained processing at the phonetic,
semantic, and syntactic level. If auditory input does not
entail any speech, speech-like, or even vocal information
regions in the posterior Sylvian region and the inferior
parietal lobe become involved in acoustic analysis. Unlike
vocal environmental sounds (dog barks, baby cries), the
nonvocal sounds used in the present design cannot easily be
associated with any meaning. It is therefore plausible to
assume that enlarged activity for sounds reflect the search
for a sensible interpretation. Recent observations underline
this notion of the posterior peri-sylvian cortex as being
highly sensitive to all categories of meaningful and mean-
ingless complex sounds [87].
5. Conclusion
In conclusion, we have delineated distinct regions in
the human peri-sylvian cortex which appear to preferen-
tially support the processing of human laughter, speech,
and nonvocal sounds. Hearing human laughter preferen-
tially involves auditory and somatosensory fields primarily
in the right hemisphere. Our findings deliver novel insight
in that it appears that hearing laughter activates the same
areas which have formerly been attributed to expressive
laughter. Hearing spoken sentences activates left anterior
and posterior lateral temporal and left inferior frontal
regions which is confirmatory with numerous former
findings. Hearing nonvocal sounds most strongly recruits
bilateral areas in the medial portion of Heschl’s gyrus and
at the medial wall of the posterior Sylvian fissure (planum
parietale) even including the inferior parietal lobe (parietal
operculum) and therefore adds functional evidence to the
ongoing debate on the extension of auditory fields in the
human brain. A post hoc conjunction analysis to identify
regions which commonly activate during both speech and
laughter revealed a functional rightward preference of
supratemporal sites for human vocal timbre. In summary,
the data add to recent work which provided evidence for a
hierarchical processing system in the peri-sylvian cortex
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306304
with the posteriomedial region mediating acoustic proper-
ties of nonvocal sounds and anteriolateral parts of the
STG and STS supporting the perception of human vocal
timbre.
Acknowledgments
The authors wish to thank Lutz Jancke, Adam McNa-
mara, and two anonymous reviewers for helpful comments
on the manuscript.
During the preparation of the manuscript, Martin Meyer
was supported by Schweizer National Fonds (Swiss
National Foundation) SNF 46234101, dShort-term and
long-term plasticity in the auditory systemT. This study
was also supported by the project AL 357/2-1 by the
Deutsche Forschungsgemeinschaft (German Research
Foundation, DFG) as well as by the Human Frontier
Science Program (HFSP RGP5300/2002-C102) awarded
to Kai Alter.
References
[1] M. Adriani, A. Bellmann, R. Meuli, E. Fornari, R. Frischknecht, C.
Bindschaedler, F. Rivier, J.-F. Thiran, P. Maeder, S. Clarke,
Unilateral hemispheric lesions disrupt parallel processing within
the contralateral intact hemisphere: an auditory fMRI study, Neuro-
Image 20 (2003) 66–74.
[2] G.K. Aguirre, E. Zarahn, M. D’Esposito, Empirical analysis of
BOLD-fMRI statistics: II. Spatially smoothed data collected under
null-hypothesis and experimental conditions, NeuroImage 5 (1997)
199–212.
[3] J.-A. Bachorowski, M.J. Smoski, M.J. Owren, The acoustic features
of laughter, J. Acoust. Soc. Am. 110 (1992) 1581–1597.
[4] P. Belin, R.J. Zatorre, Adaptation to speaker’s voice in right anterior
temporal lobe, NeuroReport 14 (2003) 2105–2109.
[5] P. Belin, M. Zilbovicius, S. Crozier, L. Thivard, A. Fontaine, M.
Masure, Y. Samson, Lateralization of speech and auditory temporal
processing, J. Cogn. Neurosci. 10 (1998) 536–540.
[6] P. Belin, R.J. Zatorre, P. Lafaille, P. Ahad, B. Pike, Voice-selective
areas in human auditory cortex, Nature 403 (2000) 309–312.
[7] P. Belin, R.J. Zatorre, P. Ahad, Human temporal-lobe response
to vocal sounds, Brain Res. Cogn. Brain Res. 13 (2002) 17–26.
[8] P. Belin, S. Fecteau, C. Bedard, Thinking the voice: neural correlates
of voice perception, TICS 8 (2004) 129–135.
[9] R.R. Benson, D.H. Whalen, M. Richardson, B. Swainson, V.P. Clark,
S. Lai, A.M. Liberman, Parametrically dissociating speech and
nonspeech perception in the brain using fMRI, Brain Lang. 78
(2001) 364–396.
[10] J.R. Binder, J.A. Frost, T.A. Hammeke, R.W. Cox, S.M. Rao, T.
Prieto, Human brain language areas identified by functional
magnetic resonance imaging, J. Neurosci. 17 (1997) 353–362.
[11] J.R. Binder, J.A. Frost, T.A. Hammeke, P.S.F. Bellgowan, J.A.
Springer, J.N. Kaufman, E.T. Possing, Human temporal lobe
activation by speech and nonspeech sounds, Cereb. Cortex 10
(2000) 512–528.
[12] J.R. Binder, E. Liebenthal, E.T. Possing, D.A. Medler, D. Ward,
Neural correlates of sensory and decision processes in auditory
object identification, Nat. Neurosci. 7 (2004) 295–301.
[13] L.X. Blonder, D. Bowers, K.M. Heilman, The role of the right
hemisphere in emotional communication, Brain 114 (1991)
1115–1127.
[14] V. Bosch, Statistical analysis of multi-subject fMRI data: the
assessment of focal activations, J. Magn. Reson. Imaging 11
(2000) 61–64.
[15] T.S. Braver, S.R. Bongiolatti, The role of frontopolar cortex in
subgoal processing during working memory, NeuroImage 15 (2002)
523–536.
[16] M.A. Burock, R.L. Buckner, M.G. Woldorff, B.R. Rosen, A.M.
Dale, Randomized event-related experimental designs allow for
extremely rapid presentation rates using functional MRI, Neuro-
Report 9 (1998) 3735–3739.
[17] G. Calvert, E. Bullmore, M. Brammer, R. Campbell, S. Williams, P.
McGuire, P. Woodruff, S. Iversen, A. David, Activation of auditory
cortex during silent lipreading, Science 276 (1997) 593–596.
[18] G.G. Celesia, Organization of auditory cortical areas in man, Brain
99 (1999) 403–414.
[19] M. Citardi, Y.E.J. Estill, Videoendoscopic analysis of laryngeal
function during laughter, Ann. Otol., Rhinol., Laryngol. 105 (1996)
545–549.
[20] J.T. Crinion, M.A. Lambon-Ralph, E. Warburton, D. Howard, R.J.S.
Wise, Temporal lobe regions during normal speech comprehension,
Brain 126 (2003) 1193–1201.
[21] M. Dapretto, S.Y. Bookheimer, Form and content: dissociating
syntax and semantics in sentence comprehension, Neuron 24 (1999)
427–432.
[22] M.H. Davis, I.S. Johnsrude, Hierarchical processing in spoken
language comprehension, J. Neurosci. 23 (2003) 3423–3431.
[23] J.T. Devlin, J. Raley, E. Tunbridge, K. Lanary, A. Floyer-Lea, C.
Narain, I. Cohen, T. Behrens, P. Jezzard, P.M. Matthews, D.R.
Moore, Functional asymmetry for auditory processing in human
primary auditory cortex, J. Neurosci. 23 (2003) 11516–11522.
[24] G. Dogil, H. Ackermann, W. Grodd, H. Haider, H. Kamp, J. Mayer,
A. Riecker, D. Wildgruber, The speaking brain: a tutorial introduc-
tion to fMRI experiments in the production of speech, prosody, and
syntax, J. Neurolinguist. 15 (2002) 59–90.
[25] A. Engelien, D. Silbersweig, E. Stern, W. Huber, W. Dfring, C.Frith, R.S.J. Frackowiak, The functional anatomy of recovery from
auditory agnosia. A PET study of sound categorization in a neuro-
logical patient and normal controls, Brain 118 (1995) 1395–1409.
[26] S. Fecteau, J.L. Armony, Y. Joanette, P. Belin, Is voice processing
species-specific in human auditory cortex? An fMRI study, Neuro-
Image 23 (2004) 840–848.
[27] S.D. Forman, J.D. Cohen, M. Fitzgerald, W.F. Eddy, M.A. Mintun,
D.C. Noll, Improved assessment of significant activation in func-
tional magnetic resonance imaging (fMRI): use of a cluster-size
threshold, Magn. Reson. Med. 33 (1995) 636–647.
[28] A.D. Friederici, Towards a neural basis of auditory sentence
processing, TICS 6 (2002) 78–84.
[29] A.D. Friederici, K. Alter, Lateralization of auditory language
functions: a dynamic dual pathway model, Brain Lang. 89 (2004)
267–276.
[30] A.D. Friederici, M. Meyer, D.Y. von Cramon, Auditory language
comprehension: an event-related fMRI study on the processing of
syntactic and lexical information, Brain Lang. 75 (2000) 465–477.
[31] A.D. Friederici, S. Rqschemeyer, A. Hahne, C. Fiebach, The role of
left inferior frontal and superior temporal cortex in sentence
comprehension: localizing syntactic and semantic processes, Cereb.
Cortex 13 (2003) 170–177.
[32] K.J. Friston, Statistical parametric maps in functional imaging: a
general linear approach, Hum. Brain Mapp. 2 (1994) 189–210.
[33] A. Galaburda, F. Sanides, Cytoarchitectonic organization of the
human auditory cortex, J. Comp. Neurol. 190 (1980) 597–610.
[34] A.L. Giraud, E. Trury, The contribution of visual areas to speech
comprehension: a PET study in cochlear implants patients and
normal-hearing subjects, Neuropsychologia 40 (2002) 1562–1569.
[35] A.L. Giraud, C.J. Price, J.M. Graham, E. Truy, R.S.J. Frackowiak,
Cross-modal plasticity underpins language recovery after cochlear
implantation, Neuron 30 (2001) 657–663.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306 305
[36] A.L. Giraud, C. Kell, C. Thierfelder, P. Sterzer, M.O. Russ, C.
Preibisch, A. Kleinschmidt, Contributions of sensory input, auditory
search and verbal comprehension to cortical activity during speech
processing, Cereb. Cortex 14 (2004) 247–255.
[37] T.D. Griffihs, J.D. Warren, The planum temporale as a computational
hub, Trends Neurosci. 25 (2002) 348–353.
[38] T. Griffiths, J.D. Warren, What is an auditory object? Nat. Neurosci.
Rev. 5 (2004) 887–892.
[39] T.D. Griffiths, C. Bqchel, R.S. Frackowiak, R.D. Patterson, Analysisof temporal structure in sound by the human brain, Nat. Neurosci. 1
(1998) 422–427.
[40] F.H. Guenther, A. Nieto-Castason, S.S. Ghosh, J.A. Tourville, Rep-
resentation of sound categories in auditory cortical maps, J. Speech
Lang. Hear. Res. 47 (2004) 46–57.
[41] D.A. Hall, I.S. Johnsrude, M.P. Haggard, A.R. Palmer, M.A.
Akeroyd, A.Q. Summerfield, Spectral and temporal processing in
human auditory cortex, Cereb. Cortex 12 (2002) 140–149.
[42] D.A. Hall, H.C. Hart, I.S. Johnsrude, Relationships between human
auditory cortical structure and function, Audiol. Neuro-Otol. 8
(2003) 1–18.
[43] H.C. Hart, A.R. Palmer, D.A. Hall, Amplitude and frequency-
modulated stimuli activate common regions of human auditory
cortex, Cereb. Cortex 13 (2003) 773–781.
[44] K.M. Heilman, D. Bowers, L. Speedie, H.B. Coslett, Comprehen-
sion of affective and nonaffective prosody, Neurology 84 (1984)
917–921.
[45] G. Hickok, D. Pfppel, Towards a functional neuroanatomy of speech
perception, TICS 4 (2000) 131–138.
[46] G. Hickok, B. Buchsbaum, C. Humphries, T. Muftuler, Auditory–
motor interaction revealed by fMRI: speech, music, and working
memory in area Spt, J. Cogn. Neurosci. 15 (2003) 673–682.
[47] P. Indefrey, A. Cutler, Prelexical and lexical processing in listening,
in: M.A. Gazzaniga (Ed.), The New Cognitive Neuroscience, vol. III,
MIT Press, Cambridge, MA, 2004, pp. 759–774.
[48] P. Indefrey, F. Hellwig, H. Herzog, R.J. Seitz, P. Hagoort, Neural
responses to the production and comprehension of syntax in identical
utterances, Brain Lang. 14 (2004) 312–319.
[49] L. Jancke, G. Schlaug, X. Huang, H. Steinmetz, Asymmetry of the
planum parietale, NeuroReport 5 (1994) 1161–1163.
[50] L. Jancke, N.J. Shah, S. Posse, M. Grosse-Ruyken, H. Mqller-G7rtner, Intensity coding of auditory stimuli: an fMRI study,
Neuropsychologia 36 (1998) 875–883.
[51] L. Jancke, S. Mirzazade, N.J. Shah, Attention modulates activity in
the primary and the secondary auditory cortex: a functional magnetic
resonance imaging study in human subjects, Neurosci. Lett. 266
(1999) 125–128.
[52] L. Jancke, T. Buchanan, K. Lutz, S. Mirzazade, N.J.S. Shah, The
time course of BOLD response in the auditory cortex to acoustic
stimuli of different duration, Brain Res. Cogn. Brain Res. 8 (1999)
117–224.
[53] L. Jancke, T. Buchanan, K. Lutz, N.J.S. Shah, Focused and non-
focused attention in verbal and emotional dichotic listening: an fMRI
study, Brain Lang. 78 (2001) 349–363.
[54] L. Jancke, N. Gaab, T. Wqstenberg, H. Scheich, H. Heinze, Short-term functional plasticity in the human auditory cortex: an fMRI
study, Brain Res. Cogn. Brain Res. 12 (2001) 479–485.
[55] L. Jancke, T. Wqstenberg, H. Scheich, H. Heinze, Phonetic
perception and the temporal cortex, NeuroImage 15 (2002) 733–746.
[56] S. Lattner, M. Meyer, A. Friederici, Voice perception: sex, pitch, and
the right hemisphere, Hum. Brain Mapp. 24 (2005) 11–20.
[57] J.W. Lewis, F.I. Wightman, J.A. Brefczynski, R.E. Phinney, J.R.
Binder, E.A. DeYoe, Human brain regions involved in recognizing
environmental sounds, Cereb. Cortex 14 (2004) 1008–1021.
[58] G. Lohmann, K. Mqller, V. Bosch, H. Mentzel, S. Hessler, L. Chen,
S. Zysset, D.Y. von Cramon, Lipsia—A new software system for the
evaluation of functional magnetic resonance images of the human
brain, Comput. Med. Imaging Graph. 25 (2001) 449–457.
[59] M. MacSweeney, E. Amaro, G.A. Calvert, R. Campbell, P.K.
McGuire, S.C. Williams, B. Woll, M.J. Brammer, Silent speech-
reading in the absence of scanner noise: an event related fMRI study,
NeuroReport 11 (2000) 1729–1733.
[60] P.P. Maeder, R.A. Meuli, M. Adriani, A. Bellmann, E. Fornari, J.-P.
Thiran, A. Pittet, S. Clarke, Distinct pathways involved in sound
recognition and localization: a human fMRI study, NeuroImage 14
(2001) 802–816.
[61] B.M. Mazoyer, N. Tzourio, V. Frak, A. Syrota, N. Murayama, O.
Levrier, G. Salamon, S. Dehaene, L. Cohen, J. Mehler, The
cortical representation of speech, J. Cogn. Neurosci. 5 (1993)
467–479.
[62] M. Meyer, A.D. Friederici, D.Y. von Cramon, Neurocognition of
auditory sentence comprehension: event related fMRI reveals
sensitivity to syntactic violations and task demands, Brain Res.
Cogn. Brain Res. 9 (2000) 19–33.
[63] M. Meyer, K. Alter, A.D. Friederici, G. Lohmann, D.Y. von
Cramon, Functional MRI reveals brain regions mediating slow
prosodic modulations in spoken sentences, Hum. Brain Mapp. 17
(2002) 73–88.
[64] M. Meyer, K. Steinhauer, K. Alter, A.D. Friederici, D.Y. von
Cramon, Brain activity varies with modulation of dynamic pitch
variance in sentence melody, Brain Lang. 89 (2004) 277–289.
[65] E.B. Michael, T.A. Keller, P.A. Carpenter, M.A. Just, fMRI
investigation of sentence comprehension by eye and ear: modality
fingerprints on cognitive processes, Hum. Brain Mapp. 13 (2001)
239–252.
[66] P. Moore, H. van Leden, Dynamic variations of the vibratory pattern
in the normal larynx, Folia Phoniatr. 10 (1958) 205–238.
[67] P. Morosan, J. Rademacher, A. Schleicher, K. Amunts, T. Schor-
mann, K. Zilles, Human primary auditory cortex: cytoarchitectonic
subdivisions and mapping into a spatial reference system, Neuro-
Image 13 (2001) 684–701.
[68] R.-A. Mqller, R.D. Rothermel, M.E. Behen, O. Muzik, T.J. Mangner,
H.T. Chugani, Receptive and expressive language activations for
sentences: a PET study, NeuroReport 8 (1997) 3767–3770.
[69] H. Mustovic, K. Scheffler, F. Di Salle, F. Esposito, J.G. Neuhoff, J.
Hennig, E. Seifritz, Temporal integration of sequential auditory
events: silent period in sound pattern activates human planum
temporale, NeuroImage 20 (2003) 429–434.
[70] C. Narain, S.K. Scott, R.J.S. Wise, S. Rosen, A. Leff, S.D. Iversen,
P.M. Matthews, Defining a left lateralized response specific to
intelligible speech using fMRI, Cereb. Cortex 13 (2003) 1362–1368.
[71] D.G. Norris, Reduced power multi slice MDEFT imaging, Magn.
Reson. Imaging 11 (2000) 445–451.
[72] K. Poeck, Pathological laughter and crying, in: S. Frederik
(Ed.), Handbook of Clinical Neurology, Elsevier, Amsterdam,
1985, pp. 219–225.
[73] D. Pfppel, A. Guillemin, J. Thompson, J. Fritz, D. Bavelier, A.R.
Braun, Auditory lexical decision, categorical perception, and FM
direction discrimination differentially engage left and right auditory
cortex, Neuropsychologia 42 (2004) 183–200.
[74] R. Provine, Laughter, Am. Sci. 84 (1996) 38–45.
[75] R.R. Provine, Laughter punctuates speech: linguistic, social and
gender contexts of laughter, Ethology 95 (1993) 291–298.
[76] R.R. Provine, L.Y. Young, Laughter: a stereotyped human vocal-
ization, Ethology 89 (1991) 115–124.
[77] J. Rademacher, P. Morosan, T. Schormann, A. Schleicher, C. Werner,
H. Freund, K. Zilles, Probabilistic mapping and volume measurement
of human primary auditory cortex, NeuroImage 13 (2001) 669–683.
[78] F. Rivier, S. Clarke, Cytochrome oxidase, acetylcholinesterase, and
NADPH Diaphorase staining in human supratemporal and insular
cortex: evidence for multiple auditory areas, NeuroImage 6 (1997)
288–304.
[79] W. Ruch, P. Ekman, The expressive pattern of laughter, in: A.W.
Kaszniak (Ed.), Emotion, Qualia, and Consciousness, Word Scien-
tific Publisher, Tokyo, 2001, pp. 426–443.
M. Meyer et al. / Cognitive Brain Research 24 (2005) 291–306306
[80] K. Sander, H. Scheich, Auditory perception of laughing and crying
activates human amygdala regardless of attentional state, Brain Res.
Cogn. Brain Res. 12 (2001) 181–198.
[81] A.P. Saygin, F. Dick, S.W. Wilson, E. Bates, Neural resources for
processing language and environmental sounds, Brain 126 (2003)
928–945.
[82] S.K. Scott, I.S. Johnsrude, The neuroanatomical and functional orga-
nization of speech perception, Trends Neurosci. 26 (2003) 100–107.
[83] S.K. Scott, R.J.S. Wise, PET and fMRI studies of the neural basis on
speech perception, Speech Commun. 41 (2003) 23–34.
[84] S.K. Scott, R.J.S. Wise, The functional neuroanatomy of prelexical
processing in speech perception, Cognition 92 (2004) 13–45.
[85] S.K. Scott, C.C. Blank, S. Rosen, R.J.S. Wise, Identification of a
pathway for intelligible speech in the left temporal lobe, Brain 123
(2000) 2400–2406.
[86] E. Seifritz, F. Di Salle, D. Bilecen, E.W. Radu, K. Scheffler, Auditory
system. Functional magnetic resonance imaging, Neuroimaging Clin.
N. Am. 11 (2001) 275–296.
[87] K. Specht, J. Reul, Functional segregation of the temporal lobes into
highly differentiated subsystems for auditory perception: an auditory
rapid event related fMRI task, NeuroImage 20 (2003) 1944–1954.
[88] L.A. Stowe, M. Haverkort, F. Zwarts, Rethinking the neurological
basis of language, Lingua 115 (2005) 997–1045.
[89] T.M. Talavage, P.J. Ledden, R.R. Benson, B.R. Rosen, J.R. Melcher,
Frequency dependent responses exhibited by multiple regions in
human auditory cortex, Hear. Res. 150 (2000) 225–244.
[90] E. Tardif, S. Clarke, Intrinsic connectivity of human auditory areas: a
tracing study with Dil, Eur. J. Neurosci. 13 (2001) 1045–1050.
[91] J.-P. Thirion, Image matching as a diffusion process: an analogy with
Maxwell’s daemons, Med. Image Anal. 2 (1998) 243–260.
[92] N. Tzourio-Mazoyer, B. Landeau, D. Papathanassiou, F. Crivello, O.
Etard, N. Delcroix, B. Mazoyer, M. Joliot, Automated anatomical
labeling of activations in SPM using a macroscopic anatomical
parcellation of the MNI MRI single subject brain, NeuroImage 15
(2002) 273–289.
[93] N. Tzourio-Mazoyer, G. Josse, F. Crivello, B. Mazoyer, Interindi-
vidual variability in the hemispheric organization for speech,
NeuroImage 21 (2004) 422–435.
[94] K. Ugurbil, M. Garwood, J. Ellermann, K. Hendrich, R. Hinke, X. Hu,
S.-G. Kim, R. Menon, H. Merkle, S. Ogawa, S.R. Ogawa, Magnetic
fields: initial experiences at 4T, Magn. Reson. Q. 9 (1993) 259.
[95] D. Van Lancker, J. Kreiman, Voice discrimination and recognition
are separate abilities, Neuropsychologia 25 (1987) 829–834.
[96] D. Van Lancker, J.J. Sidtis, The identification of affective prosodic
stimuli by left and right hemisphere damaged subjects: all errors are
not created equal, J. Speech Hear. Res. 35 (1992) 963–970.
[97] D. Van Lancker, J. Cummings, J. Kreiman, B. Dobkin, Phonagnosia:
a dissociation between familiar and unfamiliar voices, Cortex 24
(1988) 195–209.
[98] D. Van Lancker, J. Kreiman, J. Cummings, Voice perception deficits:
neuroanatomical correlates of phonagnosia, J. Clin. Exp. Neuro-
psychol. 11 (1989) 665–674.
[99] K. von Kriegstein, E. Eger, A. Kleinschmidt, A.L. Giraud,
Modulation of neural responses to speech by directing attention
to voices or verbal content, Brain Res. Cogn. Brain Res. 17 (2003)
48–55.
[100] K. von Kriegstein, A.L. Giraud, Distinct functional substrates along
the right superior temporal sulcus for the processing of voices,
NeuroImage 22 (2004) 948–955.
[101] J. Warren, T. Griffths, Distinct mechanisms for processing spatial se-
quences and pitch sequences in the human auditory brain, J. Neurosci.
23 (2003) 5799–5804.
[102] J. Warren, B.A. Zielinski, G.G.R. Green, J.P. Rauschecker, T.
Griffiths, Perception of sound source motion by the human brain,
Neuron 34 (2002) 139–148.
[103] J. Warren, S. Uppenkamp, R. Patterson, T. Griffiths, Separating pitch
chroma and pitch height in the human brain, Proc. Natl. Acad. Sci.
U. S. A. 100 (2003) 10038–10043.
[104] C.M. Wessinger, J. VanMeter, B. Tian, J. Pekar, J.P. Rauschecker,
Hierarchical organization of the human auditory cortex revealed by
functional magnetic resonance imaging, J. Cogn. Neurosci. 13
(2001) 1–7.
[105] B. Wild, F.A. Rodden, W. Grodd, W. Ruch, Neural correlates of
laughter and humour, Brain 126 (2003) 2121–2138.
[106] R.J.S. Wise, Language systems in normal and aphasic human
subjects: functional imaging studies and inferences from animal
studies, Br. Med. Bull. 65 (2003) 95–119.
[107] R.J.S. Wise, S.K. Scott, S.C. Blank, C.J. Mummery, K. Murphy,
E.A. Warburton, Separate neural subsystems within dWernicke’s
areaT, Brain 124 (2001) 83–95.
[108] T. Zaehle, T. Wqstenberg, M. Meyer, L. Jancke, Evidence for rapid
auditory perception as the foundation of speech processing—a
sparse temporal sampling fMRI study, Eur. J. Neurosci. 20 (2004)
1447–1456.
[109] E. Zarahn, G. Aguirre, M. D’Esposito, Empirical analysis of BOLD
fMRI statistics: I. Spatially smoothed data collected under null
hypothesis and experimental conditions, NeuroImage 5 (1997)
179–197.
[110] R.J. Zatorre, P. Belin, Spectral and temporal processing in human
auditory cortex, Cereb. Cortex 11 (2001) 946–953.
[111] R. Zatorre, J. Binder, Functional and structural imaging of the human
auditory system, in: A. Toga, J. Mazziotta (Eds.), Brain Mapping:
The Systems, Academic Press, San Diego, 2000, pp. 365–402.
[112] R.J. Zatorre, P. Belin, V.B. Penhune, Structure and function of
auditory cortex: music and speech, TICS 6 (2002) 37–46.
[113] R.J. Zatorre, M. Bouffard, P. Ahad, P. Belin, Where is dwhereT in the
human auditory cortex? Nat. Neurosci. 5 (2003) 905–909.
[114] R.J. Zatorre, M. Bouffard, P. Belin, Sensitivity to auditory object
features in human temporal neocortex, J. Neurosci. 24 (2004)
3637–3642.