the neural basis of speech perception – a view from functional imaging
DESCRIPTION
The Neural Basis of Speech Perception – a view from functional imaging. Sophie Scott Institute of Cognitive Neuroscience, University College London. This approach to speech perception. Speech is an auditory signal - PowerPoint PPT PresentationTRANSCRIPT
The Neural Basis of Speech Perception – a view from
functional imaging
Sophie Scott
Institute of Cognitive Neuroscience,
University College London
This approach to speech perception
• Speech is an auditory signal• It is possible to address the neural processing
of speech within the framework of auditory cortical processing.
• This is not synonymous with the entire language system.
• If one is a skilled speaker of a language, then speech perception is obligatory.
Functional imaging
• Where neural activity occurs, blood is directed.
• Measure neural activity by tracking these changes in local blood flow.
• Thus measuring mass synaptic activity• Poor temporal resolution• Essentially a comparison of blood flow
changes across conditions - so the baseline comparisons are critical
Listening
Wise et al, Lancet, 2001
Neuroanatomy of speechSpeech production
Speech perception
sts
Ins
TS2
TS1
TS3
A1Tpt
paAlt
Pro
caudal
CORE
BELT
PARABELT
rostral
medial
lateralsts dorsal
RTM
RM
CM
R
RT
A1
RTL
AL
ML
MC
RP
CP
Scott and Johnsrude, 2003, from Romanski et al, 1999
AI
R
RT
CL
ML
AL
STGc
CBP
RBP
STGr
Dorsal prearcuate (8a)
Dorsal principal sulcus
(46)
Inferior convexity
(12)
Orbitalpolar
From Kaas and Hackett, 1999
Core Belt ParabeltPrefrontal
cortex
tonotopy
bandwidth
Conspecific vocalisations
Spatial representations
HG PT
CB
PB
Assoc
Assoc
Tpt
Anterior Posterior
Ven
tral
STS
STP
STS
STP
Human
Monkey
medial
Scott and Johnsrude, 2003
anterior
posterior
lateral
AAMALAA1
PASTAALALP
medial
Scott and Johnsrude, 2003
anterior
posterior
lateral
Sounds with harmonic structure against pure tones: Hall, Johnsrude et al., 2002
Frequency modulated tones against unmodulated tones: Hall, Johnsrude et al., 2002
Amplitude modulated noise against unmodulated noise: Giraud et al, 1999
Spectral change against steady state sounds: Thivard et al, 2000
Hierarchical processing
• Structure in sound is computed beyond primary auditory cortex
• More complex structure (e.g. spectral change) processed further from PAC
• How does this relate to speech processing?
speech
noise vocoded speech
rotated speech
rotated noise vocoded speech
Sp VCo RSp RVCo
Sp VCo RSp RVCo Sp VCo RSp RVCo
Sp VCo RSp RVCo
-2
1
00
00
1
11
-2-2
-1
2
-1
-1-1
Anterior
-60 -4 -10 Z = 6.6
-54 +6 -16 Z = 4.7 -62 -12 -12 Z = 5.5
-64 -38 0 Z = 5.7
(Sp + VCo + RSp) - RVCo (Sp + VCo + RSp) - RVCo
(Sp + VCo) - (RSp + RVCo) (Sp + VCo) - (RSp + RVCo)
Left hemisphere
Scott, Blank, Rosen and Wise, 2000
1
0
-1
2
Sp VCo RSp RVCo
Right hemisphere
Anterior
+66 -12 0 Z = 6.7
(Sp + RSp) - (VCo + RVCo)
Scott, Blank, Rosen and Wise, 2000
Intelligibility
Plasticity within this system
Naïve subjects were scanned before they could understand noise vocoded speech, then they were trained, then scanned again.
Activity to noise vocoded speech after a training period, relative to prior activity to NVC before the training period. Narain, Wise, Rosen, Matthews, Scott, under review.
Flexibility in speech perception: learning to understand noise vocoded speech
As well as left lateralised STS, there is involvement of left premotor cortex and the left anterior thalamus (which receive projections from the belt and parabelt).
Spectrograms of the stimuli
(speech)
16
8
4
3
2
1
(rotated speech)
16R
3R
Intelligibility - behavioural data
Right
Scott, Rosen, Lang and Wise, 2006
Z=5.96 x=64 y=-4 z=-2Z=4.73 x=-48 y=-16 z=-16
Z=4.52 x=-64 y=-28 z=8Z=5.6 x=-62 y=-10 z=80 Left
1 2 3 4 8 16 3R 16R
1 2 3 4 8 16 3R 16R
1 2 3 4 8 16 3R 16R
1 2 3 4 8 16 3R 16R
medial
Scott and Johnsrude, 2003
anterior
posterior
lateral
Sounds with harmonic structure against pure tones: Hall, Johnsrude et al., 2002
Frequency modulated tones against unmodulated tones: Hall, Johnsrude et al., 2002
Amplitude modulated noise against unmodulated noise: Giraud et al, 1999
Spectral change against steady state sounds: Thivard et al, 2000
Peak responses to Intelligibility (Scott et al, 2006)
Speech specific processing
• Does not occur in primary auditory cortexd
• Begins early in auditory cortex - in areas that also respond to AM
• As we move forward down the STS, the responses become less sensitive to acoustic structure - resembles behavioural profile
Speech comprehension - The role of context
• e.g., words recognised more easily in sentences
• “The ship sailed the sea” > “Paul discussed the dive”.
• Can we identify the neural basis of this contextual modulation of speech comprehension?
(Miller et al., 1951; Boothroyd and Nittrouer, 1988; Grant and Seitz, 2000;
Stickney and Assmann, 2001; Davis et al., 2005)
(noise vocoding:Shannon et al., 1995predictability:Kalikow et al., 1977)
Low predictability:log increase with more channels
…‘Sue was interestedin the bruise’…
jonas obleser 27
Behav 2 low+high
High predictability:influence at intermediate number of channels
…‘He caught thefish in his net’…
…‘Sue was interestedin the bruise’…
jonas obleser 28
(cf. e.g. Binder et al. 2000; Scott et al., 2000; Davis & Johnsrude 2003; Zekveld et al., 2006)
Bottom-up processes:correlations with number of channels
RFX p<0.005 uncorrected, k>30 Obleser, Wise, Dresner, & Scott, 2007
Left-hemispheric array of brain regions when context affects comprehension
Lateral Prefrontal (BA 8)
Posterior Cingulate (BA 30)
Medial Prefrontal (BA 9)
Angular Gyrus (BA 39)
Ventral IFG (BA 47)
RFX p<0.005 uncorrected, k>30Obleser, Wise, Dresner, & Scott, 2007
findings
• A range of brain areas outwith auditory cortex contribute to ‘top down’ semantic influences on speech perception
• Further studies will be able to dissociate the contributions of different linguistic factors
Words are not the only things we say
Non speech sounds?
x=54
Regions in red respond to noises and rotated noises
Regions in yellow respond to noises and rotated noises
1
0
-1
2
Sp VCo RSp RVCo
Right hemisphere
Anterior
+66 -12 0 Z = 6.7
(Sp + RSp) - (VCo + RVCo)
What drives lateral asymmetry?
• Previous studies have not generally used ‘speech-like’ acoustic modulations
• We aimed to manipulate speech stimuli to vary the amplitude and spectral properties of speech independently
• Control for intelligibility• Do we see additive effects of amplitude and
spectral modulations?• Are these left lateralised?
Steady spectrum, steady amplitude
Steady spectrum, varying amplitude
Varying spectrum, steady amplitude
Varying spectrum, varying amplitude
Ideal additive effectsE
ffec
t si
ze
Down for flat amplitude and spectrum
Similar response to AM and SpM
Significantly more activated by stimuli with both AM and SpM
Additive effects
Flat AM SpM SpMAM
Flat AM SpM SpMAM
PET scanning, 16 runs, N=13, thresholded at p<0.0001, 40 voxels
Additive effects
Flat AM SpM SpMAM
Flat AM SpM SpMAM
PET scanning, 16 runs, N=13, thresholded at p<0.0001, 40 voxels
But…
• Is there a problem - were these stimuli really processed as speech?
• To address this, 6 of the 13 subjects were pretrained on speech exemplars, and the speech stimuli were included as a 5th condition.
A
B
C
D
E speech
A
B
C
D
E speech
Speech conditions
Flat AM SpM SpMAM
Flat AM SpM SpMAM
Flat AM SpM SpMAM
N=6, thresholded at p<0.0001, 40 voxels
Speech conditions
Flat AM SpM SpMAM
Flat AM SpM SpMAM N=6, thresholded at p<0.0001, 40 voxels
Asymmetries in speech perception
• Exist!• Are not driven by simple properties of
the speech signal• Right - preferentially processes speech-
like sounds - voices?• Left - processes linguistically relevant
information
Posterior auditory areas
• In primates, medial posterior areas show auditory and tactile responses
• What do these areas do in speech processing in humans?
Speaking and mouthing
This region, in the left posterior temporal-parietal junction, responds when subject repeat a phrase, mouth the phrase silently, or go ‘uh uh’, over mentally rehearsing the phrase
Wise, Scott, Blank, Murphy, Mummery and Warburton, 2001
Wise et al, 2001, Brain
QuickTime™ and aDV - PAL decompressor
are needed to see this picture.
Listening over silence
Amount of DAF (0, 50, 125, 200ms)
DAF peak on right
0 50 125 200
Neural basis of speech perception
• Hierarchical processing of sound in auditory cortex• The anterior ‘what’ pathway is important in the
perceptual processing of speech• Activity in this system can be modulated by top
down linguistic factors• There are hemispheric asymmetries in speech
perception - the left is driven by phonetic, lexical and linguistic properties: the right is driven by pitch variation, emotion and indexical properties
• There are sensory motor links in posterior auditory areas - part of a ‘how’ pathway?
what
where
what
how
where
Scott, in press
Scott, Current Opinions in Neurobiology, 2005
Charlotte Jacquemot
Frank Eisner
Disa Sauter
Carolyn McGettigan
Narly Golestani
Jonas Obleser
Sophie ScottStuart Rosen
Richard WiseCharvy NarainAndrew FaulknerHideki Takaso