Harvard-MIT Division of Health Sciences and TechnologyHST.722J: Brain Mechanisms for Hearing and SpeechCourse Instructor: Bertrand Delgutte
Quantitative approaches to the study of neural coding
©Bertrand Delgutte, 2003-2005
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
1www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
The Problem• Neural and behavioral data have different forms that
are not easily comparable– In some cases, data from different species and states
(awake vs. anesthetized) are being compared– Even when neural and behavioral data have similar forms
(e.g. a threshold), they both depend on arbitrary criteria • Both neural and behavioral responses are
probabilistic: They are not identical for different repetitions of the same stimulus. Therefore, they provide a noisy (imprecise) representation of the stimulus.
• Are there quantitative tools to describe the precision of stimulus representation in neural and behavioral data?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
2www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Questions that can be addressed by quantitative methods
• Does a particular class of neurons provide sufficient information to account for performance in a particular behavioral task?
• Does a particular neural code (e.g. rate-place vs. temporal) provide sufficient information for the task?
• Which stimulus feature is a particular neuron or neural population most sensitive to? Are there “feature detector” neurons?
• Is there stimulus information in the correlated firings of groups of neurons?
• How efficient is the neural code?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
3www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Signal detection theory (a.k.a. ideal observer analysis)– Single neuron– Combined performance for neural population
• Shannon information theory
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
4www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Neural variability limits detection performance• When a sound stimulus is presented
repeatedly, the number of spikes recorded from an auditory neuron differs on each trial. The blue and red surfaces are model spike count distributions for two pure tones differing in intensity by 3 dB.
• The overlap in the spike count distributions limits our accuracy in identifying which of the two stimuli was presented based on the responses of this neuron.
• A measure of the separation between the two distributions (and therefore of the neuron’s ability to discriminate) is the discriminability index d’, which is the difference in means of the two distributions divided by their standard deviation.
• The just noticeable difference (JND) or difference limen (DL) is often taken as the intensity increment for which d’ = 1. This criterion corresponds to 76% correct in a two-interval, two-alternative psychophysical experiment.
Delgutte (unpublished)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
5www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Conditional Probability and Bayes’ Rule
Conditional Probability:
Statistically Independent Events:
Bayes’ Rule:
( | ) ( , ) / ( )P S R P S R P R=
( , ) ( ) ( )( | ) ( )
P S R P S P RP S R P S
==
( | ) ( )( | )( )
P R S P SP S RP R
=www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
6www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Bayesian Optimal Decision and the Likelihood Ratio
The Problem:• Choose between two alternatives (stimuli) S0 and S1 with prior
probabilities P(S0) and P(S1) given the observation (neural response) R so as to minimize the probability of error.
• Conditional probabilities (stimulus-response relationships) P(R|S0) and P(R|S1) are known.
Bayes’ Optimal Decision Rule:• Given a specific value of R, choose the alternative which
maximizes the posterior probability:
Equivalent Likelihood Ratio Test (LRT):
1( 1| ) ( 0 | )
0
SP S R P S R
S
><
1( | 1) ( 0)( | 0) ( 1)
0
SP R S P SLRP R S P S
S
>=
<
LRT separates priors (under experimenter control) from neuron’s stimulus-response characteristic.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
7www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Properties of the Likelihood Ratio
• Separates computation (left side) from biases and costs (prior probabilities, right)
• If the conditional probabilities P(R|Si) are either Poisson or Gaussian with equal variances, LRT reduces to comparing the response R with a threshold γ:
• LRT works not only for scalar observations (e.g. a spike count from a single neuron), but also for multidimensional observations (e.g.temporal discharge patterns and ensembles of neurons).
• Decision rule invariant to monotonic transformation (e.g. logarithm)• For independent observations (e.g. simultaneous observations from
multiple neurons or multiple observation intervals from same neuron), logLR is additive:
1( | 1) ( 0)( | 0) ( 1)
0
SP R S P SLRP R S P S
S
>=
<
1
0
SR
S
γ><
log ( 1, 2,..., ) log ( )N
LR R R RN LR Ri=∑1i=www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
8www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Detection and False Alarm ProbabilitiesP(R|S0)
P(R|S1)γ
PF
PD
( ) ( 0) ( 1) (1 )P Error P S PF P S PD= + −
• The neural response (in general LR) is compared to a criterion to make a decision about which stimulus was presented
• Two types of errors: “misses” and “false alarms”• PF is the probability of false alarm• PD is the probability of detection; PM = 1-PD is the probability of a
miss
Poisson Example
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
9www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Receiver Operating Characteristic (ROC)
Conditional Probabilities
(Poisson)
P(R/S0)
P(R/S1)
ROC Curves
• The area under the ROC curve gives a distribution-free measure of performancewww.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
10www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
d’ as a measure of performance• If the conditional probabilities P(R|S0)
and P(R|S1) are Gaussian with equal variances, then d’ = ∆r/σ completely determines the performance (the ROC curve):
• Many probability distributions (including Poisson) approach a Gaussian when the mean response becomes moderately large.
• If so, performance in non-Gaussian cases can be approximated by d’. If, as in the Poisson case, the variances are unequal under the two alternatives, they can be averaged.
• d’ is a poor measure of performance when the number of spikes is very small or when the two stimuli are widely separated
2 / 22
'/ 2
1/ 2 zI
d
PC e dzπ∞
−
−
= ∫
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
11www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Visual discrimination of random-dot patterns by monkeys and cortical neurons
• Single-unit recordings from Area MT (middle temporal) of awake macaques
• Single-unit responses and behavioral performance were recorded simultaneously
• Performance of typical neuron matches behavioral performance
Figures removed due to copyright reasons. Please see: Figures from Newsome, W. T., K. H. Britten, C. D. Salzman, and J. A. Movshon. "Neuronal mechanisms of motion perception. Cold Spring Harbor Symp." Quant Biol 55 (1990): 697-705.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
12www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Forward masking in the auditory nerve?
Figures removed due to copyright reasons.
Please see:Relkin, E. M., and C. W. Turner. “A reexamination of forward masking in the auditory nerve.” J Acoust Soc Am 84, no. 2 (Aug 1988): 584-91.
Psychophysical masked thresholds grow much faster with masker level than neural thresholds. Maximum masking can reach 30-50 dB.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
13www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Performance based on spike count from single AN fiber severely deviates from Weber’s law
0 20 40 60 80 1000
500
1000
0 20 40 60 80 1000
10
20
30
Intensity (dB SPL)
Dis
char
ge ra
te
(sp/
sec)
Inte
nsity
DL
(dB
)
Auditory-nerve fiber
Hypothetical neuron verifying Weber’s law
Poisson statistics assumedwww.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
14www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
The “lower envelope principle”Somatosensory: Detection of
sinusoidal vibration Auditory: Pure tone detection
Figure removed due to copyright reasons.
Please see:Delgutte, B. “Physiological models for basic auditory percepts.” Auditory Computation. Edited by H. Hawkins, and T. McMullen. New York, NY: Springer-Verlag, 1996, pp. 157-220.
Figure removed due to copyright reasons.
Mountcastle (1972)
Psychophysical performance is determined by the most sensitive neuron in the population.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
15www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Optimal pooling of information across neurons• Performance in discriminating two stimuli
can be improved by combining information across neurons. Specifically, if the spike counts from N neurons are either Poisson or Gaussian and statistically-independent, the optimum combination rule is to form a weighted sum of the spike counts. The discriminability index for this optimum combination is given by d’2 = Σi d’i2 .
• The structure of the optimum detector is identical to that of a single-layer perceptronin artificial neural networks. The weights can be interpreted as the strengths of synaptic connections and the threshold device as the threshold for all-or-none spike discharges in a postsynaptic neuron.
• When responses of the different neurons are not statistically independent, benefits of combining information may be better or worse than in independent case depending on the nature of the correlations.
Delgutte (unpublished)
2i
ii
rwσ∆
=
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
16www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Rate-Place Model for Intensity Discrimination (Siebert)
• Siebert (1968) developed the first model for predicting the performance of a sensory system based on the activity in primary neurons. His model incorporated all the key features of AN activity known at the time, including logarithmic cochlear frequency map, cochlear tuning, saturating rate-level functions and Poisson discharge statistics.
• The model predicts constant performance in pure-tone intensity discrimination over a wide range of intnesities (Weber’s law) by relying on the unsaturated fibers on the skirts of the cochlear excitation pattern.
• Psychophysical experiments (Viemeister, 1983) have since ruled out this model because listeners show good performance even when masking noise restricts information to a narrow band around the tone frequency. www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
17www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Spread of excitation to remote cochlear places makes model predict Weber’s Law…
Characteristic Frequency (Hz)
Dis
char
ge R
ate
(sp/
s)50 dB
80 dB
100 1000 10,0000
250
100 1000 10,0000
250
…but performance severely degrades in band-reject noise.
Characteristic Frequency (Hz)
Dis
char
ge R
ate
(sp/
s)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
18www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Rate vs. Temporal Codes
• Temporal code can convey more information than rate code
• Distinction between rate and temporal codes not that easy:
• Rate has to change when new stimulus occurs. How fast can it change and still be called “rate code”?
• Poisson process completely specified by instantaneous firing rate. If instantaneous rate tracks stimulus waveform, is it a rate or a temporal code?
From:Rieke, Fred, David Warland, Rob de RuytervanSteveninck, and William Bialek. Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press/Bradford Books, 1997 (c). Used with permission.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
19www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Rate-Place Code• Narrow tuning functions• Neurons are topographically
organized to form map of stimulus parameter
• Each stimulus activates a restricted population of neurons
• Stimulus estimated from location of maximum activity along map
• Examples:– Retinal map of space– Tonotopic map– ITD map in barn owl
(Jeffress model)?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
20www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
A distributed code: The population vector (Georgopoulos, 1986)
• Broad tuning functions around a preferred direction
• Topographic map not necessary• Each stimulus activates entire
neural population to a degree• Each neuron’s response can be
seen as a vector, with angle equal to the neuron’s best direction and magnitude equal to firing rate
• Vector sum across neural population accurately points to the stimulus direction
• Example: Direction of arm movement in motor cortex.
• Could apply to any angular dimension, e.g. sound azimuth.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
21www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Entropy of a Random Variable
Definition:
• Entropy is expressed in bits (binary choices)• For a uniform probability distribution (p(x) = 1/N for all
x), the entropy is the logarithm (base 2) of the number N of possible values of X. All other distributions have a lower entropy.
• The entropy represents the number of independent binary choices required to uniquely specify the value of a random variable, i.e. it quantifies the “uncertainty” about that variable. It is also the average length of a binary string of 1’s and 0’s required to encode the variable.
2( ) ( ) log ( )x X
H X p x p x∉
= −∑
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
22www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Entropy rates in soundsFor a continuous signal source, the entropy rate (in bits/s) is lim 1/ ( )
TR T H T
−>∞=
• CD-quality audio: 44,100 Hz X 16 bits X 2 channels = 1.4 Mb/s
• MP3: 32-128 kb/s• Telephone: 30 kb/s• Vocoder: 2.4-9.6 kb/s• Nonsense speech: 8 phonemes/sec X log2(32)
phonemes = 40 b/s (overestimate because not equiprobable)
• This lecture: 2 bits/3600s = 0.0006 b/s• What about neuron firings?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
23www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Entropy rate of spike trainsSpike Times (binomial distribution, McKay & McCulloch 1952)
rr is the average firing rate, ∆t the temporal resolutionExample: HT = 950 bits/s (3.2 bits/spike)
for r = 300 spikes/s, ∆t=1 msSpike Counts (exponential distribution - optimal)
n=n=rTrT is the mean spike count, T is the recording timeExample: HR=26 bits/s (0.1 bits/spike) for r = 300 spikes/s, T=300
ms
2logTeR rr t
⎛ ⎞≈ ⎜ ⎟∆⎝ ⎠
[ ]2 2log (1 ) log (1 1/ ) /RR n n n T= + + +
The entropy rate of a spike train gives an upper bound on the information about the stimulus that can be transmitted by the spike train.
Figures removed due to copyright reasons.Please see:Figures from Rieke F., D. A. Bodnar, W. Bialek. "Naturalistic stimuli increase the rate and efficiency ofinformation transmission by primary auditory afferents." Proc Biol Sci 262, no. 1365(December 22, 1995): 259-65. (Courtesy of The Royal Society. Used with permission.)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
24www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Conditional Entropy and Mutual Information
H(S) H(R)
H(R|S)H(S|R) I(S,R)
• The entropy H(S) represents the uncertainty about the stimulus in the absence of any other information
• The conditional entropy H(S|R) represents the remaining stimulus uncertainty after the neural response has been measured
• I(S,R) = H(S)–H(S|R) is the mutual information between S and R; it represents the reduction in uncertainty achieved by measuring R
• If S and R are statistically independent, then I(S,R)=0 and H(S,R)=H(S)+H(R)
• By symmetry, I(S,R) = I(R,S) = H(R)–H(R|S)
p(s,r) = p(s|r) p(r)
H (S) = _Σp(si) log2 p(si)
H (R,S) = _ΣΣp(si,rj) log2 p(si,ri)
Equivalent forms for average information:I(R, S) = H(R) _ H(R|S)I(R, S) = H(S) _ H(S|R)
I(R, S) = H(R) + H(S) _ H(R, S)
i
i j
Bayes' theorem
Entropy of S
Joint entropy of R and S
H (S|R) = _Σp(rj) Σp(si|rj) log2 p(si|rj)ij
Conditional entropy of R given S or neuronal noise
H (R|S) = _Σp(sj) Σp(ri|sj) log2 p(ri|sj)ij
Conditional entropy of S given R or stimulus equivocation
Entropy and Information.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
25www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Direct method for computing mutual information between stimulus and neural response
H(S) H(R)
H(R|S)H(S|R)
I(S,R)
• The most obvious method for measuring I(S,R)would be to subtract H(S|R) from H(S). But H(S|R) is hard to measure because it requires estimating the stimulus from neural response.
• Trick: By symmetry, I(S,R) is also H(R) – H(R|S).H(R|S) is the entropy of the part of the neural response that is NOT predictable from the stimulus, i.e. the noise in the response.
Method:• Present stimulus set S with probability P(S). Measure neural response to many
presentations of each stimulus. • Estimate P(R) for entire stimulus set and compute H(R) from P(R).• To estimate the noise in the response, average response to all presentations of same
stimulus, and subtract average from response to each trial. Compute H(R|S) from the estimated noise distribution.
• Subtract H(R|S) from H(R). • Assumptions about noise distribution (e.g. Gaussian) can be made to simplify estimation of
probabilities. • In practice, this method becomes prohibitive in its data requirements for large
dimensionality, e.g. when computing information for population of neurons.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
26www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Classic Example (Shannon): Information Rate of Gaussian Channel
S(t)
N(t)
R(t)
( )20
( )log 1 ( )
W
INFOS fR dfN f= +∫
Both the stimulus S(t) and the additive noise N(t) have Gaussian probability distributions which are fully characterized by the power spectra S(f)and N(f).
Figures removed due to copyright reasons.
Please see: Borst, A., and F. E. Theunissen. “Information theory and neural coding.“ Figure 2 in Nat Neurosci 2, no. 11 (Nov 1999): 947-57. Information rate is entirely determined
by available bandwidth W and signal-to-noise ratio S/N.
Telephone channel: S/N=30 dB, W=3000 Hz => I = 30 kb/s
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
27www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Natural and synthetic songs are coded more efficiently than noise in zebra finch auditory neurons
Neural recording from 3 locations:• MLd: Auditory midbrain • Field L: Primary cortex• CM: Secondary cortex involved in song processing
3 types of stimuli:• Song: 20 natural songs from male zebra
finch • Syn-song: Mimics spectral and temporal
modulations in song • ML-noise: Noise with limited range of
modulations
Hsu, A., S. M. Woolley, T. E. Fremouw, and F. E. Theunissen. “Modulation power and phase spectrum of natural sounds enhance neural encoding performed by single auditory neurons.” Figures 1A, 4, and 5 in J Neurosci 24, no. 41 (Oct 13, 2004): 9201-11. (Copyright 2004 Society for Neuroscience. Used with permission.)www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
28www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Using the stimulus reconstruction method to estimate stimulus information in spike train
• Computing I(S,R) directly from the definition often requires too much data.
• Data processing theorem: If Z=f(R), then I(S,Z)≤ I(S,R).“Data processing cannot increase information”.• Special case: If Ŝ is an estimate of S based on neural
response R, then I(S,Ŝ)≤ I(S,R)
• Method:• Use Gaussian stimulus S(t) and compute linear
estimate Ŝ(t) from neural response. • Define noise as N(t)=S(t)-Ŝ(t), assumed to be
Gaussian.• Compute power spectra of N(t) and Ŝ(t).• Use formula for information rate of Gaussian channel
to estimate RINFO.
• The method is meant to give a lower bound on the information rate. The better the stimulus estimate and the Gaussian assumption, the more accurate the bound.
From:Rieke, Fred, David Warland, Rob de RuytervanSteveninck, and William Bialek. Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press/Bradford Books, 1997 (c). Used with permission.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
29www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Linear reconstruction of a stimulus from the spike train (Bialek et al., 1991)
• Given a stimulus waveform s(t) and a spike train r(t),what is the linear filter h(t) which gives the least-squares estimate of s(t) from r(t)?
• The solution is given (in the frequency domain) by the Wiener filter:
– H(f) is the Fourier transform of h(t), i.e. the filter frequency response, Sr(f) is the power spectrum of the spike train, and Srs(f) is the cross-spectrum between r(t) and s(t)
2
0
ˆ( ) ( )* ( ) ( )
ˆFind ( ) to minimize ( ( ) - ( ))
ii
T
s t h t r t h t t
h t E dts t s t
= = −
=
∑
∫
( )( )( )
rs
r
S fH fS f
=
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
30www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Example of linear stimulus reconstruction from a spike train
Figures removed due to copyright reasons.
Please see:Borst, A., and F. E. Theunissen. “Information theory and neural coding.” Figure 4 in Nat Neurosci 2, no. 11 (Nov 1999): 947-57.
• Fly visual system (“H1 neuron”), moving grating stimulus• Reconstruction resembles lowpass filtered stimuluswww.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
31www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Bullfrog advertisement callWaveform Power Spectrum
Male bullfrogs produce an advertisement call to convey information about location and breeding readiness to both sexes.
Figure by MIT OCW.
From:Rieke, Fred, David Warland, Rob de RuytervanSteveninck, and William Bialek. Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press/Bradford Books, 1997 (c). Used with permission.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
32www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Information rates in bullfrog primary auditory fibers are higher for call than for noise
• Linear stimulus reconstruction method was used to estimate (a lower bound on) information rate for both broadband Gaussian noise and call-shaped noise.
• Information rates are higher for call-shaped noise (40-250 bits/s) than for broadband noise (10-50 bits/s).
Figures removed due to copyright reasons.Please see:Figures from Rieke F., D. A. Bodnar, W. Bialek. "Naturalistic stimuli increase the rate and efficiency ofinformation transmission by primary auditory afferents." Proc Biol Sci 262, no. 1365(December 22, 1995): 259-65. (Courtesy of The Royal Society. Used with permission.)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
33www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Coding efficiency in bullfrog primary auditory fibersInformation RateCoding Efficiency =
Entropy Rate in Spike Train
• The coding efficiency is a measure of the fraction of the information contained in the spike train that is used to code the stimulus.
• Coding efficiency ranged from 0.05 to 0.2 for the broadband stimulus, 0.35 to 0.9 for the call-shaped noise.
• These high coding efficiencies were achieved despite the poor quality of the stimulus reconstructions.
Figures removed due to copyright reasons.Please see:Figures from Rieke F., D. A. Bodnar, W. Bialek. "Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents." Proc Biol Sci 262, no. 1365 (December 22, 1995): 259-65. (Courtesy of The Royal Society. Used with permission.)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
34www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Information rates and coding efficiencies in sensory neurons
Information Rate(bits/spike)
Information Rate(bits/s)
Coding Efficiency
Best TimingPrecision and
Variance : MeanNeural System and Species
Cat retinal ganglion cellsPrimary visual cortex (VI) of rhesus monkeysMiddle temporal area (MT) of rhesus monkeysInferior temporal area (IT) of rhesus monkeysHippocampus of rhesus monkeys
~0.04-0.10-0.025~0.13~0.18
0.4-0.80.620.89 0.290.90.32(max = 1.2)
+-
-
---
-
~1.3--
Constant stimulus
HI motion-sensitive neuron of a flyFrog auditory afferentsVibratory receptors of the bullfrog sacculusCricket mechanoreceptorsSalamander retinal ganglion cells
The MT of anesthetized rhesus monkeys
The MT of alert rhesus monkeys
0.750.662.60.6-3.21.9
~0.65
0.6
642315575-2943.7 (up to 10 fora population of>10 cells)6.7(max = 12.3)5.5
30%11%50-60%50-60%26% (>79%for >10 cells)
-
<30%
~2 ms-~0.4 ms-
2-4 ms
-
Variable stimulus: reconstruction method
HI motion-sensitive neuron of a flySalamander and rabbit retinal ganglion cellsThe MT of alert rhesus monkeys
2.433.71.5
8016.312(max = 29)
50%59%Up to 45%
1.5-3 ms; <0.1> 0.70 ms; > 0.05<2 ms; ~1.4_ _
Variable stimulus: direct method
Table by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
35www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Information Theory Pro’s and Con’sPros• Does not assume any particular neural code• Can be used to identify the stimulus features best encoded by
neurons or to compare effectiveness of different putative neuralcodes
• One number summarizes how well stimulus set is coded in neural response
Cons• Information estimate depends on stimulus set. Stimulus
probabilities in environment hard to specify.• Does not specify how to read out the code: the code might be
unreadable by the rest of the nervous system.• For all but simplest examples, estimation of mutual information
requires huge amount of data. Methods that try to circumvent data limitations (e.g. stimulus reconstruction) make additional assumptions (e.g. linearity or Gaussian distribution) that are not always valid.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
36www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References
Slide 25: Borst, A., and F. E. Theunissen. "Information theory and neural coding." Nat Neurosci 2, no. 11 (Nov 1999): 947-57.
Slide 35: Buracas, G. T., and T. D. Albright. "Gauging sensory representations in the brain." Trends Neurosci 22, no. 7 (Jul 1999): 303-9.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
37www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Harvard-MIT Division of Health Sciences and Technology
HST.722J: Brain Mechanisms for Hearing and Speech
Course Instructor: Joseph S. Perkell
Motor Control of Speech: Control Variables and Mechanisms
HST 722, Brain Mechanisms for Hearing and Speech
Joseph S. Perkell MIT
111/2/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
38www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Introduction • Measuring speech production • What are the “controlled variables” for segmental
(phonemic) speech movements? • Segmental motor programming goals • Producing speech sounds in sequences • Experiments on feedback control • Summary
2112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
39www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Introduction – Utterance planning – General physiological/neurophysiological features – The controlled systems – Example of movements of vocal-tract articulators
• Measuring speech production • What are the “controlled variables” for segmental
speech movements? • Segmental motor programming goals • Producing speech sounds in sequences • Experiments on feedback control • Summary
3112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
40www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Utterance Planning
• Objective: generate an intelligible message while providing for �“economy of effort” – stages: �– Form the message (e.g. Feel hungry; smell pizza; together with a friend). – Select and sequence lexical items (words). “Do you want a pizza?” – Assign a syntactically-governed prosodic structure. – Determine “postural” parameters of overall rate, loudness and degree of
reduction (and settings that convey emotional state, etc.) • Extreme reduction: “Dja wanna pizza?”
– Determine temporal patterns: Sound segment durations depend on: • Phoneme length • Overall rate • Intrinsic characteristics of sounds • Position and number of syllables in word
• Result: an ordered sequence of goals for the production mechanism
4112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
41www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Serial Ordering
• Evidence reflecting serial ordering in utterance planning: speech errors – Examples from Shattuck-Hufnagel (1979)
• Substitution Anymay (Anyway) • Exchange emeny (enemy) • Shift bad highvway dri_ing (highway driving) • Addition the plublicity would be (publicity) • Omission sonata _umber ten (number) • ? dignal sigital processing
– See Averbeck et al. on neurophysiologial evidence concerning serial ordering
5112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
42www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
General Physiological/Neurophysiological Features • Muscles are under voluntary control • Structures contain feedback receptors that
supply sensory information to the CNS: – Surfaces: touch/pressure – Muscles:
• length and length changes: spindles • Tension: tendon organs
– Joints (TMJ): joint angle • Reflex mechanisms:
– Stretch – Laryngeal (coughing) – Startle
• Motor programs (low-level, “hard wired” neural pattern generators) – Breathing – Swallowing – Chewing – Sucking
Pharynx
Larynx
Epiglottis
Lips
Arytenoid cartilage
Esophagus
Lungs
Soft palate
Vocal cords
Nasal cavity
Tongue
Hyoid
Teeth
Trachea
Diaphragm
Adam's apple
Thyroid cartilage
Cricoid cartilage
• Low-level circuitry could be employed in Figure by MIT OCW.
speech motor control. The picture is complex, and a comprehensive account hasn’t emerged.
112/05 6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
43www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
The controlled systems • The respiratory system
– most massive (slowly-moving structures) – Provides energy for sound production
• Fluctuations to help signal emphasis • Relatively constant level of subglottal pressure
– Different patterns of respiration: breathing, reading aloud, spontaneous, counting
– Different muscles are active at different phases of the respiratory cycle – a complex, low-level motor program
• Larynx – Smallest structures, most rapidly contracting
muscles – Voicing, turned on and off segment-by-segment – F0, breathiness – suprasegmental regulation
• Vocal tract – Intermediate-sized, slowly moving structures:
tongue, lips, velum, mandible – Many muscles do not insert on hard structures – Can produce sounds at rates up to 15/sec – To do so, the movements are coarticulated
Figures removed due to copyright reasons.
7112/05
Please see: Conrad, B., and P. Schonle. "Speech and respiration." Arch Psychiatr Nervenkr 226, no. 4 (1979): 251-68.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
44www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Velum
LipsTongue blade
Tongue body
Focus of lecture is on movements of vocal-tract articulators
• Consider the movements of each of these structures
• Approximate number of muscle pairs that move the
– Tongue: 9 – Velum: 3 – Lips: 12
Mandible – Mandible: 7 – Hyoid bone: 10
Hyoid bone – Larynx: 8 – Pharynx: 4
Larynx • Not including the respiratory system
• Observations: • A large number of degrees of freedom • A very complicated control problem
8112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
45www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Introduction • Measuring speech production
– Acoustics – Articulatory movement – Area functions
• What are the “controlled variables” for segmental speech movements?
• Segmental motor programming goals • Producing speech sounds in sequences • Experiments on feedback control • Summary
9112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
46www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
•� Acoustics – important for perception Measuring Speech Production – Spectral, temporal and amplitude measures
•� Vowels, liquids and glides: – Time varying patterns of formant frequencies
•� Consonants: – Noise bursts �
Figure removed due to copyright reasons.
– Silent intervals – Aspiration and frication noises – Rapid formant transitions
“The yacht was a heavy one” From: Perkell, Joseph S. Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Research Monograph No. 53. Cambridge, MA: •� Movements MIT Press. 1969(c). Used with permission.
– From x-ray tracings – With an Electro-Magnetic
Midsagittal Articulometer(EMMA) System
•� Points on the tongue, lips, jaw, (velum) �
Reprinted with permission from: Perkell, J., M. Cohen, M. Svirsky, M. Matthies, I. Garabieta,
and M. Jackson. "Electro-magnetic midsagittal articulometer (EMMA) systems
• Other parameters: air pressures �
for transducing speech articulatory movements. J Acoust Soc Am 92 (1992):
and flows, muscle activity …�
3078-3096. Copyright 1992, Acoustical Society America.
112/05
10
Please see: Steven, K. Acoustic Phonetics. Cambridge, MA: MIT Press, 1998, p. 248. ISBN: 026219404X.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
47www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
EMMA Data Collection
• Transducer coils are placed on subject’s articulators • Subject reads text from an LCD screen • Movement and audio signals are digitized and displayed in real time • Signals are processed and data are extracted and analyzed
11112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
48www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
112/05 12
Analysis of EMMA data
• Algorithmic data extraction at time of minimum in absolute velocity during the vowel:– Vowel formants– Articulatory positions (x, y)
800 900 1000 1100 1200 1300 1400 MSEC
AUDIO and VELOCITY MAGNITUDE FOR TD
5
10
15
20
25
CM
/SE
C
s e # h u d # h , d # , t
800 900 1000 1100 1200 1300 1400 MSEC
AUDIO and VELOCITY MAGNITUDE FOR TD
5
10
15
20
25
CM
/SE
C800 900 1000 1100 1200 1300 1400 MSEC
AUDIO and VELOCITY MAGNITUDE FOR TD
5
10
15
20
25
CM
/SE
C800 900 1000 1100 1200 1300 1400 MSEC
AUDIO and VELOCITY MAGNITUDE FOR TD
5
10
15
20
25
CM
/SE
C800 900 1000 1100 1200 1300 1400 MSEC
AUDIO and VELOCITY MAGNITUDE FOR TD
5
10
15
20
25
CM
/SE
C
s e # h u d # h , d # , t
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
49www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
/i/ - 2 speakers 3-Dimensional Area Function Data /u/ - 2 speakers
Transverse (horizontal)
/#/, /i/ - 2 speakers
• MR images of sustained vowels (Baer et al., JASA 90: 799-828)
– Area functions are more complicated than they look in 2 dimensions
– There are lateral asymmetries, but 2-D midsagittal Coronal (vertical) (midline) movement data provide useful information
13112/05
Baer, T., J. C. Gore, L. C. Gracco, and P. W. Nye. "Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels." J Acoust Soc Am 90 (1991): 799.�Reprinted with permission from
Copyright 1991, Acoustical Society America.��www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
50www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Introduction • Measuring speech production • What are the “controlled variables” for segmental
(phonemic) speech movements? – Possible controlled variables – Modeling to make the problem approachable: DIVA – A schematic view of speech movements
• Segmental motor programming goals • Producing speech sounds in sequences • Experiments on feedback control • Summary
14112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
51www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
“What are the controlled variables?”
• The question has theoretical and practical implications – What are the fundamental motor
programming units and most Figure removed due to copyright reasons.
appropriate elements for phonological/phonetic theory?
– What domains should be the main focus of research for diagnosis and treatment of speech disorders?
“The yacht was a heavy one”
•� Objective of Speaker: –� To produce sounds strings with acoustic patterns that result in intelligible
patterns of auditory sensations in the listener •� Acoustic/auditory cues depend on type of sound segment : �
–� Vowels and glides: Time varying patterns of formant frequencies �–� Consonants: Noise bursts, Silent intervals, Aspiration and frication noises, �
Rapid formant transitions
15112/05
Please see: Steven, K. Acoustic Phonetics. Cambridge, MA: MIT Press, 1998, p. 248. ISBN: 026219404X.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
52www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Possible Motor Control Variables
Genio-glossus
Figure by MIT OCW.
Tongue
Jaw Hyoid bone
Hyo-glossus
•� Auditory characteristics of speech sounds are determined by: 1. Levels of muscle tension 2. Changing muscle lengths and movements of structures 3. The vocal-tract shape (area function) 4. Aerodynamic events and aeromechanical interactions 5. The acoustic properties of the radiated sound
•� Hypothetically, motor control variables could consist of feedback about any combination of the above parameters
16112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
53www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Modeling to make the problem approachable: DIVA
“Directions Into Velocities of Articulators” (Guenther and Colleagues – Next lecture)
Update
To
Sound Representation
4 Somatosensory Error
2 Commands
3 Auditory Error
Auditory Goal
Region Auditory Feedback
Feedback Feedforward
Command
Feedback-based Command
Auditory Feedback-based Command
Feedforward Feedback
1 Speech
Muscles
Motor
Somatosensory Goal Region
Somatosensory
Somatosensory
Subsystem Subsystem
• A neuro-computational model of relations among cortical activity, motor output, sensory consequences
• Phonemic Goals: Projections (mappings) from premotor to sensory cortex that encode expected sensory consequences of produced speech sounds – Correspond to regions in
multidimensional auditory-temporal and somatosensory-temporal spaces
• Roles of feedforward and feedback subsystems will be discussed later.
17112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
54www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
A Schematic View ofSpeech Movements
• Planned and actual acoustic trajectories illustrate: – Auditory/acoustic goal regions
– Economy of effort (Lindblom) – Coarticulation – Motor equivalence – Biomechanical saturation
(quantal) effects • When controlling an articulatory
speech synthesizer, DIVA, accounts for the first four and – Aspects of acquisition – Responses to perturbations
Planned trajectory
Biomechanical
height
height
Biomechanical A
rticu
lato
rpo
sitio
n
actual trajectories to lag planned trajectory
Acoustic goal region for /u/
Acoustic result of biomechanical
blade contacting sides of palatal vault)
Acoustic Space
Actual trajectories Acoustic goal region for /i/
Acoustic goal region for /g/
(closure) on actual trajectories
Lip protrusion
protrusion & tongue-body height
Repetition 1
Repetition 1
Repetition 2
Repetition 2
constraint (closure) A
cous
tic d
imen
sion
1 (e
.g. F
1)
Time
effect (inertia) causing saturation effect
Saturation effect (stiffened tongue-
Economy of effort
Articulator Space
Economy of effort
Effect of biomechanical constraint
Tongue-blade
Tongue-body
Trading relation between lip
Figure by MIT OCW. 18112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
55www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Introduction • Measuring speech production • What are the “controlled variables” for segmental
speech movements? • Segmental motor programming goals
– Anatomical and acoustic constraints: Quantal effects – Individual differences – anatomy – Motor equivalence: A strategy to stabilize acoustic
goals – Clarity vs. economy of effort – Relations between production and perception
• Vowels • Sibilants
• Producing speech sounds in sequences • Experiments on feedback control • Summary
19112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
56www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anatomical and Acoustic Constraints on Articulatory Goals • Properties of speakers’ production and perception mechanisms help
to define goals for speech sounds that are used in speech motor planning
• Some of these properties are characterized by quantal effects (Stevens), which can also be called “saturation effects”
I
Articulatory parameter
Aco
ustic
par
amet
er
II
III
• Schematic example: A continuous change in an articulatory parameter produces two regions of acoustic stability, separated by a rapid transition
• Hypothesis: some goals are auditory and can be characterized in terms of acoustic parameters:formant
Figure by MIT OCW. frequencies, relative sound level, etc.
• Languages “prefer” such stable regions • The use of those regions by individual speakers helps to produce �
relatively robust acoustic cues with imprecise motor commands �
20112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
57www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Goals for the vowel /i/ - A. An acoustic saturation (quantal) effect for constriction location (Stevens, 1989)
4
4
3
2
2
1
0 0 6 8
1 ( )
()
L1 L2
A1 A2
10 12 Length of back cavity, L cm
Freq
uenc
y kH
z
Figure by MIT OCW.
Perkell, J. S., and W. L. Nelson. "Variability in production
There is a range (in green) of back
of the vowels /i/ and /a/." J Acoust Soc Am 77 (1985): 1889-1895.
cavity lengths over which F1-F3 are �relatively stable. �
Many repetitions of /i/ in two subjects �show a corresponding variation of �constriction location. �
However, as reflected in the articulatory �data, the formants of /i/ are sensitive to variation in constriction degree. �
21112/05
Reprinted with permission from
Copyright 1985, Acoustical Society America.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
58www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Quantal and non-quantal articulatory-to-acoustic relations for /L/ and /$/
Perkell, J. S., and W. L. Nelson. "Variability in production
�
of the vowels /i/ and /a/." J Acoust Soc Am 77 (1985): 1889-1895. 22
112/05
Reprinted with permission from:
Copyright 1985, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
59www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
A biomechanical saturation effect for constriction degree for /i/
0
1
2
3
4
0 40 80 120 140
F1
F2
F3
kHz
Stiff Tongue Blade
Lax Tongue Blade
Percent
GGp
Figures by MIT OCW.
Perkell, J. S., and W. L. Nelson. "Variability in production
of the vowels /i/ and /a/." J Acoust Soc Am 77 (1985): 1889-1895.
• Constriction degree and resulting formants can be stabilized
– Stiffening the tongue blade (with intrinsic muscles)
• Constriction area (shaded) varies – Pressing the stiffened tongue blade against little, even with variation in GGp the sides of the hard palate through contraction (from a 3D tongue model by contraction of the posterior genioglossus Fujimura & Kakita) (GGp) muscles
112/05 23
Reprinted with permission from:
Direction of posterior genioglossus contraction
Copyright 1985, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
60www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Tongue Contour Differences Among Four Speakers
a
o
u
I
�
e
C8
C7
C6
C5C4
C3
C2
C1 ee
ae
oo
uh er
Figure by MIT OCW.
• Note the tongue contour differences among the four speakers.
• An “auditory-motor theory of speech production” (Ladefoged, et al., 1972)
Figures removed due to copyright reasons.
24112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
61www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Effects of different palate shapes on vowel articulations
• Palatal shapes differ among individuals • Palatal depth can influence
– Spatial differences in vowel targets
Figures removed due to copyright reasons.Please see:Perkell, J. S. "On the nature of distinctive features: Implications of a preliminary vowel production study."Frontiers of Speech Communication Research. Edited by B. Lindblom and S. Öhman.London, UK: Academic Press, 1979, p. 372 (right), 373 (left).
/i/ /I/ Figures by MIT OCW.
25112/05 www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
62www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Production of /u/
• Contractions of the styloglossus andposterior genioglossus
• Note: place of constriction & variationin constriction location
1
2 3
4 5 6 7
Dorsal
Ventral
i
a
u
), A schematic illustration of articulatory data for multiple repetitions of the vowels /a/ (tongue surface illustrated by /i/ ( ) and /u/ ( ), showing elliptical distribution of positioning of points on the tongue surface, with the long axes of the ellipses oriented parallel to the vocal-tract midline.
Figure by MIT OCW.
Figure by MIT OCW. 26112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
63www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Stabilizing the sound output for the vowel /u/: Motor Equivalence
•� Hypothesis: negative correlation between tongue-body raising and lip protrusion in multiple repetitions of the vowel ( )
()
0.4
0.5
0.6
0.8
0.7
1.2 1.3 1.4 1.5 1.6 1.7
0.9
Upper lip x position cm
Tong
ue y
pos
ition
cm
Acoustic goal region for /u/
Aco
ustic
Dim
ensi
on 1
(e.g
. F1)
Lip Protrusion
Repetition 2 Repetition 1
& tongue-body height
Arti
cula
tor P
ositi
on
Repetition 2
Repetition 1
Actual Trajectories
Planned Trajectory
Tongue-Body Height
Trading relation between lip protrusion
3
2
1
0
-1
-2
-3
-4
-5 -5.5 -3.5 -1.5 0.5 2.5
LI
LLTB
UL Palate
•� Hypothesis is supported in a number of subjects
•� The goal for the articulatory movements for /u/ is in an acoustic/auditory frame of reference, not a spatial one
•� Strategy: Stay just within the acoustic goal region
27112/05 Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
64www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Palatal Depth and Motor Equivalence
3
2
1
0
-1
-2
-3
-4
-5 -5.5 -3.5 -1.5 0.5 2.5
LI
LL
LL
TB
UL UL
-6 -5 -5
-4
-4
-3
-3
-2
-2
-1
-1
0
0
1
1
2
2
3
LI
Palate
Palate TB
1cm
Subject 6Subject 2
Palatal Depth and Motor Equivalence
Tran
sduc
er P
ositi
on (c
m)
Transducer Position (cm) Transducer Position (cm)
Tran
sduc
er P
ositi
on (c
m)
Figure by MIT OCW.
• Palatal depth can also influence – Variability in movement toward vowel targets
28112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
65www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Motor Equivalence for /r/
• Speakers use similar articulatory trading relations when producing /r/ in different phonetic contexts (Guenther, Espy-Wilson, Boyce, Matthies, Perkell, and Zandipour, 1999, JASA)�
• Acoustic effect of the longer front �
cavity of the blue outlines is� compensated by the effect of the �
longer and narrower constriction of the red outlines (e.g., Stevens, 1998).
• F3 variability is greatly decreased by these articulatory trading relations.
• Conclusion: The movement goal for /r/ is a low value of F3 – an auditory/acoustic goal
29112/05
/wagrav//warav/ or /wabrav/
S1
S2
S3
S4
S5
S6
S7
BACK FRONT BACK FRONT
1 cm
Reprinted with permission from:Guenther, F., C. Espy-Wilson, S. Boyce, M. Matthies, M. Zandipour, and J. Perkell. "Articulatory tradeoffs reduce acoustic variability during American English /r/ production." J Acoust Soc Am 105 (1999): 2854-2865.Copyright 1999, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
66www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Clarity vs. Economy of Effort: Another principle (continuous, asopposed to. quantal) that influences vowel categories (Lindblom, 1971)
Figures removed due to copyright reasons.
• Used an articulatory synthesizer and heuristics to estimate the location of �vowels in F1, F2 space, based on �– A compromise between “perceptual differentiation” and “articulatory ease” and – The number of vowels in the language
• Approximated vowel distributions for languages containing up to about 7 vowels • Later discussed in terms of a tradeoff between clarity and economy of effort,
i.e., a relation between production and perception 30112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
67www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Relations between Production and Perception
• Close linkage between production and perception: –� Speech acquisition, with and without hearing –� Speech of Cochlear Implant users –� Second-language learning (e.g., Bradlow et al.)
–� Focused studies of production & perception (e.g., Newman)
–� Mirror neurons – a more general action-perception link (e.g., Fadiga et al.)
• Hypothesis: – Speakers who discriminate well between vowel sounds with subtle
acoustic differences will produce more clear-cut sound contrasts –� Speakers who are less able to discriminate the same sound stimuli will
produce less clear-cut contrasts
31112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
68www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Production Experiment
•� Data Collection –� Subjects: 19 young-adu
speakers of American English–� For each subject: �
•� Recorded articulatory �movements and acoustic signal�
Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, E. Stockmann, M. Tiede,
•� Subject pronounced �
M. Zandipour. "The distinctness of speakers' productions of vowel contrasts is related to their
“Say___ hid it.”;
discrimination of the contrasts." J Acoust Soc Am 116 (2004): 2338-44.
___ = cod, cud, who’d or hood
•� Clear, Normal and Fast conditions
• Analysis �–� Calculated contrast distance for each vowel pair: �
• Articulatory (TB) contrast distance: distance in mm between the centroids of the cod
and cud TB distributions. • Acoustic contrast distance: distance in Hz between centroids of F1, F2 distributions
for cod, cud
32�112/05
-60 -40 -20 0 20-30
-20
-10
0
10
x coordinate (mm)
y co
ordi
nate
(mm
) PALATE
TDTB
TTLI
UL
LLx cud+ cod
-60 -40 -20 0 20-30
-20
-10
0
10
x coordinate (mm)
y co
ordi
nate
(mm
) PALATE
TDTB
TTLI
UL
LLx cud+ cod
Reprinted with permission from
:
Copyright 2004, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
69www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Perception Experiment Who’d-hood continuum (male)
• Methods – Synthesized natural-
sounding stimuli in 7-cod-step continua – for
Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, E. Stockmann, M. Tiede,
cud, who’d-hood
M. Zandipour. "The distinctness of speakers' productions of vowel contrasts is related to their
– Each subject: Labeling
discrimination of the contrasts." J Acoust Soc Am 116 (2004): 2338-44.
and discrimination (ABX)tasks
• Results: ABX scores (2-step) – Ceiling effects: some 100% subjects probably had better discrimination than
measured– For further analysis divide subjects into two groups:
• HI discriminators - at 100% (above the median) • LO discriminators - (at median and below)
33112/05
70 80 90 1002-step ABX (% correct)
0
2
4
6
8
Num
ber o
f sub
ject
s
70 80 90 1002-step ABX (% correct)
Cod-Cud Who'd Hood
70 80 90 1002-step ABX (% correct)
0
2
4
6
8
Num
ber o
f sub
ject
s
70 80 90 1002-step ABX (% correct)
Cod-Cud Who'd Hood
Reprinted with permission from:
Copyright 2004, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
70www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Results & Conclusions
• HI discrimination subjects produced greater contrast distance than LO discrimination subjects (measured in articulation or acoustics)
• The more accurately a speaker discriminates a vowel contrast, the more distinctly the speaker produces the contrast
* Difference between HI and LO groups is significant at p < .001
Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, E. Stockmann, M. Tiede,
M. Zandipour. "The distinctness of speakers' productions of vowel contrasts is related to their discrimination of the contrasts." J Acoust Soc Am 116 (2004): 2338-44.
34112/05
0
2
4
6
8
10
L
LL
Art
icul
ator
yCon
trast
Dis
tanc
e (m
m)
H
HH
F N CSpeaking Condition
200
250
300
350
400
L
LL
Aco
ustic
Con
trast
Dis
tanc
e (H
z)
H
HH
L LL
H HH
F N CSpeaking Condition
L LL
HH
H
Who'd-Hood Cod -Cud
* *
*LO
HI LO
HI
LO
HI
LO
HI
L
LLH
HH
L
LL
H
HH
L LL
H HH
L LL
HH
H
- -Cud
* *
*LO
HI LO
HI
LO
HI
Reprinted with permission from:
Copyright 2004, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
71www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
A Possible Explanation
• It is advantageous to be as intelligible as possible • Children will acquire goal regions that are as distinct as possible
– Speakers who can perceive fine acoustic details learn auditory goal
regions that are smaller and spaced further apart than speakers with less acute perception, because
– The speakers with more acute perception are more likely to reject poorly produced tokens when learning the goal regions
35112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
72www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Consonants: A saturation effect for /s/ may help define the /s-6/ contrast
• Production of /6/ (as in “shed”) – Relatively long, narrow groove between
tongue blade and palate – Sublingual space
Figures removed due to copyright reasons. • Production of /s/ (as in “said”)�Please see: Figures from: Perkell, J. S., M. L. Matthies, M. Tiede, H. Lane, – Short narrow groove �M. Zandipour, N. Marrone, E. Stockmann, and F. H. Guenther. ”The Distinctness of Speakers' /s/—/∫/ Contrast is related to their – No sublingual space �auditory discrimination and use of an articulatory saturation effect.” J Speech, Language and Hearing Res 47 (2004): 1259-69. • Saturation effect for /s/ �
– As tongue moves forward from /6/, sublingual cavity volume decreases
– When tongue contacts lower alveolar ridge, sublingual cavity is eliminated, resonant frequency of anterior cavity increases abruptly
– After contact, muscle activity can increase further; output is unchanged
36112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
73www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Relations Between Production andPerception of Sibilants
• Hypothesis: The sibilants, /s/ and /5/, �have two kinds of sensory goals: �
/6/ /s/
– Auditory: particular distribution of � Figures removed due to copyright reasons. Please see:energy in the noise spectrum Figures from: Perkell, J. S., M. L. Matthies, M. Tiede, H. Lane, M. Zandipour, N. Marrone, E. Stockmann, and F. H. Guenther. – Somatosensory: e.g., patterns of � ”The Distinctness of Speakers' /s/—/∫/ Contrast is related to their
contact of the tongue blade with the � auditory discrimination and use of an articulatory saturation effect.” J Speech, Language and Hearing Res 47 (2004): 1259-69.palate and teeth
• Speakers will vary in their ability to discriminate /s/ from /5/ • Speakers use contact of the tongue tip with the lower alveolar ridge for �
/s/ to help differentiate /s/ from /5/ �– This will also vary across speakers
• Across speakers, both factors, ability to discriminate auditorily between �the two sounds and use of contact (a possible somatosensory goal), �will predict the strength of the produced contrast �
37112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
74www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Methods (with the same 19 subjects as the vowel study) �
• Production experiment – each subject: – Recorded: acoustic signal, and �
contact of the under side of the tongue tip with the lower alveolar ridge - with a custom-made sensor
– Subject pronounced, “Say___ hid it.”;___ = sod, shod, said or shed”
– Clear, Normal and Fast conditions • Analysis – calculated:
– Proportion of time contact was made
during the sibilant interval – Spectral median for /s/ and /5/� – Acoustic contrast distance:�
• Difference in spectral median � between /s/ and /5/�
Figures removed due to copyright reasons. Please see: Figures from: Perkell, J. S., M. L. Matthies, M. Tiede, H. Lane, M. Zandipour, N. Marrone, E. Stockmann, and F. H. Guenther. ”The Distinctness of Speakers' /s/—/∫/ Contrast is related to their auditory discrimination and use of an articulatory saturation effect.” J Speech, Language and Hearing Res 47 (2004): 1259-69.
• Perception experiment - eachsubject:– Labeled and discriminated
(ABX) between synthesized stimuli from a seven-step said to shed continuum
38 112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
75www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Results Use of tongue-to-lower-ridge contact Discrimination
Figures removed due to copyright reasons. Please see: Figures from: Perkell, J. S., M. L. Matthies, M. Tiede, H. Lane, M. Zandipour, N. Marrone, E. Stockmann, and F. H. Guenther. ”The Distinctness of Speakers' /s/—/∫/ Contrast is related to their auditory discrimination and use of an articulatory saturation effect.” J Speech, Language and Hearing Res 47 (2004): 1259-69.
– Nine subjects had percent correct – 12 subjects (left of vertical line) are classified as = 100; categorized as HI
Strong (S) for use of contact difference (c) between discriminators (right of line) /s/ and /5/ – 10 subjects had percent correct
– The remaining subjects are classified Weak (W) for� < 100; categorized as LO �use of contact difference discriminators �
39112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
76www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Produced contrast distance is related to
– Ability to discriminate the �contrast �
* �difference is significant, p < .01
– Use of contact difference Figures removed due to copyright reasons. Please see: Figures from: Perkell, J. S., M. L. Matthies, M. Tiede, H. Lane, M. Zandipour, N. Marrone, E. Stockmann, and F. H. Guenther.
• Interactions � ”The Distinctness of Speakers' /s/—/∫/ Contrast is related to their auditory discrimination and use of an articulatory saturation effect.” – Speakers with good � J Speech, Language and Hearing Res 47 (2004): 1259-69.
discrimination and use of �contact difference: best �contrasts �
– Speakers with one or the �other factor: intermediate �contrasts �
– Speakers with neither �factor: poorest contrasts �
40112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
77www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline (break time?)
• Introduction • Measuring speech production • What are the “controlled variables” for segmental speech
movements? • Segmental motor programming goals • Producing speech sounds in sequences
– An example utterance – Movements show context dependence
• Velar movements • Lip rounding for /u/
– Effects of speaking rate – Persistence of inaudible gestures at word boundaries
• Experiments on feedback control • Summary
41112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
78www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Producing Sounds in Sequences
• An example utterance
• * indicates articulations that aren’t strongly constrained by communicative needs
• Articulations anticipate upcoming requirements: anticipatory coarticulation
• Coarticulation: – asynchronous
movements of structures of differing sizes and movement time constants
– a complicated motor coordination task
k m b r
; /r/
/r/
*
*
/r/
*
/ /
*
/m/ /b/ *
/b/, ( /b/
)
* * *
/b/
/k/
/k/
body
Rise to contact roof of mouth to achieve closure & silence
Release contact to generate a noise
to vowel position
Begin movement toward position
Maintain position
blade
Maintain contact with floor of mouth to stay out of the way
Begin retroflexion or bunching in anticipation of
Maintain retroflexed or bunched configuration
Lips Begin spreading for the vowel
Maintain position for vowel, then begin toward closure
Achieve & maintain closure
Release rapidly & round somewhat
Maintain closure
Mandible Move upward to support tongue movement
Move downward to support tongue movement
Move upward to support lower lip movement
Move downward slightly to aid lip release
Soft palate
Maintain closure to contain pressure buildup
Begin downward movement to open velopharyngeal port for
Begin closing movement toward onset of
Reach closure at right instant to begin move upward during
walls air pressure buildup
Relax, perhaps expand actively to allow continuation of voicing for
position with peak occurring at release
Adduct to position for voicing
Maintain position Maintain position Maintain position
vocal folds
Begin to raise tension to signal stress on following vowel
Achieve maximum tension for the F0 peak that signals stress
Lower tension to lower F0
Maintain tension Maintain tension
Respiratory system
Increase subglottal air pressure to obtain a burst release for the
Maintain subglottal air pressure for increased sound level to signal stress
Return to the previous value of subglottal pressure
Maintain pressure Maintain pressure
Tongue burst move down
Tongue
to help expand v.t. walls - voicing
Vocal-tract Stiffen to contain
Vocal-fold Abduct maximally,
Tension on
112/05 Table by MIT OCW. 42
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
79www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
What happens when sounds are produced in sequences?
• When individuals speak to one another, additional forces are at play – Articulatory movements from one sound to another are influenced by
dynamical factors: canonical targets are very rarely reached. – The speaker knows that the listener can fill in a great deal of missing
information, so “reduction” takes place (see Introduction) – Speaking style (casual, clear, rapid, etc.) can vary
• Amount of variation can depend on the situation and the interlocutor (a familiar speaker of the same language?)
43112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
80www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Movements Show Context Dependence
• Coarticulation – At any moment in time, the current
state of the vocal tract reflects theinfluence of preceding sounds (perservatory coarticulation) and upcoming sounds (anticipatory coarticulation)
– Such coarticulation is a property ofany kind of skilled movement (e.g., tennis, piano playing, etc.)
– It makes it possible to produce sounds in rapid succession (up toabout 15/sec), with smooth, economical movements of slowly-�moving structures.
P
a
m
e
Figure by MIT OCW.
• During the /4/ in “camping” (Kent)
44112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
81www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Effects of Coarticulation and Speaking Rate on velar movements
• The velum has to be raised to From: Perkell, Joseph S. Physiology of Speech Production: Results and
contain the air pressure increase of Implications of a Quantitative Cineradiographic Study. Research Monograph No. 53. Cambridge, MA: MIT Press. 1969(c). Used with permission. obstruent consonants
– Its height during the /t/ is context (vowel) dependent - coarticulation
– In the context of a nasal consonant, vowels in American English can be nasalized due to coarticulation
– This is possible because vowel Figure removed due to copyright reasons.nasalization isn’t contrastive in American English �
• The velum (like most other vocal-�tract structures) is slowly-moving �– At higher speaking rates, its
movements become attenuated 45
112/05 www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
82www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Coarticulation of lip rounding for /u/
Figure by MIT OCW.From: Perkell, Joseph S. Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Research Monograph No. 53. Cambridge, MA: MIT Press. 1969(c). Used with permission.
• Lip rounding in production of the vowel /u/ in /h�’tu/ – The first three sounds in the utterance are neutral with respect to lip
rounding– The lips are fully protruded before the utterance begins – Coarticulation takes place whenever it doesn’t interfere with transmission of
the message – It crosses syllable and word boundaries – Movements of different structures are asynchronous
46112/05 www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
83www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Coarticulation and Acoustic Effects (Gay, 1973, J. Phonetics 2:255-266)
Figures removed due to copyright reasons.
• Cineradiographic measurements – anticipatory and perseveratory coarticulation – Tongue movement from /L/ to /$/ can start during the /S/ because it can’t be heard
and it isn’t constrained physically – The consonants have effects only on the pellet position for the /$/ (not /L/ or /X/). – The pellet is at an acoustically critical constriction in the vocal tract for /L/ and /X/,
but not for /$/. • Note the vertical variation for /$/ (possible for constriction location - QNS).
47112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
84www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Effects of Speaking Rate
Perkell, J., and M. Zandipour. "Economy of effort in different speaking conditions.
II. Kinematic performance spaces for cyclical and speech movements." J Acoust Soc Am 112 (2002): 1642-51.
•� Cyclical movements: �L X
– Higher rates show decreased movement durations, distances, increased speed (a measure of effort)
•� Speech vs. cyclical movements: – Compared to cyclical, speech movements �
( generally are faster, larger, shorter – perhaps ¥
because they have well-defined phonetic targets $ •� Vowels produced in fast vs. clear speech:
– larger dispersions, goal-region edges that are �
• Ellipses indicating the range of formant frequencies (+/-1 s.d.) used by a speaker to produce five vowels
closer together – less distinct from one another�
during fast speech (light gray) and clear speech (dark gray) in a variety of phonetic contexts.
48112/05
Reprinted with permission from:
Copyright 2002, Acoustical Society America.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
85www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Persistence of inaudible gestures at word boundaries U. Tokyo X-ray µ-beam
(Fujimura et al. 1973)
List Production
“perfect, memory”
Figure removed due to copyright reasons.
Phrasal Production
“perfec(t) memory”
• Phrasal: /m/ closure overlaps /t/ release, making it inaudible; /t/ gesture is present nevertheless (c.f. Browman & Goldstein; Saltzman & Munhall)
• Findings replicated and expanded with 21 speakers • Explanation (DIVA): Frequently used phonemes, syllables, words become �
encoded as feedforward command sequences �49112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
86www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline• Introduction • Measuring speech production • What are the “controlled variables” for segmental speech
movements? • Segmental motor programming goals • Producing speech sounds in sequences • Experiments on feedback control
– DIVA: feedback & feedforward control – Long term effects: Hearing loss and restoration – An example of abrupt hearing and then motor loss – Responses to perturbations – auditory and articulatory
• “Steady state” perturbations • Gradually increasing perturbations • Abrupt, unanticipated perturbations
– Feedback vs. feedforward mechanisms in error correction • Summary
50112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
87www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Feedback and Feedforward Control in DIVA
Update
To
Sound Representation
4 Somatosensory Error
2 Commands
3 Auditory Error
Auditory Goal
Region Auditory Feedback
Somatosensory Feedback
Feedforward Command
Feedback-based Command
Auditory Feedback-based Command
Feedforward Feedback
1 Speech
Muscles
Motor
Somatosensory Goal Region
Somatosensory
Subsystem Subsystem • With acquisition, control becomes predominantly feedforward
• Feedback control – Uses error detection and correction - to teach, refine and update feedforward control mechanisms
• Experiments can shed light on – sensory goals
– error correction
– mappings between motor/sensory and acoustic/auditory parameters
51112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
88www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Learning and maintaining phonemic goals: Use of Auditory Feedback • Audition is crucial for normal speech acquisition • Postlingual deafness: Intelligible speech, but with some abnormalities • Regain some hearing with a Cochlear Implant (CI):
• Usually show parallel improvements in perception, production and intelligibility
Acoustic measuresof contrast between /l/ and /r/6 months afterreceiving a CI
2000
1500
1500
1000
500
500 0 -500 0 1000
L
l l
l lll ll l l
l lll
l l ll
llr r
r r
r
r r r
r rr r
r
rrr
r r
r
r r
R
l
l l l
l l l
lll lll l l l l ll
r rr rrr
r r rr
rrrrrr rrr R
L
2000
1500
1500
1000
500
500 0 -500 0 1000
F3 transition (Hz)F3 transition (Hz)
F3-F
2 (H
z)
PRE-CI POST-CI
Figures by MIT OCW.
•� Phonemic contrast is enhanced pre- to post-implant – typical for CI users, many of whom have somewhat diminished contrasts pre-implant
52112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
89www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Long-term stability of auditory-phonemic goals for vowels �
• Typical pre- ( �) and post- ( ) �implant formant patterns: �generally congruent with �normative data ( )
pre-FA: some irregularity of F2 implant (18 years after onset of profound hearing loss)
– One year post-implant: F2 values are more like normative ones
• Phonemic identity doesn’t change; degree of contrast can
• Goals and feedforward commands for vowels generally are stable – If they degrade from hearing loss,
can be recalibrated with hearing from a CI
Perkell, J., H. Lane, M. Svirsky,
and J. Webster. "Speech of cochlear implant
patients: A longitudinal study of vowel production."
J Acoust Soc Am 91 (1992): 2961-2979.
53
112/05
FA FB
Vowel
Data from 2 cochlear implant users
Reprinted with permission from:
Copyright 1992, Acoustical Society America.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
90www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Long-term stability of phonemic goals for sibilants in CI users
• Subjects 1 and 3: good distinctions between /s/ and /6/ pre-implant – –� Typical, decades following onset o
hearing loss• Subject 2: reversed values and distorted
productions pre-implant –� After about 6 months of implant use,
sibilant productions improved • These precisely differentiated
articulations are usually maintained for years without hearing –� Possibly because of the use of
somatosensory goals – e.g. pattern of contact between tongue, teeth and palate
Spectral� median� (Acoustic � COG) /6/ /s/
Matthies, M. L., M. A. Svirsky, H. Lane, and J. S. Perkell.
"A preliminary study of the effects of cochlear implants on
the production of sibilants." J Acoust Soc Am 96 (1994): 1367-1373.
54� 112/05
Reprinted with permission from:
Copyright 1994, Acoustical Society America.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
91www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Responses to abrupt changes in hearing and motor innervation
An NF2 patient with sudden hearing �loss, followed by some motor loss �
• Two surgical interventions – OHL: Onset of a significant hearing
loss (especially spectral) from removalof an acoustic neuroma
– Hypoglossal nerve transpositionsurgery ĺ Some tongue weakness
• /s-6/ contrast: Good until second �surgery, when contrast collapsed �
• Hypothesis: Feedforward mappings �invalidated by transposition surgery �– Without spectral auditory feedback,
compensatory adaptation (relearning) was impossible – as might be possiblewith hearing� Figure by MIT OCW.
– Somatosensory goal deteriorated without auditory reinforcement� Spectral median for /s/ and /6/ vs.
weeks in an NF2 patient
55112/05
0
( )
/ S /
/�/
-20 20 40 60 80 100 120 3200
4200
5200
6200
7200 OHL
Hypoglossal transposition surgery
Week
Med
ian
Hz
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
92www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Vowel Contrasts and Hearing Status• Compare English with Spanish CI users, CI processor OFF and ON • Previous findings: Contrasts increase with hearing, decrease without • Hypothesis: Because of the more crowded vowel space in English, turning the
CI processor OFF and ON will produce more consistent decreases and increases in vowel contrasts in English than in Spanish
( )
( )
-
i
ie
e I
�
u
uo
oc
�
�
a
( )
( ) ( )
200
400
600
800
200
400
600
800 2500 2000 1500 1000 500 2500 2000 1500 1000 500
ENGLISH eM3
ON 612
719P&B
OFF 537ae
505 ON
633 OFF
719
F1 H
z
F1 H
z
SPANISH - sM5
P&B English
F2 Hz F2 Hz
Figure by MIT OCW.
• Average Vowel Spacing (AVS) – a measure of overall vowel contrast • Change of AVS from processor ON to processor OFF (for 24 hours)
– AVS: decreases for the English speaker, increases for the Spanish speaker
56112/05 www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
93www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
ON
AVS – by subject
• Prediction: AVS increases with the CI processor (hearing)
• Changes follow the predicted pattern more consistently for English than Spanish speakers
( )
350
250
150
50
-50
-150
-350
-250
350
250
150
50
-50
-150
-350
-250
350
250
150
50
-50
-150
-350
-250
350
250
150
50
-50
-150
-350
-250
F1 F2 F3 F4 F5
F1 F2 F3 F4 F5
English OFF to ON
English ON to OFF Spanish ON to OFF
Spanish OFF to ON
SUBJECT SUBJECT
Predicted
Predicted Predicted
Predicted
Not Predicted
Not Predicted
Not Predicted
Not Predicted
M1 M2 M3 M4 M7
M1 M2 M3 M4 M7
M5 M6 M8 M9
M5 M6 M8 M9
Cha
nge
in A
vera
ge V
owel
Spa
cing
Hz
Figure by MIT OCW.
57112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
94www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
112/05 58
Modeling Contrast Changes: Clarity vs. Economy of Effort
• DIVA contains a parameter that changes sizes of all goal regionssimultaneously – to control speaking rate and clarity (e.g., AVS)
• Shrinking goal region size – like what English speakers do with hearing – Produces increased clarity (contrast distance), decreased dispersion
– Without hearing, economy of effort dominates• With fewer vowels in Spanish, clarity demands aren’t as stringent
– Acceptable contrasts may be produced regardless of hearing status, without changing goal region size
Aco
usti
c/a
ud
ito
ry
para
mete
r C1
V1
C2
V2
Time
Increased
contrast
distance
Aco
usti
c/a
ud
ito
ry
para
mete
r C1
V1
C2
V2
Time
Increased
contrast
distance
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
95www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Effect of Varying S/N in Auditory FeedbackA
vger
age
Vow
el S
paci
ng (M
els
Sta
ndar
dize
d) Averaged Across 6 NH Subjects Averaged Across CI Subjects - 1 month
1.0 2
1.0
A A A A A
A
A
D D D
D D D
D
S S
S S
S S S
-1.0
A
D
S
1.0 1.5
D 1.00.50.5
1 �A � 0.5
0.00.0 0.0
0
-0.5-0.5 -0.5 �S� D�
A A
A A
AA
D D D
D
D
D
SS
S
S
S
S S
A -1.0 �-1.0 �
-1 -1.0 �
-2.50 -1.75 -1.00 -0.25 0.50 1.25 2.00 -2.50 -1.75 -1.00 -0.25 0.50 1.25 2.00 �
Standardized, binned NSR Standardized, binned NSR
( )
Averaged Across CI Subjects - 1 Year 0.5
1.50.5
A
A
A A
A
A
D
D
D D
D
D D
S S
S
S S
S S A
0.0 0.5
-0.5-0.5 D S A
-1.5-0.5 -1.0
-2.50 -1.75 -1.00 -0.25 0.50 1.25 2.00
Standardized, binned NSR
ti
Dura
ion or SPL (S
tandardzed)
• Normal-hearing and CI subjects heard their own vowel productions mixed with increasing amounts of noise
• In general, AVS increased, then decreased with increasing N/S • Possible explanation: With increasing N/S
– If auditory feedback is sufficient, clarity is increased – As feedback becomes less useful, economy of effort predominates
• Similar result for /U-6/ contrast, but with peak at lower NSR 59112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
96www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Bite block experiments
Figures removed due to copyright reasons.
• Speakers compensate fairly well with the mandible held at unusual degrees of opening
• Compensations may be better for quantal vowels (with better-defined articulatory targets)
• Presumably, the speakers mappings are not as accurate for the perturbed condition
• Compensation continues to improve, possibly with the help of auditory feedback
60112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
97www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Mappings can be Temporarily Modified: Auditory Feedback (Houde and Jordan)
• SensorimotorAdaptation
Figures removed due to copyright reasons.
• Methods: Please see: Figures from: Houde, J. F. and M. I. Jordan. "Sensorimotor adaptation of speech I:
•� Fed-back (whispered) � Compensation and adaptation." J Speech, Language, Hearing Research 45 (2002): 295-310. vowel formants were gradually shifted
•� 16 msec delay • Subjects were Figures removed due to copyright reasons.
unaware of shift Please see: Figure from: Houde, John Francis. "Sensorimotor
•� Results: � adaptation in speech production." Thesis (Ph. D.)-
•� Subjects adapted for � -Massachusetts Institute of Technology, Dept. of Brain
shift by modifying and Cognitive Sciences, 1997.
productions in the opposite direction
•� Effect generalized to other consonant environments and to other vowels •� Effect persisted in the presence of masking noise: “Adaptation” •� Adaptation was exhibited later, simply by putting subjects in the apparatus (no shift)
•� Speakers use auditory goals and auditory-motor mappings.
61112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
98www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Sensorimotor adaptation • Subjects hear own vowels with F1 perturbed, are unaware of perturbation
–� •� Thesis project of Virgilio Villacorta
Based on work of Houde & Jordan, but
Averaged across 10 subjects
with voiced vowels
• Subjects partially compensate by shifting F1 in opposite direction; Shift is formant-specific
• Mismatch between expected and produced auditory sensations Æ Error correction
• 20 subjects: Varied in amount of compensation • Is there a relation between perceptual acuity
and amount of compensation? 62112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
99www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Relation Between Adaptation and Auditory Discrimination
F2
F1
/(/
LO HI
Compensatory
Perturbation
responses
• DIVA and previous studies: Production goals for vowels are primarily regions in auditory space – Speakers with more acute auditory
discrimination have smaller goal regions, spaced further apart
• Hypothesis: – More acute perceivers will adapt more
Hypothetical compensatory responses to F1 to perturbation�perturbation by High- and Low-Acuity speakers
• Measure of auditory acuity: JNDs on milestone = center
r 2
= 0.250, p = 0.040
0.7 training, 1.0 m.s.
1.3 training, 1.0 m.s.
pairs of synthetic vowel stimuli0.08
• Result: Hypothesis is supported 0.075
0.07
0.065
0.06
0.055
0.05
0.045
0.04
0.035
0.03
-0.05� 0 0.05 0.1 0.15� Adaptive Response Index�
63112/05
jnd
(per
t) xy||z
r 0.661 0.656
r 2
0.514 0.436 0.430
0.004 0.010
first order r
jnd, ARI||F1 sep F1 sep, ARI||jnd jnd, F1 sep||ARI
-0.717
p-score 0.021*
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
100www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Modifying a Somatosensory-to-Motor Mapping
A “Force Field Adaptation” experiment (Ostry & colleagues)
Figures by MIT OCW.
•� Methods: • Velocity-dependent forces applied (gradually) by a robotic device act to
protrude the jaw: proportional to instantaneous jaw lowering or raising velocity • Jaw motion path over large number of repetitions (700) is used to assess
adaptation, which may be evidence of: •� Modification of somatosensory-motor mappings •� Incorporation of information about dynamics in speech movement planning
(Ostry’s interpretation)
64112/05 www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
101www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Results
0 0
5
5
5 0 5 5 0 5 j ( )
j (
)
j (
)
j ( )
10
10
15
15
20
30
25
20
10 10 1015 mm
mm
mm
mm
SPEECH CONDITION NON-SPEECH CONTROL
Null field
Initial exposure
Adapted
Horizontal aw position
Verti
cal
aw p
ositi
on
Verti
cal
aw p
ositi
on
Horizontal aw position
After effect
Summary and interpretation • Subjects adapt to a motion
dependent force field applied to the jaw during speech production
• Kinesthetic feedback alone is not sufficient for adaptation; have to be in a “speech mode”
• Control signals (mappings) are updated based on differences between expected and actual feedback
• Information about dynamics is incorporated in speech motor planning
Figures by MIT OCW.
65112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
102www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Rapid drift of spectral median for /6/
• A CI “on-off” experiment 2000 25001500
-0.29
posi
tion
(dm
)
-0.28
-0.27
-0.26
-0.25
-0.24
On 2On1 On 2
80 dB
|�| Spectral median
75 dB
3240 Hz
3120 Hz
500 1000 1500 2000 2500
Time (s)
Tong
ue b
lade
hor
izon
tal
Off 1 Off 2 Off 2
Vowel SPL
Time (s)
Figure by MIT OCW. • Observations
•�Vowel SPL increased rapidly with CI processor off, decreased with processor on •�Spectral median drifted upward toward /s/ during the 1000 seconds with processor
off – Surprising, since the goals are usually stable •�Hearing one aberrant utterance when the processor was turned on, speaker
overcompensated to restore an appropriate /6/ • Extremely narrow dental arches (and movement transducer coil on tongue)
may have made it difficult for speaker to rely on somatosensory goal • He may have had to rely predominantly on auditory feedback to maintain
feedforward control on an utterance-to-utterance basis
66112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
103www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Unanticipated Acoustic Perturbations (Tourville et al., 2005)
Form
ant d
iffer
ence
F1
F1 shifted up
F1 shifted down
(shi
fted-
unsh
ifted
-H
z)
Time (s)
• Results (averaged across 11 subjects) • Methods – like sensorimotor
adaptation, but with sudden, – Subjects produced compensatory
modification of F1, in direction opposite unanticipated shift of F1 to shift – Subjects pronounced /C(C/ – Delay of about 150 ms. – compatible
words with auditory feedback with other results, in which F0 was through a DSP board shifted.
– In 1 of 4 trials, F1 was shifted up toward /4/ or down toward /,/
• Result is compatible with error detection and correction mechanisms in DIVA
112/05 67
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
104www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
A
How long does it take for Vowel 1� Vowel 2
200
100
120
140
160
180
200
on_off off_on
180
parameters to change when hearing is turned on or off?
160
140
120
100
SHED
A SHED
l
l
SOUNDPROOF ROOM
MIX/AMP
Computer mouse
PC running CI contro software Speech
Processor
SP contro interface box
-4 -3 -2 -1 0 1 2 3 4 5 -4 -3 -2 -1 0 1 2 3 4 5
85
83
81
79
77
75 75
77
79
81
83
85
-4 -3 -2 -1 0 1 2 3 4 5 -4 -3 -2 -1 0 1 2 3 4 5 Utterance relative to switch control signal from printer port
•� Subject pronounced a large number • Example results for one subject – Changes not evident until second vowelof repetitions of four 2-syllable
utterances (e.g., done shed, don – Change may be more gradual for SPL said; quasi-random order). than for F0
•� CI processor state (hearing) was �switched between on and off �unexpectedly �
• Results varied among parameter and subject – Perhaps related to subject acuity?
68112/05
F0 (H
z)
dB S
PL
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
105www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Unanticipated Movement perturbation – Motor Responses
Figures removed due to copyright reasons. Please see: Abbs, J. H., and V. L. Gracco. "Control of complex motor gestures -orofacial muscle responses to load perturbations of lip during speech." Journal of Neurophysiology 51 (1984): 705-723.
• Abbs et al.; others (1980s) – In response to downward perturbation
of lower lip in closure toward a /p/, – Upper lip responds with increased
downward displacement, accompanied by EMG and velocity increases
– The response is phoneme-specific
69112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
106www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Further observations andinterpretation (Abbs et al.)
Figures removed due to copyright reasons.
• Coordinated speech gestures are performed by “synergisms” – – Temporarily recruited combinations of neural
and muscular elements that convert a simple • Motor equivalence at the muscle input into a relatively complex set of motor
and movement levels commands • There are alternative interpretations (Gomi, et al.)
70112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
107www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Motor and acoustic responses to unanticipated jaw perturbations (In collaboration with David Ostry)
“see red” N = 100 (two 50 rep trials) 1200 • Robot used to perturb 1100
850 900
assistive/down (10) resistive / msec 700 750 800
up (10)
pertu
rbat
ion
jaw movements – Triggered by downward
F2
-F1
(H
z)
1000
900
800movement • 50 repetitions/utterance� 700
e.g., “see red” 0
– 5 perturbed upward -5
(resistive) -10
– 5 downward (assistive) JA
WY
(mm
)
700 750 800 850 900 950
msec
• Formants begin to recover 60-90 ms after perturbation; jaw does not
•� Two other subjects were similar • Evidence of within-movement, closed-
loop error correction 71
Figure by MIT OCW. 112/05
-15
-20
950
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
108www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Compensatory Responses to Unexpected Palatal Perturbation (Honda , Fujino & Murano)
• A subject pronounced phrase:/L$ 6$ 6$ 6$ 6$ 6$ 6$ 6$ 6$/
• Movements and acoustic signal wererecorded
• Palatal configuration was perturbed by inflation of a small balloon on 20% of Figures removed due to copyright reasons.
Please see: trials (randomly determined) Honda, M., and E. Z. Murano. "Effects of tactile
• Feedback conditions: andauditory feedback on compensatory articulatory response to an unexpected palatal perturbation."
– Feedback not blocked Proceedings of the 6th International Seminar on
– Auditory feedback blocked with masking Speech Production, Sydney, December 7 to 10, 2003.
noise – Tactile feedback blocked with topical �
anesthesia �– Both types of feedback blocked
• Measures – Articulatory compensations – Listener judgments of distorted sibilants
72112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
109www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Results
• Perturbation caused distortions in �/6/ production �– Compensation and feedback:
•� With feedback not blocked, speaker compensated within � Figures removed due to copyright reasons.
about 2 syllables •� With auditory or tactile
feedback blocked, speaker wasmuch less able to compensate
•� With both forms of feedbackblocked, compensation wasworst Mean error score for fricative consonant identification (%)
•� Results are compatible with Syllable No. 1 2 3 4 5 6 7 8
– Sensory goals as basic units Normal auditory-feedback Steady-state deflated 0 0 0 0 0 0 0 0
– Use of mismatches between � Inflation 83 14 0 0 0 0 0 0 Deflation 8 0 0 0 0 0 0 0expected and actual sensory Steady-state inflated 0 0 0 0 0 0 0 0
consequences to correct Masked auditory-feedback feedforward commands Steady-state deflated 0 0 0 0 0 14 11 11
Inflation 72 39 39 44 28 42 33 50 Deflation 0 0 0 3 3 11 11 17
Steady-state inflated 47 39 44 50 44 44 44 44
73112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
110www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Feedforward vs. Feedback Control and Error Correction
• In DIVA, feedback and feedforward control operate simultaneously; feedforward usually predominates
• Feedback control intervenes when there is a large enough mismatch between expected and produced sensory consequences (sensorimotor adaptation results)
• Timing of correction: – With a long enough movement, correction is expressed (closed loop)
during the movement (e.g., “see red”) – Otherwise, correction is expressed in the feedforward control of
following movements (e.g., /6/ spectrum, vowel SPL, F0 when CI turned on) – Correction to an auditory perturbation takes longer than to a
somatosensory perturbation (presumably due to different processing times)
• Additional Examples of error correction – Closed-loop responses to perturbations (see Abbs, others)
– Feedforward error correction with, e.g., dental appliances – Responses to combined perturbations (cf. Honda & Murano)
– All are compatible with DIVA’s use of feedback 74112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
111www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Outline
• Introduction • Measuring speech production • What are the “controlled variables” for segmental
speech movements? • Segmental motor programming goals • Producing speech sounds in sequences • Experiments on feedback control • Summary
75112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
112www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Summary of Main Points
• Highest level control variables for phonemic movements – Auditory-temporal and somatosensory-temporal goal regions
• Goal regions encoded in CNS – Projections (mappings) from premotor to sensory cortex: �
Expected sensory consequences of producing speech sounds� • Goal regions defined partly by articulatory and acoustic saturation
effects that are properties of vocal-tract anatomy and acoustics – Most vowels: goals primarily auditory; saturation effects, acoustic – Consonants: both auditory and somatosensory goals; saturation effects,
primarily articulatory (e.g., any consonant closure) • Articulatory-to-acoustic motor equivalence (/u/, /r/)
– Help stabilize output of certain acoustic cues – Evidence that goals are auditory
76112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
113www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Summary (continued)
• Auditory feedback (CI users) – Used to acquire goals and feedforward commands – Needed to maintain appropriate motor commands with vocal-tract growth,
perturbations • Goals and feedforward commands are usually stable, even with hearing
loss • Clarity vs. economy of effort
– Tradeoff evident when hearing (CI) is turned on, off, in presence of noise • Relations between production and perception
– Better discriminators produce more distinct sound contrasts – Better discriminators may learn smaller, more distinct goal regions
• Feedback and feedforward control – Frequently used sounds (syllables, words) are encoded as feedforward
commands– Responses to perturbations: intra-gesture are closed loop; inter-gesture are
via adjustments to subsequent feedforward commands
77112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
114www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DIVA (Next lecture)
Figures removed due to copyright reasons.
• Components have hypothesized correlates in cortical activation
• Hypotheses can be tested with brain imaging
• Can quantify relations among phonemic specifications, cortical activity, movement
and the speech sound output
112/05 78
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
115www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
112/05 79
Questions?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
116www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
SOME REFERENCES
•� Abbs, J.H. and Gracco, V.L. (1984). �Control of complex motor gestures - orofacial muscle responses to load perturbations of lip during speech. Journal of Neurophysiology, 51, 705-723.
•� Bradlow, A.R., Pisoni, D.B., Akahane-Yamada, R. and Tohkura, Y. (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production, J. Acoust. Soc. Am. 101, 2299-2310.
•� Browman, C.P. and Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6, 201-251.
•� Fujimura, O. & Kakita, Y. (1979). Remarks on quantitative description of lingual �articulation. In B. Lindblom and S. Öhman (eds.) Frontiers of Speech Communication Research, Academic Press.
•� Fujimura, O., Kiritani,S. Ishida,H. (1973). Computer controlled radiography for observation of �movements of articulatory and other human organs, Comput. Biol. Med. 3, 371-384 �
•� Guenther, F.H., Espy-Wilson, C., Boyce, S.E., Matthies, M.L., Zandipour, M. and Perkell, J.S. (1999) Articulatory tradeoffs reduce acoustic variability during American English /r/ production. J. Acoust. Soc. Am., 105, 2854-2865.
•� Honda, M. & Murano, E.Z. (2003). Effects of tactile and auditory feedback on compensatory articulatory response to an unexpected palatal perturbation, Proceedings of the 6th International Seminar on Speech Production, Sydney, December 7 to 10.
•� Houde, J.F. and Jordan, M.I. (2002). Sensorimotor adaptation of speech I: Compensation and �adaptation. J. Speech, Language, Hearing Research, 45, 295-310. �
•� Lindblom, B. (1990). Explaining phonetic variation: a sketch of the H&H theory. In W. J. Hardcastle and A. Marchal (eds.), Speech production and speech modeling. Dordrecht: Kluwer. 403-439.
•� Newman, R.S. (2003). Using links between speech perception and speech production to evaluate different acoustic metrics: A preliminary report. J. Acoust. Soc. Am. 113, 2850-2860.
•� Perkell, J.S. and Nelson, W.L. (1985). Variability in production of the vowels /i/ and /a/, J. Acoust. Soc. Am. 77, 1889-1895.
•� Saltzman, E.L. and Munhall, K.G. (1989). A dynamical approach to gestural patterning in speech production, Ecological Psychology, 1, 333-382.
•� Stevens, K.N. (1989). On the quantal nature of speech. J. Phonetics,17, 3-45. 80112/05
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
117www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References
Slide 6 (right figure): Denes, Peter B., and Elliot N. Pinson. The Speech Chain: The Physics and Biology of Spoken Language.New York, NY: W.H. Freeman, 1993. ISBN: 0716723441.
Slide 16 (left figure): Grant, J. C. Boileau (John Charles Boileau). Grant’s Atlas of Anatomy. 5th ed. Baltimore, MD: Williams &Wilkins, 1962.
Slide 18, Slide 27 (right three), Slide 28, Slide 55: Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, P. Perrier, J. Vick, R. Wilhelms-Tricarico, and M. Zandipour. “A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss.” J Phonetics 28 (2000): 233-372.
Slide 23 (left): Fujimura, O., and Y. Kakita. “Remarks on quantitative description of lingual articulation.”Frontiers of Speech Communication Research. Edited by B. Lindblom and S. Öhman. Academic Press, 1979.
Slide 23 (mid/lower 2): Perkell, J. S. “Properties of the tongue help to define vowel categories: Hypotheses based on physiologically- oriented modeling.” Journal of Phonetics 24 (1996): 3-22.
Slide 20-21: Stevens, K. N. "On the quantal nature of speech." J Phonetics 17 (1989): 3-45.
Slide 24: Denes, P. B., and E. N. Pinson. The Speech Chain. New York, NY: W.H. Freeman, 1993, p. 68. ISBN: 0716722569.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
118www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References (cont.)
Slide 26 (top right): Perkell, J. S. “Properties of the tongue help to define vowel categories: Hypotheses based on physiologically- oriented modeling.” Journal of Phonetics 24 (1996): 3-22.
Slide 42: Perkell, J. S. “Articulatory Processes.” The Handbook of Phonetic Sciences. Edited by W. Hardcastle and J. Laver. Blackwell, 1997, pp. 333-370.
Slide 56, Slide 57: Perkell, J., W. Numa, J. Vick, H. Lane, T. Balkany, and J. Gould. “Language-specific, hearing-related changes in vowel spaces: A preliminary study of English- and Spanish-speaking cochlear implant users.” Ear and Hearing 22 (2001): 461-470.
Slides 66: Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, P. Perrier, J. Vick, R. Wilhelms-Tricarico, and M. Zandipour. “A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss.” J Phonetics 28 (2000): 233-372.
Slides 26 (left set): Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, P. Perrier, J. Vick, R. Wilhelms-Tricarico, andM. Zandipour. “A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss.” J Phonetics 28 (2000): 233-372.
Slide 27:Perkell, J. S., F. H. Guenther, H. Lane, M. L. Matthies, P. Perrier, J. Vick, R. Wilhelms-Tricarico, and M. Zandipour. "A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss." J Phonetics 28 (2000): 233-372.��
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
119www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Neuroimaging Correlates of HumanAuditory Behavior
HST 722Brain Mechanisms for Hearing and Speech
October 27, 2005Jennifer Melcher
Harvard-MIT Division of Health Sciences and TechnologyHST.722J: Brain Mechanisms for Hearing and SpeechCourse Instructor: Jennifer R. Melcher
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
120www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
"The Problem"
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
121www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
SoundPerception
NoninvasivePhysiologicMeasures
NeuralActivity
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
122www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Figure above illustrates how auditory evoked potentials are measured. Potential shown wasevoked by a click stimulus and was recorded in a human subject. Waveform is the averagedresponse to many click presentations. (AEP record from R.A. Levine)(Courtesy of Robert Aaron Levine. Used with permission.)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
123www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
124www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Furst et al. (1985) “Click lateralization is related to the βcomponent of the dichotic brainstem auditory evokedpotentials of human subjects”
For binaural clicks with different ITDs and ILDs, quantified
perception
binaural difference potential
Attributes of the binaural difference are correlated withthe perception of binaural sound.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
125www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Binaural difference (BD) is derived from BAEPs evoked bymonaural and binaural stimuli (above). BAEP: brainstem auditoryevoked potential
The BD reflects an interaction between converging signalsfrom the two ears at the level of the brainstem.
Binaural Difference Potential
Fig. 9. Binaural, sum of the monaurals, and binaural difference waveforms for both species.The binaural (solid lines) and sum of the monaural waveforms (dotted lines) aresuperimposed. The difference between these two waveforms, the binaural difference (BD)waveform, is plotted below. The recording electrodes were vertex to nape for both species.Stimuli were 10/sec; rarefaction clicks at 40 dB SL for the cat and 38 dB HL for the human.(from Fullerton et al., 1987)
0 4 8 12 0 4 8 12msec msec
CAT 24 SU 44
bd
a c
β
α
7.0 µV 0.2 µV
CAT HUMAN
Σ MON
BINBD
P4 P5
Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
126www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Cellular Generators of the Binaural Difference Potential
Cellular generators of the binaural difference potential in cat. Diagonal line shadingsindicate the generators of the first peak (‘b’; white on black) and possible generatorsof the second peak (‘d’; black on white). The schematic of the lower auditory system(at bottom) shows the generators’ relationship to other cells. NLL, nuclei of thelateral lemniscus; IC, inferior colliculus.
(From Melcher, 1996)
LEFT SOC
Midline
Trapezoidbody
Auditorynerve
Laterallemniscus
NLL/IC
?
MSOLSO
Principal cell
COCHLEARNUCLEUS
Cellular generators of the binaural difference potential in cat. Green color indicate the generatorsof the first peak ('b'; white on black) and possible generators of the second peak ('d'; black on white).The schematic of the lower auditory system (at bottom) shows the generators' relationship to other cells.NLL, nuclei of the lateral lemniscus; IC, inferior colliculus.
DCN
PVCNpPVCNa
AVCNp
AVCNa
RIGHT SOC
Sphericalcell
Binaural difference waveform
b d
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
127www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Discussion Questions:
If the generator results are combined with the findings ofFurst et al., what can be said about the neural processingunderlying sound lateralization and binaural fusion?
We generally think of the MSO as a coincidence detector.Are Furst et al.’s binaural difference data consistent with this idea?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
128www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Late Responses: dependence on attention and
stimulus context
Idealized AEP evoked by transient stimuli( ___ ) including components that are dependent onstimulus context and subject attention ( ….. , ----- ).(from Hillyard and Kutas, 1983; also see Hillyard etal., 1973; Donchin et al., 1978).
Figure 1-12. Schematic diagram of oddballstimulus presentation paradigm for P300measurement (from Squires & Hecox, 1983).Selected measurement parameters areindicated. Responses are averaged separatelyfor Stimulus Type 1 (i.e. the frequent stimulus)and Stimulus Type 2 (i.e. the rare or oddballstimulus). Note. From “ElectrophysiologicalEvaluation of Higher Level AuditoryProcessing” by K.C. Squires and K.E. Hecox,1983, Seminars in Hearing, 4 (4), p. 422.Reprinted by permission.(from Hall, 1992)
Nd
N2
P3
- or “processing negativity”- produced when the subject attends to the stimuli- visualized by taking the difference between responses to attended and unattended stimuli
- or “N2000”, “mismatch negativity- occurs in response to “rare” stimuli (S2 below) in oddball paradigm- can occur even when the subject is not attending to the stimuli- dependent on stimulus modality (e.g. auditory vs. visual)
- or “P300”- occurs in response to “rare” stimuli (S2 below) in oddball paradigm when the subject is attending to the stimuli- independent of stimulus modality
III
IIIIV V
VIP0
N0Na
N1
Nd
N2
Nb
Pa P1P2
P3 (P300)
-5µv
+5µv
Stimulus Onset
10Time (msec)
Auditory Event-Related Potential
1000100
S1 S1 S1 S1 S2 S1 S2
Average 1 Average 2
Two Stimuli, S1 & S2Train with fixed intervalsProbabilities of S1 & S2 unequalS1 & S2 mixed randomly
ODDBALL PROCEDURE
Figure by MIT OCW.
Figure by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
129www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
N2 (N200), P3 (P300)
Fig. 1. Mean for eight subjects of the non-signal (NS), signal (S) and difference (Δ) waveforms at each electrode site in theauditory condition. Isopotential topographic distributions are expressed as percentages of maximum response amplitude forthe N1 and P2 components of the non-signal response (left) and the negative (N2 Δ) and positive (P3 Δ) components of theΔ waveform (right). Supraorbital (0) and vertex (electrode 3) traces from the 3 runs are superimposed.(From Simson et al., 1977)NS - responses to standard stimuli (2000 Hz tone bursts)S - responses to rare stimuli (1000 Hz tone bursts)Δ - response to rare stimuli minus response to standard stimuli
Fig. 4. Frontal, vertex, and parietal(across-subjects averaged) differencewaveforms obtained by subtracting theERP to the 1000-Hz standard stimulusfrom that to the 1044-Hz deviant stimulusat different deviant-stimulus probabilities.The continuous line indicates the countingcondition and the broken line the ignorecondition. The amplitude of the fronto-centrally distributed MMN is decreasedwhen the probability is increased from 2%to 10%. When the two stimuli areequiprobable, no MMN is seen.
From Sams et al., 1985)
MMN - mismatch negativity
100 ms
+5µv-
12 3 4
5
67050
70
90
0507090
0
1
N1NS S ∆
P2
N2∆
P3∆
2
3
4
5
67
8
9
10
11
12
13
9
912 11
10
8 507090
70
50
90
Figure by MIT OCW.
MMN
2% 10% 50%
P3
Pz
Oz
Fz
Pz
Oz
Fz
Pz
Oz
Fz
s
200 msec
10µV
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
130www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Kraus et al. (1996) “Auditory neurophysiologic responsesand discrimination deficits in children with learning problems”
Stimuli: Syllables, varied along two continua
Subjects: Children with and without learning problems
Measured discrimination and mismatch negativity
The children with learning problems showed- deficits in their ability to discriminate syllables- abnormally small mismatch negativity
Conclusion The behavioral deficits in the children with learningproblems arose at a processing stage that precedes conscious perception.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
131www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Brain ActivityIncrease
MetabolicResponse
Blood Flow
IncreaseTracer
Increase
Brain ActivityIncrease
MetabolicResponse
Blood Flow
Increase
BloodOxygen-
ationIncrease
ImageSignal
Increase
fMRI(Blood Oxygenation Level-Dependent (BOLD))
PET(radioactive tracer e.g., radio-labeled H2O)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
132www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
133www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Computational inflation of the corticalsurface. In the inflated format, the cortex ofsulci and gyri can be viewed simultaneously.See Fischl et al. (1999) NeuroImage 9: 195-207.
Folded Cortex(lateral view)
Inflated Cortex(lateral view)
A B C
Temporal Lobe(view from above)
planum polare
planum temporale
Heschl’s gyrus
lateral
anterior
SuperiorTemporal Sulcus
Sylvianfissure
Human Brain(lateral view)
(Courtesy of Irina Sigalovsky. Used with permission.)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
134www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Scott et al. (2000) “Identification of a pathway for intelligiblespeech in the left temporal lobe”
Four stimulus conditions that included speech and severalforms of degraded speech:
- speech - rotated speech- noise-vocoded speech - rotated, noise-vocoded
speech
The stimuli differed in - intelligibility- presence of phonetic information- presence of pitch variations
PET activity was compared between conditions.
ConclusionProcessing unique to intelligible speech is performedanteriorly in the left superior temporal sulcus, while lower-levelprocessing is performed more posteriorly in the left STSand STG.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
135www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Discussion Questions:
Scott et al. argue that their choice of stimuli may be betterthan previous ones for identifying sites of speech-specificprocessing. Do you agree?
What assumptions have been made about the relationshipbetween brain activity and the functional specificity ofa brain region?
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
136www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Beauchamp et al. (2004) “Unraveling multisensory integration:patchy organization within human STS multisensory cortex” Nat. Neurosci. 7: 1190-1192.
Three stimulus conditions:- visual (videos of tools or faces)- auditory (sounds of tools or voices)- audio-visual (simultaneous images and sounds)
High resolution fMRI of the superior temporal sulcus, a knownregion of multimodal convergence
Three types of cortical patches were identified having: - auditory > visual response - visual > auditory response - auditory = visual
Conclusion“A model… suggested by our data is that auditory and visualinputs arrive in the STS-MS in separate patches, followedby integration in the intervening cortex.”
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
137www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Figure 2. Statistical parametric maps for contrasts of interest (group data). a, SPMs are shown as “glass brain” projections insagittal, coronal, and axial planes. b, SPMs have been rendered on the group mean structural MRI brain image, normalized tothe MNI standard stereotactic space (Evans et al., 1993). Tilted axial sections are shown at three levels parallel to thesuperior temporal plane: 0 mm (center), +2 mm, and -2 mm (insets). The 95% probability boundaries for left and righthuman PT are outlined (black) (Westbury et al., 1999). Sagittal sections of the left (x = -56 mm) and right (x = +62 mm)cerebral hemispheres are displayed below. All voxels shown are significant at the p < 0.05 level after false discovery ratecorrection for multiple comparisons; clusters less than eight voxels in size have been excluded. Broadband noise (withoutpitch) compared with silence activates extensive bilateral superior temporal areas including medial Heschl’s gyrus (HG) (b,center, yellow). In the contrasts between conditions with changing pitch and fixed pitch and between conditions withchanging spatial location and fixed location, a masking procedure has been used to identify voxels activated only by pitchchange (blue), only by spatial change (red), and by both types of change (magenta). The contrasts of interest activate distinctanatomical regions on the superior temporal plane. Pitch change (but not spatial location change) activates lateral HG,anterior PT, and planum polare (PP) anterior to HG, extending into superior temporal gyrus, whereas spatial change (but notpitch change) produces more restricted bilateral activation involving posterior PT. Within PT (b, axial sections), activationattributable to pitch change occurs anterolaterally whereas activation attributable to spatial change occurs posteromedially.Only a small number of voxels within PT are activated both by pitch change and by spatial change.
Figures 2a, 2b from Warren, and Griffiths. “Distinct mechanisms for processing spatial sequences and pitch sequences in the human auditory brain.” J Neurosci 23 (2003): 5799-5804. (Copyright 2003 Society for Neuroscience. Used with permission.)
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
138www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Zimmer and Macaluso (2005) “High binaural coherencedetermines successful sound localization and increasedactivity in posterior auditory areas”
Main Experiment: - fMRI and behavioral measurements during sound localization - manipulate sound location using ITD - manipulate ability to localize by manipulating binaural coherence - identify brain areas showing a correlation between activation and localization performance
Control Experiments: - separated activation specifically correlated with localization performance from activation correlated with binaural coherence.
Conclusion Within the superior temporal plane, only planum temporale showed activation specifically correlated with localization performance. It was concluded that binaural coherence cues are used by this region to successfully localize sound.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
139www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Beyond Auditory Cortex: "What" and "Where" Pathways?
Do sound recognition and soundlocalization involve segregatednetworks (i.e., "what" and"where" pathways)? Thisquestion was addressed byMaeder and coworkers (2001). Inan fMRI experiment, subjectswere imaged in three conditions:(1) during a localization task, (2)during a recognition task, and (3)at rest (see right).
Ventral cortical areas showed greater activity during the recognition task (green, below), while dorsalareas showed greater activity during localization (red, below).
FIG.5 Active paradigm: 3-D projections of activation on smoothed normalized brain (group results). Areas more activatedin recognition than localization are shown in green, areas more activated in localization than in recognition are shown in red.Adapted fromMaeder et al. (2001) NeuroImage 14: 802-816.
FIG. 1 Schematic representation of the experimental paradigm, theblocks and the temporal structure of the stimuli. L = localization task;R = recognition task; r = rest.
Figures removed due to copyright reasons. Please see:Figures 1 and 5 in Maeder et al. “Distinct pathways involved in sound recognition and localization: a human fMRI study.” NeuroImage 14 (2001): 802-816.
,
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
140www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Lewald et al. (2002) “Role of the posterior parietal cortex inspatial hearing” J. Neurosci. 22: RC207.
Subjects performed a sound lateralization task before and after cortical stimulation using transcranial magnetic stimulation (TMS)
TMS: a noninvasive stimulation method that reversibly alters neuronal function
Stimuli: Dichotic tones with various ITDs
Task: indicate perceived location (left or right)
Stimulation site: posterior parietal lobe
Conclusion TMS produced a shift in sound lateralization, suggestinga role for posterior parietal cortex in spatial hearing.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
141www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
“What” and “Where” Pathways of the Visual System
The “what” and “where” pathways in the visual system include areas specialized for processing depth perception(symbolized by a pair of spectacles), form (an angle), color, and direction (the curve ahead sign). The result is objectrecognition (the “what” pathway) or object location (the “where” pathway).
Posner, M. I., and M. E. Raichle. Images of the Mind. New York, NY: Scientific American Library, 1994.
Figure removed due to copyright reasons. Please see:
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
142www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
REFERENCES
Donchin, E., Ritter, W., and McCallum, W.C. (1978) Cognitive psychophysiology:
the endogenous components of the ERP. in "Event-Related Brain Potentials in Man." E.
Callaway, P. Tueting, S.H. Koslow, eds. Academic Press, New York.
Fullerton, B.C., Levine, R.A., Hosford-Dunn, H.L. and Kiang, N.Y.S. (1987)
Comparison of cat and human brain-stem auditory evoked potentials. Electroenceph. clin.
Neurophysiol. 66, 547-570.
Furst, M., Levine, R.A. and McGaffigan, P.M. (1985) Click lateralization is related
to the ! component of t he dichotic brainstem auditory evoked potentials of human
subjects. J. Acoust. Soc. Am. 78, 1644-1651.
Goldstein, M.H. Jr. and K iang, N.Y.S. (1958) Synchrony of neural activity in
electric responses evoked by transient acoustic stimuli. J. Acoust. Soc. Am. 30, 107-114.
Hall, James W. III (1992) "Handbook of Auditory Evoked Responses". Allyn and
Bacon, Boston.
Hillyard, S.A. and Kutas, M. (1983) Electrophysiology of cogn itive processing.
Ann. Rev. Psychol. 34: 33-61.
Hillyard, S.A., Hink, R.F., Schwent, V.L. and Picton, T.W. (1973) Electrical signs
of selective attention in the human brain. Science, 182, 177-182.
Kraus N, McGee TJ, Carrell TD, Zecker SG, Nicol TG, Koch DB Auditory
neurophysiologic responses and discrimination deficits in children with learning
problems. Science 273: 971-973 (1996).
Maeder PP, Meuli RA, Adriani M, Bellmann A, Fornari E, Thiran J-P, Pittet A,
Clarke S. Distinct pathways involved in sound recognition and localization: a human
fMRI study. NeuroImage 14:802-816 (2001)
Melcher, J.R. (1996) Cellular generators of the binaural difference potential. Hear.
Res. 95, 144 - 160.
Melcher, J.R. and Kiang, N.Y.S. (1996) Generators of the brainstem auditory
evoked potential in cat III: identified cell populations. Hear. Res. 93, 52 - 71.
Sams, M., Alho, K., Näätänen, R. (1985) The mismatch negativity and information
processing. in "Psychophysiological Approaches to Human Information Processing." F.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
143www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Scott SK, Blank SC, Rosen S, Wise RJS. Identification of a pathway for
intelligible speech in the left temporal lobe. Brain 12: 2400-2406 (2000)
Simson, R., Vaughan, H.G.Jr., and Ritter, W. (1977) The scalp topography of
potentials in auditory and visual discrimination tasks. Electroenceph. clin. Neurophysiol.
42: 528-535.
Wang, B. and Kiang, N.Y.S. (1978) Synthesized compound action potentials
(CAP's) of the auditory nerve. J. Acoust. Soc. Am. 63: S77.
Wang, B. (1979) The relation between the compound action potential and unit
discharges of the auditory nerve. Ph.D. Thesis, Massachusetts Institute of Technology.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
144www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
fMRI: Mitigation of Scanner Acoustic Noise
Clustered Volume Acquisition: A method for removing the impact of scanner acousticnoise on auditory fMRI activation
Figure removed due to copyright reasons. Please see:Edmister, et al. Human Brain Mapping 7 (1999): 89-97.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
145www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References
Slide 7: �Fullerton, B. C., R. A. Levine, H. L. Hosford-Dunn, and N. Y. Kiang. "Comparison of�cat and human brain-stem auditory evoked potentials." Electroencephalogr Clin�Neurophysiol 66, no. 6 (Jun 1987): 547-70.�
Slide 8: �Melcher, J. R. "Cellular generators of the binaural difference potential." Hear Res95 (1996): 144-160.�
Slide 10 (Top figure): �Hillyard, S. A. and M. Kutas. "Electrophysiology of cognitive processing." Ann Rev�Psychol 34 (1983): 33-61.�
Slide 10 (Bottom figure): �Hall, James W. "Handbook of Auditory Evoked Re sponses." Allyn and Bacon. Boston, I, II, 1992.�
Slide 11 (Top figure): �Simson, R., H. G. Vaughan, Jr., and W. Ritter. "The scalp topography of potentials in �auditory and visual discrimination tasks." Electroenceph clin Neurophysiol 42 (1977): 528-535.�
Slide 11 (Bottom figure):�Sams, M., K. Alho, and R. Näätänen. "The mismatch negativity and information �processing." Psychophysiological Approaches to Human Information Processing. 1985. �
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
146www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Harvard-MIT Division of Health Sciences and TechnologyHST.722J: Brain Mechanisms for Hearing and SpeechCourse Instructor: Kenneth E. Hancock
HST.722 Brain Mechanisms of Speech and HearingFall 2005
Dorsal Cochlear NucleusSeptember 14, 2005
Ken Hancock
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
147www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus (DCN)
• Overview of the cochlear nucleus and its subdivisions• Anatomy of the DCN• Physiology of the DCN• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
148www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus (DCN)
• Overview of the cochlear nucleus and its subdivisions• Anatomy of the DCN• Physiology of the DCN• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
149www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
The cochlear nucleus
DCN
VCN AN
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
150www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
AN fibers terminate in a “tonotopic” or “cochleotopic” pattern
I
IIIII
0.17 kHz
2.7 kHz
10.2 kHz
36.0 kHz
ANR
sgb
1 mm
DCN
AVCN
PVCN
36.0 kHz
10.2 kHz
2.7 kHz
0.17 kHz
Figure by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
151www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Major subdivisions of the cochlear nucleus
PVCN
AVCNDCN
OAPD
AP
PV
MA
AA
VEST NERVE
AUDITORYNERVE
Spherical bushy
Globular bushy Multipolar
Octopus
Spherical bushy
Globular bushy Multipolar
Octopus
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
152www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Summary of pathways originating in the cochlear nucleus
PVCN
AVCN
DCN
OAPD
AP
PV
MA
AA
VEST NERVE
AUDITORYNERVE
Midline
SBC
GBC MAOA
SBC
GBC MAOA
Inferior colliculus
Lateral lemniscus
Superior olive
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
153www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Projections suggest DCN is a different animal than VCN
• (All roads lead to the inferior colliculus)• VCN projects directly to structures dealing with binaural
hearing and olivocochlear feedback• DCN ???
PVCN
AVCN
DCN
OAPD
AP
PV
MA
AA
VEST NERVE
AUDITORYNERVE
Midline
SBC
GBC MAOA
SBC
GBC MAOA
Inferior colliculus
Lateral lemniscus
Superior olive
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
154www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN• Physiology of the DCN• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
155www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN• Physiology of the DCN• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
156www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
157www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN
PVCN
I.C.I.C.
SomatosensoryAuditory
Vestibular
SomatosensoryAuditory
Vestibular
SomatosensoryAuditory
Vestibular
TD
?
?
TD
?
D
?
?
granulegranule
giant
fusiform
giant
fusiform
auditory nerve
auditory nerve
golgi
cartwheel
golgi
cartwheel
stellategolgi
cartwheel
verticalvertical
Kanold Young2001
Tzounopoulos 2004
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
158www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– nonauditory inputs– similar organization to cerebellar cortex
• Physiology of the DCN• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
159www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– nonauditory inputs– similar organization to cerebellar cortex
• Physiology of the DCN• Functional considerations
Photograph of Eric Young removed due to copyright reasons. Please see: http://www.bme.jhu.edu/labs/chb/people/index.php?page=ABOUT&user=eyoung
Eric Youngwww.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
160www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Response Map classification scheme
~ Auditory Nerve increasing inhibition
200
10050
100
150
100
200
100
200
300
200
400
0 0
600
0 0 0
FREQUENCY FREQUENCY
Type I Type II Type III Type IV Type V
Sound Level Sound Level 10 dB10 dB
Soun
d Le
vel
Dis
char
ge R
ate
Excitatory Responses
Inhibitory Responses
BF Tone
Noise
Spont.Act
Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
161www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN: Vertical cells are type II and type III units
•Narrow V-shaped region of excitation•No spontaneous activity•Tone response >> noise response
•V-shaped region of excitation•Inhibitory sidebands
Evidence: Antidromic stimulation (Young 1980)
BF tonenoise
type IItype IItype II
BF tonenoise
type III
400
00.5 1 2 5 10 20
-95
-90
-80
-70
-60
-50
-40
-30
-20
Frequency, kHz
Rat
e, S
pike
s/s
Rat
e, S
pike
s/s
dB atten
Sound Level0
Rat
e
1.25 2
300
05 10 20
2030405060708090
100
Frequency, kHz
100
200
0Sound Level
Rat
e
300
200
100
BF
Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
162www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Antidromic stimulation•record from neuron•shock its axon
DAS
VCN
DCN
fusiform
giant
vertical
recordingelectrode
stimulatingelectrode
recordingelectrode
stimulatingelectrode
recordingelectrode
stimulating electrode
recordingelectrode
stimulating electrode
Figures by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
163www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN: “Principal” cells are type III and type IV units
•“Island of excitation” & “Sea of inhibition”•BF rate-level curve inhibited at high levels•Noise rate-level curve ~ monotonic
Rat
e, sp
ikes
/sR
ate,
spik
es/s
1.25 2
300
05 10 20
2030405060708090100
Frequency, kHz
Frequency, kHz
50
100
150
100
200
0
0Sound Level
Sound Level
Rat
eR
ate
400
05 10 20 50
25
35
45
55
65
75
8595
dB attn
TYPE IIITYPE III
TYPE IVTYPE IV
Evidence:Antidromic stimulation (Young 1980)Intracellular recording and labeling (Rhode et al. 1983, Ding et al. 1996)
SpontSpont
Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
164www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Neural circuitry underlying DCN physiology:type II units inhibit type IV units
Rat
e, sp
ikes
/s
Frequency, kHz
50
100
150
0Sound level
Rat
eR
ate
400
05 10 20 50
25
35
45
55
65
75
8595
dB attn.TYPE IVTYPE IV
TYPE IITYPE II
SpontSpont
400
00.5 1 2 5 10 20
-95
-90
-80
-70
-60
-50
-40
-30
-20
Frequency, kHz
Rat
e, sp
ikes
/sdB attn.BF
Sound level0
300
200
100
BF tonenoise
"Reciprocal" responses
Giant
Fusiform
Vertical
Figures by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
165www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Classic experiment: type II units inhibit type IV units• multiunit recording Voigt & Young (1980, 1990)
Spik
es/S
econ
d
Cross-correlogram
Milliseconds-10 -5 0 5 10
10
20
30
-20
-10
400
300
200
100
0110 60 10
BF Tone
Level, dB Attn
NoiseRat
e, S
pike
s
Rat
e, S
pike
s
200
100
0-25 25 75
BBN
BF
Level,dB
type IV firing rate given type II fires at t=0
inhibitory trough
Cross-correlogramtype IV firing rate given type II fires at t=0
inhibitory trough
-
type II unit type IV unittype II unittype II unit type IV unit
Figures by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
166www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN physiology so far…
IV ~ principal cells
II
auditory nerve
~ vertical cells
+
+
• type II units inhibit type IV units
• BUT this analysis based on pure-tone responses
⇒ what happens with more general stimuli???
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
167www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Inhibition from type II units doesn’t account for everything
• (DCN responses to broadband stimuli cannot be predicted from responses to tones: nonlinear)
• Type II units do not respond to notch noise—whither the inhibition?• Response map has two inhibitory regions?
0.5 1 5 10 40
Frequency (kHz) Frequency (kHz)
10
20
30
40
50
60
70
80
Rat
e (s
pike
s/se
c)
dB attn
0.5 1 10 40
Data
Model
Spont.
Rat
e (s
pike
s/se
c)
200
0
100
20 d
B
200
Type IV unit response map
spec
trum
leve
l
Notch noise stimuli
spec
trum
leve
lsp
ectr
um le
vel
Notch noise stimuli
Upper Inhibitory SidebandUpper Inhibitory Sideband
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
168www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN notch noise sensitivity due to wideband inhibition
Broadband noise
AN input to type IV unit
frequency
leve
lle
vel
AN input to WBI
Notch noise
frequency
• strong AN input dominates
• type IV response is excitatory
• type IV loses greater portion of its excitatory input
• WBI input dominates
• type IV response is inhibitory
Nelken & Young 1994
WBIExcitatory
Inhibitory
Frequency
IV
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
169www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
PVCN: is the D-stellate cell the wideband inhibitor?
• such responses arise from radiate or stellate neurons (Smith & Rhode 1989)
• stellate cells send axons dorsally into the DCN, thus called “D-stellate cells” (Oertel et al. 1990)
• D-stellate cells are inhibitory (Doucet & Ryugo 1997)
• broadly-tuned, onset-chopper units are found in the PVCN (Winter & Palmer 1995)
• typically respond better to broadband noise than to tones
5-5
-15-25-35-45-55-65-75-85-95
-1051000050001000
Frequency (Hz)
Rel
ativ
e to
ne le
vel (
dB) 500
400
300
200
100
0100 50 0
1000
00 10 20
Noise
BF tone
Level, dB attn
Rat
e, sp
ikes
/s
Figure by MIT OCW. DCN
PVCN
DCN
PVCNa
Figure by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
170www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Summary: Circuitry of DCN deep layer
Excitatory inputAN fibers?
Type II inhibitory inputs (strong)
UIS from ? source (weak)
Soun
d le
vel
Frequency
Excitatory
Inhibitory
400
300
200
100
0110 60 10
BF Tone
Level, dB attn
NoiseRat
e, sp
ike/
s
IV
Best frequency
500
400
300
200
100
0100 50 0
1000
00 10 20
Noise
BF tone
Level, dB attnR
ate,
spik
es/s
Inhibitory
Excitatory
Spirou and Young 1991
type IItype II
W.B.I.W.B.I.
“reciprocal” response properties
Figures by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
171www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Summary of DCN anatomy and physiology
NonauditorySomatosensory
AuditoryVestibular
granule
fusiform
• vertical cell• type II unit• narrowband inhibition
D• D-stellate cell• onset-chopper• wideband inhibition
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
172www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– nonauditory inputs– similar organization to cerebellar cortex
• Physiology of the DCN– diverse response properties– complex interconnections– highly nonlinear
• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
173www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– nonauditory inputs– similar organization to cerebellar cortex
• Physiology of the DCN– diverse response properties– complex interconnections– highly nonlinear
• Functional considerations
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
174www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Filtering by the pinna provides cuesto sound source location
Human
A
-15.
-15.
15.
60.
90.
0.
2 3 5 7 10 15 20
10
0
5
-5
-10
-15
-20
Frequency (kHz)
Tran
sfor
mat
ion
(dB
)“Head Related Transfer Function ” (HRTF)
Out
er e
ar g
ain
-15°-15°0°0°
+15°+15°
“first notch” frequency changes with elevation“first notch” frequency changes with elevation
Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
175www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Type IV units are sensitive to HRTF first notch
• type IV units are inhibited by notches centered on BF
• null in DCN population response may code for sound source location
Reiss & Young2005
May 2000
Physiology ⇒
Behavior ⇒
Figures by MIT OCW.
30
-100
1020
30
-100
1020
-15dB
-35 dB
-55 dB
-75 dB
-95 dB300
3 5 10 30
3 5 10 300
100
200
Frequency / kHz
Stimulus notch frequency
Rat
e / s
pike
s per
seco
ndR
ate
/ spi
kes p
er se
cond
Gai
n / d
BEl = 30o, Az = 15o
El = 30o, Az = -15o
El = 30o, Az = -15o
El = 30o, Az = 15o
spont.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
176www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– receives nonauditory inputs– has similar organization to cerebellar cortex
• Physiology of the DCN– diverse response properties– complex interconnections– highly nonlinear
• Functional considerations– coding sound source location based on pinna cues
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
177www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN is a “cerebellum-like structure”
Generic cerebellum-like structure
descending auditorydescending auditory
somatosensorysomatosensory
cochleacochlea
“plastic” synapses“plastic” synapses
Predictive Inputs :Corollary discharge signalsHigher levels of the same modalityOther sensory modalities (e.g. proprioception)?
Granule layer
Molecular layer
Principal cell layer
Sensory input layer
Input from a sensory surface
Figure by MIT OCW.www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
178www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Synaptic plasticity: Long-Term Potentiation (LTP)
• “Classical” LTP demonstration at the hippocampal CA3-CA1 synapse• LTP evoked by tetanic stimulation (mechanism involves NMDA receptors)
Tzou
nopo
ulos
2004
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
179www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Electric fish provide clues to cerebellum-like function
black ghost knifefish(Apteronotus albifrons)
Photograph removed due to copyright reasons. Please see the Nelson Lab home page: http://nelson.beckman.uiuc.edu
electric organ discharge
(EOD)
• electrical activity detected by electric lateral line• afferent activity transmitted to electric lateral line
lobe (ELL), analogous to DCNwww.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
180www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Electric fields provide information about nearby objects
• BUT the fish generates its own electric fields:• tail movements• ventilation
⇒ cerebellum-like ELL helps solve this problem Bell 2001
Figures by MIT OCW.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
181www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
What do cerebellum-like structures do???
• Subtract the expected input pattern from the actual input pattern to reveal unexpected or novel features of a stimulus.
– DCN: pinna movement is expected to shift the first notch, independent of what the sound source is doing
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
182www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– nonauditory inputs– similar organization to cerebellar cortex
• Physiology of the DCN– diverse response properties– complex interconnections– highly nonlinear
• Functional considerations: the DCN may…– code sound source location based on pinna cues– extract novel components of response
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
183www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
DCN may play a role in tinnitus
• percept of noise, ringing, buzzing, etc.
• affects up to 80% of the population
• 1 in 200 are debilitated
• (not voices in the head)
This makes no sense!
So why DCN? Because tinnitus…
• involves plasticity
• may involve somatosensory effects
Levine 1999www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
184www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Dorsal Cochlear Nucleus
• Overview of the cochlear nucleus and its subdivisions– DCN projections do not reveal its function
• Anatomy of the DCN– more complex than other CN subdivisions– nonauditory inputs– similar organization to cerebellar cortex
• Physiology of the DCN– diverse response properties– complex interconnections– highly nonlinear
• Functional considerations: the DCN may…– code sound source location based on pinna cues– extract novel components of response– contribute to tinnitus
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
185www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References
Slide 5: Ryugo DK, May SK (1993) The projections of intracellularly labeled auditory nerve fibers to the dorsal cochlear nucleus of cats. J Comp Neurol 329:20-35.
Slide 6: Ehret G, Romand R, eds (1997) The Central Auditory System. New York: Oxford UniversityPress.
Slide 7: Ehret G, Romand R, eds (1997) The Central Auditory System. New York: Oxford UniversityPress.
Slide 12: Lorente de Nó R (1981) The Primary Acoustic Nuclei. New York: Raven.
Kane ES, Puglisi SG, Gordon BS (1981) Neuronal types in the deep dorsal cochlear nucleus of the cat. I. Giant neurons. J Comp Neurol 198:483-513.
Slide 15: Young ED (1984) Response characteristics of neurons of the cochlear nuclei. In: Hearing Science (Berlin CI, ed), pp 423-460. San Diego: College-Hill. ISBN: 0316091693.
Slides 16, 18, 19: Young ED (1984) Response characteristics of neurons of the cochlear nuclei. In: Hearing Science (Berlin CI, ed), pp 423-460. San Diego: College-Hill.
Young ED, Davis KA (2001) Circuitry and Function of the Dorsal Cochlear Nucleus. In: Integrative Functions in the Mammalian Auditory Pathway (Oertel D, Popper AN, Fay RR, eds). New York: Springer-Verlag. www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
186www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References, continued
Slide 20:VOIGT, H. F. AND YOUNG, E. D. Cross-correlation analysis of inhibitory interactions in dorsal cochlear nucleus. J. Neurophysiol. 64: 1590- 16 10, 1990.
Young ED, Davis KA (2001) Circuitry and Function of the Dorsal Cochlear Nucleus. In: Integrative Functions in the Mammalian Auditory Pathway (Oertel D, Popper AN, Fay RR, eds). New York: Springer-Verlag.
Slide 22:Spirou GA, Young ED (1991) Organization of dorsal cochlear nucleus type IV unit response maps and their relationship to activation by band-limited noise. J Neurophysiol 66:1750-1768.
Slide 23:Nelken I, Young ED (1994) Two separate inhibitory mechanisms shape the responses of dorsal cochlear nucleus type IV units to narrowband and wideband stimuli. J Neurophysiol 71:2446-2462.
Slide 24:Winter IM, Palmer AR (1995) Level dependence of cochlear nucleus onset unit responses and facilitation by second tones or broadband noise. J Neurophysiol 73:141-159.
Young ED, Davis KA (2001) Circuitry and Function of the Dorsal Cochlear Nucleus. In: Integrative Functions in the Mammalian Auditory Pathway (Oertel D, Popper AN, Fay RR, eds). New York: Springer-Verlag.
Oertel D, Wu SH, Garb MW, Dizack C (1990) Morphology and physiology of cells in slice preparations of the posteroventral cochlear nucleus of mice. J Comp Neurology 295:136-154.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
187www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Selected References, continued
Slide 25:Spirou GA, Young ED (1991) Organization of dorsal cochlear nucleus type IV unit response maps and their relationship to activation by band-limited noise. J Neurophysiol 66:1750-1768.
Nelken I, Young ED (1994) Two separate inhibitory mechanisms shape the responses of dorsal cochlear nucleus type IV units to narrowband and wideband stimuli. J Neurophysiol 71:2446-2462.
Young ED, Davis KA (2001) Circuitry and Function of the Dorsal Cochlear Nucleus. In: Integrative Functions in the Mammalian Auditory Pathway (Oertel D, Popper AN, Fay RR, eds). New York: Springer-Verlag.
Slide 26:Oertel D, Wu SH, Garb MW, Dizack C (1990) Morphology and physiology of cells in slice preparations of the posteroventral cochlear nucleus of mice. J Comp Neurology 295:136-154.
Slide 29: Geisler CD (1998) From Sound to Synapse: Physiology of the Mammalian Ear.: Oxford University Press.
Slide 30:Young ED, Spirou GA, Rice JJ, Voigt HF (1992) Neural organization and responses to complex stimuli in the dorsal cochlear nucleus. Philos Trans R Soc Lond B Biol Sci 336:407-413.
Slide 32: Bell CC (2001) Memory-based expectations in electrosensory systems. Curr Opin Neurobiol 11:481-487.
Slide 35:Zakon HH (2003) Insight into the mechanisms of neuronal processing from electric fish. Curr Opin Neurobiol13:744-750. www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
188www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Harvard-MIT Division of Health Sciences and Technology HST.722J: Brain Mechanisms for Hearing and Speech Course Instructors: Bertrand Delgutte, M. Christian Brown,
Joe C. Adams, Kenneth E. Hancock, Frank H. Guenther, Jennifer R. Melcher, Joseph S. Perkell, David N. Caplan
HST.722 Student Topics Proposal Erik Larsen
Cortical correlates of audio-visual integration
Introduction Animals have multiple sensory systems with which they can probe the external
environment. Oftentimes, objects or events in the environment produce signals to which more than one sensory system responds, and it is the task of the cerebral cortex to integrate these separate modalities into a unified percept. We take this process for granted, but it is in fact a difficult problem, which becomes obvious when we start to ask ourselves how we would design a system to do the same. The current state of knowledge described in the literature is quite limited and is mostly phenomenological, i.e. describing how auditory and visual inputs can either modulate each other and/or activate association areas of the cortex. There does not seem to be a mechanistic understanding of how this works.
I have collected a number of papers that look at audio-visual integration at different levels, from single-unit studies to cortical activation patterns to behavioral. The methods used range from intracellular recordings to scalp-recorded ERP (event-related potentials) to fMRI / MEG / PET imaging. For your pleasure and convenience, most of these papers are quite short. I have chosen three of them to discuss in class (Discussion papers), three for background, and quite a few for ‘further reading’ (if you only want to read a few of these, the most interesting for me are marked by ♦).
Behavioral effects of audiovisual integration Ultimately multi-modal (or multi-sensory) integration should lead to noticeable
behavioral effects. For human, the most interesting case is speech perception, which is usually considered an auditory modality but is strongly influenced by visual input. Being able to see the person that is talking to you leads to an increased intelligibility, especially in adverse conditions (high noise, reverberation, competing talkers) as shown by Sumby and Pollack (1954). The visual input gives redundant cues to reinforce the auditory stimulus but also disambiguates some speech sounds which differ in place of articulation but sound similarly (such as /ba/ vs. /da/). Sumby and Pollack (1954) have shown that especially at low acoustic signal-to-noise ratios, the visual signal can dramatically increase word recognition, in some cases from near zero to 70% or 80%. In higher signal-to-noise ratio conditions, the visual signal still contributes, but the absolute effect is smaller because auditory performance alone is already high.
Beside this synergistic effect of auditory and visual input, there are a few other effects that clearly demonstrate the strong interaction of these two modalities. The first is the so-called ‘ventriloquist’ effect, where a synchronous yet spatially separate audio and visual signal is heard as originating from the visual location. Macaluso et al. (2004) were interested in finding brain areas that mediate this integration of spatially separate yet synchronous information into a single percept, through PET scans of human brains, and identified an area in the right inferior parietal lobule to be activated especially in this
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
189www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
condition. Bushara et al. (2001) used PET scans to identify the right insula being most strongly involved in audiovisual synchrony-asynchrony detection.
Another famous audiovisual ‘illusion’ is the ‘McGurk effect’ (McGurk and MacDonald, 1976), where the sound /ba/ is combined with the visual image of a talker articulating /ga/, leading to a robust percept of /da/. The usual explanation for this effect is that the brain tries to find the most likely stimulus given the (in this case conflicting) auditory and visual cues. Note that this effect is extremely robust and not susceptible to extinction, even after hundreds of repetitions. This effect has contributed to the notion that audiovisual integration occurs at a pre-lexical stage, early in the neural processing pathway.
Neural correlates of audiovisual integration for ‘simple stimuli’ Given the strong behavioral effects of audiovisual integration described above, it
would be very interesting to explore the neural basis for these. However, we will first explore some more basic properties of audiovisual integration.
The canonical view of cortical sensory processing is that each sensory modality has a primary unimodal cortex, several higher-order unimodal association cortices, and that finally the various sensory modalities interact in multimodal association cortices. For the auditory case, the primary auditory cortex is located in the superior temporal gyrus or BA41 (see Fig. 1 for an overview of Brodmann areas). This area is surrounded by a belt and parabelt region, which are the auditory association areas. The middle and inferior temporal gyri (BA 20, 21, 37) are multisensory association areas, mainly auditory and visual. In fMRI studies of human brain activation, these multimodal areas activate uniformly in response to multimodal stimuli. Using high-resolution fMRI, it has recently been shown by Beauchamp et al. (2004) that in fact these multisensory areas (at least in the superior temporal sulcus) contain a patchwork of auditory, visual, and audiovisual areas, and each a few mm in size. It appears that the various unimodal areas send projections to small patches of multisensory cortex, after which the modalities are integrated in the intervening patches.
From human imaging and animal studies it is clear that there are special cortical areas which have multisensory responses. Komura et al. (2005) found that multisensory responses can also be found at lower levels, specifically in the auditory thalamus – the medial geniculate body (MGB). Traditionally, the thalamus is thought of as a relay station between brainstem/spinal cord and cortex, sending signals upward; but it is also known that it receives massive projections from the cortex itself. Komura and coworkers recorded from MGB shell neurons (which receive the cortical projections) in rat during a reward-based auditory spatial discrimination task, which was paired with an irrelevant yet variable light stimulus. Although MBG neurons did not respond to the light stimulus alone, the response was strongly modulated by the visual input, in particular the response was greater when auditory and visual stimuli were matched and smaller when they were conflicting. The matched and conflicting condition also led to a decrease vs. increase in reaction time, respectively, which again demonstrates the utility of integrating multiple modalities (which would usually be in agreement with one another, leading to a faster reaction). Interestingly, MGB shell neurons were also modulated by the amount of reward, although that is beyond the scope of our discussion.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
190www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
A conceptual difficulty with studies of multisensory integration is that the paradigm does not always permit one to be sure that integration has in fact occurred. For example, in the mentioned study by Macaluso et al. (2004) where spatially separated audio and video was used, it is in principle possible that there was no integration, even though one would expect it based on prior psychophysical data. Also, conditions that are designed to produce multisensory integration often necessarily use somewhat different stimuli than conditions that are not aimed at producing integration. Bushara et al. (2003) devised an ingenious method to circumvent these problems, and were able to use exactly identical stimuli that sometimes produced integration and sometimes did not. By comparing brain activation (BOLD fMRI) from either category they were able to find a correlation between larger activation in specific areas and the (non-)occurrence of integration. Because of these results, they propose that multisensory association areas work in parallel with primary sensory areas, instead of as a higher-level end-station, which is the usual view. This would agree with Komura et al. (2005) who showed that error rates were sensitive to matched or conflicting simultaneous audiovisual signals, which also had strong neural correlates (enhanced or depressed rate responses).
A completely different aspect of multisensory effects on sensory processing is attentional cueing, i.e. one modality biases the observer such that the other modality will respond preferentially to the cued object. As an example, consider hearing a familiar voice calling your name from a crowd of people; you will reflexively turn towards the sound location and focus your visual attention of the same location, leading you to find your friend more quickly than had he been silent. Such cueing effects are well documented in psychophysical experiments, and it has been proposed that by directing attention towards an object, the neural signal from that object propagates faster through the brain. This is the proposed neural correlate of judging attended objects to appear earlier than unattended objects, even if they appear simultaneously. McDonald et al. (2005) used a sound cue in a visual task where subjects were required to judge which of two lights switched on earlier. They were able to show by measuring event-related potentials (ERPs) from the scalp that the attended (cued) light yielded a larger ERP than the unattended light, even if they were simultaneous. Somehow this difference in magnitude is translated into a delay in subsequent stages of processing that lead to perception. These findings may again corroborate to some extent the findings of Komura et al. (2005) in that cueing by one modality can influence the response of neurons in the other modality, with subsequent clear-cut behavioral consequences.
Neural correlates of audiovisual integration in speech perception
We have already described how important visual signals are for speech perception, both in the synergistic sense of aiding speech intelligibility (Sumby and Pollack, 1954) as well as in creating illusions such as the McGurk effect (McGurk and MacDonald, 1976). In the previous section we explored how relatively simple audiovisual stimuli that may or may not be fused into single objects can influence neural responses, both for individual neurons as well as for the whole brain. How is this for speech? Does brain activity differ between purely listening to speech as compared to listening and seeing speech (beside predictable activation of purely visual sensory areas)? And what about purely seeing speech (no sound)? Does this activate any of the same cortical areas? The latter question
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
191www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
of as academic interest in understanding language processing, but also has a more practical importance in that this is how many deaf people ‘listen’ to others, i.e. by lipreading, also called speechreading. It would have great value to know what areas are the most important for speech reading, and whether differences in speechreading ability (which are large between people) have a identifiable neural basis.
The classic paper in this context is Calvert et al. (1997). They used fMRI to find brain areas of increased activity when either listening to speech or silent lipreading, and the surprising finding was that silent lipreading activates primary and association auditory cortices. This was interpreted to mean that audiovisual integration occurs at a very early level in the neural pathway, even before association areas are activated (although these may cooperate in parallel, instead of hierarchical, cf. Bushara, 2003). Auditory areas were not activated by closed-mouth, non-speech movements of the lower face. A series of subsequent experiments similar to these were conducted, and with improvements in fMRI technology it became controversial whether primary auditory cortex is indeed activated by silent lipreading. Most investigators failed to find group-averaged activation of primary auditory cortex, although auditory association areas (e.g. BA 42, 22) do reliable activate during silent lipreading. Hall et al. (2005) found such kind of result, with the exception that for some proficient lipreaders, superior temporal gyrus did activate. It seems that currently there is no unambiguous answer to the question whether silent lipreading activates primary auditory cortex; there seems to be evidence pro and contra. Interestingly, Hall et al. (2005) describes other cortical areas that vary in activity as a function of speechreading proficiency. For example, high activity in the left inferior frontal gyrus was associated with poor speechreading ability. The explanation is that the greater task difficulty requires more extensive use of cognitive processes, which are located in the frontal lobe. However, the left inferior frontal gyrus is Broca’s area (BA 44, 45), traditionally assumed to support articulatory-based mechanisms of speech production and executive aspects of semantic processing. From these and other recent results, it is becoming increasingly clear that Broca’s area is probably also involved in language comprehension, supporting the formation of syntactic and semantic structure and syntactic working memory.
Paulesu et al. (2003) also found activation the perisylvian language area and of Broca’s area (BA 44) in lipreading, also for non-lexical lipreading (NLLR, formed by playing video backwards). Lexical lipreading (LLR, forward video) differentially activated the more anterior part of Broca’s area (BA 45), and also the left inferior temporal cortex. Therefore, Paulesu and coworkers assumed that these two areas might be particularly important for lexical access in lipreading. As in other recent imaging studies of lipreading, they found activation of auditory association areas (but not primary auditory areas) for both LLR and NLLR.
Conversely, auditory input can also activate visual sensory areas, as studied by Giraud and Truy (2002). In both normal and cochlear-implant subjects the fusiform gyrus and early visual cortex (BA 18, 19) was activated by listening to speech (no visual signal). It is assumed that the expectancy of visual correlates of speech are responsible for this effect. In cochlear-implant patients, the visual area activation was much greater than for normal-hearing subjects, showing the greater reliance they presumably place on visual cues. This shows that multimodal integration is flexible and can be strengthened when one modality is degraded.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
192www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
References
Background • W.H. Sumby and I. Pollack, “Visual contribution to speech intelligibility in
noise,” J. Acoust. Soc. Am. 26(2): 212-215, 1954. Nice short paper showing the (large) benefit (in terms of words correctly perceived) of seeing the face (mouth) that is talking to you, with an interesting interaction of the size of the vocabulary.
• H. McGurk and J. MacDonald, “Hearing lips and seeing voices,” Nature 264: 746-748, 1976. The famous ‘McGurk effect’ described: an auditory /ba/ paired with a visual /ga/ leads to a perception of /da/.
• G.A. Calvert, E.T. Bullmore, M.J. Brammer, R. Campbell, S.C.R. Williams, P.K. McGuire, P.W.R. Woodruff, S.D. Iversen, A.S. David, “Activation of auditory cortex during silent lipreading,” Science 276: 593-596, 1997. Classic paper on neuroimaging of speechreading; was the first to show auditory cortex activation from speechreading alone. Later studies have usually found that primary auditory cortex is not activated by speechreading alone, throwing some controversy on the issue.
Discussion papers • K.O. Bushara, T. Hanakawa, I. Immisch, K. Toma, K. Kansaku, and M. Hallet,
“Neural correlates of cross-modal binding,” Nature Neurosci. 6(2): 190-195, 2003. Shows that specific cortical areas are more active (fMRI) when an audiovisual stimulus is perceived as one, compared to when exactly the same audiovisual stimulus is perceived as separate unimodal events.
• G.A. Calvert and R. Campbell, “Reading speech from still and moving faces: The neural substrates of visible speech,” J. Cogn. Neurosci. 15(1): 57-70, 2003. Visible speech aids speech reception both when moving images are presented as well as still images. In this study, fMRI was used to asses if different cortical areas subserve these two visual categories. They found that similar language-based cortical areas were activated in either case, although stronger activation was seen for the moving images.
• Y. Komura, R. Tamura, T. Uwano, H. Nishijo, and T. Ono, “Auditory thalamus integrates visual inputs into behavioral gains,” Nature Neurosci. 8(9): 1203-1209, 2005. Single-unit responses in rat MGB (belt region) are modulated by visual stimulus and reward, in an audiovisual discrimination task. Shows that neuron response predicts behavioral response in hit/miss/reject/false alarm conditions; also shows neural and behavioral correlates of task difficulty.
Further reading (♦ most interesting) • T. Raij, K. Uutela, and R. Hari, “Audiovisual integration of letters in the human
brain,” Neuron 28: 617-625, 2000. MEG imaging of human brain in response to visual, auditory, and audiovisual letters. It was found in the superior temporal sulcus audiovisual responses are suppressed relative to the unimodal responses, in contrast to single-unit responses, which are usually potentiated for multimodal stimuli.
• K.O. Bushara, J. Grafman, and M. Hallet, “Neural correlates of auditory-visual stimulus onset asynchrony detection,” J. Neurosci. 21(1): 300-304, 2001.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
193www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
The authors wanted to identify which brain areas are primarily involved in detection audio-visual asynchrony, and found the right insula to correlate best with the asynchrony detection effort as measured by response latency.
• A.L. Giraud and E. Truy, “The contribution of visual areas to speech comprehension: a PET study in cochlear implant patients and normal-hearing subjects,” Neuropsychologia 40: 1562-1569, 2002. Cochlear-implant patients (but not normal-hearing subjects) produce strong visual cortex activation when hearing speech.
• E. Paulesu, D. Perani, V. Blasi, G. Silani, N.A. Borghese, U. De Giovanni, S. Sensolo and F. Fazio, “A functional-anatomical model for lipreading,” J. Neurophysiol. 90: 2005-2013, 2003. A PET-scanning study of lexical and non-lexical (reversed video) lipreading implicates Broca’s area in speech perception, as well as some other areas. Associative (non-primary) auditory cortex was activated for both stimulus conditions.
• ♦ E. Macaluso, N. George, R. Dolan, C. Spence, and J. Driver, “Spatial and temporal factors during processing of audiovisual speech: A PET study,” NeuroImage 21: 725-732, 2004. Using PET scanning the authors identify cortical regions that are sensitive to the synchrony and/or spatial coincidence of visual and auditory speech.
• ♦ M.S. Beauchamp, B.D. Argall, J. Bodurka, J.H. Duyn, and A. Martin, “Unraveling multisensory integration: patchy organization within human STS multisensory cortex,” Nature Neurosci. 7(11): 1190-1192, 2004. The authors show with high-resolution fMRI that audiovisual association cortex has patches of auditory, visual, and audiovisual cortex, instead of a homogenous audiovisual cortex.
• D.A. Hall, C. Fussell, and A.Q. Summerfeld, “Reading fluent speech from talking faces: Typical brain networks and individual differences,” J. Cogn. Neurosci. 17(6): 939-953, 2005. Includes a literature overview of neuroimaging studies of speech/lip-reading. Investigates with fMRI which cortical areas are active both in auditory speech and visual speech comprehension, and defines some cortical areas that correlate with individual speechreading proficiency.
• ♦ J.J. McDonald and W.A. Teder-Sälejärvi, F. Di Russo, and S.A. Hillyard, “Neural basis of auditory-induced shifts in visual time-order perception,” Nat. Neurosci. 8(9): 1197-1202, 2005. Measuring ERP (event-related potentials) from human scalp the authors show that time-order judgments correlate with the magnitude of ERP, not its timing. In particular, they find no evidence for the prior held belief that attended stimuli propagate faster through the CNS than unattended stimuli. Auditory cues were used to bias attention towards one of two visual cues, which could appear simultaneously or with a time delay.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
194www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Figure removed due to copyright considerations.
Figure 1. Outline of Brodmann areas with functional attribution. From http://spot.colorado.edu/~dubin/talks/brodmann/brodmann.html.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
195www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Neural Centers and Perceptual Characteristics of Auditory Short-term Memory Anna A. Dreyer
Studies of auditory perception rarely incorporate findings regarding storage of short-term auditory memory. Investigators rarely incorporate auditory working memory into the neural circuits and anatomical regions they hypothesize underlies perception. This review focuses on the brain regions, neural circuits and perceptual characteristics of human auditory working memory, the storage of auditory percepts necessary to perform short-term, trial-by-trial tasks, in contrast to global, permanent memory formation. The neural areas participating in auditory working memory greatly overlap with those implicated in auditory perception, suggesting the importance of incorporating working memory into our understanding of auditory perception. Investigators interested in elucidating the mechanisms of auditory memory have focused on characterizing the perceptual limitations and neurophysiological responses. Psychophysical studies seek to understand human properties of the representation, single-cell recordings look for responses characteristic of short-term retention and imaging studies implicate the brain regions necessary to perform various short-term memory tasks. In addition, lesion studies have been important in examining the perceptual and neurophysiological defects when a key area implicated in short-term auditory memory has been removed. This review will describe how each methodology has shaped the current understanding of the mechanisms underlying auditory memory. Retention of non-verbal, perceptual information needed in pitch and lateralization discrimination is discussed to allow for comparisons between neurophysiological studies in non-human primates and human perceptual and imaging studies. Psychophysical studies find parametric segregation of short-term memory stores Psychophysical studies have explored system storage capacities in delayed performance of tasks with respect to variation in sensory parameters, such as duration, pitch, spatial location and memory load of auditory stimuli. In particular, investigators have discovered that auditory memory has separate stores for important auditory parameters. Anourova et al. (1999) conducted experiments testing whether pitch and location short-term memory can be disrupted with pitch and location-based distracters. Subjects were asked to identify whether two stimuli, separated by a delay, matched in frequency, in cases where either a stimulus of a different pitch or in a different spatial location served as a distracter during the delay. Anourova et al. found that only stimuli of the same modality (pitch of spatial location) degraded performance, concluding that pitch and location are contained in different auditory stores. Findings of Clarke et al. (1999) corroborate those findings. Semal and Demany (1991) suggest that pitch and timbre also have distinctly independent representations in auditory working memory. Other studies characterize the pitch or location-based stimuli that serve as the most distracting. For instance Deutch (1972) also asked subjects to compare two test tones and played distracter tones between the two test tones. Subjects were told to ignore the distracter tones and indicate whether the two tones were similar or different in pitch. However, as also found by Anourova et al (1999) and Clarke et al. (1999), performance degraded with distracter tones having similar pitches to the test tones. Performance worsened dramatically when the distracter tones were 2/3 tone away from the test tone, regardless of the tone frequency, and improved for distracter tones higher and lower in frequency. These findings indicate that distracter tones are
Harvard-MIT Division of Health Sciences and TechnologyHST.722J: Brain Mechanisms for Hearing and SpeechCourse Instructors: Bertrand Delgutte, M. Christian Brown, Joe C. Adams, Kenneth E. Hancock, Frank H. Guenther, Jennifer R. Melcher, Joseph S. Perkell, David N. Caplan
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
196www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
most effective when they are close in frequency to the test tones. This study and another conducted by the author a year later (Deutsch, 1973) also adds that the auditory memory storage is logarithmically organized, since the errors scaled with tonal distance between the distracter and test tone. As evidence of further parameter segregation in auditory working memory, Clement et a. (1999) suggested that pitch and loudness memory processing have separate representations and possibly different processing centers. This study measured performance during a frequency and an intensity discrimination task and found that the patterns of performance degradation as a function of delay between the two test stimuli differs substantially, as shown in Figure 1.
Figure 1: Discrimination performance (d’) as a function of separation time between the two stimuli for frequency and intensity discrimination. Results averaged for four subjects in each condition. Results show that working memory degrades at different rates for the two conditions suggesting different underlying storage mechanisms. Adapted from Clement et al. (1999).
This finding is analogous to different working memory domains processing contrast and spatial frequency in the visual system, and provides further evidence of different memory stores for important auditory working memory parameters. Neurophysiological studies find neural correlates of short-term retention in medial geniculate body and auditory cortex Relatively few studies have directly recorded neural responses from auditory cortical areas participating in short-term memory tasks. Studies have implicated the auditory cortex in short-term memory processing in cats (Neff et al., 1975), dogs (Chorazyna and Stepien, 1961) and monkeys (Stepien et al., 1960). This relatively primitive stage or cortical processing was thoroughly investigated by Gottlieb et al., (1989) who concluded that the cortical neurons retain information about the first tone frequency during interstimulus intervals (ISI) using a rate code. Gottlieb et al. trained monkeys to discriminate two sounds with a delay in a performance and a passive listening condition, and found that the firing rate during the ISI correlates with the first tone in both conditions. In addition, they found that the firing rate of some units depended on whether the frequency of the first tone matched the frequency of the second tone. This study strongly implicates these neurons in the circuitry of short-term working memory (STWM). Participation of auditory cortex in working memory is also supported by magnetoencephalography study of tone loudness (Lu et al., 1992).
HST.722 Brain Mechanisms of Speech and Hearing 2
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
197www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
In a similar study, Sakurai (1990) implicated the medial geniculate body (MGB), a processing center downstream from the auditory cortex, in addition to A1 in STWM. In his experiments, rats made discriminatory same/different responses regarding the frequency of two test tones, while Sakurai recorded spiking patterns from the auditory cortex, medial geniculate body and the inferior colliculus. Although many of the cells in each area showed differences in firing rates during the presentations of the two stimuli, some cells in the auditory cortex and the medial geniculate body showed differential, stimulus-dependent activity during the delay between the two stimuli, indicating involvement in retention. Both this and the Gottlieb et al. (1989) study suggest a neural correlate of retention of the two stimuli with a rate code dependent on the first stimulus.
Figure 2: Responses of a medial geniculate body (MGB) unit to demonstrate the delay correlate between tone presentation periods and the delay period. Left: diagrams represent peri-stimulus time (PST) histograms during a sample tone (3 s), the interstimulus delay (3 s) and the test tone (1 s). Data are presented for six types of trials with tones identified as H=high, L=low, and responses as G=go and NG=no go. Mismatch conditions are H-L and L-H. Right: Bargrams of median spike frequencies (not Hz) during the sample tone, delay and test tone, and for the first and second halves of the delay. The asterisk shows statistically significant differences (p<.05) between high and low tones during the sample tone, delay and test tone. Differences in the number of spikes also exist in both halves of the delay for different stimuli. Adapted from Sakurai (1990).
Another non-cortical area neurophysiologically implicated in working-memory in the hippocampus. Although this area is often associated with long-term, reference memory, Sakai (1994) finds that some hippocampal neurons are solely involved in either working or reference memory, whereas other neurons are involved in both types of memories. Imaging studies (see below) implicate the prefrontal cortex (PFC) in STWM (Romanski and Goldman-Rakic, 2002). The PFC consists of several cytoachitechtonically defined regions and there is evidence that these regions both distinct with respect to sensory modality and
HST.722 Brain Mechanisms of Speech and Hearing 3
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
198www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
stimulus features (Levy and Goldman-Rakic, 2000; Carmichael and Price, 1995) and integrate information across sensory modalities (Fuster et al. 1990; Bodner et al, 1996). Romanski et al. (1999) find distinct regions in the belt and parabelt regions that have neuronal targets in the frontal lobe of the prefrontal cortex in the rhesus monkey. Since the belt and parabelt regions of A1 have implications of SHTM involvement (Gottlieb et. al, 1989; Sakurai, 1990), these histological studies give further evidence that the prefrontal cortex incorporates auditory STWM neural substrate downstream. Neurophysiological recordings of PFC activity in relation to auditory STWM tasks would shed light on the neural circuitry involving the MGB, A1 and PFC, but such studies are lacking. Imaging work implicates auditory cortical, frontal, parietal and cerebellar areas in STWM tasks Functional imaging studies have shed light on neural correlates of perceptual processing and implicate various cortical areas in auditory working memory. These studies suggest that the auditory cortex as well as higher cortical, as well as cerebellar, areas participate in working memory processing, but the degree of activation and the precise neural correlate depends on the type of memory task. Many imaging studies find activation of the prefrontal cortex in STWM tasks (e.g. Romanski and Goldman-Rakic, 2002). Levy and Goldman-Rakic (2000) find “domain-specific” organization of working memory function in the prefrontal cortex, and activation of one or more of several subdivisions of dorsolateral prefrontal cortex, depending on the type of information processed. Romanski and Goldman-Rakic (2002) specifically implicate the dorsolateral and ventrolateral prefrontal cortices in retention of auditory stimuli. Celsis et al. (1999) also suggested frontal and prefrontal regions play a role in auditory working memory in maintenance of tonal patterns. For pitch comparison memory tasks, Zatorre et al. (1994) found greater activation of the right frontal and temporal lobes, and the parietal and insular cortex in high memory load conditions. Temporal lobe activation is expected, with its recognized role in musical processing, as is frontal activation, known for executive functions. Parietal and insular cortical activation probably played a role in attention. These findings are in line with those of Griffiths et al. (1999) who also found more extensive right lateralized network including the cerebellum, posterior temporal and inferior frontal regions when subjects were making “same/different” judgments while comparing the pitch of 6 tones. This finding may be in sharp contrast with a pitch matching study by Gaab et al. (2003) of greater working pitch memory activation of the left supramarginal gyrus (SMG) and the left dorsolateral cerebellum, as well as other structures in the temporal cortex.
HST.722 Brain Mechanisms of Speech and Hearing 4
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
199www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
Figure 3: Brain activation pattern (P<.05, corrected) during the initial imaging time points (0-2s after the ends of the auditory stimulation). The pattern is dominated by strong (left>right) activation of the superior temporal gyrus bilaterally including Heschl’s gyrus and auditory association cortex; anterior, lateral and peosterior to Heschl’s gyrus including the planum temporale and the supramarginal gyrus. Activation also includes bilateral posterior dorsolateral frontal regions, superior parietal region on the right side, left>right pre-SMA region and lobules V and VI of the left cerebellum.
Celsis et al (1999) also implicated left SMG and right frontal and primary and secondary auditory cortices in the activation of a task requiring memory judgments between tones of different pitch height or spectral content. Thus, there seems to be some disagreement about the side of the asymmetry of activation patterns in pitch memory tasks. That the cerebellum has non-motor functions is found in several studies, including auditory verbal memory function (Grasby et al, 1993), tone recognition (Holcomb et al, 1998) and musical duration discrimination (Parsons, 2001). Mathiak et al. (2004) also implicates cerebellar involvement in non-verbal auditory memory. Although the latter study suggests that auditory cortex activated early in the delay between the first and second test tones, and that the SMG and cerebellum and activated later, Gaab et al. (2003) finds that the cerebellum activates immediately and sees cerebellar activation through the task. However, both studies agree on cerebellar involvement in STWM tasks. For audiospatial memory tasks, Martinkauppi et al. (2002) found bilateral activation in superior, middle and inferior frontal gyri and posterior parietal and middle temporal cortices. These areas were similarly found in the Zatorre et al. (1994) pitch study but here with bilateral activation. Martinkauppi et al. (2002) also revealed similar activation in the audio- and visio-spatial tasks, indicating that audio- and visio-spatial performance in STM tasks are located along a common neural pathway.
HST.722 Brain Mechanisms of Speech and Hearing 5
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
200www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
Lesion studies indicate debilitation in STWM performance with bilateral auditory association cortical lesions. Lesion studies in monkeys implicate superior temporal cortex, a high-level auditory association area, in short-term memory tasks (Colombo et al., 1990; Colombo, 1996). However, Colombo (1996) suggests that only bilateral lesions impact auditory short-term memory performance significantly. This study used a delayed matching-to-sample task with high-frequency and low-frequency tones. Monkeys had to indicate whether two consecutive stimuli (with 2 s pause) were the same or different. Monkeys received lesions of association cortex with testing after each unilateral operation. The lesions corresponded to cytoarchitectonic area TA, sparing primary (area TC) and secondary (area TB) of auditory cortex. Visual performance after the first operation was unaffected, but auditory performance was affected in some monkeys after many sessions and some were unable to achieve preoperative levels. With bilateral lesions, monkeys were unable to reach preoperative levels with extensive testing, with worse performance deteriorating with delay length.
Figure 4: Performance (percent correct) on pattern-discrimination task preoperatively and after the second (bilateral) cortical lesion. Top and bottom dotted lines represent criterial and chance levels of performance, respectively. Adapted from Colombo et al., 1996.
This study indicates that good performance is possible with one superior temporal cortex, corroborating findings of greater unilateral activation (Zatore, 1994; Gaab, 2003; Celsis, 1999), and clearly implicates this area in STWM processing. Conclusion: complex interactions of STWM circuitry Combined psychophysical, neurophysiological, imaging and lesion studies implicate a complex network of neuronal areas involved in STWM processing. Psychophysical work suggests that different memory representations and processing exists for important STWM parameters such as pitch, timbre, spatial location and loudness, with only slight parametric interaction in memory tasks. Neurophysiological studies find neural correlates of rate-based coding and retention of information about prior stimuli in MGB and A1, but further studies need to map out the connections and coding schemes involving these centers, as well as other center implicated using imaging work. Functional imaging studies suggest a complex asymmetry in pitch and spatial location memory tasks involving primary and secondary auditory cortices,
HST.722 Brain Mechanisms of Speech and Hearing 6
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
201www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
frontal and prefrontal areas, parietal areas as well as the cerebellum. Studies disagree about the temporal patterns of activation and about the right or left asymmetry of activation. Lesion studies in monkeys suggest that only bilateral auditory cortical lesions significantly affect performance, a finding that may corroborate some imaging studies with asymmetric activation patterns of activity. Because of the wide involvement of many cortical (and sub/supracortical) areas in STWM processing as well as disagreement regarding activated areas, further investigations of perceptual auditory phenomena should keep in mind possible involvement of STWM in designing tasks and evaluating results.
HST.722 Brain Mechanisms of Speech and Hearing 7
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
202www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
Suggested Papers
Background Pasternak, T., and Greenlee, M.W. (2005). Working memory in primate sensory systems. Nature Reviews Neurosci., 6: 97-107. Assigned Papers Clement S., Demany, L., and Semal, C. (1999). Memory for pitch versus memory for loudness. J. Acoust. Soc. Am., 106(5): 2805-2811. Gottlieb, Y., Vaadia, E., and Abeles, M. (1989). Single unit activity in the auditory cortex of a monkey performing a short-term memory task. Exp. Brain Res., 74: 139-148 Gaab, N., Gaser, C., Zaehle, T., jancke, L., Schlaug, G. (2003). Functional anatomy of pitch memory—an fMRI study with sparse temporal sampling. NeuroImage, 19: 1417-1426. Further reading Colombo, M., Rodman, H.R., Gross, C.G. (1996). The effects of superior temporal lesions on the processing and retention of auditory information in mokeys (ebus paella). J. Neurosci. 16, 4501-4517. Deutsh, D. (1973). Interference in memory between tones adjacent in the musical scale. J. Exp. Psychol. 100, 228-231. Mathiak, K., Hertrich, I., Grodd, W., and Ackermann, H. (2004). Discrimination of temporal information at the cerebellum: functional magnetic resonance imaging of nonverbal auditory memory. Neuroimage 21: 154-162. Sakurai, Y. (1990). Cells in the rat auditory system have sensory delay correlates during the performance on an auditory working memory task. Behavioral neurosci. 104(6): 856-868.
HST.722 Brain Mechanisms of Speech and Hearing 8
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
203www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
Works Cited Anourova, I., Rama, P.; Alho, K.; Koivusalo, S.; Kahnari, J.; and Carlson, S. (1999) Selective interference reveals dissociation between auditory memory for location and pitch. Neuroreport 10. 3543-3547. Bodner, M., Kroger, J., and Fuster, J.M., (1996). Auditory memory cells in dorsolateral prefrontal cortex. Neuroreport 7, 1905-1908. Carmichael, S.T. and Price, J.L. (1995). Sensory and premotor connections of the orbital and medial prefrontal cortex of macaque monkeys. J. Comp. Nurol. 363, 642-664. Celsis, P., Boulanouar K., Doyon, B. Ranjeva, J.P., Berry, I., Nespoulous J.L, Chollet, F., (1999). Differential fMRI responses in the left posterior superior temporal gyrus and left supramarginal gyrus to habituation and chage detection in syllables and tones. Neuroimage 9, 135-144. Chorazyna, H., Stepien L. (1961). Impairment of auditory recent memory produced by cortical lesions in dogs. Acta Biol Exp. 21: 177-187. Clarke, S., Adriani M. and Bellmann. (1998). A Distinct short-term memory systems for sound content and sound localization. Neuroreport 9(15), 3433-3437. Clement S., Demany, L., and Semal, C. (1999). Memory for pitch versus memory for loudness. J. Acoust. Soc. Am., 106(5): 2805-2811. Colombo, M.D., D’Amato, M.R., Rodman, H.R., Gross, C.G. (1990). Auditory association cortex impair auditory short-term memory in monkeys. Science 247, 336-338. Colombo, M., Rodman, H.R., Gross, C.G. (1996). The effects of superior temporal lesions on the processing and retention of auditory information in mokeys (ebus paella). J. Neurosci. 16, 4501-4517. Deutsch, D. (1972). Mapping of interactions in the pitch memory store. Science, Vol. 175: 1020-1022. Deutsh, D. (1973). Interference in memory between tones adjacent in the musical scale. J. Exp. Psychol. 100, 228-231. Fuster, J.M., Bodner, M. and Kroger, J.K. (2000). Cross-modal and cross-temporal association in neurons of frontal cortex. Nature 405. 347-351. Gaab, N., Gaser, C., Zaehle, T., jancke, L., Schlaug, G. (2003). Functional anatomy of pitch memory—an fMRI study with sparse temporal sampling. NeuroImage, 19: 1417-1426. Gottlieb, Y., Vaadia, E., and Abeles, M. (1989). Single unit activity in the auditory cortex of a monkey performing a short-term memory task. Exp. Brain Res., 74: 139-148 Grasby, P.M., Frith, C.D., Friston, K.J., Bench, C., Frackowiak, R.S., Dolan, R.J. (1993). Functional mapping of brain areas implicated in auditory verbal memory function. Brain. 116(1): 1-20. Griffiths, T.D., Johnsrude, I., Dean, J.L., Green, G.G.R. (1999). A common neural substrate for the analysis of pitch and duration pattern in segmented sound? Neroreport 10. 3825-3830. Holcomb, H.H., Medoff, D.R., Caudill, P.J., Zhao, Z., Lahti, A.c., Dannals, R.F., and Tamminga, C.A. (1998). Cerebral blood flow relationships associated with a difficult tone recognition task in trained normal volunteers. Cerebral Cortex. 8: 534-542. Levy, R. and Goldman-Rakic, P.S. (2000). Segregation of working memory functions within the dorsolateral prefrontal cortex. 133(1):23-32. Lu, Z.L., Williamson, S.J. and Kaufman L. (1992). Behavioral lifetime of human auditory sensory memory predicted by physiological measures. Science 258. 1668-1670. Martinkauppi, S., Rama, P., Aronen, H.J., Korvenoja, A., Carlson, S. (2000). Working memory of auditory localization. Cerebral Cortex 10(9): 889-898. Mathiak, K., Hertrich, I., Grodd, W., and Ackermann, H. (2004). Discrimination of temporal information at the cerebellum: functional magnetic resonance imaging of nonverbal auditory memory. Neuroimage 21: 154-162. Neff, W., diamond, I., Cassedey, J. (1975). Behavioral studies of auditory discrimination: central nervous system. In: Keider WD, Neff D. (eds). Handbook of sensory physiology. Vol V/2. Springer, Berlin Heidelberg, New York, 307-400. Parsons, L.M. (2001). Exploring the functional neuroanatomy of music performance, perception and comprehension. Ann. NY Academy Sci. 930: 211-231. Pasternak, T., and Greenlee, M.W. (2005). Working memory in primate sensory systems. Nature Reviews Neurosci., 6: 97-107. Romanski, L.M., Bates, J.F. and Goldman-Rakic, P.S. (1999) Auditory Belt and Parabelt Projections to the prefrontal cortex in the rhesus monkey, J. Comp. Neurol, 403: 141-157.
HST.722 Brain Mechanisms of Speech and Hearing 9
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
204www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Anna A. Dreyer October 26, 2005
Romanski, L.M., and Goldman-Rakic, P.S. (2002). Au auditory domain in primate prefrontal cortex. Nature Neurosci 5, 15-16. Sakurai, Y. (1990). Cells in the rat auditory system have sensory delay correlates during the performance on an auditory working memory task. Behavioral neurosci. 104(6): 856-868. Sakurai, Y. (1994). Involvement of auditory cortical and hippocampal-neurons in auditory working-memory and reference memory in rat. J. Neurosci. 14(5): 2606-2623. Semal, C. and Demany, L. (1991). Dissociation of pitch from timbre in auditory short-term memory. J. Acoust. Soc. Am. 89: 2404-2410. Stepien, L.S., cordeau, J.P., and Rasmussen, T. (1960). The effect of temporal lobe and hippocampal lesions on auditory and verbal recent memory. Brain 83: 470-489. Zatorre, R.J., Evans, A.C. and Meyer, E. (1994). Neural mechanisms underlying melodic perception and memory for pitch. J. Neurosci. 14. 1908-1919.
HST.722 Brain Mechanisms of Speech and Hearing 10
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
205www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Harvard-MIT Division of Health Sciences and Technology HST.722J: Brain Mechanisms for Hearing and Speech Course Instructors: Bertrand Delgutte, M. Christian Brown,
Joe C. Adams, Kenneth E. Hancock, Frank H. Guenther, Jennifer R. Melcher, Joseph S. Perkell, David N. Caplan
Student Topic: Brain Attending a Cocktail Party Adrian KC Lee
Brain Attending a Cocktail Party
Introduction
Imagine after our topic presentation, we go to a crowded pub to celebrate. You are trying hard
to ignore those eighties tunes blaring out of the loudspeakers while you are chatting with your
colleagues. Suddenly, this familiar melody from a Mozart’s piano concerto grabs your attention
– yes, your cell phone alerts you that your partner just called for the third time…
This classic phenomenon, in which our brain analyzes a scene by perceptually organizing
sensory data to form auditory objects (or auditory streams), is often referred to as the “Cocktail
Party Effect” (Cherry, 1953). Many cues have been identified that influence perceptual
organization, but only little is known about the actual brain mechanisms underlying this
phenomenon. In this proposed topic, we look at the latest development in the quest to find the
neural basis for auditory stream segregation.
Background – Auditory grouping mechanisms
The ability to form auditory objects is important in the natural environment where sounds
arriving at our ears are a resultant of all spectro-temporal components that may have arisen
from different auditory events. We constantly analyze an auditory scene by trying to group
related components from one source, and segregating out other frequency components that are
not in the attended object (Carlyon, 2004). This ability of bringing acoustical events to the
attention foreground may increase the chance of survival for a species in the animal kingdom
through auditory awareness to the movement of their predators. Humans also rely on auditory
grouping mechanisms for daily communication, especially in the presence of noise and other
competing sources, since we need to group simultaneous components originating from a single
source across frequency, as well as grouping events across time, in order to hear whole words
and messages. In his seminal book, Bregman (1990) provides us with working definitions on the
terminologies used in the world of auditory scene analysis. He described auditory stream
segregation as:
The general process of auditory scene analysis in which links are formed between parts
of the sensory data. These links will affect what is included and excluded from our
perceptual descriptions of distinct auditory events.
HST.722 Brain Mechanisms for Hearing and Speech Page 1 of 6 www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
206www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Student Topic: Brain Attending a Cocktail Party Adrian KC Lee
If these links, which are also commonly referred to as streaming cues, are correct sensory parts
across time, we refer to such perceptual grouping as sequential. However, if these sensory data
coexist in time, and the formation of multiple auditory objects are as a result of perceptually
partitioning of the spectral contents into distinct objects, e.g., harmonics in a vowel, it is referred
to as simultaneous grouping. Apparent spatial location, onset / offset synchrony, frequency
proximity, and fundamental frequency are but some of the common acoustic cues that we
employed in auditory scene analysis.
Experimental paradigm – Buildup of Streaming
How does one systematically investigate the process of stream segregation that occurs in a
complex auditory environment? A popular paradigm is to use an ambiguous auditory figure that
could either be heard as one stream with a galloping rhythm (commonly labeled as “Horse”) or
as two concurrent streams with two different tempi (“Morse”) (See Figure 1). The basic stimulus
consists of a high tone A alternating with a low tone B, in repeated ABA_ sequences. If the
frequency difference (∆f) between the A and B tones is small, then neighboring tones tend to
bind together perceptually, resulting in a “Horse” rhythm. Conversely, if ∆f is large, the A and B
tones are no longer bind to each other, resulting in a “Morse” rhythm. The tone repetition rate
also influences the percept: the faster the repetition rate, the more binding there is between the
A and B tones (van Noorden, 1975).
At intermediate values of ∆f and tone repetition rate, the initial galloping “Horse” percept
changes after a few seconds of listening to a “Morse” percept (Anstis and Saida, 1985;
Bregman, 1978; Carlyon et al, 2001). With this systematic change in auditory percept over time,
it is possible to record neural responses at various points during an ongoing sequence of
sounds without any change in the evoking stimulus, and compare the neural responses
associated with dramatically different percepts.
HST.722 Brain Mechanisms for Hearing and Speech Page 2 of 6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
207www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Student Topic: Brain Attending a Cocktail Party Adrian KC Lee
Figure removed due to copyright considerations.
Figure 1 For the correct parameters, these sequences are ambiguous and can be heard with one or two perceptual organizations with different rhythms: (Left): a characteristic galloping rhythm (“Horse”); (Right): 2 isochronous streams, like Morse code (“Morse”). Colored regions correspond to perceptual streams. (Taken from Fig. 2, Cusack, 2005).
Key Areas of Research
Many investigations into the neural correlates of auditory streaming employ the aforementioned
ABA- paradigm, or the variants thereof, but their generic approaches and the specific questions
they address can be divided into several distinct classes. One class of approach is to use
single-unit recordings in animals to infer perceptual effects of stream segregation. Fisherman et
al (2001, 2004) showed that multiunit spiking responses to tone sequences in the primary
auditory cortex of awake monkeys follow the pattern that one might expect on the basis of
published psychophysical data from human subjects. Bee and Klump (2004) performed similar
sequential streaming experiments and recorded neural responses in the auditory forebrain of
awake starlings. They concluded that while there are preattentive auditory processes, such as
frequency selectivity and forward masking, that contribute to the perceptual segregation of
sequential acoustic events having different frequencies into separate auditory streams, there
may be additional processes that are required to account for all known perceptual effects
related to sequential stream segregation. The major confounding factor in interpretations of
these conclusions is that the neural response patterns that are putatively associated with the
one- and two-stream percepts were always induced using physically different stimuli.
Another class of approach combines human neuromagnetic and behavioral measures.
Gutschalk et al (2005) used magnetoencephalography (MEG) to examine the neural bases of
auditory stream formation. They concluded that there is a tight coupling between auditory
cortical activity and streaming perception, suggesting that an explicit representation of auditory
streams may be maintained within nonprimary auditory areas. They hypothesized that selective
adaptation by feature-specific neurons might be a general neural mechanism subserving
perceptual organization. Cusack (2005) combined functional magnetic resonance imaging
HST.722 Brain Mechanisms for Hearing and Speech Page 3 of 6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
208www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Student Topic: Brain Attending a Cocktail Party Adrian KC Lee
(fMRI) with psychophysical experiments to test for an effect of perceptual organization across
the whole brain.
The other class of approach addresses the specific interaction between auditory streaming and
attention. This remains a controversial topic. Due to time constraint, the scope of the proposed
discussion will not include literature that explicitly addresses this issue, but some key results are
summarized here for completeness. It has been suggested that streaming is a preattentive
process based on recording of event-related potentials (ERP) using a mismatch negativity
paradigm (MMN) (Sussman et al, 1999). It was argued by these authors that since MMN, which
can occur in the absence of attention, is elicited only in the infrequent odd-ball presentation
within-streams, the two potential streams of sound were segregated without attention being
focused on the auditory stimuli. However, Carlyon et al (2001) and Cusack et al (2003) argue
that the buildup of streaming is an attentive process, and previous experiments by van Noorden
(1975) suggests that listeners have control over their perception in the ambiguous ∆f region. In
other words, whether or not streaming is an attentive process is not conclusive, but the
techniques used to probe such a high cognitive process are, nonetheless, interesting to follow.
The last class of active research area is computational modeling of stream segregation (e.g.,
Beauvois and Meddis, 1991; McCabe and Denham, 1997). An interesting neural network model,
known as ARTSTREAM (Grossberg, 2004), uses adaptive resonance theory to propose how
the brain achieves auditory scene analysis. Even though this neural network structure does not
have clear neural correlates, its putative structure may inspire neurophysiologists to find neural
units that have similar behaviors. This is perhaps an interesting paper to discuss if the class
would like to be exposed to neural network modeling.
Recent Developments and Proposed Papers for Discussion
In this month’s publication of Neuron, Micheyl and colleague (Micheyl, 2005) reported their
comparative experiments on auditory stream segregation of human and awake monkeys. Unlike
in previous studies, they used identical stimuli to conduct psychophysical experiments in
humans and single-unit extracellular recordings in the primary auditory cortex of awake
monkeys. Interestingly, using signal detection theory discussed in previous sessions, the
authors can now compare quantitatively the stochastic neural responses with the probabilistic
perceptual judgments.
HST.722 Brain Mechanisms for Hearing and Speech Page 4 of 6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
209www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Student Topic: Brain Attending a Cocktail Party Adrian KC Lee
Earlier this year, Cusack (2005) used fMRI to test for an effect of perceptual organization across
the whole brain and suggested that the intraparietal sulcus (IPS) might play a role in perceptual
organization. This is also exciting in light of the growing evidence in the literature suggesting
that the IPS is also important for binding in vision, touch and cross-modally.
In a sequel to Wang’s (2005) paper, Bartlett et al (2005) extended the investigation on
recordings of auditory cortical neurons in awake marmosets to 2-sound sequences. Not only do
their results demonstrate that persistent modulations of the responses of an auditory cortical
neuron to a given stimulus can be induced by preceding stimuli, they also find that decreases or
increases of responses to the succeeding stimuli are dependent on the spectral, temporal, and
intensity properties of the preceding stimulus. Such long-lasting modulation by stimulus context
in the cortex suggests these response properties may be important for auditory streaming and
segregation.
All the proposed papers for discussion highlight some new experimental approaches or latest
findings on the neural correlates of auditory stream segregation. This is a very active area of
research on many fronts, ranging from neurophysiology to psychoacoustics. However, we are
still in the infancy stage in the quest of understanding the brain mechanism behind auditory
scene analysis. Next time when you are in a pub drinking a beer, marvel at how the brain
accomplishes such an amazing feat!
Papers for Discussion
Bartlett, E., Wang, X. (2005). “Long-Lasting Modulation by Stimulus Context in Primate Auditory Cortex,” Journal of Neurophysiology, 94, 83-104.
Cusack, R. (2005). “The Intraparietal Sulcus and Perceptual Organization,” Journal of Cognitive Neuroscience, 17, 641-651.
Micheyl, C., Tian, B., Carlyon, B., Rauschecker, J. (2005). “Perceptual Organization of Tone Sequences in the Auditory Cortex of Awake Macaques,” Neuron, 48, 139-148.
Background
Carlyon, R. (2004). “How the brain separates sounds,” Trends in Cognitive Sciences, 8, 465-471.
Darwin, C.J. (1997). “Auditory grouping,” Trends in Cognitive Sciences, 1, 327-333.
HST.722 Brain Mechanisms for Hearing and Speech Page 5 of 6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
210www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Student Topic: Brain Attending a Cocktail Party Adrian KC Lee
Further Reading
Bee, M., Klump, G. (2004). “Primitive Auditory Stream Segregation: A Neurophysiological Study in the Songbird Forebrain,” Journal of Neurophysiology, 92, 1088-1104.
Fishman, Y., Reser, D., Arezzo, J., Steinschneider, M. (2001). “Neural correlates of auditory stream segregation in primary auditory cortex of the awake monkey,” Hearing Research, 151, 167-187.
Fishman, Y., Arezzo, J., Steinschneider, M. (2004). “Auditory stream segregation in monkey auditory cortex: effects of frequency separation, presentation rate, and tone duration,” Journal of Acoustical Society of America, 116, 1656-1670.
Grossberg, S., Govindarajan, K., Wyse, L., Cohen, M. (2004). “ARTSTREAM: a neural network model of auditory scene analysis and source segregation,” Neural Networks, 17, 511-536.
Gutschalk, A., Micheyl, C., Melcher, J., Rupp, A., Scherg, M., Oxenham, A. (2005). “Neuromagnetic Correlates of Streaming in Human Auditory Cortex,” Journal of Neuroscience, 25, 5382-5388.
Sussman, E., Bregman, A., Wang, W., Khan, F. (2005). “Attentional modulation of electrophysiological activity in auditory cortex for unattended sounds within multistream auditory environments,” Cognitive, Affective, & Behavioral Neuroscience, 5, 93-110.
Winkler, I., Takegata, R., Sussman, E. (2005). “Event-related brain potentials reveal multiple stages in the perceptual organization of sound,” Cognitive Brain Research, 25, 291-299.
Reference
Beauvois, M., Meddis, R. (1991). “A computer model of auditory stream segregation,” Quarterly Journal of Experimental Psychology, 43, 517-541.
Bregman, A. S. (1978). “Auditory streaming is cumulative,” Journal of Experimental Psychology: Human Perception and Performance, 4, 380-387.
Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual Organization of Sound, Cambridge, MA: MIT Press.
Carlyon, R. P., Cusack, R., Foxton, J. M., Robertson, I. H., (2001). “Effects of attention and unilateral neglect on auditory stream segregation,” Journal of Experimental Psychology: Human Perception and Performance, 27, 115-127.
Cherry, E. C. (1953). “Some experiments on the recognition of speech, with one and with two ears”, Journal of the Acoustical Society of America 25, 975-979.
Cusack, R., Deeks, J., Aikman, G., Carlyon, R. P. (2004). “Effects of Location, Frequency Region, and Time Course of Selective Attention on Auditory Scene Analysis,” Journal of Experimental Psychology: Human Perception and Performance, 30 (4), 643-656.
McCabe, S., Denham, M. (1997). “A model of auditory streaming,” Journal of Acoustical Society of America, 101, 1611-1621.
Van Noorden, L. (1975). “Temporal coherence in the perception of tone sequences,” doctoral thesis, Eindhoven University of Technology, The Netherlands.
Wang, X., Lu, T., Snider, R., Lian, L. (2005). “Sustained firing in auditory cortex evoked by preferred stimuli,” Nature, 435, 341-346.
HST.722 Brain Mechanisms for Hearing and Speech Page 6 of 6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
211www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
C a r r i e N i z i o l e kH S T . 7 2 2 t o p i c p r o p o s a lO c t o b e r 2 6 , 2 0 0 5W h a t t h e F O X P 2 g e n e c a n t e l l u s a b o u tt h e n e u r a l c o n t r o l o f s p e e c h a n d l a n g u a g eI n c o n t r a s t w i t h a l l o t h e r a n i m a l s , h u m a n s h a v e e v o l v e d t h e a b i l i t y t o a c q u i r es p o k e n l a n g u a g e . H u m a n s ’ c a p a c i t y f o r s p e e c h m u s t , i n s o m e r e s p e c t , b e d e r i v e df r o m o u r g e n e s . W h a t i s i t i n o u r g e n e s t h a t e n a b l e s u s t o a c q u i r e s p e e c h ? W h a tw o u l d s u c h a g e n e d o , a n d w h a t n e u r a l s t r u c t u r e s w o u l d d e p e n d o n i t s e x p r e s s i o n ?W h a t w o u l d h a p p e n i f i t w e r e d i s r u p t e d ? T h e r e h a s b e e n s p e c u l a t i o n a b o u t ag e n e t i c s u b s t r a t e f o r s p e e c h a b i l i t y f o r a l o n g t i m e , b u t i t w a s n ’ t u n t i l v e r y r e c e n t l yt h a t s c i e n t i s t s w e r e a b l e t o t r a c k d o w n a n d i d e n t i f y a c a n d i d a t e g e n e , o n e t h a t w o u l db e n e c e s s a r y f o r t h e a c q u i s i t i o n o f s p o k e n l a n g u a g e .A s p e e c h - i m p a i r e d f a m i l y k n o w n a s t h e K E f a m i l y a f f o r d e d g e n e t i c i s t s ap r o m i s i n g o p p o r t u n i t y t o l o o k f o r t h e “ s p e e c h g e n e . ” H a l f t h e m e m b e r s o f t h e K Ef a m i l y h a v e s e v e r e i m p a i r m e n t s i n s p e e c h a n d l a n g u a g e , i m p l y i n g a h e r e d i t a r ym u t a t i o n t h a t i s p a s s e d o n i n a s i m p l e m a n n e r . T h o s e w i t h t h e d i s o r d e r e x h i b i td e f i c i t s i n s p e e c h m o t o r c o n t r o l : t h e y h a v e t r o u b l e w i t h f i n e m o v e m e n t s i n t h e l o w e rh a l f o f t h e f a c e , d i s r u p t i n g t h e i r a b i l i t y t o s p e a k . H o w e v e r , t h e i r d i s o r d e r i s n o te n t i r e l y a r t i c u l a t o r y : t h o s e w i t h t h e m u t a t i o n a l s o h a v e d e f e c t s u n r e l a t e d t ov e r b a l i z a t i o n , f o r e x a m p l e , a d e f i c i t i n m a k i n g a w r i t t e n l i s t o f w o r d s s t a r t i n g w i t h ap a r t i c u l a r l e t t e r . 1I n 2 0 0 1 , a f t e r y e a r s o f r e s e a r c h w i t h t h e K E f a m i l y a n d C S , a n u n r e l a t e dp a t i e n t w i t h a s i m i l a r s p e e c h a b n o r m a l i t y , g e n e t i c i s t s L a i a n d F i s h e r a l i g h t e d o n t h eF O X P 2 g e n e o n c h r o m o s o m e 7 q 3 1 , a r e g i o n k n o w n t o b e r e l a t e d t o b r a i n f u n c t i o n .( T h e n a m e F O X P 2 c o m e s f r o m f o r k h e a d b o x , f o r t h e f o r k h e a d b o x p r o t e i n f a m i l yi n v o l v e d i n t r a n s c r i p t i o n . ) F O X P 2 i s e x p r e s s e d i n m a n y r e g i o n s i n t h e d e v e l o p i n g
Harvard-MIT Division of Health Sciences and TechnologyHST.722J: Brain Mechanisms for Hearing and SpeechCourse Instructors: Bertrand Delgutte, M. Christian Brown, Joe C. Adams, Kenneth E. Hancock, Frank H. Guenther, Jennifer R. Melcher, Joseph S. Perkell, David N. Caplan
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
212www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
b r a i n , i n c l u d i n g t h e c o r t i c a l p l a t e , b a s a l g a n g l i a , t h a l a m u s , i n f e r i o r o l i v e , a n dc e r e b e l l u m . A s i n g l e b a s e - p a i r m u t a t i o n c a u s e d t h e F O X P 2 m u t a t i o n r e s u l t i n g i ns p e e c h d e f e c t s i n b o t h C S a n d t h e K E f a m i l y . 2T o d e t e r m i n e F O X P 2 ’ s c o n t r i b u t i o n s t o s p e e c h a b i l i t y – t o e s t a b l i s h w h a t i ta c t u a l l y d o e s w h e n i t i s e x p r e s s e d – r e c e n t s t u d i e s h a v e l o o k e d f o r d i f f e r e n c e s i nb r a i n s t r u c t u r e b e t w e e n n o r m a l h u m a n s a n d t h o s e w i t h m u t a t e d c o p i e s o f t h e g e n e .I f w e c a n d e t e r m i n e w h a t c o r t i c a l s t r u c t u r e s t h e g e n e i s r e g u l a t i n g , a n d c o r r e l a t es t r u c t u r a l i r r e g u l a r i t y w i t h b e h a v i o r a l i m p a i r m e n t s , w e c a n l e a r n m o r e a b o u t t h en e u r a l c i r c u i t s f o r s p e e c h p r o c e s s i n g a n d a r t i c u l a t i o n . T h e p a p e r s I h a v e p r o p o s e de l u c i d a t e t h e r o l e o f F O X P 2 w i t h d a t a a b o u t i t s e x p r e s s i o n i n s p e e c h a n d m o t o rc e n t e r s i n t h e b r a i n , a s w e l l a s t h e a n a t o m i c a l , f u n c t i o n a l , a n d b e h a v i o r a lc o n s e q u e n c e s o f i t s d i s r u p t i o n .P A P E R 1 : L i é g e o i s e t a l . , 2 0 0 3 3 , n e u r o i m a g i n g ( f M R I ) . T h e f u n c t i o n a la b n o r m a l i t i e s i n m e m b e r s o f t h e K E f a m i l y c a n b e r e v e a l e d i n f u n c t i o n a l i m a g i n gs t u d i e s . R e s u l t s b y L i é g e o i s s h o w a l a r g e d i s c r e p a n c y i n b r a i n a c t i v a t i o n b e t w e e ns u b j e c t s w i t h n o r m a l a n d m u t a n t c o p i e s o f t h e F O X P 2 a l l e l e . S p e c i f i c a l l y , a f f e c t e df a m i l y m e m b e r s s h o w e d a n a t y p i c a l d i s t r i b u t i o n o f a c t i v a t i o n d u r i n g b o t h s p e e c h( n o n w o r d r e p e t i t i o n ) a n d l a n g u a g e ( w o r d g e n e r a t i o n ) t a s k s . P a r t i c u l a r l y s t r i k i n gw a s a n a b n o r m a l l y b r o a d a c t i v a t i o n t h r o u g h o u t c o r t e x a n d a s i m u l t a n e o u su n d e r a c t i v a t i o n i n B r o c a ’ s a r e a a n d o t h e r s p e e c h - r e l a t e d c o r t i c a l a n d s u b c o r t i c a lb r a i n r e g i o n s . 4 . 5 T h e s t u d y a l s o d i s c u s s e s m o r p h o l o g i c a l a b n o r m a l i t y i n t h e s es t r u c t u r e s , p r i n c i p a l l y t h e c a u d a t e n u c l e u s a n d t h e p u t a m e n . O t h e r r e l a t e d i m a g i n gs t u d i e s 1 u s e v o x e l - b a s e d m o r p h o m e t r y t o s h o w b i l a t e r a l r e d u c t i o n s i n g r a y m a t t e r i nt h e v e n t r a l c e r e b e l l u m . A l l t h i s i s s t r o n g e v i d e n c e t h a t F O X P 2 m e d i a t e s c o r t i c a lp r o c e s s e s r e l a t e d t o l a n g u a g e .Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
213www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
P A P E R 2 : S h u e t a l . 2 0 0 5 6 ,k n o c k o u t m i c e & i m m u n o h i s t o c h e m i s t r y .B e c a u s e F O X P 2 i s e x p r e s s e d i n o t h e ra n i m a l s , a l b e i t i n a s l i g h t l y d i f f e r e n tf o r m , w e c a n i n d u c e m u t a t i o n s i n t h eg e n e a n d e x a m i n e t h e r e l a t i o n sb e t w e e n g e n e i r r e g u l a r i t y a n d t h e r e s u l t i n g p h e n o t y p i c d i f f e r e n c e s . S h u e t a l .e n g i n e e r e d t w o t y p e s o f k n o c k o u t m i c e , a t y p e w i t h t w o d i s r u p t e d c o p i e s o f F O X P 2( h o m o z y g o u s ) a n d a t y p e w i t h o n e n o r m a l a n d o n e d i s r u p t e d c o p y ( h e t e r o z y g o u s ,t h e g e n o t y p e o f t h e a f f e c t e d m e m b e r s o f t h e K E f a m i l y ) . H e t e r o z y g o u s k n o c k o u tm i c e m a d e m a n y f e w e r u l t r a s o n i c v o c a l i z a t i o n s i n r e s p o n s e t o s e p a r a t i o n f r o m t h em o t h e r t h a n d i d c o n t r o l m i c e , w h e r e a s h o m o z y g o u s k n o c k o u t m i c e m a d e n ov o c a l i z a t i o n s a t a l l . T h i s r e s u l t i m p l i e s t h a t F O X P 2 p l a y s a r o l e i n t h e d e v e l o p m e n to f a n e u r a l c i r c u i t e m p l o y e d i n s o c i a l c o m m u n i c a t i o n , n o t u n i q u e l y r e l a t e d t o h u m a ns p e e c h . I n a d d i t i o n , h e t e r o z y g o u s k n o c k o u t s e x h i b i t e d a d e l a y i n m o t o rd e v e l o p m e n t , w h i l e h o m o z y g o u s k n o c k o u t m i c e h a d a s e v e r e m o t o r i m p a i r m e n t ,a f f i r m i n g t h e g e n e ’ s i n v o l v e m e n t i n m o t o r a n d s e q u e n c i n g a r e a s o f t h e b r a i n .H i s t o l o g i c a l a n a l y s i s o f t h e m i c e i l l u s t r a t e d i r r e g u l a r c e r e b e l l a r s e c t i o n s i n k n o c k o u tm i c e , p a r t i c u l a r l y i n t h e P u r k i n j e c e l l s a n d i n t h e e x t e r n a l g r a n u l a r l a y e r . F O X P 2t h e r e f o r e s e e m s t o b e i m p o r t a n t f o r b o t h c e r e b e l l a r d e v e l o p m e n t a n d m e c h a n i s m s o fs o c i a l c o m m u n i c a t i o n .P A P E R 3 : T e r a m i t s u e t a l . 2 0 0 4 7 , h i s t o l o g y & v o c a l l e a r n i n g i n s o n g b i r d s . F O X P 2i s a l s o e x p r e s s e d i n s o n g b i r d s , a n d c u r r e n t r e s e a r c h s e e m s t o s u g g e s t i t p l a y s a r o l ei n b i r d s ’ v o c a l l e a r n i n g . A c c o r d i n g t o T e r a m i t s u e t a l . , F O X P 2 i s l o c a l i z e d t o r e l a t e ds u b c o r t i c a l s t r u c t u r e s i n h u m a n s a n d b i r d s ( s p e e c h a r e a s / s o n g n u c l e i , c e r e b e l l a rP u r k i n j e c e l l s ) . T h i s c o l o c a l i z a t i o n s u g g e s t s a s s o c i a t e d m e c h a n i s m s u n d e r l y i n gv o c a l l e a r n i n g i n b o t h h u m a n s ( s p e e c h ) a n d b i r d s ( s o n g ) . F O X P 2 a n d t h e r e l a t e d
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
214www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
g e n e F O X P 1 s h o w p a r a l l e l e x p r e s s i o n i n t h e h u m a n a n d s o n g b i r d b r a i n , p r e d i c t i n g ap o s s i b l e r o l e o f F O X P 1 i n s p e e c h a b i l i t y .T h i s t o p i c i s v e r y r e l e v a n t t o s t u d y i n g t h e b r a i n m e c h a n i s m s f o r s p e e c h . T h eF O X P 2 g e n e h a s t h e p o t e n t i a l t o b e a p o w e r f u l t o o l f o r e x p l o r i n g t h e n e u r a l c i r c u i t st h a t a l l o w u s t o c o m m u n i c a t e v i a s p o k e n l a n g u a g e . H o w e v e r , t h e f i e l d w o u l dc e r t a i n l y b e n e f i t f r o m s o m e s p i r i t e d c o n v e r s a t i o n a b o u t t h e l i m i t a t i o n s o f t h e s em e t h o d s , a s w e l l a s a n y u n f o u n d e d c o n c l u s i o n s t h a t s h o u l d b e a v o i d e d .F u r t h e r m o r e , t h e f u n c t i o n o f t h e g e n e i t s e l f i s a n i n t e r e s t i n g t o p i c o f s t u d y . I f F O X P 2i s a “ l a n g u a g e ” g e n e , w h a t i s i t d o i n g i n m i c e , s o n g b i r d s , a n d f u n g i ? I f i t i s m e r e l y ag e n e e n a b l i n g f i n e m o t o r m o v e m e n t s s u c h a s t h o s e u s e d i n s p e e c h , w h a t l e a d s t od e f e c t s i n l a n g u a g e d e v e l o p m e n t i n i n d i v i d u a l s l a c k i n g t h e n o r m a l g e n e ? W h a t i st h e r e l a t i o n s h i p b e t w e e n t h e m o t o r m o u t h c o n t r o l F O X P 2 m a y r e g u l a t e a n d t h el a n g u a g e a b i l i t y i t s d i s r u p t i o n i m p a i r s ? D o e s a h e a r i n g - o r m o t o r - f e e d b a c k l o o p ,s u c h a s t h a t d e s c r i b e d i n o u r s p e e c h m o t o r c o n t r o l p a p e r s , p l a y a p a r t i n a n y o f t h i s ?F i n a l l y , w h e r e d i d t h i s g e n e c o m e f r o m ? B y c o m p a r i n g t h e g e n e ’ s e x p r e s s i o ni n o t h e r a n i m a l s , w e c a n g a i n i n s i g h t o n w h e n a n d h o w s p e e c h a b i l i t y e v o l v e d i n o u rs p e c i e s ( a n d i n n o o t h e r s ) .T h e s e a r c h f o r a n d e v a l u a t i o n o f t h e s p e e c h g e n e , F O X P 2 , i s a n e x c i t i n g a n de x t r e m e l y r e c e n t i s s u e , w i t h b r o a d - r e a c h i n g i m p l i c a t i o n s f o r r e s e a r c h o n t h ee v o l u t i o n o f h u m a n s p e e c h . C u r r e n t r e s e a r c h c o m b i n e s t e c h n i q u e s f r o m g e n e t i c s ,b r a i n i m a g i n g , i m m u n o h i s t o c h e m i s t r y , a n d b e h a v i o r a l s c i e n c e t o g e t a t t h e f u n c t i o no f F O X P 2 i n h u m a n s a n d a n i m a l s . T h e F O X P 2 g e n e i s c r i t i c a l f o r t h e d e v e l o p m e n t o ft h e n e u r a l s u b s t r a t e s o f s p e e c h a n d l a n g u a g e , a n d w i l l p r o v e t o b e i m p o r t a n t t o o u ru n d e r s t a n d i n g o f c i r c u i t s u n d e r l y i n g s p e e c h a n d a r t i c u l a t i o n .
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
215www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
1 . B e l t o n E e t a l . B i l a t e r a l b r a i n a b n o r m a l i t i e s a s s o c i a t e d w i t h d o m i n a n t l y i n h e r i t e d v e r b a la n d o r o f a c i a l d y s p r a x i a . H u m a n B r a i n M a p p i n g 1 8 ( 3 ) : 1 9 4 - 2 0 0 , 2 0 0 5 .2 . * * L a i C S , F i s h e r S E , H u r s t J A , V a r g h a - K h a d e m F & M o n a c o A P . A f o r k h e a d - d o m a i n g e n ei s m u t a t e d i n a s e v e r e s p e e c h a n d l a n g u a g e d i s o r d e r . N a t u r e 4 1 3 : 5 1 9 - 5 2 3 , 2 0 0 1 .3 . * L i é g e o i s F , B a l d e w e g T , C o n n e l l y A , G a d i a n D G , M i s h k i n M & V a r g h a - K h a d e m F .L a n g u a g e f M R I a b n o r m a l i t i e s a s s o c i a t e d w i t h F O X P 2 g e n e m u t a t i o n . N a t u r eN e u r o s c i e n c e 6 : 1 2 3 0 - 1 2 3 7 , 2 0 0 3 .4 . * * M a c D e r m o t K D e t a l . I d e n t i f i c a t i o n o f F O X P 2 t r u n c a t i o n a s a n o v e l c a u s e o fd e v e l o p m e n t a l s p e e c h a n d l a n g u a g e d e f i c i t s . A m . J . H u m . G e n e t . 7 6 : 1 0 7 4 - 1 0 8 0 , 2 0 0 55 . M u s s o M e t a l . B r o c a ' s a r e a a n d t h e l a n g u a g e i n s t i n c t . N a t u r e N e u r o s c i e n c e 6 : 7 7 4 - 7 8 1 ,2 0 0 3 .6 . * S h u W , C h o S Y , B u x b a u m J D e t a l . A l t e r e d u l t r a s o n i c v o c a l i z a t i o n i n m i c e w i t h ad i s r u p t i o n i n t h e F o x p 2 g e n e . P r o c N a t l A c a d S c i U S A . 2 0 0 5 J u l y 5 ; 1 0 2 ( 2 7 ) : 9 6 4 3 –9 6 4 8 , 2 0 0 5 .7 . * T e r a m i t s u I e t a l . P a r a l l e l F o x P 1 a n d F o x P 2 e x p r e s s i o n i n s o n g b i r d a n d h u m a n b r a i np r e d i c t s f u n c t i o n a l i n t e r a c t i o n . J N e u r o s c i 2 4 ( 1 3 ) : 3 1 5 2 - 3 1 6 3 , 2 0 0 4 .* p r o p o s e d f o r d i s c u s s i o n a r t i c l e s* * p r o p o s e d f o r r e v i e w / b a c k g r o u n d a r t i c l e s
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
216www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Harvard-MIT Division of Health Sciences and Technology HST.722J: Brain Mechanisms for Hearing and Speech Course Instructors: Bertrand Delgutte, M. Christian Brown,
Joe C. Adams, Kenneth E. Hancock, Frank H. Guenther, Jennifer R. Melcher, Joseph S. Perkell, David N. Caplan
Absolute Pitch
Jianwen Wendy Gu
October 26, 2005
HST.722 Brain Mechanisms for Hearing and Speech
Introduction
Pitch is a fundamental attribute of sound, which has led to extensive research on pitch processing, categorization, and memory with the goal of elucidating the complex workings of the auditory system. The phenomenon of absolute pitch (AP), the ability to identify or produce a specified pitch without external reference, provides a unique opportunity to study the perception and neural coding of pitch.
Background
Most people have at least some degree of AP, as it is fairly easy to identify a tone as belonging to the range of a piccolo rather than a tuba. Furthermore, even people with no musical training are good at remembering and singing the pitch of a favorite song. True AP possessors, however, can do these tasks with an order of magnitude better accuracy (Figure 1). From cognitive studies, AP seems to require two separable cognitive components: very narrow fixed pitch categories and the association of these categories with verbal labels. [4]
For people with AP ability, pitch categories are as fixed and familiar as color categories are for everyone else. However, this category structure for pitch does not affect their perception of pitch, only their ability to label. AP possessors and non-possessors have equivalent acuity and and perceptual thresholds for pitch differences. [3]
AP is not an all-or-none ability; rather, it exists along a continuum. Some people with AP can label only tones produced by one particular instrument. Others have AP for only a single tone and fail to show the automatic and rapid identification found in true AP possessors. [3]
1
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
217www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Figure 1: Distributions of pitch-name responses to randomly presented tones, plotted as distance from correct response.
Key Issues
AP Acquisition
Musical training is essential for AP to manifest itself and most importantly, this training must happen early in life (Figure 2), which supports the hypothesis of a critical period during which AP can be acquired [3]. The type of training does have some influence, for
Figure removed due to copyright considerations.
Figure 2: Data from retrospective reports on age of acquisition of AP, as modeled by a Gamma function.
much musical training involves teaching of relative pitch, but it cannot account for all of
2
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
218www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
the variance in the population. In fact, AP ability often develops without explicit tutoring: mere exposure to tones and their labels can be sufficient. [4]
A fundamental question regarding AP acquisition remains to be answered: which neural events are responsible for this fairly fixed developmental time course? There are many neurodevelopmental processes with critical periods, such as language acquisition and the columnar segregation of neurons in the visual cortex. Understanding the molecular mechanisms that lead to AP acquisition could shed light on neural development in general. [4]
AP and Genetics
Since most people who receive musical training from a young age do not develop AP, early training can be only part of the story. Two recent findings point to a genetic component. First, there are significant associations between siblings who demonstrate AP even when environmental factors are controlled. Second, AP may be differentially distributed across different human populations. For example, people of Asian descent have a much greater incidence of AP than those of other backgrounds. Moreover, this higher rate of AP is found across several Asian ethnicities (Chinese, Korean, and Japanese) that are distinct from one another culturally and is unrelated to speaking tonal languages. Neither Korean nor Japanese is a tonal language, and the higher incidence has also been reported among Asian-Americans who speak only English. [4]
Although these observations support the idea of a genetic component, it is difficult to control all environmental factors. More rigorous methods and larger samples must be utilized to identify genetic influences of AP with certainty. [4]
Neural Correlates of AP
The development of structural and functional imaging techniques have enabled researchers to probe the neural correlates of AP. Anatomically, regions of the right superior temporal cortex and planum temporale (PT) tend to be smaller in AP possessors than non-possessors. [3] Early studies using event-related potentials (ERPs) indicated that listeners with AP showed an absent or reduced electrical scalp component that is thought to index the updating of working memory. Listeners without AP had a physiological response to a certain pitch change, indicating that some type of online memory system had been refreshed. AP possessors did not show this response presumably because their memory representations consist of fixed, absolute values for each pitch. Therefore, pitch representation requires encoding into working memory but no updating once it’s in memory. [4]
A related phenomenon has been observed using measures of cerebral blood flow (CBF) as shown in Figure 3, specifically positron emission tomography (PET) and functional magnetic resonance imaging (fMRI). An area of the right frontal cortex believed to be important for monitoring pitch information in working memory was more active among musicians lacking AP than in those with AP. AP possessors apparently used their categorical representation of tones in such tasks instead of requiring continuous maintenance of a sensory trace. [4]
3
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
219www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Figure 3: Data from a PET study of AP and relative (RP) possessors, who labeled (a) isolated tones and (b) intervals.
Conversely, a different area of the frontal lobe, the posterior dorsolateral cortex, was more active in subjects with AP than in control musicians, when listening to tones. This area is known to be involved in establishing and maintaining conditional associations in memory. Thus, it is a logical candidate region for the link between a pitch and its label in AP possessors. Additional support for this idea comes from the finding that the same area is active in both subjects with AP and those without AP when asked to label pairs of tones that form musical intervals. [4]
These findings show that working memory and associative learning aspects of AP draw on neural resources that are reasonably well understood. However, this still does not explain the component of AP related to fixed pitch categories. Part of the answer to this issue may lie in the function of subcortical nuclei important for periodicity coding and coincidence detection, such as the inferior colliculus, which could provide input based on intrinsic oscillations to the auditory cortex of AP possessors. [4]
4
Figure removed due to copyright considerations.
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
220www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Topic Papers
People with absolute pitch process tones without producing a P300
Klein et al. measured the P300 in seven AP possessors and seven non-possessors in response to auditory and visual stimuli. The P300 is a positive-going component of the ERP, which is believed to be a manifestation of the processes of maintaining or updating working memory. It is obtained in the “oddball” procedure in which two discrete stimuli (one frequent and one rare) are presented in a Bernoulli sequence; the subject counts the rare stimulus, which elicits a large P300. The control subjects showed standard ERPs in both visual and auditory modalities. AP subjects, however, showed a small P300 for the rare auditory stimulus but a normal P300 for the rare visual stimulus. These results suggest that AP possessors have access to permanent representations of the tones so they do not need to utilize working memory to fetch and compare representations for novel stimuli. [2]
Functional anatomy of musical processing in listeners with absolute pitch and relative pitch
Zatorre et al. used structural and functional brain imaging techniques to investigate the neural basis of AP. Twenty right-handed musically trained volunteers were included in the study: 10 AP possessors and 10 subjects with good relative pitch (RP) but no AP. PET was used to measure CBF during the presentation of musical tones to the subjects. Similar patterns of increased CBF were seen in auditory cortical areas in both groups. The AP group also demonstrated activation of the left posterior frontal cortex, an area thought to be related to learning conditional associations. This activity was also observed in non-AP subjects when they made relative pitch judgments of intervals. Activity within the right inferior frontal cortex was observed in RP but not in AP subjects during the interval judgment task, suggesting that AP possessors need not access working memory mechanisms in this task. MRI measures of cortical volume indicated a larger left PT in the AP group. [5]
Absolute pitch and planum temporale
In this study by Keenan et al., anatomical MRI was used to measure PT size of a right-handed group of 27 AP musicians, 27 nonmusicians, and 22 non-AP musicians. The authors hoped to settle the debate over whether the increased leftward asymmetry seen in AP possessors was due to an increase in the left PT, a decrease in the right PT, or a redistribution and reallocation of resources between the two hemispheres. The AP musicians showed greater leftward asymmetry and smaller right absolute PT size compared to the other two groups. The increased PT asymmetry of AP musicians was not seen in the group of early beginning non-AP musicians, suggesting that this asymmetry might be genetically based. [1]
5
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
221www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN
Conclusion
In summary, AP is an ability that involves both higher and lower levels of auditory processing. Its manifestation is influenced by the interaction of genes and environment. Thus, it is a neatly packaged phenomenon for studying many aspects of brain function.
References
[1] Julian Paul Keenan, Ven Thangaraj, Andrea R. Halpern, and Gottfried Schlaug. Absolute pitch and planum temporale. NeuroImage, 14:1402–1408, 2001.
[2] Mark Klein, Michael G. H. Coles, and Emanuel Donchin. People with absolute pitch process tones without producing a P300. Science, 223(4642):1306–1309, March 1984.
[3] Daniel J. Levitin and Susan E. Rogers. Absolute pitch: Perception, coding, and controversies. Trends in Cognitive Sciences, 9(1):26–33, January 2005.
[4] Robert J. Zatorre. Absolute pitch: A model for understanding the influence of genes and development on neural and cognitive function. Nature Neuroscience, 6(7):692–695, July 2003.
[5] Robert J. Zatorre, David W. Perry, Christine A. Beckett, Christopher F. Westbury, and Alan C. Evans. Functional anatomy of musical processing in listeners with absolute pitch and relative pitch. Proc. Natl. Acad. Sci. USA, 95:3172–3177, March 1998.
6
www.bsscommunitycollege.in www.bssnewgeneration.in www.bsslifeskillscollege.in
222www.onlineeducation.bharatsevaksamaj.net www.bssskillmission.in
WWW.BSSVE.IN