pitch perception - mit opencourseware · pitch of complex tones • harmonic tones produce a pitch...

28
Harvard-MIT Division of Health Sciences and Technology HST.723: Neural Coding and Perception of Sound Instructor: Andrew J. Oxenham Pitch Perception HST.723. Neural Coding and Perception of Sound © 2005 Andrew J. Oxenham

Upload: phungliem

Post on 29-Dec-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Harvard-MIT Division of Health Sciences and TechnologyHST.723: Neural Coding and Perception of SoundInstructor: Andrew J. Oxenham

Pitch Perception

HST.723. Neural Coding and Perception of Sound

© 2005 Andrew J. Oxenham

Page 2: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Pitch Perception of Pure TonesThe pitch of a pure tone is strongly related to the tone’s

frequency, although there are small effects of level and masking.<1000 Hz: increased level: decreased pitch1000-2000 Hz: little or no change>2000 Hz: increased level: increased pitch

Difference Limens for Frequency (DLF)The auditory system is exquisitely sensitive to changes in frequency (e.g. 2-3 Hz at 1000 Hz = 0.01 dB).

Figure removed due to copyright reasons.

(Moore, 1997)

Page 3: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

How is frequency coded - Place or timing?

• Place• Pros: Could in principle be used

at all frequencies.

• Cons: Peak of BM traveling wave shifts basally with level by ½ octave – no similar pitch shift is seen; fails to account for poorer performance in DLFs at very high frequencies (> 4 kHz), although does a reasonable job of predicting frequency-modulation difference limens (FMDLs).

Figure removed due to copyright reasons.

Zwicker’s proposal for FM detection.(From Moore, 1997)

Page 4: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Temporal cues

TimingPros: Pitch estimate is basically

level-invariant; may explain the absence of musical pitch above ca. 4-5 kHz.

Cons: Thought to break down totally above about 4 kHz (although some “optimal detector” models predict residual performance up to 8 or 10 kHz); harder to explain diplacusis (differences in pitch perception between the ears).

Figure removed due to copyright reasons.

From Rose et al. (1971)

Page 5: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Musical pitch

Musical pitch is probably at least 2-dimensional:• Tone height: monotonically related to frequency• Tone chroma: related to pitch class (note name)Circularity in pitch judgments: changes in chroma

but no change in height. In circular pitch is a half-octave interval perceived as going up or down? (Deutsch, 1987)

• Musical pitch of pure tones breaks down above about 5 kHz: octave matches become erratic and melodies are no longer recognized. Differences in frequency are still detected – only tone chroma is absent.

• Further evidence for the influence of temporal coding?

(Demo from ASA Auditory Demonstrations CD)

Figure removed due to copyright reasons.

Page 6: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Pitch of complex tones

• Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch of the missing fundamental). Evidence against Ohm/Helmholtz place theory.

Am

plitu

de

Time

Pitch = 200 Hz

200400

600800

10001200

14001600

Pitch = 200 Hz

Frequency (Hz)

Page 7: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Harmonic complex tonesMany sounds in our world are harmonic complex tones, consisting of many

sinusoids all at multiples of the fundamental frequency (F0).

0

10

20

30

40

0 500 1000 1500 2000 2500 3000 3500

Input Spectrum:

Center Frequency (Hz)

Exci

tatio

n(d

B)

Resolved Unresolved

Auditory Filterbank:

Excitation Pattern:

Leve

l(dB

)

BM Vibration:

Time

(ms)

0102030405060

0 500 1000 1500 2000 2500 3000 3500

010

2030

Frequency (Hz)

Tim

e (m

s)

Cochlear filtering:

Resolved harmonics: Temporal fine structure

Unresolved harmonics: Temporal envelope

(Plack & Oxenham, 2005)

Page 8: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Two temporal cues in complex sounds• Temporal fine structure

– Could be coded either by place or time (or both)

• Temporal envelope– Coded by timing information only

-1.5

-1

-0.5

0

0.5

1

1.5

2

Time

Am

plitd

ueFine structure(Resolved harmonics)

Envelope(Unresolved harmonics)

Page 9: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

High (unresolved) harmonics produce poor musical pitch

Highpass filtered above 8th harmonic

Unresolved

Lowpass filtered below 8th harmonic

Resolved

Resolved &Unresolved No filtering

(Courtesy of Bertrand Delgutte.)

Page 10: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Low (resolved) harmonics dominate pitch perception

100-100 100-106 100-112 100-133 100-178

F0 below 800 Hz F0 above 800 Hz

Figure removed due to copyright reasons.

Resynthesized sentences with low- and high-spectral regions on different F0s (Demo by C.J. Darwin)

Page 11: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Mechanisms of Complex Pitch Perception: The Early Years

Temporal Theory (Schouten, 1940):Pitch is extracted from the summed waveform of adjacent components. This requires that some components interact.

Pattern Recognition Theory (e.g. Goldstein, 1973):The frequencies of individual components are determined and the “best-fitting” f0 is selected. This requires that some components remain resolved and that some form of “harmonic template” exists.

Page 12: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Pros and Cons of Temporal and Place Models of Pitch

Evidence against a “pure” temporal model• Pitch sensation is strongest for low-order (resolved) harmonics (Plomp,

1967; Ritsma, 1967).• Pitch can be elicited by only two components, one in each ear

(Houtsma and Goldstein, 1972).• Pitch can be elicited by consecutively presented harmonics (Grose et

al., 2002).

Evidence again a “pure” pattern recognition theory• Very high, unresolved harmonics can still produce a (weaker) pitch

sensation• Aperiodic, sinusoidally amplitude-modulated (SAM) white noise can

produce a pitch sensation (Burns and Viemeister, 1976; 1981).

Page 13: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

• Based on an original proposal by Licklider (1951).

• The stimulus within each frequency channel is correlated (delayed, multiplied and averaged) with itself (through delay lines).

• This produces peaks at time intervals corresponding to multiples of the stimulus period.

• Pooling interval histograms across frequency produces an overall estimate of the “dominant” interval, which generally corresponds to the fundamental frequency.

Autocorrelation model of pitch perception

Please see: Meddis, R., and M. Hewitt. “Virtual pitch and phase sensitivity studied of a computer model of the auditory periphery. I: Pitch identification.” J Acoust Soc Am 89 (1991): 2866-2882.

Figure removed due to copyright considerations.

Page 14: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Autocorrelation modelPros:• Model can deal with both resolved and unresolved harmonics• Predicts no effect of phase for resolved harmonics, but strong phase

effects for unresolved harmonics, in line with data (Meddis & Hewitt, 1991).

• Predicts a dominance region of pitch, roughly in line with earlypsychophysical data, due to reduction in phase locking with frequency.

Cons:• Deals too well with unresolved harmonics – predicts no difference

based on resolvability, in contrast to psychophysical data (Carlyon and Shackleton, 1994).

• Dominance region based on absolute, not relative, frequency, in contrast to data.

[N.B. The “template” model of Shamma and Klein (2000) involves place and timing coding, but not in the traditional sense.]

Page 15: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

“Regular Interval Noise”

Delay (d) Gain (g)

+- Rippled noiseNoise (X(t))

dg

+- Comb-filtered noiseNoise (X(t))

+-

g d+-

g d

Noise (X(t))

Iterated rippled noise (IRN)

Page 16: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Figure removed due to copyright reasons.

Patterson et al. (2002)

Page 17: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Distinguishing time from place

• For pure tones, temporal and place information co-vary, making dissociation difficult.

• Transposed stimuli (van de Par & Kohlrausch, 1997) are an attempt to overcome this.

AIMS:• Transpose low-frequency temporal fine-structure

information into the envelope of a high-frequency carrier.

• Dissociate place and time representations.

Page 18: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

What are transposed stimuli?

x

Stimuli Peripheral auditory representation

Sinusoid

0 5 10 15

-1

0

1

Time (ms)

Am

plitu

de

0 5 10 15

-1

0

1

Time (ms)

Am

plitu

de

0 5 10 15

-1

0

1

Time (ms)

Am

plitu

de

Transposed tone0 5 10 15

-1

0

1

Time (ms)

Am

plitu

de 0 5 10 15

-1

0

1

Time (ms)

Am

plitu

de

Modulator

0 5 10 15

-1

0

1

Time (ms)

Am

plitu

de

Carrier

(van de Par and Kohlrausch, 1997)

fm

fm-fc fm+fc

Frequency

Page 19: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Interaural Time Differences (ITDs)

50

100

1000

40 100 500

ITD

(us)

Frequency (Hz)

Pure tone

4000-Hz TS

500

200

200Figures from Oxenham, A. J., J. G. W. Bernstein, and H. Penagos. "Correct tonotopicrepresentation is necessary for complex pitch perception," Proc Natl Acad Sci USA 101 (2004): 1421-1425. Copyright (2004) National Academy of Sciences, U.S.A.

Page 20: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Pure-tone frequency difference limens

0.5

1

10

30

40 100 500

Freq

uenc

y di

ffere

nce

(%)

Frequency (Hz)

Pure tone

4000-Hz TT

6350-Hz TT

10080-Hz TT

200

5

2

Figures from Oxenham, A. J., J. G. W. Bernstein, and H. Penagos. "Correct tonotopicrepresentation is necessary for complex pitch perception," Proc Natl Acad Sci USA 101 (2004): 1421-1425. Copyright (2004) National Academy of Sciences, U.S.A.

Page 21: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Transposed tones: Simple pitch• Unlike ITDs, temporal information for frequency

cannot be used optimally by the auditory system.• Pitch perception seems weaker for all transposed

tones.• Place information may be important.

What about complex pitch?

300-Hz tone, transposed to 4 kHz300-Hz pure tone

Page 22: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Complex tone pitch perception

Pitch = 100 Hz

300 500400

Pitch = ?

6300

(1)

(2)

4000 10080Frequency (Hz)

Page 23: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Temporal model predictions

Please see: Meddis, R., and L. O'Mard. "A unitary model of pitch perception." J Acoust Soc Am 102 (1997): 1811-1820.

Figure removed due to copyright considerations.

Page 24: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Pitch matchesSinusoids

Transposed

0

10

20

30

40S7

0

10

20

30

40

Num

ber o

f mat

ches S8

-10 -6 -2 2 6 100

10

20

30

40

Semitones

S9

Figures from Oxenham, A. J., J. G. W. Bernstein, and H. Penagos. "Correct tonotopicrepresentation is necessary for complex pitch perception," Proc Natl Acad Sci USA 101 (2004): 1421-1425. Copyright (2004) National Academy of Sciences, U.S.A.

Page 25: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Transposed tones: Conclusions

• Pitch of pure tones is poor and complex pitch is nonexistent.

• Suggests that fine structure must be presented to the correct place in the cochlea – timing is not enough.

• Possible hybrid models include Shamma et al.’s (2000) harmonic template model.

Page 26: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Musical intervals: Consonance and Dissonance• In the West, the equal- (or well-) tempered scale has been

adopted, with the octave split into twelve equal (semitone) steps on a log scale, i.e., 1 semitone higher is 21/12 times higher in frequency.

• This is a compromise: the intervals in the harmonic series only approximate the notes of the scale.

• Perceived dissonance is in part due to beating effects between neighboring harmonics. Remaining effect of perceived consonance and dissonance may be simply cultural.

log(f)2f0 3f0 4f0 5f0 6f0 8f07f0

Octave FifthFourth

Maj. 3rd

f0

Page 27: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

Auditory Grouping and Pitch

Simultaneous, harmonically related tones tend to form a single auditory object, which makes ecological sense.

What happens if one component is slightly out of tune?

Harmonicity can be a strong cue in binding components together, but it can be overridden by competing cues or expectations (Darwin et al., 1994; 1995).

A mistuned harmonic can be “heard out” more easily, but can still contribute to the overall pitch of the complex. This is an example of “duplex perception”.

Page 28: Pitch Perception - MIT OpenCourseWare · Pitch of complex tones • Harmonic tones produce a pitch at the fundamental frequency (F0), even if there is no energy at the F0 itself (pitch

ReferencesMoore, B. C. J. (1997). An Introduction to the Psychology of Hearing (Academic Press, London).Rose, J. E., Hind, J. E., Anderson, D. J., and Brugge, J. F. (1971). "Some effects of the stimulus intensity on response of auditory nerve fibers in the

squirrel monkey," J. Neurophysiol. 34, 685-699.Deutsch, D. (1987). "The tritone paradox: effects of spectral variables," Percept Psychophys 41, 563-575.Plack, C. J., and Oxenham, A. J. (2005). "Pitch perception," in Pitch: Neural Coding and Perception, edited by C. J. Plack, A. J. Oxenham, A. N.

Popper and R. Fay (Springer, New York).Schouten, J. F. (1940). "The residue and the mechanism of hearing," Proc. Kon. Akad. Wetenschap. 43, 991-999.Goldstein, J. L. (1973). "An optimum processor theory for the central formation of the pitch of complex tones," J. Acoust. Soc. Am. 54, 1496-1516.Ritsma, R. J. (1967). "Frequencies dominant in the perception of the pitch of complex sounds," J. Acoust. Soc. Am. 42, 191-198.Plomp, R. (1967). "Pitch of complex tones," J. Acoust. Soc. Am. 41, 1526-1533.Houtsma, A. J. M., and Goldstein, J. L. (1972). "The central origin of the pitch of complex tones: Evidence from musical interval recognition," J.

Acoust. Soc. Am. 51, 520-529.Grose, J. H., Hall, J. W., and Buss, E. (2002). "Virtual pitch integration for asynchronous harmonics," J. Acoust. Soc. Am. 112, 2956-2961.Burns, E. M., and Viemeister, N. F. (1976). "Nonspectral pitch," J. Acoust. Soc. Am. 60, 863-869.Burns, E. M., and Viemeister, N. F. (1981). "Played again SAM: Further observations on the pitch of amplitude-modulated noise," J. Acoust. Soc.

Am. 70, 1655-1660.Licklider, J. C. R. (1951). "A duplex theory of pitch perception," Experientia 7, 128-133.Meddis, R., and Hewitt, M. (1991). "Virtual pitch and phase sensitivity studied of a computer model of the auditory periphery. I: Pitch

identification," J. Acoust. Soc. Am. 89, 2866-2882.Shamma, S., and Klein, D. (2000). "The case of the missing pitch templates: How harmonic templates emerge in the early auditory system," J.

Acoust. Soc. Am. 107, 2631-2644.Patterson, R. D., Uppenkamp, S., Johnsrude, I. S., and Griffiths, T. D. (2002). "The processing of temporal pitch and melody information in auditory

cortex," Neuron 36, 767-776.van de Par, S., and Kohlrausch, A. (1997). "A new approach to comparing binaural masking level differences at low and high frequencies," J.

Acoust. Soc. Am. 101, 1671-1680.Oxenham, A. J., Bernstein, J. G. W., and Penagos, H. (2004). "Correct tonotopic representation is necessary for complex pitch perception," Proc.

Natl. Acad. Sci. USA 101, 1421-1425.Meddis, R., and O'Mard, L. (1997). "A unitary model of pitch perception," J. Acoust. Soc. Am. 102, 1811-1820.Darwin, C. J., Ciocca, V., and Sandell, G. J. (1994). "Effects of frequency and amplitude modulation on the pitch of a complex tone with a mistuned

harmonic.," Journal of the Acoustical Society of America 95, 2631-2636.Darwin, C. J., Hukin, R. W., and al-Khatib, B. Y. (1995). "Grouping in pitch perception: Evidence for sequential constraints," J. Acoust. Soc. Am.

98, 880-885.