hearing 1 hearing 2 hearing - tunisgn14006/pdf2015/l03-hearing.pdf · hearing 6 2.1 outer ear...
TRANSCRIPT
Hearing 1 SGN-14006 / A.K.
Contents: 1. Introduction 2. Ear physiology 3. Loudness 4. Masking 5. Pitch 6. Spatial hearing
Hearing Sources: Rossing. (1990). ”The science of sound”. Chapters 5–7.
Karjalainen. (1999). ”Kommunikaatioakustiikka”. Pulkki, Karjalainen (2015). ”Communication acoustics” Moore. (1997). ”An introduction to the psychology of hearing”.
Hearing 2 SGN-14006 / A.K. 1 Introduction
! Auditory system can be divided in two parts – Peripheral auditory system (outer, middle, and inner ear) – Auditory nervous system (in the brain)
! Ear physiology studies the peripheral system ! Psychoacoustics studies the entire sensation:
relationships between sound stimuli and the subjective sensation
Hearing 3 SGN-14006 / A.K. 1.1 Auditory system
! Dynamic range of hearing is wide – ratio of a very loud to a barely audible sound pressure level is
1:106 (120 dB)
! Frequency range of hearing varies a lot between individuals – only few can hear from 20 Hz to 20 kHz – sensitivity to low sounds (< 100Hz) is not very good – sensitivity to high sounds (> 12 kHz) decreases along with age
! Selectivity of hearing – listener can pick an instrument from among an orchestra – listener can follow a speaker at a cocktail party – One can sleep in background noise but still wake up to an
abnormal sound
Hearing 4 SGN-14006 / A.K. 1.2 Psychoacoustics
! Perception involves information processing in the brain – Information about the brain is limited
! Psychoacoustics studies the relationships between sound stimuli and the resulting sensations – Attempt to model the process of perception – For example trying to predict the perceived loudness / pitch /
timbre from the acoustic properties of the sound signal
! In a psychoacoustic listening test – Test subject listens to sounds – Questions are made or the subject is asked to describe her
sensasions
Hearing 5 SGN-14006 / A.K. 2 Ear physiology
! The human ear consists of three main parts: (1) outer ear, (2) middle ear, (3) inner ear
Hearing 6 SGN-14006 / A.K. 2.1 Outer ear
! Outer ear consists of: – pinna – gathers sound; direction-dependent response – auditory canal (ear canal) - conveys sound to middle ear, and acts
as a resonator at 3-4 kHz amplifying sound +10 dB
! Outer ear is passive and linear, and its behavior can be completely described by laws of acoustic wave propagation.
Nerve signal to brain
[Chittka05]
Hearing 7 SGN-14006 / A.K. 2.2 Middle ear
! Middle ear contains – Eardrum that transforms sound waves into mechanic vibration – Tiny audtory bones (ossicles): 1 Malleus (resting against the
eardrum, see figure), 2 Incus and 3 Staples
! The ossicles transmit eardrum vibrations (in air) to the oval window of the inner ear (filled with fluid)
! Acoustic reflex: – when sound
pressure level exceeds 50-60 dB, eardrum tension increases and staples is removed from oval window
– Protects the inner ear from damage
1 2
3
Hearing 8 SGN-14006 / A.K. 2.3 Inner ear, cochlea
! The inner ear contains the cochlea: a fluid-filled organ where vibrations are converted into nerve impulses to the brain.
! Cochlea = Greek: “snail shell”. ! Spiral tube: When stretched out, approximately 35 millimeters long. ! Vibrations on the cochlea’s oval window cause hydraulic pressure
waves inside the cochlea ! Inside the cochlea there is
the basilar membrane, ! On the basilar membrane
there is the organ of Corti with nerve cells that are sensitive to vibration
! Nerve cells transform movement information into neural impulses in the auditory nerve
Hearing 9 SGN-14006 / A.K. 2.4 Basilar membrane
! Figure: cochlea stretched out for illustration purposes – Basilar membrane divides the fluid of the cochlea into separate
tunnels – When hydraulic pressure waves travel along the cochlea, they
move the basilar membrane from oval window to round window.
base
apex
Hearing 10 SGN-14006 / A.K. Basilar membrane
Best freq (Hz)
Travelling waves:
! Different frequencies produce highest amplitude at different sites ! Preliminary frequency analysis happens on the basilar membrane
apex
base
Hearing 11 SGN-14006 / A.K.
! Distributed along the basilar membrane are sensory hair cells that transform membrane movement into neural impulses
! When a hair cell bends, it generates neural impulses – Impulse rate depends on vibration amplitude and frequency
! Depending on the position on the basilar membrane, each nerve cell has a characteristic frequency to which it is most responsive to (right fig.: �tuning curves� of 6 different cells from different positions)
! Left fig: Varying the amplitude of stimulus (dB) broadens the range of frequencies that the hair cell reacts to
2.5 Sensory hair cells Hearing 12
SGN-14006 / A.K. 3 Loudness
! Loudness describes the subjective level of sound – Perception of loudness is relatively complex, but – consistent phenomenon and – one of the central parts of psychoacoustics
! The loudness of a sound can be compared to a standardized reference tone, for example 1000 Hz sinusoidal tone – Loudness level (phon) is defined to be the sound pressure level
(dB) of a 1000 Hz sinusoidal, that has the the same subjective loudness as the target sound
– For example if the heard sound is perceived as equally loud as 40 dB 1kHz sinusoidal, is the loudness level 40 phons
Hearing 13 SGN-14006 / A.K. 3.1 Equal-loudness curves
Soun
d pr
essu
re le
vel (
dB)
Frequency (Hz)
Loudness level (phons)
Hearing 14 SGN-14006 / A.K.
! Comparing to two band-limited noises with same center frequencies and increasing the other’s bandwidth, the perceived loudness increases only after a critical bandwidth – Figure: 1 kHz @ 60 dB, Critical bandwidth is 160 Hz at 1 kHz
! Ear analyzes sound at critical band resolution. Each critical band contributes to the overall loudness level
! Each center frequency fc has a critical bandwidth Δf (Bark bandwidths)
! Equivalent rectangular bandwidth (ERB) is a different type of procedure to measure a rectangular bandwidth of auditory filters. ! Both bandwidths can be used to define a frequency Bark and ERB scales
– Stack bandwidths on top of each other
3.2 Critical bands
Hearing 15 SGN-14006 / A.K. 3.3 Perceptually-motivated frequency scales
mm. on basilar membrane frequency / kHz frequency / mel frequency / Bark
Hearing 16 SGN-14006 / A.K.
! A broadband sound is perceived louder than a narrowband sound of equal SPL, as shown by the critical bandwidth experiment.
! Loudness of a complex sound is calculated by using so-called specific loudness of a critical band as intermediate unit
! Specific loudness is roughly proportional to the log-power of the signal at the band (weighted according to sensitivity of hearing and spread slightly by convolving over bands)
! Overall loudness is obtained by summing up specific loudness values over critical band
3.4 Loudness of a complex sound
Hearing 17 SGN-14006 / A.K. 3.5 Measuring sound level
! SPL does not work well as a perceptual metric of sound loudness.
! Different frequency weighting curves (A,B,C,D) can be applied to measure sound level, which is more related to frequency sensitivity of hearing
! A-weighting is most common, referred as dB(A), it roughly resembles the inverse of the hearing threshold curve.
wikipedia
Hearing 18 SGN-14006 / A.K. 4 Masking
! Masking describes the situation where a weaker but clearly audible signal (maskee, test tone) becomes inaudible in the presence of a louder signal (masker)
! Masking depends on both the spectral structure of the sounds and their variation over time
Hearing 19 SGN-14006 / A.K. 4.1 Masking in frequency domain
! Model of the frequency analysis in the auditory system – subdivision of the frequency axis into critical bands – frequency components within a same critical band mask each
other easily – Bark & ERB scales: frequency scales that are derived by
mapping frequencies to critical band numbers ! Narrowband noise masks a tone (sinusoidal) easier than
a tone masks noise ! Masked threshold refers to the raised threshold of
audibility caused by the masker – sounds with a level below the masked threshold are inaudible – masked threshold in quiet = threshold of hearing in quiet
Hearing 20 SGN-14006 / A.K. Masking in frequency domain
! Figure: masked thresholds [Herre95] – masker: narrowband noise around 250 Hz, 1 kHz, 4 kHz – spreading function: the effect of masking extends to the spectral
vicinity of the masker (spreads more towards high freqencies)
! Additivity of masking: joint masked thresh is approximately (but slightly more than) sum of the components
Hearing 21 SGN-14006 / A.K. 4.2 Masking in time domain
! Forward masking – masking effect extends to times after the masker is switched off
! Backwards masking – masking extends to times before the masker is been switched on
! Forward/backward masking does not extend far in time " simultaneous masking is more important phenomenon
forward masking
backward masking
Hearing 22 SGN-14006 / A.K. 5 Pitch
! Pitch – Subjective attribute of sounds that enables us to arrange them on
a frequency-related scale ranging from low to high – Sound has a certain pitch if human listaners can consistently
match the frequency of a sinusoidal tone to the pitch of the sound
! Fundamental frequency vs. pitch – Fundamental frequency is a physical attribute – Pitch is a perceptual attribute – Both are measured in Hertz (Hz) – In practise, perceived pitch ≈ fundamental frequency
! "Perfect pitch" or "absolute pitch" - ability to recognize the pitch of a musical note without any reference – Minority of the population can do that
Hearing 23 SGN-14006 / A.K. 5.1 Harmonic sound
! For a sinusoidal tone – Fundamental frequency = sinusoidal frequency – Pitch ≈ sinusoidal frequency
! Harmonic sound
Trumpet sound: * Fundamental frequency F = 262 Hz * Wavelength 1/F = 3.8 ms
Hearing 24 SGN-14006 / A.K. 5.2 Pitch perception
! Pitch perception has been tried to explain using two competing theories – Place theory: “Peak activity along the basilar membrane
determines pitch” (fails to explain missing fundamental) – Periodicity theory: “Pitch depends on rate, not place, of response.”
Neurons fire in sync with signals ! The real mechanism
is a combination of the above – Sound is subdivided into
subbands (critical bands) – Periodicity of the
amplitude envelope (see lowest panel) is analyzed within bands
– Results are combined across bands
Hearing 25 SGN-14006 / A.K. 6 Spatial hearing
! The most important auditory cues for localizing a sound sources in space are 1. Interaural time difference (ITD)
• Is relatively constant with frequency 2. Interaural intensity difference (ILD)
• Increases with frequency 3. Direction-dependent filtering of the sound spectrum by head and
pinnae
! Terms – Monaural : with one ear – Binaural : with two ears – Interaural : between the ears (interaural time difference etc) – Lateralization : localizing a source in horizontal plane
Hearing 26 SGN-14006 / A.K. 6.1 Monaural source localization
! Diretional hearing works to some extent even with one ear ! Head and pinna form a direction-dependent filter
– Direction-dependent changes in the spectrum of the sound arriving in the ear can be described with HRTFs
– HRTF = head-related transfer function ! HRTFs are crucial
for localizing sources in the median plane (vertical localization)
Hearing 27 SGN-14006 / A.K. Monaural source localization
! HRTFs can be measured by recording – Sound emitted by a source – Sounds arriving to the auditory canal or eardrum (transfer function
of the auditory canal does not vary along with direction) ! In practice
– left: microphone in the ear of a test subject, OR
– right: head and torso simulator
Hearing 28 SGN-14006 / A.K. 6.2 Localizing a sinusoidal
! Experimenting with sinusoidal tones helps to understand the localization of more complex sounds
! Angle-of-arrival perception for sinusoids below 750 Hz is based mainly on interaural time difference
Hearing 29 SGN-14006 / A.K. Localizing a sinusoidal
! Interaural time difference is useful only up to 750 Hz – Above that, the time difference is ambiguous, since there are
several wavelengths within the time difference – Moving the head (or source movement) helps: can be done up to
1500 Hz ! At higher frequencies
(> 750 Hz) the auditory system utilizes interaural intensity difference – Head causes and acoustic
”shadow” (sound level is lower behind the head)
– Works especially at high frequencies with
shorter wavelenghts compared to head dimensions
Hearing 30 SGN-14006 / A.K. 6.3 Localizing complex sounds
! Complex sounds refer to sounds that – involve a number of different frequency components and – vary over time
! Localizing sound sources is typically a result of combining all the above-described mechanisms 1. Interaural time difference (most important) 2. Interaural intensity difference 3. HRTFs
! Wideband noise: directional hearing works well
Hearing 31 SGN-14006 / A.K. 6.4 Lateralization in headphone listening
! When listening with headphones, the sounds are often localized inside the head, on the axis between the ears – Sound does not seem to come from outside the head because the
diffraction caused by pinnae and head is missing – If the sounds are processed with HRTFs carefully, they move outside the
head
Hearing 32 SGN-14006 / A.K. 6.5 Precedence effect
! The sound in indoors that reaches the listener contains the direct path, early reflections, and late reverberation.
! The brain suppresses early reflected sounds to aid in source direction perception – Early reflections are not heard as separate sounds. – Instead, they amplify the loudness of direct sound
! Precedence effect works when 1. Early reflections arrive less than ~ 35ms after direct sound. 2. The spectrum of a reflection is similar to that of the direct sound 3. Reflections are lower or at least not signigicantly louder than direct
sound • Even a sound with +10 dB SPL than direct sound can not be heard