hearing 1 hearing 2 hearing - tunisgn14006/pdf2015/l03-hearing.pdf · hearing 6 2.1 outer ear...

Hearing 1 SGN-14006 / A.K.

Contents: 1.  Introduction 2.  Ear physiology 3.  Loudness 4.  Masking 5.  Pitch 6.  Spatial hearing

Hearing Sources: Rossing. (1990). ”The science of sound”. Chapters 5–7.

Karjalainen. (1999). ”Kommunikaatioakustiikka”. Pulkki, Karjalainen (2015). ”Communication acoustics” Moore. (1997). ”An introduction to the psychology of hearing”.

Hearing 2 SGN-14006 / A.K. 1 Introduction

!  Auditory system can be divided in two parts –  Peripheral auditory system (outer, middle, and inner ear) –  Auditory nervous system (in the brain)

!  Ear physiology studies the peripheral system !  Psychoacoustics studies the entire sensation:

relationships between sound stimuli and the subjective sensation

Hearing 3 SGN-14006 / A.K. 1.1 Auditory system

!  Dynamic range of hearing is wide –  ratio of a very loud to a barely audible sound pressure level is

1:106 (120 dB)

!  Frequency range of hearing varies a lot between individuals –  only few can hear from 20 Hz to 20 kHz –  sensitivity to low sounds (< 100Hz) is not very good –  sensitivity to high sounds (> 12 kHz) decreases along with age

!  Selectivity of hearing –  listener can pick an instrument from among an orchestra –  listener can follow a speaker at a cocktail party –  One can sleep in background noise but still wake up to an

abnormal sound

Hearing 4 SGN-14006 / A.K. 1.2 Psychoacoustics

!  Perception involves information processing in the brain –  Information about the brain is limited

!  Psychoacoustics studies the relationships between sound stimuli and the resulting sensations –  Attempt to model the process of perception –  For example trying to predict the perceived loudness / pitch /

timbre from the acoustic properties of the sound signal

!  In a psychoacoustic listening test –  Test subject listens to sounds –  Questions are made or the subject is asked to describe her

sensasions

Hearing 5 SGN-14006 / A.K. 2 Ear physiology

!  The human ear consists of three main parts: (1) outer ear, (2) middle ear, (3) inner ear

Hearing 6 SGN-14006 / A.K. 2.1 Outer ear

!  Outer ear consists of: –  pinna – gathers sound; direction-dependent response –  auditory canal (ear canal) - conveys sound to middle ear, and acts

as a resonator at 3-4 kHz amplifying sound +10 dB

!  Outer ear is passive and linear, and its behavior can be completely described by laws of acoustic wave propagation.

Nerve signal to brain

[Chittka05]

Hearing 7 SGN-14006 / A.K. 2.2 Middle ear

!  Middle ear contains –  Eardrum that transforms sound waves into mechanic vibration –  Tiny audtory bones (ossicles): 1 Malleus (resting against the

eardrum, see figure), 2 Incus and 3 Staples

!  The ossicles transmit eardrum vibrations (in air) to the oval window of the inner ear (filled with fluid)

!  Acoustic reflex: –  when sound

pressure level exceeds 50-60 dB, eardrum tension increases and staples is removed from oval window

–  Protects the inner ear from damage

1 2

3

Hearing 8 SGN-14006 / A.K. 2.3 Inner ear, cochlea

!  The inner ear contains the cochlea: a fluid-filled organ where vibrations are converted into nerve impulses to the brain.

!  Cochlea = Greek: “snail shell”. !  Spiral tube: When stretched out, approximately 35 millimeters long. !  Vibrations on the cochlea’s oval window cause hydraulic pressure

waves inside the cochlea !  Inside the cochlea there is

the basilar membrane, !  On the basilar membrane

there is the organ of Corti with nerve cells that are sensitive to vibration

!  Nerve cells transform movement information into neural impulses in the auditory nerve

Hearing 9 SGN-14006 / A.K. 2.4 Basilar membrane

!  Figure: cochlea stretched out for illustration purposes –  Basilar membrane divides the fluid of the cochlea into separate

tunnels –  When hydraulic pressure waves travel along the cochlea, they

move the basilar membrane from oval window to round window.

base

apex

Hearing 10 SGN-14006 / A.K. Basilar membrane

Best freq (Hz)

Travelling waves:

!  Different frequencies produce highest amplitude at different sites !  Preliminary frequency analysis happens on the basilar membrane

apex

base


!  Distributed along the basilar membrane are sensory hair cells that transform membrane movement into neural impulses

!  When a hair cell bends, it generates neural impulses –  Impulse rate depends on vibration amplitude and frequency

!  Depending on the position on the basilar membrane, each nerve cell has a characteristic frequency to which it is most responsive to (right fig.: �tuning curves� of 6 different cells from different positions)

!  Left fig: Varying the amplitude of stimulus (dB) broadens the range of frequencies that the hair cell reacts to

2.5 Sensory hair cells Hearing 12

SGN-14006 / A.K. 3 Loudness

!  Loudness describes the subjective level of sound –  Perception of loudness is relatively complex, but –  consistent phenomenon and –  one of the central parts of psychoacoustics

!  The loudness of a sound can be compared to a standardized reference tone, for example 1000 Hz sinusoidal tone –  Loudness level (phon) is defined to be the sound pressure level

(dB) of a 1000 Hz sinusoidal, that has the the same subjective loudness as the target sound

–  For example if the heard sound is perceived as equally loud as 40 dB 1kHz sinusoidal, is the loudness level 40 phons

Hearing 13 SGN-14006 / A.K. 3.1 Equal-loudness curves

Soun

d pr

essu

re le

vel (

dB)

Frequency (Hz)

Loudness level (phons)


!  Comparing to two band-limited noises with same center frequencies and increasing the other’s bandwidth, the perceived loudness increases only after a critical bandwidth –  Figure: 1 kHz @ 60 dB, Critical bandwidth is 160 Hz at 1 kHz

!  Ear analyzes sound at critical band resolution. Each critical band contributes to the overall loudness level

!  Each center frequency fc has a critical bandwidth Δf (Bark bandwidths)

!  Equivalent rectangular bandwidth (ERB) is a different type of procedure to measure a rectangular bandwidth of auditory filters. !  Both bandwidths can be used to define a frequency Bark and ERB scales

–  Stack bandwidths on top of each other

3.2 Critical bands

Hearing 15 SGN-14006 / A.K. 3.3 Perceptually-motivated frequency scales

mm. on basilar membrane frequency / kHz frequency / mel frequency / Bark


!  A broadband sound is perceived louder than a narrowband sound of equal SPL, as shown by the critical bandwidth experiment.

!  Loudness of a complex sound is calculated by using so-called specific loudness of a critical band as intermediate unit

!  Specific loudness is roughly proportional to the log-power of the signal at the band (weighted according to sensitivity of hearing and spread slightly by convolving over bands)

!  Overall loudness is obtained by summing up specific loudness values over critical band

3.4 Loudness of a complex sound

Hearing 17 SGN-14006 / A.K. 3.5 Measuring sound level

!  SPL does not work well as a perceptual metric of sound loudness.

!  Different frequency weighting curves (A,B,C,D) can be applied to measure sound level, which is more related to frequency sensitivity of hearing

!  A-weighting is most common, referred as dB(A), it roughly resembles the inverse of the hearing threshold curve.

wikipedia

Hearing 18 SGN-14006 / A.K. 4 Masking

!  Masking describes the situation where a weaker but clearly audible signal (maskee, test tone) becomes inaudible in the presence of a louder signal (masker)

!  Masking depends on both the spectral structure of the sounds and their variation over time

Hearing 19 SGN-14006 / A.K. 4.1 Masking in frequency domain

!  Model of the frequency analysis in the auditory system –  subdivision of the frequency axis into critical bands –  frequency components within a same critical band mask each

other easily –  Bark & ERB scales: frequency scales that are derived by

mapping frequencies to critical band numbers !  Narrowband noise masks a tone (sinusoidal) easier than

a tone masks noise !  Masked threshold refers to the raised threshold of

audibility caused by the masker –  sounds with a level below the masked threshold are inaudible –  masked threshold in quiet = threshold of hearing in quiet

Hearing 20 SGN-14006 / A.K. Masking in frequency domain

!  Figure: masked thresholds [Herre95] –  masker: narrowband noise around 250 Hz, 1 kHz, 4 kHz –  spreading function: the effect of masking extends to the spectral

vicinity of the masker (spreads more towards high freqencies)

!  Additivity of masking: joint masked thresh is approximately (but slightly more than) sum of the components

Hearing 21 SGN-14006 / A.K. 4.2 Masking in time domain

!  Forward masking –  masking effect extends to times after the masker is switched off

!  Backwards masking –  masking extends to times before the masker is been switched on

!  Forward/backward masking does not extend far in time " simultaneous masking is more important phenomenon

forward masking

backward masking

Hearing 22 SGN-14006 / A.K. 5 Pitch

!  Pitch –  Subjective attribute of sounds that enables us to arrange them on

a frequency-related scale ranging from low to high –  Sound has a certain pitch if human listaners can consistently

match the frequency of a sinusoidal tone to the pitch of the sound

!  Fundamental frequency vs. pitch –  Fundamental frequency is a physical attribute –  Pitch is a perceptual attribute –  Both are measured in Hertz (Hz) –  In practise, perceived pitch ≈ fundamental frequency

!  "Perfect pitch" or "absolute pitch" - ability to recognize the pitch of a musical note without any reference –  Minority of the population can do that

Hearing 23 SGN-14006 / A.K. 5.1 Harmonic sound

!  For a sinusoidal tone –  Fundamental frequency = sinusoidal frequency –  Pitch ≈ sinusoidal frequency

!  Harmonic sound

Trumpet sound: * Fundamental frequency F = 262 Hz * Wavelength 1/F = 3.8 ms

Hearing 24 SGN-14006 / A.K. 5.2 Pitch perception

!  Pitch perception has been tried to explain using two competing theories –  Place theory: “Peak activity along the basilar membrane

determines pitch” (fails to explain missing fundamental) –  Periodicity theory: “Pitch depends on rate, not place, of response.”

Neurons fire in sync with signals !  The real mechanism

is a combination of the above –  Sound is subdivided into

subbands (critical bands) –  Periodicity of the

amplitude envelope (see lowest panel) is analyzed within bands

–  Results are combined across bands

Hearing 25 SGN-14006 / A.K. 6 Spatial hearing

!  The most important auditory cues for localizing a sound sources in space are 1.  Interaural time difference (ITD)

•  Is relatively constant with frequency 2.  Interaural intensity difference (ILD)

•  Increases with frequency 3.  Direction-dependent filtering of the sound spectrum by head and

pinnae

!  Terms –  Monaural : with one ear –  Binaural : with two ears –  Interaural : between the ears (interaural time difference etc) –  Lateralization : localizing a source in horizontal plane

Hearing 26 SGN-14006 / A.K. 6.1 Monaural source localization

!  Diretional hearing works to some extent even with one ear !  Head and pinna form a direction-dependent filter

–  Direction-dependent changes in the spectrum of the sound arriving in the ear can be described with HRTFs

–  HRTF = head-related transfer function !  HRTFs are crucial

for localizing sources in the median plane (vertical localization)

Hearing 27 SGN-14006 / A.K. Monaural source localization

!  HRTFs can be measured by recording –  Sound emitted by a source –  Sounds arriving to the auditory canal or eardrum (transfer function

of the auditory canal does not vary along with direction) !  In practice

–  left: microphone in the ear of a test subject, OR

–  right: head and torso simulator

Hearing 28 SGN-14006 / A.K. 6.2 Localizing a sinusoidal

!  Experimenting with sinusoidal tones helps to understand the localization of more complex sounds

!  Angle-of-arrival perception for sinusoids below 750 Hz is based mainly on interaural time difference

Hearing 29 SGN-14006 / A.K. Localizing a sinusoidal

!  Interaural time difference is useful only up to 750 Hz –  Above that, the time difference is ambiguous, since there are

several wavelengths within the time difference –  Moving the head (or source movement) helps: can be done up to

1500 Hz !  At higher frequencies

(> 750 Hz) the auditory system utilizes interaural intensity difference –  Head causes and acoustic

”shadow” (sound level is lower behind the head)

–  Works especially at high frequencies with

shorter wavelenghts compared to head dimensions

Hearing 30 SGN-14006 / A.K. 6.3 Localizing complex sounds

!  Complex sounds refer to sounds that –  involve a number of different frequency components and –  vary over time

!  Localizing sound sources is typically a result of combining all the above-described mechanisms 1.  Interaural time difference (most important) 2.  Interaural intensity difference 3.  HRTFs

!  Wideband noise: directional hearing works well

Hearing 31 SGN-14006 / A.K. 6.4 Lateralization in headphone listening

!  When listening with headphones, the sounds are often localized inside the head, on the axis between the ears –  Sound does not seem to come from outside the head because the

diffraction caused by pinnae and head is missing –  If the sounds are processed with HRTFs carefully, they move outside the

head

Hearing 32 SGN-14006 / A.K. 6.5 Precedence effect

!  The sound in indoors that reaches the listener contains the direct path, early reflections, and late reverberation.

!  The brain suppresses early reflected sounds to aid in source direction perception –  Early reflections are not heard as separate sounds. –  Instead, they amplify the loudness of direct sound

!  Precedence effect works when 1.  Early reflections arrive less than ~ 35ms after direct sound. 2.  The spectrum of a reflection is similar to that of the direct sound 3.  Reflections are lower or at least not signigicantly louder than direct

sound •  Even a sound with +10 dB SPL than direct sound can not be heard

hearing 1 hearing 2 hearing - tunisgn14006/pdf2015/l03-hearing.pdf · hearing 6 2.1 outer ear...

Documents