obstruent acoustics
DESCRIPTION
Obstruent Acoustics. Bonus Learning Fun!. Motor Theory, in a nutshell. The big idea: We perceive speech as abstract “gestures”, not sounds. Evidence: The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds Speech perception is multi-modal - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/1.jpg)
Obstruent Acoustics
Bonus Learning Fun!
![Page 2: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/2.jpg)
Motor Theory, in a nutshell• The big idea:
• We perceive speech as abstract “gestures”, not sounds.
• Evidence:
1. The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds
2. Speech perception is multi-modal
3. Direct (visual, tactile) information about gestures can influence/override indirect (acoustic) speech cues
4. Limited top-down access to the primary, acoustic elements of speech
![Page 3: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/3.jpg)
Moving On…• One important lesson to take from the motor theory perspective is:
• The dynamics of speech are generally more important to perception than static acoustic cues.
• Note: visual chimerism and March Madness.
![Page 4: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/4.jpg)
Auditory Chimeras• Speech waveform + music spectrum:
• Music waveform + speech spectrum:
frequency bands
1 2 4 8 16 32
frequency bands
1 2 4 8 16 32
Source: http://research.meei.harvard.edu/chimera/chimera_demos.html
Originals:
![Page 5: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/5.jpg)
Auditory Chimeras• Speech1 waveform + speech2 spectrum:
• Speech2 waveform + speech1 spectrum:
frequency bands
1 2 4 6 8 16
frequency bands
1 2 4 6 8 16
Originals:
![Page 6: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/6.jpg)
Closure Voicing• The low frequency information that passes through the stop “filter” appears as a “voicing bar” in a spectrogram.
• This acoustic information provides hardly any cues for place of articulation.
Armenian:
[bag]
![Page 7: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/7.jpg)
Stop Transition Cues (again)• With the transition between stop closure and vowel, the perceptual task becomes much easier:
• Try the same with Peter’s productions:
• stop closures:
• with transitions:
• The moral of the story (again):
• Dynamic changes provide stronger perceptual cues to place than static acoustic information.
![Page 8: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/8.jpg)
Release Bursts• Note: along with transitions, stops have another cue for place at their disposal.
• = release bursts
• (nasals do not have these)
• Here’s a waveform of a [p] release burst:
duration 5 msec
• What do you think the [p] burst spectrum will look like?
![Page 9: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/9.jpg)
Burst Spectrum• [p] bursts tend to have very diffuse spectra, with energy spread across a wide range of frequencies.
• Also: [p] bursts are very weak in intensity.
• Extremely short duration of bursts requires lots of damping in the waveform.
• broader frequency range
![Page 10: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/10.jpg)
Release Bursts• In a spectrogram:
• bilabial release bursts have a very diffuse spectrum, weakly spread across all frequencies.
[p] burst
• [p] bursts are relatively close to pure transient sounds.
![Page 11: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/11.jpg)
Transients• A transient is:
• “a sudden pressure fluctuation that is not sustained or repeated over time.”
• An ideal transient waveform:
![Page 12: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/12.jpg)
A Transient Spectrum• An ideal transient spectrum is perfectly flat:
![Page 13: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/13.jpg)
Burst Filtering• The spectra of more posterior release bursts may be filtered by the cavity in front of the burst.
• Ex: [t] bursts tend to lack energy at the lowest end of the frequency scale.
• And higher frequency components are somewhat more intense.
[t] burst
![Page 14: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/14.jpg)
Release Bursts: [k]• Velar release bursts are relatively intense.
• They also often have a strong concentration of energy in the 1500-2000 Hz range (F2/F3).
• There can often be multiple [k] release bursts.
[k] burst
![Page 15: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/15.jpg)
Another Look• [k] bursts tend to be intense right where F2 and F3 meet in the velar pinch:
Armenian:
[bag]
![Page 16: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/16.jpg)
Finally, Fricatives• The last type of sound we need to consider in speech is an aperiodic, continuous noise.
• (Transients are aperiodic but not continuous.)
• Ideally:
• Q: What would the spectrum of this waveform look like?
![Page 17: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/17.jpg)
White Noise Spectrum• Technical term: White noise
• has an unlimited range of frequency components
• Analogy: white light is what you get when you combine all visible frequencies of the electromagnetic spectrum
![Page 18: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/18.jpg)
Turbulence• We can create aperiodic noise in speech by taking advantage of the phenomenon of turbulence.
• Some handy technical terms:
• laminar flow: a fluid flowing in parallel layers, with no disruption between the layers.
• turbulent flow: a fluid flowing with chaotic property changes, including rapid variation in pressure and velocity in both space and time
• Whether or not airflow is turbulent depends on:
• the volume velocity of the fluid
• the area of the channel through which it flows
![Page 19: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/19.jpg)
Turbulence• Turbulence is more likely with:
• a higher volume velocity
• less channel area
• All fricatives therefore require:
• a narrow constriction
• high airflow
![Page 20: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/20.jpg)
Fricative Specs• Fricatives require great articulatory precision.
• Some data for [s] (Subtelny et al., 1972):
• alveolar constriction 1 mm
• incisor constriction 2-3 mm
• Larger constrictions result in -like sounds.
• Generally, fricatives have a cross-sectional area between 6 and 12 mm2.
• Cross-sectional areas greater than 20 mm2 result in laminar flow.
• Airflow = 330 cm3/sec for voiceless fricatives
• …and 240 cm3/sec for voiced fricatives
![Page 21: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/21.jpg)
Turbulence Sources• For fricatives, turbulence is generated by forcing a stream of air at high velocity through either a narrow channel in the vocal tract or against an obstacle in the vocal tract.
• Channel turbulence
• produced when airflow escapes from a narrow channel and hits inert outside air
• Obstacle turbulence
• produced when airflow hits an obstacle in its path
![Page 22: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/22.jpg)
Channel vs. Obstacle• Almost all fricatives involve an obstacle of some sort.
• General rule of thumb: obstacle turbulence is much noisier than channel turbulence
• [f] vs.
• Also: obstacle turbulence is louder, the more perpendicular the obstacle is to the airflow
• [s] vs. [x]
• [x] is a “wall fricative”
![Page 23: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/23.jpg)
Sibilants• Alveolar, dental and post-alveolar fricatives form a special class (the sibilants) because their obstacle is the back of the upper teeth.
• This yields high intensity turbulence at high frequencies.
![Page 24: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/24.jpg)
vs.
“shy” “thigh”
![Page 25: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/25.jpg)
Fricative Noise• Fricative noise has some inherent spectral shaping
• …like “spectral tilt”
• Note: this is a source characteristic
• This resembles what is known as pink noise:
• Compare with white noise:
![Page 26: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/26.jpg)
Fricative Shaping• The turbulence spectrum may be filtered by the resonating tube in front of the fricative.
• (Due to narrowness of constriction, back cavity resonances don’t really show up.)
• As usual, resonance is determined by length of the tube in front of the constriction.
• The longer the tube, the lower the “cut-off” frequency.
• A basic example:
• [s] vs.
![Page 27: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/27.jpg)
vs.
“sigh” “shy”
[s]
![Page 28: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/28.jpg)
Sampling Rates Revisited• Remember: Digital representations of speech can only capture frequency components up to half the sampling rate
• the Nyquist frequency
• Speech should be sampled at at least 44100 Hz
(although there is little frequency information in speech above 10,000 Hz)
• [s] has higher acoustic energy from about 3500 - 10000 Hz
• Note: telephones sample at 8000 Hz
• 44100 Hz • 8000 Hz
![Page 29: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/29.jpg)
Further Back
[xoma]
palatal vs. velar
• In more anterior fricatives, turbulence noise is generally shaped like a vowel made at the same place of articulation.
![Page 30: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/30.jpg)
Even Further Back• Examples from Hebrew:
![Page 31: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/31.jpg)
At the Tail End• [h] exhibits a lot of coarticulation
• [h] is not really a “fricative”;
• it’s more like a whispered or breathy voiced vowel.
“heed” “had”
![Page 32: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/32.jpg)
Aspirated Fricatives• Like stops, fricatives can be aspirated.
• [h] follows the supraglottal frication in the vocal tract.
• Examples from Chinese:
[tsa] [tsha]
![Page 33: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/33.jpg)
Back at the Ranch• There is not much of a resonating filter in front of labial fricatives…
• so their spectrum is flat and diffuse
• (like bilabial stop release bursts)
• Note: labio-dentals are more intense than bilabial fricatives
• (channel vs. obstacle turbulence)
![Page 34: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/34.jpg)
Fricative Internal Cues• The articulatory precision required by fricatives means that they are less affected by context than stops.
• It’s easy for listeners to distinguish between the various fricative places on the basis of the frication noise alone.
• Result of both filter and source differences.
• Examples:
• There is, however, one exception to the rule…
![Page 35: Obstruent Acoustics](https://reader035.vdocuments.mx/reader035/viewer/2022062520/5681633e550346895dd3ce48/html5/thumbnails/35.jpg)
Huh?• The two most confusable consonants in the English language are [f] and .
• (Interdentals also lack a resonating filter)