1 the university of south florida audiovisual phoneme database v 1.0 frisch, s.a., stearns, a.m.,...

16
1 The University of South Florida audiovisual phoneme database v 1.0 Frisch, S.A., Stearns, A.M., Hardin, S.A., & Nikjeh, D.A. University of South Florida [email protected] This work supported by NIH-NIDCD R03 06164

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

1

The University of South Florida audiovisual phoneme database

v 1.0

Frisch, S.A., Stearns, A.M.,

Hardin, S.A., & Nikjeh, D.A.

University of South Florida

[email protected]

This work supported by NIH-NIDCD R03 06164

2

Phoneme Database Project

• Recorded wordlist demonstrating all English phonemes in initial, medial, and final word position (if possible)

• Audiovisual recordings– Acoustics– Face video– Ultrasound of tongue– Flexible endoscopy of pharynx, larynx

3

Gratuitous Equipment Picture

4

Purpose

• Potential for multimedia tools in teaching phonetics/speech science

• Students have ready access to multimedia computers

• Freeware for acoustic analysis is available

• Need for multimedia resources appropriate for students’ needs

5

Methods – Recording Parameters

• Ultrasound– Mid-saggital image of tongue posture– Probe in direct contact with jaw (no

compressible acoutically transparent standoff)

– No head stabilization

• Digital video camera– Aimed at angle to front of face– Shows lip and jaw movement

6

Methods – Recording Parameters

• Flexible endoscopy– Shows laryngeal setting (but cannot see

glottal cycle)– Also shows pharyngeal articulation

• Audio recording captured as part of all video recordings, used to synchronize videos with one another

7

Word List

• Each English phoneme in word initial, word medial, and word final position where allowed by English phonotactics

• Common words used wherever possible

• Some additional gaps in database due to recording difficulties

• See handout for complete list

8

Procedure

• Each word was read clearly in isolation

• Considerable pause between each word, with articulators moved back to “neutral” position

9

Post-Processing

• Video recordings were superimposed to create a single video file showing facial video, endoscopy of larynx, and ultrasound of tongue position

• Noise reduction applied to audio to eliminate machine noise from recording environment

10

Using the Database

• Recordings can be viewed with freeware Wavesurfer program

• Allows display of common acoustic phonetic analysis windows in conjunction with video image

• Cursor position in acoustic analysis window is tied to the appropriate frame in the video image

• Download from http://www.speech.kth.se/wavesurfer/

11

Example 1 – okay

tongue blade

front

tongue root

arytenoids

vocal folds

12

Example 1 – okay

• Ultrasound shows tongue body raised and tongue root pulled forward to produce high front vowel /e/

• Endoscope window shows arytenoids cartilages are approximated and glottis is closed for voicing

• Video clip shows lips pulled apart for unrounded vowel production

• Etc…

13

Example 2 – voice

// F2// F2

14

Example 2 – voice

• Sample image of diphthong off-glide //• Cursor positioned on spectrogram at end of

diphthong• Ultrasound shows visible tongue tip and body

raising, and advancement of the tongue root• Face video shows lip spreading and jaw raising• Endoscope shows approximation of the

arytenoids and vocal folds

15

Conclusion

• Ultrasound and other multimedia tools have great potential to enhance teaching and learning in phonetics/speech science

• Copies of the database, version 1.0, are available in a compressed file archive on CD-ROM

• Additional suggestions for improvements or additions to the database are welcome

16

Just for fun

• “Tongue” the music video included as part of the database