![Page 1: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/1.jpg)
BILC SEMINAR 2009
Speech Recognition: Is It for Real?
Tony Mirabito
Defense Language Institute
English Language Center
(DLIELC)
DLIELC
![Page 2: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/2.jpg)
OVERVIEW
● The technology evolution in language teaching ● What speech recognition is
● How speech recognition works
● Shortcomings of speech recognition ● Conclusions
DLIELC
![Page 3: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/3.jpg)
THE TECHNOLOGY EVOLUTION
The Classroom
![Page 4: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/4.jpg)
WHAT SPEECH RECOGNITION IS
● A formal definition: -- A system of spoken input into a computer
in which software can “recognize” the input and transform it into digitized signals—that is, “react” in various ways to the spoken input
● Examples: -- Speech to text -- “Telephony”: airlines, transportation, etc. -- Commercial software for learning a language (e.g. Rosetta Stone)
DLIELC
![Page 5: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/5.jpg)
WHAT SPEECH RECOGNITION IS
● A computer program that takes verbal input and “matches” it against models—acoustic and language models
● A computer program that allows speech to be evaluated as “correct” or “acceptable” or “incorrect” or “unacceptable”
● A computer program that “talks to“ an authoring software and allows the software to branch in different directions based on the evaluation
DLIELC
![Page 6: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/6.jpg)
WHAT SPEECH RECOGNITION IS
● “Speaker independent” vs. “speaker dependent” -- Speaker independent: a speech recognition program that recognizes all speakers (used in language learning)
-- Speaker dependent: a speech recognition program that is “trained” to recognize a particular speaker (speech to text)
DLIELC
![Page 7: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/7.jpg)
WHAT SPEECH RECOGNITION IS
● “Discreet speech input” vs. “continuous speech input”
-- Discreet speech input: requires a user to pause between words (e.g. “I + want + to + leave.”)
-- Continuous speech input: blending of sounds between words is allowed (e.g. “next + week” becomes “neksweek’)
● Cannot “understand” free speech; it “matches” speech input with stored data and pre-determined parameters
DLIELC
![Page 8: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/8.jpg)
HOW SPEECH RECOGNITION WORKS
● Speech recognition technology is based on the Markov Chain Theory, a mathematical formula that deals with probabilities and changes
● Most speech recognition engines contain common databases for a particular language: -- Grammar -- Lexicon (dictionary) -- Supra-segmental models (prosody) -- Acoustic speech samples
DLIELC
![Page 9: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/9.jpg)
HOW SPEECH RECOGNITION WORKS
● Speech is input through a microphone, analyzed by databases (“search aligners”), and then scored against a norm
● A developer can determine the score as being “acceptable” or “unacceptable”; appropriate feedback from the authoring software can be given to the user (text, audio, video, etc.)
DLIELC
![Page 10: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/10.jpg)
DLIELC
Speech Input SR Engine Databases
Authoring Software
Feedback
HOW SPEECH RECOGNITION WORKS
● Communication between the speech engine and the authoring software is essential
That’s a table.
![Page 11: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/11.jpg)
SHORTCOMINGS OF SPEECH RECOGNITION TECHNOLOGY
● Quiet environment needed plus noise- reduction microphones
● General problems with consonants ● Prosody and fluency problems which
requires re-engineering the engine
● Perpetual issues: false positives false negatives● Most recognizers are effective only in limited domains
DLIELC
![Page 12: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/12.jpg)
SHORTCOMINGS OF SPEECH RECOGNITION TECHNOLOGY
● If used as a diagnostic tool to pinpoint pronunciation problems, it must be carefully re-engineered to do so, and it must be accurate
DLIELC
![Page 13: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/13.jpg)
CONCLUSIONS
● Most speech recognizers have a couple of domains where they are effective—the recognizer should be used only in these domains (“low stakes” vs. “high stakes”)
● Speech recognition technology is not a “black box” or a “magic pill”. It’s a tool that has to be used very carefully.
● Research needs to be done in order to use a recognizer effectively
● We must never forget that technology is effective only if it allows people to learn
DLIELC
![Page 14: 1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649e9f5503460f94ba18a3/html5/thumbnails/14.jpg)
BILC SEMINAR 2009
QUESTIONS?
COMMENTS?
DLIELC