speech recognition seminar
DESCRIPTION
Presentation For The Seminar On The Topic "Speech Recognition" For The partial Fulfillment Of The Requirements For Third Year Computer Engineering.TRANSCRIPT
![Page 1: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/1.jpg)
1
SPEECH RECOGNITION
07-Feb-2013
Seminar By: Suraj Vitthal GaikwadGuided By: Prof. S. R. Lahane
![Page 2: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/2.jpg)
2
Outline
Introduction Speech Recognition Process Types Of Speech Recognition Systems Algorithms Applications Advantages & Disadvantages Future Scope Conclusion
07-Feb-13SPEECH RECOGNITION
![Page 3: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/3.jpg)
3
Introduction
Speech recognition is the process by which a computer (or any other type of machine) identifies spoken words.
Basically, it means talking to your computer, AND having it correctly understand what you are saying.
An alternative to traditional methods of interacting with a computer.
07-Feb-13SPEECH RECOGNITION
![Page 4: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/4.jpg)
4
07-Feb-13SPEECH RECOGNITION
![Page 5: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/5.jpg)
5
Speech Recognition Process
07-Feb-13
Signal Processing Convert the audio wave into a sequence of feature
vectors Speech Recognition
Decode the sequence of feature vectors into a sequence of words
Semantic Interpretation Determine the meaning of the recognized words
Dialog Management Correct the errors and help get the task done
Response Generation What words to use so as to maximize user
understanding Speech Synthesis (Text to Speech)
Generate synthetic speech from a ‘marked-up’ word string
SPEECH RECOGNITION
![Page 6: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/6.jpg)
6
Typical Speech Recognition Process
07-Feb-13SPEECH RECOGNITION
![Page 7: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/7.jpg)
7
Types of Speech Recognition
07-Feb-13
Isolated Words Single utterance at a time
Connected Words Separate utterances together with a
minimal pause between them Continuous Speech
Rehearsed speech or dictation Spontaneous Speech
Natural speechSPEECH RECOGNITION
![Page 8: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/8.jpg)
8
Algorithms
07-Feb-13
Dynamic Time Warpingan algorithm for measuring similarity
between two sequences which may vary in time or speed.
Hidden Markov Models Neural Networks
SPEECH RECOGNITION
![Page 9: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/9.jpg)
9
Hidden Markov Model
07-Feb-13
In a HMM, the state is not directly visible, but output, dependent on the state, is visible.
Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.
x — statesy — possible observationsa — state transition probabilitiesb — output probabilities
SPEECH RECOGNITION
![Page 10: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/10.jpg)
10
HMM Example
07-Feb-13SPEECH RECOGNITION
![Page 11: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/11.jpg)
11
Neural Network
07-Feb-13
A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation.
An NN is typically defined by three types of parameters: The interconnection pattern between different
layers of neurons The learning process for updating the weights of the
interconnections The activation function that converts a neuron's
weighted input to its output activation.SPEECH RECOGNITION
![Page 12: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/12.jpg)
12
Speech Recognition Softwares
07-Feb-13
Open source Julius
Macintosh Dragon Dictate
Mobile Devices/ Smartphone Google Now Siri Micromax AISHA
(Artificial Intelligence Speech Handset Assistant) S Voice Iris (Intelligent Rival Imitator of Siri)
Windows Dragon NaturallySpeaking Windows Speech Recognition
SPEECH RECOGNITION
![Page 13: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/13.jpg)
13
Applications
07-Feb-13
Games and Edutainment Data Entry Document Editing Speaker Identification/Verification Automation at Call Centers Medical/Disabilities Fighter Aircrafts
SPEECH RECOGNITION
![Page 14: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/14.jpg)
14
Advantages
07-Feb-13SPEECH RECOGNITION
Increases Productivity Can help with menial computer tasks Can help people with disabilities Cost Effective Diminishes Spelling Mistakes
![Page 15: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/15.jpg)
15
Disadvantages
07-Feb-13
Inaccuracy & Slowness Vocal Strain Adaptability Out-of-Vocabulary (OOV) Words Spontaneous Speech. Etc Accent, Dialect and Mixed Language
SPEECH RECOGNITION
![Page 16: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/16.jpg)
SPEECH RECOGNITION
16
Future Scope
07-Feb-13
Achieving efficient speaker independent word recognition
SRS may have the ability to distinguish nuances of speech and meanings of words.
Stand alone Speech Recognition Systems.
Wearable Speech Recognition System. Talk with all the devices.
![Page 17: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/17.jpg)
17
Conclusion
07-Feb-13
Within five years, speech recognition technology will become so pervasive in our daily lives that service environments lacking this technology will be considered inferior.
Speech recognition will revolutionize the way people interacted with Smart devices & will, ultimately, differentiate the upcoming technologies.
SPEECH RECOGNITION
![Page 18: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/18.jpg)
18
References
07-Feb-13SPEECH RECOGNITION
JOE TEBELSKIS {1995}, SPEECH RECOGNITION USING NEURAL NETWORKS, School of Computer Science, Carnegie Mellon University
KÅRE SJÖLANDER {2003}, An HMM-based system for automatic segmentation and alignment of speech, Umeå University, Department of Philosophy and Linguistics
KLAUS RIES {1999}, HMM AND NEURAL NETWORK BASED SPEECH ACT DETECTION, International Conference on Acoustics and Signal Processing (ICASSP’99)
B. PLANNERER {2005}, AN INTRODUCTION TO SPEECH RECOGNITION KIMBERLEE A. KEMBLE, AN INTRODUCTION TO SPEECH RECOGNITION,
Voice Systems Middleware Education, IBM LAURA SCHINDLER {2005}, A SPEECH RECOGNITION AND SYNTHESIS
TOOL, Department of Mathematics and Computer Science, College of Arts and Science, Stetson University
MIKAEL NILSSON, MARCUS EGNARSSON {2002}, SPEECH RECOGNITION USING HMM, Blekinge Institute Of technology
![Page 19: Speech Recognition Seminar](https://reader036.vdocuments.mx/reader036/viewer/2022082320/5456f982b1af9ff5168b4c6b/html5/thumbnails/19.jpg)
19
07-Feb-13SPEECH RECOGNITION
THANK YOU…!!
ANY QUESTIONS…??