my project
DESCRIPTION
My University ProjectTRANSCRIPT
ROBOTIC CONTROL THROUGH SPEECH
INTRODUCTION• This voice recognition project consists of two
major components, a speech recognition module and a motorized robot.
• Programmable module allows us to write the programming in Visual DSP++ (Programming applications for the ADSP 2181 Architecture).
• The motorized robot will consist of two DC motors and will make the robot forward and backward directions.
DEPARTMENT OF ECE 2
PROJECT DESCRIPTION
The Speaker Recognition can be classified into two phases.
1 Training Phase.2 Testing Phase.
DEPARTMENT OF ECE 3
Training Phase.
• In Training Phase ,the frequency components of the given speech signal is extracted.
• Each registered speaker has to provide samples of their speech (given words).
• so that the system an build or train a reference model for that speaker.
DEPARTMENT OF ECE 4
Testing phase
• In testing phase ,the input speech is matched with stored references models (s)
• Recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients (MFCC)
• The command recognition is observed by the operation of stepper motor & DC motor and the control signals to the DC motor
DEPARTMENT OF ECE 5
ARCHITECTURE OF ADSP 2181
DEPARTMENT OF ECE 6
FEATURES OF ADSP 2181 PROCESSOR
• 25 ns Instruction Cycle Time from 20 MHz Crystal at 5.0 Volts
• Single-Cycle Instruction Execution
• Multifunction Instructions
• Low Power Dissipation in Idle Mode
• 16K Words On-Chip Program Memory RAM
• 16K Words On-Chip Data Memory RAM
• Independent ALU, Multiplier/Accumulator, and Barrel Shifter Units
• 3-Bus Architecture Allows Dual Operand Fetches in every Instruction Cycle
DEPARTMENT OF ECE 7
ALU and MAC
• The ALU performs a standard set of arithmetic and logic operations in addition to division primitives.
• The MAC performs single-cycle multiply, multiply/add and multiply/subtract operations.
DEPARTMENT OF ECE 8
SHIFTER
• The shifter performs logical and arithmetic shifts, normalization, de-normalization, and derive exponent operations.
• The shifter implements numeric format control including multiword floating-point representations.
DEPARTMENT OF ECE 9
SPEECH
• The input speech is given in the form of nos. like1, 2,3..
• The frequency range of human voice is 4kHz hence sampling frequency is taken as 8kHz
• In coding only 2000 samples are considered because only 0.25 sec will be taken for one character
10DEPARTMENT OF ECE
REPRESENTATION OF SPEECH SIGNAL
0 5 10 15 20 25 30 35 40-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
msec
11DEPARTMENT OF ECE
Block Diagram
Input speech
via mic ADSP 2181
DEPARTMENT OF ECE 12
DCMOTOR
MELSPECTRUM
WINDOWING FFT
MEL FREQWRAP
MELCEPSTRUM
CODEC FRAMMING
FRAMING
• Speech signal is blocked into frames of N samples (n=256)
• Adjacent Frames are separated by M samples (M=100)
• Frame1= 0-256
• Frame2=100-356
• Such kind of 18 frames are required for 2000 samples/sec character.
13DEPARTMENT OF ECE
FRAMING
14DEPARTMENT OF ECE
Windowing
• Minimizes signal discontinuity in each frame
• Reduced spectral distortion
• Window signal is obtained by
Y1(n)=x1(n)*w(n) ; 0<=n<N-1
• Where w(n) is Hamming Window and is given by
w(n)=0.54-0.46Cos(2∏ n/N-1); 0<=n<N-1
15DEPARTMENT OF ECE
Windowing
16DEPARTMENT OF ECE
Result of Windowing• 256 values are o/p of this process
• These values are given as an
input for FFT.
• Some values of windowing
for 1 kHz is shown
0x00000x08260x0BE60x08B70x000F0xF6C70xF26C0xF5FC0xFFE80x0AA90x0FC7
17DEPARTMENT OF ECE
Fast Fourier Transform
• Converts time domain signal into frequency domain signal
• Power spectrum is obtained with real and imaginary part of the frequency domain of the speech signal.
18DEPARTMENT OF ECE
Wrapping
• A subjective pitch for each frequency is computed using Mel Scale
• Mel frequency scale is given by mel(f)=2595*log10(1+f/700)
19DEPARTMENT OF ECE
Mel Frequency Coefficients
20DEPARTMENT OF ECE
MFCC
• It is Mel Frequency Cepstrum Coefficient
• It consists of various frequency coefficient components.
• It contains:
Mel Spectrum (frequency domain)
Mel Cepstrum (time domain)
21DEPARTMENT OF ECE
SPECTRUM
• Samples are convoluted with mel filter bank to obtain mel frequency spectrum.
• Mel frequency spectrum is given by
s(n)=y(n)*f(n)
s(n)------>mel frequency spectrum
y(n)------>samples
f(n)------->filter coefficients
22DEPARTMENT OF ECE
Inverse Discrete Cosine Transformation
• Mel frequency power spectrum is in frequency domain function
• In order to obtain a time domain function the signal undergoes IDCT
• Now mel frequency spectrum is converted into mel frequency cepstrum.
23DEPARTMENT OF ECE
CEPSTRUM
• MFCC real numbers and are convoluted to time domain using IDCT
• The time domain coefficients are called mel frequency cepstrum coefficients..
• MFCC is given by c(n)=sum of log (Sk * cos (n(k-.5)*pi/k)
24DEPARTMENT OF ECE
LEAST MEAN SQUARE ALGORITHM (LMS)
• This algorithm is used to find out the the minimum deviation between certain values.
• During testing phase the input speech is compared with the stored 4 values.
• The least deviated value is sent.
25DEPARTMENT OF ECE
INTERFACING PC WITH KIT
RS-232 SERIAL CABLE
DEPARTMENT OF ECE 26
PCDSP
PROCESSOR
DSP TO DC MOTOR
DEPARTMENT OF ECE 27
CIRCUIT DIAGRAM
DEPARTMENT OF ECE 28
HARDWARE DETAILS
• The latched output from the latch IC is given to the relays via resistor and transistor.
• According to the predefined input, the coil gets energized and relay is switched to ON position.
• Here we use SPDT relay
• It causes a current flow in the DC Motor.
DEPARTMENT OF ECE 29
Details of dc motor
• Speed of the motor - 300 rpm
• Current – 750mA
• Voltage – 7.5V
DEPARTMENT OF ECE 30
Advantages
• It is SPEECH recognizable
• Processing time is less
• Easy and efficient
• Useful for physically disable people
• Less cost
• Maintenance is easy
DEPARTMENT OF ECE 31
Limitations
• Mismatching of frequency may affect the compatibility with the hardware.
• Each and everyone voice should be trained before testing it.
DEPARTMENT OF ECE 32
APPLICATIONS
• Physically and visually impaired friendly device where only the speech signals of the user is required.
• In cases of acute problems like system crashes and all, this method can be utilized for emergency.
33DEPARTMENT OF ECE
CONCLUSION and FUTURE MODIFICATIONS
• Speech recognition is still an active research area.
• Speech Recognition brings in the communication between human and machine.
• This project recognizes the given speech signal and the word is displayed on the PC.
DEPARTMENT OF ECE 34
THANK YOU
DEPARTMENT OF ECE 35