dr. shyamal kumar das mandal assistant professor
TRANSCRIPT
![Page 1: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/1.jpg)
Dr. Shyamal Kumar Das MandalAssistant Professor
[email protected] for Educational Technology
Indian Institute of Technology, Kharagpur
DIGITAL SPEECH PROCESSING
s1
![Page 2: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/2.jpg)
Slide 1
s1 sdm, 12/14/10
![Page 3: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/3.jpg)
Speech Processing
SignalProcessing Information
Theory
Acoustics Phonetics and Articulatory Phonetics
Algorithms(Programming)
Fourier transformsDiscrete time filtersAR(MA) models
EntropyCommunication theoryRate-distortion theory
Statistical SPStochastic models
PsychoacousticsSpeech production
![Page 4: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/4.jpg)
Speech Processing Model
Information Source
Measurement / Observation
Signal Representation
Signal Transformation
Extraction of Information
Utilization of Information
Human speaker—lots of variability
Acoustic waveform / articulatorypositions/neural control signals
Human listeners,machines
Signal Processing
![Page 5: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/5.jpg)
Digitization and Recording of speech signal, Review of Digital SignalProcessing Concepts
Human Speech production, Acoustic Phonetics and ArticulatoryPhonetics, Different categories speech sounds and Location of sounds inthe acoustic waveform and spectrograms
Uniform Tube Modeling of Speech Production, Speech Perception
Time Domain Methods in Speech Processing, Analysis and Synthesis ofPole-Zero Speech Models
Short-Time Fourier Transform, Analysis:- FT view and Filtering view,Synthesis:-Filter bank summation (FBS) Method and OLA Method
Features Extraction and Extraction of Fundamental frequency
Speech Prosody, Speech Prosody Modeling (Fujisaki Model)
Overview of Speech based Applications development (TTS, ASR andspoken language acquisition)
Course coverage
![Page 6: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/6.jpg)
Categories and labeling of different speech sound for agiven speech signal based on waveform and spectrographicview
Explain the psychoacoustic properties of speech perceptionand production
Design the Uniform tube model for speech sound productionand implement it based on discrete time modeling
Extract the fundamental frequency of speech signal basedon time domain and frequency domain method
Extract spectral parameters and time domain parameters ofspeech signal for speech technology application
Design an simple TTS and ASR system.
Explain the prosodic structure of spoken language anddesign F0 contour modeling based on Fujisaki Model
Course objective
![Page 7: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/7.jpg)
Review of DSP Concepts
![Page 8: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/8.jpg)
Concept of frequency in continuous-time and Discrete-time signal
ex tj
aAt )(
Continuous sinusoidal Time Signal
• For every fixed value of the frequency F x(t) is periodic
• Continuous time sinusoidal signal with distinctfrequencies are themselves distinct.
• Increase the frequency result in increase in the rate ofoscillation of the signalmore period are included.
tAtxacos)(
Complex exponent from
![Page 9: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/9.jpg)
Discrete time Sinusoidal
)cos()( nAnx
A discrete time signal is periodic if its frequency f is arational number
Discrete time sinusoidal whose frequency areseparated by an integer multiple of 2 are identical
The highest rate of oscillation in a discrete timesinusoidal is attained when w= or(‐ ).
![Page 10: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/10.jpg)
![Page 11: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/11.jpg)
![Page 12: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/12.jpg)
![Page 13: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/13.jpg)
Classification of Discrete time signal
Energy signal and power signal: If E is the energy of a signal x[n]
Periodic and aperiodic signal: a signal x[n] is periodic with period N if and only if x[n+N]=x[n] for all nThe smallest value of N for which holds the above equation is called fundamental period.If there is no value of N that satisfies the above equation then the signal is called aperiodic signal.
n
nxE ][2
If E is finite then x(n) is called an energy signalMany signal possesses infinite energy, have a finite average power P. The average power define as:
N
NnNnx
NP ][lim
2
121
If P is finite and nonzero then the signal is called a power signal
Symmetric and antisymmetric signal: a real‐valued signal x[n] is called symmetric if x[‐n]=x[n]On the other hand a signal x[n] is called antisymmetric if x[‐n]=‐x[n]
![Page 14: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/14.jpg)
Discrete time system
Discrete time system is a device or algorithm that operate on a discrete timesignal, called the input according to some well define rule, to produceanother discrete time signal called output of the system.
y[n]=H[x[n]]y[n]=0.8y[n-1]+0.5x[n]+0.9x[n-1]
Z‐1
+
Z‐1
+
x[n]
y[n]
0.8
0.9
0.5
![Page 15: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/15.jpg)
Classification of discrete system
• Static and Dynamic systemA discrete time system is called static or memory less it its output at anyinstant n depends at most on the input sample at the same time, but noton past or future samples of the input. In any other case the system issaid to be dynamic or to have memory.
• Time invariant and time variant system
A system is called time invariant if its input‐output characteristics do not change with time.
A relaxed system H is time invariant or shift invariant if and only if x[n] y[n] implies that x[n‐k] y[n‐k]
If the output y[n-k] not = y[n-k] even for one value of k the system is time variant
H H
![Page 16: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/16.jpg)
•Linear and non-linear system: A linear system is one that satisfies thesuperposition principle. The principle of superposition requires that theresponse of the system to a weighted sum of signal is equal to thecorresponding weighted sum of the response of the system to each ofthe individual input signal.• Causal and non-causal system: A system is said to be causal if the output of the system at any time n depends only on present and past input, but not depend on future inputs.y[n]=F[x[n],x[n-1],x[n-2]……….x[n-k]] where F is any function If a system does not satisfy the above condition then the system is called Non-causal system.•Stable and unstable system: An arbitrary relaxed system is said to bebounded input-bounded output(BIBO) stable if and only if every boundedinput produces a bounded output. x[n], y[n] are bounded is simplytranslated mathematically to mean that there exist some finite numberssay Mx, My such that
Mxnx ][ Myny ][
for all n
![Page 17: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/17.jpg)
Recursive and Non-recursive discrete system.
Cumulative average of signal x(n)
n
kkx
nny
0][
11][
][][][)1(1
0
nxkxnynn
k
][]1[ nxnny
+ x
x z‐1
y[n]
x[n]
][][][ knxkhnyk
n
1/n+1
A system whose output y[n] at time n depends on anynumber of past output values y[n-1], y[n-2] …. Is calledrecursive system
![Page 18: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/18.jpg)
ConvolutionConvolution is one of the most frequently usedoperations in DSP. Specially in digital filteringapplications where two finite and causal sequencesx[n] and h[n] of lengths N1 and N2 are convolved
0
][][][][][][][kk
knxkhknxkhnxnhny
![Page 19: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/19.jpg)
Operations involved
• Folding• Shifting• Multiplication• Summation
![Page 20: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/20.jpg)
)()()(
thxty
h()
x()
ty(t)
![Page 21: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/21.jpg)
h(-)
x()
ty(t)
)()()(
thxty
![Page 22: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/22.jpg)
)()()(
thxty
h(t-), t=0
x()
ty(t)
![Page 23: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/23.jpg)
)()()(
thxty
h(t-), t=1
x()
ty(t)
![Page 24: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/24.jpg)
y(t) x( )h(t
)
h(t-),t=2
x()
ty(t)
![Page 25: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/25.jpg)
y(t) x( )h(t
)
h(t-),t=7
x()
ty(t)
![Page 26: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/26.jpg)
![Page 27: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/27.jpg)
Circular convolution• Circular convolution of x(n) and h(n) is defined as the convolutionof h(n) with a periodic signal xp(n) :
nNnxnx p ,mod)(
][][][ nhnxny pp
where
1,.......1,0
][][][1
0
Nm
nmxnhnyN
nNpp
![Page 28: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/28.jpg)
22
Correlation
Correlation is a mathematical operation that is verysimilar to convolution. Just as with convolution,correlation uses two signals to produce a thirdsignal. This third signal is called the cross-correlation of the two input signals.
If a signal is correlated with itself, the resultingsignal is instead called the autocorrelation.
![Page 29: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/29.jpg)
Correlation
,...2,1,0)()()(
,...2,1,0)()()(
lnylnxlr
llnynxlr
nxy
nxy
Where, rxy(l) is the correlation coefficients
![Page 30: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/30.jpg)
Computation of correlation
)(lrxy
lM
ln
lnynx1
)()(
1
)()(N
ln
lnynx
MNl 0
1 NlMN
FOR l=1 to lmax{NL=M+1‐lIF(NL>N‐1) NL=NL‐1R(L)=0.0FOR(K=l TO NL{R(l)=R(l)+X(K)*Y(K‐l)}}
![Page 31: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/31.jpg)
25
Convolution vs. correlation
Convolution is the relationship between a system'sinput signal, output signal, and impulse response.
Correlation is a way to detect a known waveform ina noisy background.
The similar mathematics is only a convenientcoincidence.
![Page 32: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/32.jpg)
LTI System
![Page 33: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/33.jpg)
Linear Time-Invariant Systems
Easiest to understand Easiest to manipulate Powerful processing capabilities Characterized completely by their response tounit sample, h(n), via convolution relationship Basis for linear filtering Used as models for speech production
![Page 34: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/34.jpg)
Equivalent LTI Systems
![Page 35: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/35.jpg)
Find y[n]?
x[n]
y[n]
![Page 36: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/36.jpg)
30
Direct Form I
• Transfer function of recursive LTI system
M
kk
N
kk knxbknyany
01
nvknyany
knxbnv
N
kk
M
kk
1
0
)(1
nxknwanwN
kk
M
kk knwbny
0
![Page 37: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/37.jpg)
31
Direct Form I
![Page 38: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/38.jpg)
32
![Page 39: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/39.jpg)
33
Frequency-Domain Representation of Discrete Signals and LTI Systems
LTI system( )h ncomplex‐valued
exponencial signal
( ) j nx n e
impulse response
( ) ( ) ( )k
y n h k x n k
LTI system output
( )y n
![Page 40: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/40.jpg)
LTI system output:
( )( ) ( ) ( ) ( )
( ) ( )
j n k
k k
j k j n j n j k
k k
y n h k x n k h k e
h k e e e h k e
( ) ( )j n jy n e H e
Frequency response: ( ) ( )j j k
kH e h k e
![Page 41: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/41.jpg)
35
( )( ) ( )j j jH e H e e
( ) Re ( ) Im ( )j j jH e H e j H e
( ) ( )cos ( )sinj
k k
H e h k k j h k k
Re ( ) ( )cosj
k
H e h k k
Im ( ) ( )sinj
k
H e h k k
![Page 42: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/42.jpg)
36
Magnitude response:
2 2( ) Re ( ) Im ( )j j jH e H e H e
Im ( )( ) arg ( )
Re ( )
jj
j
H eH e arctg
H e
Phase response:
( )( ) dd
Group delay function:
![Page 43: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/43.jpg)
37
Comments on symmetry propertiesFor LTI systems with real-valued impulse response, the magnitude response, phase responses, the real component of and the imaginary component of
possess these symmetry properties:
The real component: even function of periodic with period
The imaginary component: odd function of periodic with period
( )jH e
2
2
Re ( ) Re ( )j jH e H e
Im ( ) Im ( )j jH e H e
![Page 44: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/44.jpg)
38
The magnitude response: even function of periodic with period
The phase response: odd function of periodic with period
( ) ( )j jH e H e
2
arg ( ) arg ( )j jH e H e
Consequence:
If we known and for , we can describe these functions ( i.e. also ) for all values of
( )jH e ( ) 0 ( )jH e
2
![Page 45: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/45.jpg)
39
( )jH e
24 3 2 3 4
24 3 2 3 4
Symmetry Properties
( )
EVEN
ODD
0
0
![Page 46: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/46.jpg)
40
Normalized Frequency
It is often desirable to express the frequency response of an LTI system in terms of units of frequency that involve sampling interval T. In this case, the expressions:
( ) ( )j j k
kH e h k e
1( ) ( )
2j j nh n H e e d
are modified to the form:
( ) ( )j T j kT
kH e h kT e
/
/
( ) ( )2
Tj T j nT
T
Th nT H e e d
![Page 47: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/47.jpg)
41
is periodic with period , where is sampling frequency.
Solution: normalized frequency approach:
( )j TH e 2 / 2T F F/ 2F
/ 2 50F kHz 50kHz 3
1 3
20 10 2 0.450 10 5
xx
3
2 3
25 10 0.550 10 2
xx
100F kHz
1 20f kHz
2 25f kHz
Example:
![Page 48: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/48.jpg)
Discrete Time Fourier Transform (DTFT)
![Page 49: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/49.jpg)
Discrete Time Fourier Transform• Continuous time Fourier transform, when the signal is sampled:
• Assuming• Discrete‐Time Fourier Transform (DTFT):
DTFT is periodic in frequency with period of 2
![Page 50: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/50.jpg)
Example
![Page 51: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/51.jpg)
![Page 52: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/52.jpg)
DISCRETE FOURIER TRANSFORM (DFT)
![Page 53: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/53.jpg)
x[n] is the discrete signal FT is Xs(f)
x[n]
X(f)
Frequency domain sampling of the FTDFT
![Page 54: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/54.jpg)
eW Nj
N
/2
1
0][][
N
n
kn
NWnxkX
1
0][1][
N
k
kn
NWkXN
nx
DFT
IDFT
k=0,1,2,3…….N‐1
n=0,1,2,3…….N‐1
DFT is the set of N sample {X[k]} of the Fourier transform X() for a finite–duration sequence {x[n]} of length L<=N. the sampling of X() occurs atthe N equally spaced frequencies k=2k/N, k=0,1,2,3… N-1
Where WN is define as
![Page 55: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/55.jpg)
sincos je j
1
0][][
N
n
kn
NWnxkX
1
0
/2][N
n
Nknjenx
1
0)/2sin()/2][cos([
N
nNknjNknnx
f(k)=kfs/N
X[k]=Xreal[k]+j Ximag[k] ])[][][|][| 22 kkkkx XXX imgrealpower
][
][][ tan 1
k
kk
XXX
real
img
![Page 56: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/56.jpg)
Properties of DET
Periodicity: if x[n] and X[k] are an N-point DFT then x[n+N]=x[n] for all nX[k+N]=X[k] for all k
Linearity: if
x1(n) X1(n)DFT
Nx2(n) X2(n)
DFT
N
a1x1(n)+a2x2(n) a1 X1(n)+a2 X2(n)DFT
N
Symmetry:
![Page 57: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/57.jpg)
DFT Shifting theorem:
)()( ln/2 lXl eX Nj
shifted
x[n] is shifted to the right by l sample
DFT Leakage:
DFT Resolution, Zero stuffing:DFT Magnitudes:When a real input signal contains a sine wave component of peak amplitudeA0 with an integral number of cycles over N input samples the outputmagnitude of the DFT for that particular sine wave is Mr where
Mr=A0*N/2If the DFT input is a complex sinusoid of magnitude A0 with an integral numberof cycles over N input samples the output magnitude of the DFT is Mc where
Mc=A0N
![Page 58: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/58.jpg)
Fast Fourier Transform
let the computation of N=2v point DFT , split the N point datasequence into two N/2 point data sequence f1(n),f2(n)corresponding the even-number and odd-numberd samples of x(n)
f1(n)=x(2n), f2(n)=x(2n+1)
Thus f1(n) and f2(n) are obtained by decimating x(n) by a factorof 2 and hence the resulting FFT algorithm is called Decimating intime algorithm:
1
0)()(
N
n
kn
NWnxkX
oddn
kn
Nevenn
kn
N WW nxnx )()(
![Page 59: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/59.jpg)
![Page 60: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/60.jpg)
Decimation in Frequency
1
2/
12/
0)()()(
N
Nn
kn
N
N
n
kn
N WW nxnxkX
12/
0
2/12/
0)
2()(
N
n
kn
N
kN
N
N
n
kn
N WWW Nnxnx
)1(2/ kkN
NWNow
WkN
N
N
n
kNnxnxkX 2/
12/
0)2/()()( )1(
WkN
N
N
nNnxnxkX 2/
12/
0)2/()()2(
WkN
N
N
nNnxnxkX 2/
12/
0)2/()()12(
![Page 61: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/61.jpg)
![Page 62: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/62.jpg)
![Page 63: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/63.jpg)
)/2sin()/2cos(/2 NjNeW Nj
N
xC+j(‐S)
x0+jy0
x1+jy1 x’1+jy’1
x’0+jy’0
‐1
![Page 64: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/64.jpg)
![Page 65: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/65.jpg)
C+j(‐S)
x0+jy0
x1+jy1 x’1+jy’1
x’0+jy’0
‐1x
![Page 66: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/66.jpg)
Frequency spectrum
![Page 67: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/67.jpg)
Spectrogram
![Page 68: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/68.jpg)
Digital Filter
![Page 69: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/69.jpg)
![Page 70: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/70.jpg)
![Page 71: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/71.jpg)
![Page 72: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/72.jpg)
A finite impulse response(FIR) filter is a discretelinear time-invariant system whose output is basedon the weighted summation of a finite number ofpast input.
1
0)()(
M
kk knxny b
![Page 73: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/73.jpg)
z‐1 z‐1 z‐1 z‐1
x x x x x
y(n)
x(n)
h(0) h(1) h(2) h(M‐2) h(M‐1)
1
0)()()(
M
kknxkhny
)1()1()2()2(.......)1()1()()0()( MnxMhMnxMhnxhnxhny
![Page 74: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/74.jpg)
)().()(*)()( mXmHnxkhny DFT
IDFT
X
IDFT DFT DFT DFT
Filter out put in frequency domain, Y(m)=H(m).X(m)
Filter out put in time domainy(n)=h(k)*x(n)
x(n)h(k)
Time domain
Frequencydomain
![Page 75: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/75.jpg)
Algebraic determination of time‐domain coefficients of low pass filter
1. Develop an expression for the discrete frequency response H(m)
2. Apply that expression to the inverse DFT equation to get the time domain h(k)
3. Evaluate that h(k) expression as a function of time
Let Hd(w) is the frequency response of a low‐pass filter of time response hd(n)
ehH nj
ndd n
)()(
0
dn eHh nj
dd
)()(
The unit sample response obtained from the above equation is infinite induration and must be truncated at some point say n=M-1 for a FIR filter oflength M this truncation is equivalent to multiplying hd(n) by a windowfunction w(n).
h(n)= hd(n)w(n)
![Page 76: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/76.jpg)
Window Function for FIR Filter Design
Name of Window Window functionBartlett(triangular)
Blackman
Hamming
Hanning
Kaiser
12
121
M
Mn
14cos08.0
12cos5.042.0
Mn
Mn
12cos46.054.0
Mn
12cos1
21
Mn
21
211
0
2
2
0
M
MnMI
I
![Page 77: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/77.jpg)
Common Windows (Frequency)
![Page 78: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/78.jpg)
ExampleLet the frequency response of a low‐pass filter as
)(H de Mj 2/)1(1
0
c0
otherwise
A delay of (M‐1)/2 unit is incorporated into H(w) in anticipation of forcing the filter to be of length M
dn
c
c
ehMnjw
d
)
21(
21)(
21
)2
1(sin)(
Mn
Mnn
c
h
2
1,10
MnMn
/2
1c
Mh
![Page 79: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/79.jpg)
Type of Window Approximate Transition width
of main Lobe
Peak Sidelobe
Rectangular 4/M -13
Bartlett 8/M -27
Hanning 8/M -32
Hamming 8/M -43
Blackman 12/M -58
![Page 80: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/80.jpg)
Band-pass FIR filter design:
)().()( kkk shh shiftlpbp
High-pass FIR filter design:
)().()( kkk shh shiftlpbp
![Page 81: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/81.jpg)
IIR Design Methods
Impulse invariant transformation – match theanalog impulse response by sampling; resultingfrequency response is aliased version of analogfrequency response
Bilinear transformation – use a transformation tomap an analog filter to a digital filter by warping theanalog frequency scale (0 to infinity) to the digitalfrequency scale (0 to ); use frequency pre-warping to preserve critical frequencies oftransformation (i.e., filter cutoff frequencies)
![Page 82: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/82.jpg)
ET60007 © CET, IITKGP
Human Speech Production and
Source Filter model
![Page 83: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/83.jpg)
Massage Planning
Speech Sound
Production
Rule of Grammar
Rule of Prosody
Physiological Constraints
Physiological Constraints
Input Information
LinguisticsLexical,
Syntactic, Semantic,Pragmatic
Para- LinguisticsIntentional, Attitudinal,
Stylistic
Non- LinguisticsPhysical, Emotional
Segmental and supra Segmental Features of
speechUtterance Planning
Motor Command Generation
Information manifestation in the segmental and suprasegmental features of speech
@ Fujisaki
![Page 84: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/84.jpg)
3
Silence Unvoiced
Voiced
State of the speech production source
![Page 85: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/85.jpg)
ET60007 © CET, IITKGP
Basics Definitions Speech is composed of a sequence of soundsSounds/Phonemes serve as a symbolicrepresentation of information to be shared betweenhumans (or humans and machines)
Arrangement of sounds is governed by rules oflanguage (constraints on sound sequences, wordsequences, etc)
Linguistics is the study of the rules of languagePhonetics is the study of the sounds of speech
![Page 86: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/86.jpg)
ET60007 © CET, IITKGP
Vocal tract —dotted lines in figure; beginsat the glottis (the vocal cords) and ends atthe lips• Consists of the pharynx (the connectionfrom the esophagus to the mouth) and themouth itself (the oral cavity)• Average male vocal tract length is 17.5 cm• Cross sectional area, determined bypositions of the tongue, lips, jaw and velum,varies from zero (complete closure) to 20 sqcmNasal tract — begins at the velum and endsat the nostrilsVelum —a trapdoor-like mechanism at theback of the mouth cavity; lowers to couplethe nasal tract to the vocal tract to producethe nasal sounds like /m/ (mom), /n/ (night),/ng/ (sing)
![Page 87: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/87.jpg)
ET60007 © CET, IITKGP
MRI of Speech Human Production system (Prof. Shri Narayanan, USC)
![Page 88: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/88.jpg)
ET60007 © CET, IITKGP
LUNG VOLUME
PHARYNX CAVITY
MOUTH CAVITY
NASAL CAVITY
Vocal Cords
LARYNY TUBE
TRACHEA TUBE
TOUNG HUMP
VELUM NOSE OUTPUT
MOUTH OUTPUT
Lungs and associated muscles acts as the source of air for exciting the vocal mechanism
1.When the vocal cords are tensed, the air flow causes them to vibrate, producing voiced sound.2. When the vocal cords are relaxed, in order to produce a sound the air flow either must pass through a constriction in the vocal tract and there by become turbulent, producing unvoiced sound or it can build up pressure behind a point of total closure within the vocal tract and when the closure is opened,the pressure is suddenly and abruptly released, causing a brief transient sound.
Schematic representation of the physiological mechanism of speech production
![Page 89: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/89.jpg)
ET60007 © CET, IITKGP
![Page 90: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/90.jpg)
ET60007 © CET, IITKGP
Bernoulli Oscillation Tensed Vocal Cords –Ready to Vibrate
Vocal Cords –Open for Breathing
![Page 91: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/91.jpg)
Front
Back
Vocal Folds
Glottal slit
Cricoids Cartilage
Arytenoids Cartilage
Thyroid Cartilage
Breathing
Voiced
Unvoiced
![Page 92: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/92.jpg)
Glottal Flow
![Page 93: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/93.jpg)
ET60007 © CET, IITKGP
The Vocal Tract• The shape of the vocal tract transforms raw
sound from the vocal folds into recognizablesounds.
![Page 94: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/94.jpg)
ET60007 © CET, IITKGP
![Page 95: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/95.jpg)
ET60007 © CET, IITKGP
![Page 96: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/96.jpg)
Women and Men
• The acoustics of male and female vowels differ reliablyalong two different dimensions:
1. Sound Source
2. Sound Filter
• Source--F0: Depends on length of vocal folds
Shorter in women higher average F0
Longer in men lower average F0
• Filter--Formants: Depend on length of vocal tract
shorter in women higher formant frequencies
longer in men lower formant frequencies
![Page 97: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/97.jpg)
ET60007 © CET, IITKGP
![Page 98: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/98.jpg)
17
To identify dissimilar sounds i.e., vowels, the ears are moresensitive to peaks in the signal spectrum. These resonantpeaks in the spectrum are called formants.
Formants are the characteristics partial that identify vowelsto the listeners.
Formant with lowest frequency is called F1, the second F2& the third F3. F1 & F2 are enough to disambiguate thevowel.
What is Formant??
Spectrographic view of vowel /i/
![Page 99: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/99.jpg)
ET60007 © CET, IITKGP
![Page 100: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/100.jpg)
ET60007 © CET, IITKGP
Articulatory and Acoustic phonetics
![Page 101: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/101.jpg)
20
Manners and Place of Articulation
Place of articulation: During the articulation the airstreamsthrough the vocal tract must be obstructed in some way. Theplace where the obstruction takes place is called the placeof articulation
Manner of articulation: Manner of articulation is concernedwith airflow ; the paths it take and the degree to which it isimpeded by vocal tract constrictions.
The consonants are classified depending on the place ofobstruction and manner of articulation./k/ Velar Un-aspirated unvoiced stopVowel sound specified in terms of the position of the tongueand the position of the lips.
/i/ High front Un-rounded
![Page 102: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/102.jpg)
ET60007 © CET, IITKGP21
If the glottis are closed then it isvoiced and if opened then it isunvoiced or voiceless.
Manners of Articulation due to State of the Glottis
![Page 103: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/103.jpg)
ET60007 © CET, IITKGP
Place of articulationa. Bilabial: Bilabial sounds are produced when the two lips make the
constriction
b. Labiodentals: These sounds are produced by contacting lower lipwith the upper teeth.
c. Dental: Dental sounds are produced by the constriction of tip or bladeof the tongue with the upper teeth.
d. Alveolar: The sound made by the tip or the blade of the tongue incontact against the alveolar ridge, which is the bony prominenceimmediately behind the upper teeth.
e. Post alveolar: The sound, which is articulated by the tip or the bladeof the tongue with the back area of the alveolar ridge.
f. Retroflex : Retroflex sounds are made when the tip of the tonguecurled back in the direction of the front part of the hard palate- in otherwords just behind the alveolar ridge. Depending on how far the tonguecurls back, retroflexed could be apico-postalveolar or apico-palatal.
![Page 104: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/104.jpg)
ET60007 © CET, IITKGP
g. Palatal: This sound is produced when theconstriction is made by the front part of thetongue with the hard palate.
h. Velar: It refers to a sound made by the back ofthe tongue against the soft palate.
i. Uvular: This sound is produced when the backof the tongue touches the uvula.
j. Pharyngeal: It refers to a sound produced inthe pharynx, the tubular cavity, whichconstitutes the throat above the larynx.
k. Glottal: These are the sounds, which made inthe larynx due to the closure or narrowing ofthe glottis.
![Page 105: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/105.jpg)
ET60007 © CET, IITKGP
Manner of articulation
a) Plosive, or stopb) Nasal stopc) Fricatived) Affricatee) Lateralf) Approximantg) Trill:h) Flap and Tap 1. Voiced
2. Unvoiced3. Aspiration
![Page 106: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/106.jpg)
25
• Vowels: a) Oral vowels b) nasal vowels
• Dipthongs: Dipthongs is a gliding monosyllabic speech soundthat start at or near the articulatory position for onevowel and moves to or toward the position foranother
• Semivowels: Semivowels are vowel like nature. They are generally characterized by gluding transition in vocal tract area function between adjacent phonemes.
• Consonant: a. Nasal consonants. b. unvoiced fricatives.
c. Voiced fricative d. voiced and unvoiced stop/ Plosive
Classification of sound in linguistically distinct speech (phonemes)
![Page 107: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/107.jpg)
ET60007 © CET, IITKGP
![Page 108: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/108.jpg)
ET60007 © CET, IITKGP
![Page 109: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/109.jpg)
ET60007 © CET, IITKGP
![Page 110: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/110.jpg)
The Vowel Space
![Page 111: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/111.jpg)
Different Vowels, Different Formants
• The formant frequencies of resemble the resonant frequencies of a tube that is open at one end.
• For the average man (ref: Peter Ladefoged):
• F1 = 500 Hz
• F2 = 1500 Hz
• F3 = 2500 Hz
• However, we can change the shape of the vocal tract to get different resonant frequencies.
• Vowels may be defined in terms of their characteristic resonant frequencies (formants).
![Page 112: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/112.jpg)
Articulatory description of Vowels
Vowels have traditionally been described according to following pseudo-articulatory parameters:
1. Height (of tongue) (F1)
2. Front/Back (of tongue)(F2)
3. Rounding (of lips)
![Page 113: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/113.jpg)
Lip rounding
![Page 114: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/114.jpg)
ET60007 © CET, IITKGP
Velum closed
Vocal Cord Open
Nasal PassagePlace of
Articulation (Velar)
Back of tongue (Articulator)
![Page 115: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/115.jpg)
ET60007 © CET, IITKGP
Vocal Cord Closed
Velum closed
Place of Articulation
(Velar)
Back of tongue (Articulator)
![Page 116: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/116.jpg)
ET60007 © CET, IITKGP
Vocal Cord Open
Velum closed
Nasal PassagePlace of Articulation (Post-alveolar)
Tongue tip curled back (Articulator)
![Page 117: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/117.jpg)
ET60007 © CET, IITKGP
Vocal Cord Closed
Velum closed
Nasal PassagePlace of Articulation (Post-alveolar)
Tongue tip curled back (Articulator)
![Page 118: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/118.jpg)
ET60007 © CET, IITKGP
Upper palate
Velum closed
Vocal Cords Open
Tongue Tip (Articulator)
Place of Articulation (Dental)
![Page 119: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/119.jpg)
ET60007 © CET, IITKGP
Upper palate
Velum closed
Vocal Cords Open
Both the Lips (Articulator)
Place of Articulation
(Bilabial)
![Page 120: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/120.jpg)
ET60007 © CET, IITKGP
Velum closed
Vocal Cords Open
Place of Constriction
(Post alveolar)
![Page 121: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/121.jpg)
ET60007 © CET, IITKGP
Velum closed
Vocal Cords Open
Place of Constriction
(Post alveolar)
![Page 122: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/122.jpg)
ET60007 © CET, IITKGP
Place of Constriction
(Post alveolar)
Place of release (Alveolar)
Velum closed
Vocal Cords Open
![Page 123: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/123.jpg)
ET60007 © CET, IITKGP
Place of Articulation (Bilabial)
Vocal Cords Closed
Velum Open
![Page 124: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/124.jpg)
![Page 125: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/125.jpg)
EnglishBengal
![Page 126: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/126.jpg)
ET60007 © CET, IITKGP08/21/17 45
![Page 127: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/127.jpg)
ET60007 © CET, IITKGP08/21/17 46
![Page 128: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/128.jpg)
ET60007 © CET, IITKGP08/21/17 47
![Page 129: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/129.jpg)
ET60007 © CET, IITKGP08/21/17 48
![Page 130: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/130.jpg)
ET60007 © CET, IITKGP08/21/17 49
![Page 131: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/131.jpg)
ET60007 © CET, IITKGP08/21/17 50
![Page 132: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/132.jpg)
ET60007 © CET, IITKGP08/21/17 51
![Page 133: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/133.jpg)
ET60007 © CET, IITKGP52
![Page 134: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/134.jpg)
ET60007 © CET, IITKGP08/21/17 53
![Page 135: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/135.jpg)
Time Domain Shape
![Page 136: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/136.jpg)
55
F1 & F2 are primarily determined by the position oftongue. F1 has a higher frequency when the tongue islowered and F2 has a higher frequency when the tongueis forwarded.
Vowels are classified according to the height andposition of the tongue inside the mouth.
Position of Bangla Vowels in Cardinal Vowel Diagram
Classification of vowels
Bangla vowels
![Page 137: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/137.jpg)
ET60007 © CET, IITKGP08/21/17 56
Vowel-vowel Combination
Semi-vowel or Glide
Hiatus
Diphthong
![Page 138: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/138.jpg)
ET60007 © CET, IITKGP08/21/17 57
A. In continuous speech two vowels can come togetherin two different situations.
B. They may be in a single word.
C. They may be part of two adjacent words i.e., oneword ends with a vowel and the next word starts witha vowel.
D. If the two vowels are within a single word, they mayeither be in two distinct syllables, or may merge intoone syllable.
Vowel-Vowel Combination
Back
Examples :
![Page 139: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/139.jpg)
ET60007 © CET, IITKGP5808/21/17
A diphthong is a monosyllabic vowel combinationinvolving a quick but smooth movement from onevowel to another, often interpreted by listeners as asingle vowel sound or phoneme.
It is a sequence of two different or same vowels thatare part of a single syllable. Usually one of the vowelsis stronger than the other.
.Examples:
Diphthong
Bangla Word :Bangla Word :
![Page 140: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/140.jpg)
ET60007 © CET, IITKGP08/21/17 59
When two vowels coming together without any contraction orelision are pronounced separately as distinct from Diphthongs they aretermed as hiatus.
Hiatus may be of two types:
1) Internal Hiatus which occurs within a word.
Example:
Hiatus
2) External Hiatus which refers to the break between twosuccessive words. In this situation the first word ends with a vowel andthe second word starts with a vowel.
Example:
Bangla Word :
Bangla Sentence :
![Page 141: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/141.jpg)
ET60007 © CET, IITKGP6008/21/17
Semi-vowel refers to a sound functioning as a consonant butlacking the PHONETIC characteristics normally associated withconsonants.
Its QUALITY is phonetically that of a vowel; though itsDURATION is much less than that typical of vowel.
Examples:
Semi-vowel or Glide
Bangla Word :
Bangla Word :
Bangla Word :
![Page 142: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/142.jpg)
ET60007 © CET, IITKGP6108/21/17
Steady State Duration of / a/
Transition with semi-vowel ( j)
Semi-vowel after a vowelVowel-semivowel combination (V-j) consists of transitional durationwith semivowel along with the preceding vowel’s steady stateduration.
PlaySpectrographic View of Bangla Word with V-j combination
![Page 143: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/143.jpg)
ET60007 © CET, IITKGP62
Steady State Duration of
/ o/
Steady State Duration of
/ o/
Transition with Semivowel / j/
Play
Vowel-semivowel-vowel combination consists of transitional durationwith semivowel along with the preceding and succeeding vowels’ steadystate duration.
Semi-vowel in between two vowels
Spectrographic View of Bangla Word ( ) with V-j-V combination
![Page 144: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/144.jpg)
ET60007 © CET, IITKGP
![Page 145: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/145.jpg)
ET60007 © CET, IITKGP
Consonents Manner of ArticulationS/N
Place of Articulation
Unvoiced VoicedUn-
AspiratedAspirated Un-
AspiratedAspirated
1 Velar
Stop
/k/ /kh/ /g/ /gh/
2 Post-alveolar (Retroflex ) /ʈ/ /ʈʰ/ /ɖ/ /ɖʰ/
3 Dental /t/ /th/ /d/ /dh/
4 Bilabial /p/ /ph/ /b/ /bh/
5 Alveolar -Post alveolar Affricate /ʧ/ /ʧʰ/ /ʤ/ /ʤh/
6 Alveolar
Fricative
/s/
7 Post alveolar /ʃ/
8 Glottal /h/ //
9 Velar
Nasal Murmur
/ŋ/
10 Palatal /ɳ/
11 Dental /n/
12 Bilabial /m/
![Page 146: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/146.jpg)
ET60007 © CET, IITKGP
S/N
Place of Articulati
on
Manner of ArticulationUnvoiced Voiced
Un-Aspirated
Aspirated Un-Aspirated
Aspirated
13 Dental Lateral /l/
14 Alveolar Trill /r/
15 Post alveolar
Retroflex Flap /ɽ/ /ɽh/
16 Palatal Approximant
/j/
17 Bilabial /w/
Vowel1 Back vowel Close, Rounded /u/
2 Back vowel Close-mid, Rounded /o/
3 Back vowel Open, Rounded /ɔ/
4 Front vowel Open, Unrounded /a/
5 Front vowel Open-mid, Unrounded /æ/
6 Front vowel Close-mid, Unrounded /e/
7 Front vowel Close, Unrounded /i/
![Page 147: Dr. Shyamal Kumar Das Mandal Assistant Professor](https://reader031.vdocuments.mx/reader031/viewer/2022012012/61da404772fcfd578d19c69c/html5/thumbnails/147.jpg)
ET60007 © CET, IITKGP
TUTORIAL
1.Write the place and manner of articulation of the following phoneme/k/, /g/,/u/,/gh/, /ɽ/, /ʃ/
2. Write out the phonetic transcription for the following words:/she/, /phonetic/, /marks/, /speech/,
How many syllable is present in each of the above word.
3. Draw Schematic representation of the physiological mechanism of speech productionsystem and explain how the a voiced sound is produce.
4. A voiced operated lift operation is designee using the following wordsa. stop, b. up, c. down d. floor e. first f. second g. third h. fourth and i. ground.
Figure 1 shows wideband spectrograms of one version of each of these words. Using your knowledge of acoustic phonetics, determine which wideband spectrogram corresponds to which word.
5. The following waveform is for the utterance /kolkata/ and the waveform samples are at asampling rate of FS =22050 Hz. Segment the waveform into regions of "Voiced Speech (V)"and "Non-Voiced Speech (N)".
6. Which formant frequency is related to tongue height and which formant related to tougueposition
7. Why the child speech has high F0 and formant compare to a adult