speech processing laboratory, temple university may 5, 2004 1 structure-based speech...
DESCRIPTION
Speech Processing Laboratory, Temple University May 5, Overview Voiced and Unvoiced Speech Usable and Unusable Speech Nonlinearities in Speech Non-Linear Embedding Research Goal Proposed ResearchTRANSCRIPT
![Page 1: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/1.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
1
Structure-Based Speech Classification Structure-Based Speech Classification Using Nonlinear Embedding Using Nonlinear Embedding
TechniquesTechniques
Uchechukwu Ofoegbu
AdvisorDr. Robert E. Yantorno
CommitteeDr. Saroj K. Biswas
Dr. Henry M. Sendaula
![Page 2: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/2.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
2
AcknowledgmentAcknowledgment Dr. Robert YantornoDr. Robert Yantorno Dr. Saroj BiswasDr. Saroj Biswas Dr. Henry SendaulaDr. Henry Sendaula Speech Lab MembersSpeech Lab Members
Air Force Research Laboratory,Air Force Research Laboratory,Rome, NYRome, NY
![Page 3: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/3.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
3
OverviewOverview Voiced and Unvoiced Speech
Usable and Unusable Speech
Nonlinearities in Speech
Non-Linear Embedding
Research Goal
Proposed Research
![Page 4: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/4.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
4
Voiced and Unvoiced SpeechVoiced and Unvoiced Speech
![Page 5: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/5.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
5
Voiced/Unvoiced CharacteristicsVoiced/Unvoiced Characteristics
Voiced
Quasi-periodic excitation
Modulation by vocal tract
Production of vowels, voiced fricatives & plosives
Unvoiced
No periodic vibration of vocal chords
Noise-like nature
Production of unvoiced fricatives and plosives
![Page 6: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/6.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
6
Usable SpeechUsable Speech
Portions of co-channel speech still usable for applications such as Speaker ID and Speech Recognition.
Low-energy (unvoiced/silence) segments overlap with high-energy (voiced) segments
Target-to-interferer Ratio (TIR) > 20dB
![Page 7: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/7.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
7
Nonlinearities in SpeechNonlinearities in SpeechGlottal waveform changes
Shape varies with amplitude
Physical observations Flow in vocal tract is non-laminar
Coupling between vocal tract and folds When glottis is open, prominent changes are observed
in formant characteristics
![Page 8: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/8.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
8
Nonlinear EmbeddingNonlinear Embedding
Nonlinear Systems
Point moving along some trajectory in an abstract state space
Coordinates of the point are independent degrees of freedom of the system
State space could be reconstructed from a scalar signal
![Page 9: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/9.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
9
Nonlinear Embedding (cont’d)Nonlinear Embedding (cont’d)
Takens’ Method of Delays
A state space representation topologically equivalent to the original state space of a system can be reconstructed from a single observable dimension
Vectors in m-dimensional state space are formed from time-delayed values of a signal
![Page 10: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/10.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
10
Nonlinear Embedding (cont’d)Nonlinear Embedding (cont’d)
dmisdisdisisix 1,,2,,
m = embedding dimension
d = delay value
![Page 11: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/11.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
11
Nonlinear Embedding (Cont’d)Nonlinear Embedding (Cont’d)Delay value, d:
Dependent on sampling rate and signal properties
Large enough such that nonlinearities are taken into account by the reconstructed trajectory
Small enough to retain reasonable time resolution
![Page 12: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/12.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
12
Nonlinear Embedding (Cont’d)Nonlinear Embedding (Cont’d)Dimension, m:
Generation of voiced speech constitutes a low-dimensional system
Generation of unvoiced speech constitutes a relatively high-dimensional system
Using a low dimension (such as m = 3) sufficiently reconstructs voiced but not unvoiced speech
![Page 13: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/13.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
13
![Page 14: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/14.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
14
Embedded Voiced and Embedded Voiced and Unvoiced SpeechUnvoiced Speech
-50000
5000
10000
-5000
0
5000
10000-5000
0
5000
10000
Embedded Voiced Speech
-2000
0
2000
-2000-10000
10002000-2000
-1000
0
1000
2000
Embedded Unvoiced Speech
![Page 15: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/15.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
15
Embedded Usable and Embedded Usable and Unusable SpeechUnusable Speech
-4000-2000
02000
40006000
-5000
0
5000-4000
-2000
0
2000
4000
6000
Embedded Co-channel Speech of 30dB TIR
-10000-5000
05000
-10000-5000
05000
-10000
-5000
0
5000
Embedded Co-channel Speech of 10dB TIR
![Page 16: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/16.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
16
Research GoalResearch GoalFeature Extraction
Difference-Mean Comparison (DMC) Measure
– Voiced/unvoiced classification
Nodal Density Measure– Voiced/unvoiced classification– Usable/unusable classification
![Page 17: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/17.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
Difference-Mean Difference-Mean Comparison (DMC) MeasureComparison (DMC) Measure
Voiced/Unvoiced ClassificationVoiced/Unvoiced Classification
![Page 18: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/18.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
18
IntroductionIntroduction 3rd order difference computation along first
non-singleton dimension
Ist order difference of NxN matrix given by
Length(3rd order diff. > mean) observed
(2,1) (1,1) (2, 2) (1, 2) . . . (2, ) (1, )(3,1) (2,1) (3, 2) (2,2) . . . (3, ) (2, )
. . .
. . .
. . .( ,1) (( 1),1) ( , 2) (( 1),2) . . . ( , ) (( 1), )
X X X X X N X NX X X X X N X N
X N X N X N X N X N N X N N
![Page 19: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/19.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
19
Embedded Voiced and Embedded Voiced and Unvoiced SpeechUnvoiced Speech
-50000
5000
10000
-5000
0
5000
10000-5000
0
5000
10000
Embedded Voiced Speech
-2000
0
2000
-2000-10000
10002000-2000
-1000
0
1000
2000
Embedded Unvoiced Speech
![Page 20: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/20.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
20
Difference-Mean Comparison Difference-Mean Comparison Distribution Distribution
0 20 40 60 80 100 120 140 1600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Prob
abili
ty
Difference-Mean Comparison
Clean Speech
VoicedUnvoiced
![Page 21: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/21.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
21
Difference-Mean Comparison Difference-Mean Comparison DistributionDistribution
0 20 40 60 80 100 120 140 1600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Prob
abili
ty
Difference-Mean Comparison
Speech + 15dB Pink Noise
VoicedUnvoiced
![Page 22: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/22.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
22
Difference-Mean Comparison Difference-Mean Comparison DistributionDistribution
0 50 100 1500
0.05
0.1
0.15
0.2
Prob
abili
ty
Difference-Mean Comparison
Speech + 15dB White NoiseVoicedUnvoiced
![Page 23: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/23.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
23
DMC-Based Decisions
200 400 600 800 1000 1200 1400-1
0
1
Clean Speech => 1:V; 0:Dont Care; -1:UV
Ampl
itude
200 400 600 800 1000 1200 1400-1
0
1
Deci
sion
200 400 600 800 1000 1200 1400-1
0
1
Sample Number
Deci
sion
![Page 24: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/24.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
24
DMC-Based Decisions
200 400 600 800 1000 1200 1400-1
0
1
Speech + 15dB Pink Noise => 1:V; 0:Dont Care; -1:UV
Ampl
itude
200 400 600 800 1000 1200 1400-1
0
1
Deci
sion
200 400 600 800 1000 1200 1400-1
0
1
Sample Number
Deci
sion
![Page 25: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/25.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
25
DMC-Based Decisions
200 400 600 800 1000 1200 1400-1
0
1
Speech + 15dB White Noise => 1:V; 0:Dont Care; -1:UV
Ampl
itude
200 400 600 800 1000 1200 1400-1
0
1
Deci
sion
200 400 600 800 1000 1200 1400-1
0
1
Sample Number
Deci
sion
![Page 26: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/26.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
26
DMC-Based Decisions
200 400 600 800 1000 1200 1400-1
0
1
Clean Speech => 1:V; 0:Dont Care; -1:UV
Ampl
itude
200 400 600 800 1000 1200 1400-1
0
1
Deci
sion
200 400 600 800 1000 1200 1400-1
0
1
Sample Number
Deci
sion
![Page 27: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/27.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
27
DMC-Based Decisions
200 400 600 800 1000 1200 1400-1
0
1
Speech + 15dB Pink Noise => 1:V; 0:Dont Care; -1:UVAm
plitu
de
200 400 600 800 1000 1200 1400-1
0
1
Deci
sion
200 400 600 800 1000 1200 1400-1
0
1
Sample Number
Deci
sion
![Page 28: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/28.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
28
DMC-Based Decisions
200 400 600 800 1000 1200 1400-1
0
1
Speech + 15dB White Noise => 1:V; 0:Dont Care; -1:UV
Ampl
itude
200 400 600 800 1000 1200 1400-1
0
1
Deci
sion
200 400 600 800 1000 1200 1400-1
0
1
Sample Number
Deci
sion
![Page 29: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/29.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
29
ResultsResultsHits Minus False Alarms for Voiced Speech
0
20
40
60
80
100
Clean 15dB P ink 15dB White
FR/RE E/ZC DMC
![Page 30: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/30.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
30
Results (Cont’d)Results (Cont’d)Hits Minus False Alarms for Unvoiced Speech
0
20
40
60
80
100
Clean 15dB Pink 15dB White
Perc
ent
FR/RE E/ZC DMC
![Page 31: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/31.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
Nodal Density MeasureNodal Density Measure Voiced/Unvoiced ClassificationUsable/Unusable Classification
![Page 32: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/32.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
32
IntroductionIntroduction Smallest cube which encloses the signal is
determined
This cube is divided into N smaller cubes
Edges of the smaller cubes are defined as nodes
Number of nodes spanned by the signal is determined
Ratio of number of nodes spanned to total number of nodes is defined as nodal density
![Page 33: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/33.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
Voiced/Unvoiced ClassificationVoiced/Unvoiced Classification
![Page 34: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/34.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
34
Embedded Voiced and Unvoiced Embedded Voiced and Unvoiced Speech Frames with GridsSpeech Frames with Grids
-0.1-0.05
00.05
0.10.15
-0.1-0.05
00.05
0.10.15-0.1
-0.05
0
0.05
0.1
0.15
Voiced
-0.01-0.005
00.005
0.01
-0.01
-0.0050
0.005
0.01-0.01
-0.005
0
0.005
0.01
Unvoiced
![Page 35: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/35.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
35
Nodes Spanned by Embedded Voiced and Nodes Spanned by Embedded Voiced and Unvoiced Speech FramesUnvoiced Speech Frames
-0.1-0.05
00.05
0.10.15
-0.1-0.05
00.05
0.10.15-0.1
-0.05
0
0.05
0.1
0.15
Voiced
-0.01-0.005
00.005
0.01
-0.01
-0.005
0
0.005
0.01-0.01
-0.005
0
0.005
0.01
Unvoiced
![Page 36: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/36.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
36
Nodal-Density Distribution Nodal-Density Distribution
0.03 0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
0.25
Prob
abili
ty
Nodal-Density
Clean Speech VoicedUnvoiced
![Page 37: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/37.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
37
Nodal-Density Distribution Nodal-Density Distribution
0.03 0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
0.25
Prob
abili
ty
Nodal-Density
Speech + 15dB Pink Noise
VoicedUnvoiced
![Page 38: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/38.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
38
Nodal-Density Distribution Nodal-Density Distribution
0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.0750
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Prob
abili
ty
Nodal-Density
Speech + 15dB White NoiseVoicedUnvoiced
![Page 39: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/39.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
39
FilteringFiltering
Moving Average Filter
Order, M = 10
![Page 40: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/40.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
40
Nodal-Density Distributions after Nodal-Density Distributions after FilteringFiltering
0.03 0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
Prob
abili
ty
Nodal Density
Clean Speech
VoicedUnvoiced
0.03 0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
0.25
Prob
abili
ty
Nodal-Density
Clean Speech VoicedUnvoiced
![Page 41: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/41.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
41
Nodal-Density Distributions after Nodal-Density Distributions after FilteringFiltering
0.03 0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
0.25
Prob
abili
ty
Nodal Density
Speech + 15dB Pink Noise
VoicedUnvoiced
0.03 0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
0.25
Prob
abili
ty
Nodal-Density
Speech + 15dB Pink NoiseVoicedUnvoiced
![Page 42: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/42.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
42
Nodal-Density Distributions After Nodal-Density Distributions After FilteringFiltering
0.04 0.05 0.06 0.070
0.05
0.1
0.15
0.2
Prob
abili
ty
Nodal Density
Speech + 15dB White Noise
VoicedUnvoiced
0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.0750
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Prob
abili
ty
Nodal-Density
Speech + 15dB White NoiseVoicedUnvoiced
![Page 43: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/43.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
43
ResultsResultsHits Minus False Alarms for Voiced Speech
010203040506070
Clean 15dB P ink 15dB White
ND ND_Filt
![Page 44: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/44.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
44
Results (Cont’d)Results (Cont’d)
Hits Minus False Alarms for Unvoiced Speech
010203040506070
Clean 15dB Pink 15dB White
Perc
ent
ND ND_Filt
![Page 45: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/45.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
Proposed ResearchProposed Research
Usable/Unusable ClassificationUsable/Unusable Classification
![Page 46: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/46.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
46
Embedded Usable and Unusable Embedded Usable and Unusable Speech Frames with GridsSpeech Frames with Grids
-10000-5000
05000
-10000-5000
05000
-10000
-5000
0
5000
Embedded Co-channel Speech of 10dB TIR with Grids
-5000
0
5000
-5000
0
5000-5000
0
5000
Embedded Co-channel Speech of 30dB TIR with Grids
![Page 47: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/47.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
47
Nodes Spanned by Embedded Usable Nodes Spanned by Embedded Usable and Unusable Speech Framesand Unusable Speech Frames
-4000-2000
02000
40006000
-5000
0
5000-4000
-2000
0
2000
4000
6000
Nodes Spanned by Embedded Co-channel Speech of 30dB TIR
-10000
-5000
0
5000
-10000
-5000
0
5000-6000
-4000
-2000
0
2000
4000
6000
Nodes Spanned by Embedded Co-channel Speech of 30dB TIR
-10000
-5000
0
5000
-10000
-5000
0
5000-6000
-4000
-2000
0
2000
4000
6000
Nodes Spanned by Embedded Co-channel Speech of 30dB TIR
![Page 48: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/48.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
48
Preliminary ResultsPreliminary Results
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1ROC Curve for Usable Speech Detection Using the Nodal Density Measure
False Alarms
Hits
![Page 49: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/49.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
49
SummarySummary
SpeechSpeech Nonlinear Embedding
Difference-Mean
Comparison
Nodal Density Usable/Unusable Usable/Unusable
ClassificationClassification
V/UV ClassificationV/UV Classification
V/UV ClassificationV/UV Classification
![Page 50: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/50.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
50
Future Proposed ResearchFuture Proposed Research Determine optimum filter for nodal density-based
voiced/unvoiced classification
Develop nodal density measure for usable/unusable classification
Investigate the presence of complimentary information in between both features (DMC and nodal density) for voiced/unvoiced classification
Perform decision-level fusion of both features
![Page 51: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…](https://reader035.vdocuments.mx/reader035/viewer/2022062911/5a4d1bca7f8b9ab0599d6a3f/html5/thumbnails/51.jpg)
Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University
May 5, 2004May 5, 2004
51
If you understood this If you understood this presentation presentation
……
please askplease ask QUESTIONS !!!QUESTIONS !!!