wavelet-based speech enhancement mahdi amiri april 2003 sharif university of technology course...
TRANSCRIPT
![Page 1: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/1.jpg)
Wavelet-Based Speech EnhancementWavelet-Based Speech Enhancement
Mahdi Amiri
April 2003
Sharif University of Technology
Course Project Presentation 1
![Page 2: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/2.jpg)
Page 2 of 43Wavelet-Based Speech
Enhancement
Presentation OutlinePresentation Outline
Motivation and GoalsWavelet Transform - OverviewBasic Denoising in Wavelet DomainLiterature SurveyImplementation and ResultsConclusions and Future Works
![Page 3: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/3.jpg)
Page 3 of 43Wavelet-Based Speech
Enhancement
Motivation and GoalsMotivation and GoalsKey ApplicationsKey Applications
Improving perceptual quality of speech– Reduce listener’s fatigue
– Hearing aidsImproving performance of
– Speech coders
– Voice recognition systems
![Page 4: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/4.jpg)
Page 4 of 43Wavelet-Based Speech
Enhancement
Motivation and GoalsMotivation and GoalsGoals of SE in Wavelet DomainGoals of SE in Wavelet Domain
Variable window size for different frequency components– Long time intervals precise low frequency info.
– Short time intervals precise high frequency info.
Easy to implement– Fast WT computation complexity: O(n)
– FFT computation complexity: O(nlog2n)
Denoising by simple thresholding– Real-time implementation
![Page 5: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/5.jpg)
Page 5 of 43Wavelet-Based Speech
Enhancement
Motivation and Goals
Wavelet Transform - Overview
Basic Denoising in Wavelet Domain Literature Survey Implementation and Results Conclusions and Future Works
Wavelet Transform - OverviewWavelet Transform - Overview
![Page 6: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/6.jpg)
Page 6 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - OverviewHistoryHistory
Fourier (1807)
Haar (1910)
Math World
{ }i i i ii
f t / 2 2
22
nm
mn m
tt
![Page 7: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/7.jpg)
Page 7 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - Overview
What kind of Could be useful?– Impulse Function (Haar): Best time resolution
– Sinusoids (Fourier): Best frequency resolution
– We want both of the best resolutions
t
t
Heisenberg (1930)– Uncertainty Principle
• There is a lower bound for
(An intuitive prove in [Mac91])
t
![Page 8: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/8.jpg)
Page 8 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - Overview
Gabor (1945)– Short Time Fourier Transform (STFT)
• Disadvantage: Fixed window size
![Page 9: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/9.jpg)
Page 9 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - Overview
Constructing Wavelets– Daubechies (1988)
• Compactly Supported Wavelets
Computation of WT Coefficients– Mallat (1989)
• A fast algorithm using filter banks
1Aw( )ah k
( )dh k
2
2
( )ag k
( )dg k
2
2
S1Dw
S
![Page 10: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/10.jpg)
Page 10 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - Overview
Coarse version (Approximation)more useful than the Detail
Browsing image databases on the web Signal transmission for communication Denoising
Wavelet Tree Decomposition Wavelet Transform (WT) Undecimated WT (UWT)
We may lose what is in the Detail
Multiresolution Signal RepresentationMultiresolution Signal Representation
![Page 11: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/11.jpg)
Page 11 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - Overview
Full Tree Decomposition Wavelet Packet Transform (WPT) Undecimated WPT (UWPT)
S = A1+D1 or S = A1+AD2+DD2 or …Which decomposition path could be the best choice?
The answer leads us to the Best Basis
![Page 12: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/12.jpg)
Page 12 of 43Wavelet-Based Speech
Enhancement
Wavelet Transform - OverviewWavelet Transform - Overview
Cut if:
Entropy– Coifman, Meyer, Wickerhauser (1992)
Rate-Distortion:– Vetterli (1995)
( ) ( 1) ( 2)J parentnode J child J child Best Basis Selection CriterionsBest Basis Selection Criterions
![Page 13: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/13.jpg)
Page 13 of 43Wavelet-Based Speech
Enhancement
Motivation and Goals Wavelet Transform - Overview
Basic Denoising in Wavelet Domain
Literature Survey Implementation and Results Conclusions and Future Works
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet Domain
![Page 14: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/14.jpg)
Page 14 of 43Wavelet-Based Speech
Enhancement
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet Domain
Only a few coefficients in the lower bands could be used for approximating the main features of the clean signal. Hence, by setting the smaller coefficients to zero, we can nearly optimally eliminate noise while preserving the important information of clean signal.
PrinciplePrinciple
![Page 15: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/15.jpg)
Page 15 of 43Wavelet-Based Speech
Enhancement
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet Domain
Clean signalNoise signalNoisy signal
NotationNotation
x
v
y
Y X V
X Wx
Wavelet domainTime domain
V Wv
y x v
Y Wy
![Page 16: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/16.jpg)
Page 16 of 43Wavelet-Based Speech
Enhancement
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet Domain
1. Framing input noisy signal
2. Forward WT of a frame
3. Thresholding (detail) wavelet coefficients
4. Inverse WT
5. Keep center part of the frame
6. Repeat for all of the frames
AlgorithmAlgorithm
![Page 17: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/17.jpg)
Page 17 of 43Wavelet-Based Speech
Enhancement
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet DomainThreshold ValueThreshold Value
ˆ 2 logVT N VisuShrink [DonJ94b]
2V̂
Threshold
Estimation of Noise variance Frame lengthN
For Gaussian white noise:
1 11( ( ) )
ˆ0.6745 0.6745
D D
V
median W median WMAD
MAD: Median Absolute Difference1 1( )DMAD median W
Another definition (wden.m):
![Page 18: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/18.jpg)
Page 18 of 43Wavelet-Based Speech
Enhancement
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet DomainThreshold ValueThreshold Value
Threshold in the WPT case
2ˆ 2 log( log )VT N N
ˆ 2 logjj vT N ˆ
0.6745j
jV
MAD
For the correlated noise situation:Use level dependent threshold (SureShrink [DonJ94b])
![Page 19: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/19.jpg)
Page 19 of 43Wavelet-Based Speech
Enhancement
Basic Denoising in Wavelet DomainBasic Denoising in Wavelet DomainHow to ThresholdHow to Threshold
x( , )
0 xH
x TThr x T
T
sgn( )( ) x( , )
0 xS
x x T TThr x T
T
Hard Thresholding Soft Thresholding
Alteration of valuesComparison: Discontinuity T
![Page 20: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/20.jpg)
Page 20 of 43Wavelet-Based Speech
Enhancement
Motivation and Goals Wavelet Transform - Overview Basic Denoising in Wavelet Domain
Literature Survey
Implementation and Results Conclusions and Future Works
Literature SurveyLiterature Survey
![Page 21: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/21.jpg)
Page 21 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Title:– Speech enhancement with reduction of noise
components in the wavelet domain
Novelty:– Semisoft thresholding [GaoB95]
– Classification of unvoiced region in WD
– Different thresholding for unvoiced region
[SeoB97], Novelty[SeoB97], Novelty
![Page 22: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/22.jpg)
Page 22 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Semisoft Thresholding: [GaoB95]– Less sensitivity to small perturbations in the data
– Smaller bias
[SeoB97], Thresholding[SeoB97], Thresholding
1
2 11 2
2 1
2
0 x
x( , ) sgn( )( ) x
x
SSThr x T x
x
Hard Soft Semisoft
12 12
Like [DonJ94b]
![Page 23: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/23.jpg)
Page 23 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Separation of unvoiced region– Use DWT for finding
– Calculate average energy of each subband
– Current speech segment is unvoiced if:1.
2.
[SeoB97], Unvoiced Regions[SeoB97], Unvoiced Regions
3 3 2 1, , ,A D D Dw w w w
3 1 2 3, , ,A D D DEw Ew Ew Ew
1 3 1 2 1 3 and and D A D D D DEw Ew Ew Ew Ew Ew
3
1
0.9A
D
Ew
Ew
f
3Aw 1
Dw2Dw
3Dw
![Page 24: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/24.jpg)
Page 24 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
If unvoiced then threshold just highest frequency band
Implementation results– Additive white Gaussian noise
– SNR (-10dB 10 dB)– “Should we chase those cowboys?”
[SeoB97], Implementations[SeoB97], Implementations
SNR (dB)
Noisy
Enhanced
-10 0.93
-5 3.42
0 7.12
5 11.34
10 13.92
1Dw
![Page 25: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/25.jpg)
Page 25 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[SooKY97], Novelty[SooKY97], Novelty
Title: Wavelet for speech denoising Novelty:
– Evaluation of different wavelets and different orders (db1-10, coif1-5, sym2-8, bior1.3-6.8)
– Spectral Subtraction in WD
– Wiener Filtering in WD (Uses two methods for estimating the a priori SNR)
• Maximum Likelihood approach
• Decision Directed approach
![Page 26: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/26.jpg)
Page 26 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[SooKY97], Thresholding 1[SooKY97], Thresholding 1
1 2 1, , , , ,A D D D Dy L y L y L y L yw w w w w
( ( ) 0)Dy iw k
Use DWT and find L levels of decomposition
if then(̂ ) max(0, ( ) ( ( ))D D D
x i y i n iw k w k E w k else
(̂ ) min(0, ( ) ( ( ))D D Dx i y i n iw k w k E w k
1. Spectral Subtraction (SS) in WD
Expected value of the noise magnitude, could be estimated from silence frames
( ( ))Dn iE w k
Use similar scheme for
Ay Lw
Denoised value
Denoised value
![Page 27: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/27.jpg)
Page 27 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[SooKY97], Thresholding 2[SooKY97], Thresholding 2
( )( ) ( )
( ) 1
DD Di
x i y iDi
kw k w k
k
2. Wiener Filtering in WD
is the a priori SNR ( ( ))( )
( ( ))
DD x ii D
n i
E w kk
E w k
2
2
( )(̂ ) max(0, 1)
( ( ) )
Dy iD
i Dn i
w kk
E w k
2
2
( )ˆ ˆ( ) ( 1) (1 ) max(0, 1)( ( ) )
Dy iD D
i i Dn i
w kk k
E w k
Estimating
a. Maximum Likelihood
b. Decision Directed
[0, 1], Typ. 0.9
![Page 28: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/28.jpg)
Page 28 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Implementation results– White Gaussian noise
– Both male and female voices
– 10 levels of decomposition
[SooKY97], Implementations[SooKY97], Implementations
SNR: 5dB, L: 10
WaveType Method 1 (dB) Method 2b (dB)
bior3.1 6.569 1.764
bior4.4 19.523 21.981
Sym8 19.751 22.215
![Page 29: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/29.jpg)
Page 29 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
The methods are not particularly sensitive to the various wavelet types with the exception of Bior3.1
Wiener filtered speeches have better SNR values than Magnitude subtraction
For Wiener filtering, the decision directed approach gives better SNR values than the maximum likelihood approach
[SooKY97], Conclusions[SooKY97], Conclusions
![Page 30: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/30.jpg)
Page 30 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[KimYK01], Novelty[KimYK01], Novelty
Title:– Speech enhancement using adaptive wavelet
shrinkage
Novelty:– Adaptive threshold value
• Threshold value will depend on the variance of estimated clean signal (BayesShrink)
– Classification of unvoiced region using entropy• Applies smaller threshold for unvoiced region and calls the
method as “Adaptive BayesShrink”
![Page 31: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/31.jpg)
Page 31 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[KimYK01], Threshold Value[KimYK01], Threshold Value
BayesShrink: Adaptive threshold value for minimizingthe Bayesian riskis
Thus, finds the estimated threshold value as
2ˆ( )E X Xˆ ( , )X Thr Y T
2
( ) VX
X
T
Where
2ˆˆ ˆ( )ˆV
XX
T
1ˆ0.6745V
MAD 2 2ˆ ˆmax( ,0)X Y V
2 21
ˆˆ ˆ( ) max( )DY V Xif T w
[ChaYV00a]
![Page 32: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/32.jpg)
Page 32 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Current region is unvoiced if
Unvoiced region has smaller energy, so apply a smaller threshold:
[KimYK01], Unvoiced Regions[KimYK01], Unvoiced Regions
max_yent
ent
0.77, 0.9 are selected by simulation
2 2log( )y k kk
ent y yThere was no comment about type of entropy,it could be as:
2 ˆˆ ˆ ˆ( ) ( ) . ( ).max_
yU X X V
entT T
ent
![Page 33: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/33.jpg)
Page 33 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Implementation results:– Additive white Gaussian noise
– SNR: 0db, 10dB and 20dB
[KimYK01], Implementations[KimYK01], Implementations
VisuShrink BayesShrink
Adaptive BayesShrink
0 dB 4.8208 dB 4.4982 dB 5.5733 dB
10 dB 11.5650 dB 12.8456 dB 14.1543 dB
20 dB 16.8488 dB 21.8313 dB 23.8455 dB
![Page 34: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/34.jpg)
Page 34 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[ChaKYK02], Novelty[ChaKYK02], Novelty
Title: Speech enhancement for non-stationary noise environment by adaptive wavelet packet
Novelty:– Node dependent thresholding for adaptation in
colored or non-stationary noise
– Noise estimation based on spectral entropy not MAD
– Modified hard thresholding to alleviate time-frequency discontinuities
![Page 35: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/35.jpg)
Page 35 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[ChaKYK02], Threshold Value[ChaKYK02], Threshold Value
Create WPT and find best basis tree’s leaf nodes Node dependent thresholding
Noise estimation could be like:
or the following proposed method
,, ˆ 2 logj kj k VT N
,
,ˆ0.6745j k
j kV
MAD
![Page 36: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/36.jpg)
Page 36 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[ChaKYK02], Noise Estimation[ChaKYK02], Noise Estimation
1. Estimate spectral pdf of wavelet packet coefficients through B bins histogram
2. Calculate normalized spectral entropy for each node in adapted wavelet packet tree
1
( ) log ( )B
Bb
Entropy n P P
1,2, , No of best nodesn
,No of coefficients in bin
length of nodej kw b
P
![Page 37: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/37.jpg)
Page 37 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[ChaKYK02], Noise Estimation (cont.)[ChaKYK02], Noise Estimation (cont.)
3. Estimate spectral magnitude intensity by histogram
4. Define an auxiliary threshold
5. Estimate standard deviation of noise
( ) ( ) node_lengthn Entropy n
,ˆ [No of bins bigger than ( )] bin_widthj k n
0.7 ~ 0.9
node_length
bins of C. magnitudes
# of C. with magnitude equal to or greater than
bin’s amplitude( )n
![Page 38: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/38.jpg)
Page 38 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[ChaKYK02], Noise Estimation (cont.)[ChaKYK02], Noise Estimation (cont.)
Greater disorder of wavelet coefficients (less voiced, more unvoiced)
More uniform spectral pdf
Bigger values for entropy (0 1)
Bigger value for alpha
Smaller # of bins bigger than alpha
Smaller estimation for standard deviation of noise
![Page 39: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/39.jpg)
Page 39 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey[ChaKYK02], Thresholding[ChaKYK02], Thresholding
/
x
( , ) 1T [(1 ) 1] sgn( ) xHm x T
x T
Thr x Tx T
ModifiedHard Thresholding
![Page 40: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/40.jpg)
Page 40 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
Implementation results:– Pink noise, SNR: -5db ~ 15 dB
[ChaKYK02], Implementations[ChaKYK02], Implementations
Noisy SpeechSNR (dB)
Level Dep.
with MAD
Node Dep.with MAD
Node Dep.with Proposed
SpectralSubtraction
-5 -3.7 3.53 3.31 0.10
0 1.11 5.43 5.91 1.77
5 5.79 7.44 8.30 2.35
10 10.15 9.49 10.47 2.83
15 14.15 11.39 12.15 4.08
0.8 255
Subjective tests were in favor of the level dependent thresholding but not every time!Anyway, the proposed method has better spectral performance (spectrogram)
![Page 41: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/41.jpg)
Page 41 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
– SNR (dB) test for various noisy speech: “We like bleu cheese but Victor prefers swiss cheese.” (SNR= 10dB)
[ChaKYK02], Implementations (cont.)[ChaKYK02], Implementations (cont.)
Noise type Level Dep.
with MAD
Node Dep.with Proposed
SpectralSubtraction
White 1029 10.35 2.39
Pink 9.47 10.49 2.42
F16 9.71 10.35 2.18
Car 9.65 13.50 1.95
Babble 9.59 10.18 2.23
![Page 42: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/42.jpg)
Page 42 of 43Wavelet-Based Speech
Enhancement
Literature SurveyLiterature Survey
To be continued…
Thank You.
……
![Page 43: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/43.jpg)
Page 43 of 43Wavelet-Based Speech
Enhancement
References (1 of 2)References (1 of 2)
[ChaKYK02]
S. Chang, Y. Kwon, S. I. Yang, and I. J. Kim, “Speech enhancement for non-stationary noise environment by adaptive wavelet packet,” Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-2002, Vol. 1, pp. 561-564, 2002.
[ChaYV00a] S. G. Chang, B. Yu, and M. Vetterli, “Adaptive Wavelet Thresholding for Image Denoising and Compression,” IEEE Transaction on Image Processing, Vol. 9, No. 9, pp. 1532-1546, Sep. 2000.
[DonJ94b] D. L. Donoho and I. M. Johnstone, “Threshold selection for wavelet shrinkage of noisy data,” Proceedings of the 16th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 1994. Engineering Advances: New Opportunities for Biomedical Engineers, Vol. 1, pp. A24- A25, Nov. 1994.
[GaoB95] H. Y. Gao and A. G. Bruce, “WaveShrink with Semisoft Shrinkage,” Research Report No. 39, StatSci Division of MathSoft, Inc., 1995.
[KimYK01] I. J. Kim, S. I. Yang and Y. Kwon, “Speech enhancement using adaptive wavelet shrinkage,” Proceedings of IEEE International Symposium on Industrial Electronics, ISIE-2001 , Vol. 1, pp. 501-504, 2001.
![Page 44: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/44.jpg)
Page 44 of 43Wavelet-Based Speech
Enhancement
References (2 of 2)References (2 of 2)
[SeoB97] J. W. Seok and K. S. Bae, “Speech enhancement with reduction of noise components in the wavelet domain,” IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-97, Vol. 2, pp. 1323-1326, Apr. 1997.
[SooKY97] I. Y. Soon, S. N. Koh and C. K. Yeo, “Wavelet for speech denoising,” Proceedings of IEEE Region 10 Annual Conference on Speech and Image Technologies for Computing and Telecommunications, TENCON-97, Vol. 2, pp. 479-482, Dec. 1997.
![Page 45: Wavelet-Based Speech Enhancement Mahdi Amiri April 2003 Sharif University of Technology Course Project Presentation 1](https://reader030.vdocuments.mx/reader030/viewer/2022033102/56649ef25503460f94c049e6/html5/thumbnails/45.jpg)
Wavelet-Based Speech EnhancementWavelet-Based Speech Enhancement
Thank You
Course Project Presentation 1
FIND OUT MORE AT...
1. http://ce.sharif.edu/~m_amiri/
2. http://www.aictct.com/dml/