introduction to signal processing

162
Introduction to Signal Introduction to Signal Processing and Some Processing and Some applications in audio applications in audio analysis analysis Md. Khademul Islam Molla JSPS Research Fellow Hirose-Minematsu Laboratory Email: [email protected] tokyo.ac.jp

Upload: api-26157851

Post on 18-Nov-2014

322 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Introduction to Signal Processing

Introduction to Signal Introduction to Signal Processing and Some Processing and Some

applications in audio analysisapplications in audio analysis

Md. Khademul Islam Molla

JSPS Research FellowHirose-Minematsu Laboratory

Email: [email protected]

Page 2: Introduction to Signal Processing

Outlines of the presentationOutlines of the presentationBasics of discrete time signalsBasics of discrete time signalsFrequency domain signal analysisFrequency domain signal analysisBasic TransformationsBasic TransformationsFourier Transform (FT), short-time FT (STFT)Fourier Transform (FT), short-time FT (STFT)Wavelet Transform (WT) Wavelet Transform (WT) Empirical mode decomposition Empirical mode decomposition (EMD), and (EMD), and

Hilbert spectrum (HS)Hilbert spectrum (HS)Remarkable comparisons among FT, WT, HSRemarkable comparisons among FT, WT, HSSome applications in audio processingSome applications in audio processingSome open problems to work withSome open problems to work with

Page 3: Introduction to Signal Processing

Discrete time signalDiscrete time signal

It is not possible to process It is not possible to process continuous signalscontinuous signalsWe need to make it discrete time We need to make it discrete time signal with suitable sampling signal with suitable sampling frequency and quantization frequency and quantization

The sampling theory The sampling theory FFss22ffcc

where, where, ffcc expected signal frequency, expected signal frequency, FFss required sampling frequency required sampling frequency Quantization is required in samplingQuantization is required in sampling

Page 4: Introduction to Signal Processing

Discrete time signalDiscrete time signal

Signal samplingSignal sampling Signal quantizationSignal quantization

Page 5: Introduction to Signal Processing

Discrete time signalDiscrete time signal

Effects of under samplingEffects of under sampling

Page 6: Introduction to Signal Processing

Discrete time signalDiscrete time signal

Effects of required sampling frequencyEffects of required sampling frequency

Page 7: Introduction to Signal Processing

Discrete time signalDiscrete time signal

Telephone speech is usually sampled at 8 kHz to capture up to 4 kHz data 16 kHz is generally regarded as sufficient for speech recognition and synthesis The audio standard is a sample rate of 44.1 kHz (CD) or 48 kHz (Digital Audio Tape) to represent frequencies up to 20 kHz

Page 8: Introduction to Signal Processing

-5

-3

-1

1

3

5

-10 -5 0 5 10

-5

-4

-3

-2

-1

0

1

2

3

4

5

-10 -5 0 5 10

Discrete time signalDiscrete time signal

Amplitude

Phase

Frequency

f(x) = 5 cos (x)

f(x) = 5 cos (x + 3.14)

f(x) = 5 cos (3 x + 3.14)

-5

-3

-1

1

3

5

-10 -5 0 5 10

Page 9: Introduction to Signal Processing

Time-domain signalsTime-domain signals

The Independent Variable is TimeThe Dependent Variable is the AmplitudeMost of the Information is Hidden in the Frequency Content

0 0.5 1-1

-0.5

0

0.5

1

0 0.5 1-1

-0.5

0

0.5

1

0 0.5 1-1

-0.5

0

0.5

1

0 0.5 1-4

-2

0

2

4

10 Hz2 Hz

20 Hz2 Hz +

10 Hz +20Hz

TimeTime

Time Time

Ma

gn

itu

de

Ma

gn

itu

de

Ma

gn

itu

de

Ma

gn

itu

de

Page 10: Introduction to Signal Processing

SignalSignal TransformationTransformation

WhyTo obtain a further information from the signal that

is not readily available in the raw signal.

Raw SignalNormally the time-domain signal

Processed SignalA signal that has been "transformed" by any of the

available mathematical transformations

Fourier TransformationThe most popular transformation

between time and frequency domains

Page 11: Introduction to Signal Processing

Frequency domain analysisFrequency domain analysis

Why Frequency Information is Needed

Be able to see any information that is not obvious in time-domain

Types of Frequency TransformationFourier Transform, Hilbert Transform,

Short-time Fourier Transform,the Radon Transform, the Wavelet Transform …

Page 12: Introduction to Signal Processing

Frequency Frequency domain domain analysisanalysis

time, t frequency, fF

s(t)s(t) S(f) = S(f) = FF[s(t)][s(t)]

analysianalysiss

synthesissynthesis

s(t), S(f) : s(t), S(f) : Transform PairTransform Pair

General Transform General Transform as problem-solving as problem-solving

tooltool

•Powerful & complementary to time domain analysisPowerful & complementary to time domain analysis methodsmethods•Frequency domain representation shows the signal Frequency domain representation shows the signal energy energy and phase with respect to frequencyand phase with respect to frequency•Fast and efficient way to view signal’s informationFast and efficient way to view signal’s information

Basic block diagram of signal transformationBasic block diagram of signal transformation

Page 13: Introduction to Signal Processing

Frequency Frequency domain domain analysisanalysis

Complex numbers4.2 + 3.7i9.4447 – 6.7i-5.2 (-5.2 + 0i)

General FormZ = a + biRe(Z) = aIm(Z) = b

AmplitudeA = | Z | = √(a2 + b2)

Phase = Z = tan-1(b/a)

Page 14: Introduction to Signal Processing

Frequency Frequency domain domain analysisanalysis

Polar CoordinateZ = a + bi

AmplitudeA = √(a2 + b2)

Phase = tan-1(b/a)

a

b

A

Page 15: Introduction to Signal Processing

Frequency Frequency domain domain analysisanalysis

Frequency SpectrumBe basically the frequency components (spectral

components) of that signalShow what frequencies exists in the signal

Fourier Transform (FT) One way to find the frequency contentTells how much of each frequency exists in a

signal

Spectrum of Spectrum of speech speech signalsignal

Page 16: Introduction to Signal Processing

Fourier TransformFourier Transform•Fourier transform decomposes a function into a Fourier transform decomposes a function into a spectrum of its spectrum of its frequency componentsfrequency components, ,

•TThe inverse transform synthesizes a function from its he inverse transform synthesizes a function from its spectrum of frequency components spectrum of frequency components

•Discrete Fourier transform pair is defined as:Discrete Fourier transform pair is defined as:

Where Where XXkk represents the frequency component represents the frequency component

Where Where xxnn represents nth sample in time domain represents nth sample in time domain

Page 17: Introduction to Signal Processing

FourierFourier Transform Transform

-5

-4

-3

-2

-1

0

1

2

3

4

5

0 200 400 600 800 1000 1200 1400

-5

-4

-3

-2

-1

0

1

2

3

4

5

0 200 400 600 800 1000 1200 1400

5 10 15(Hz)

5 10 15(Hz)

Amplitude OnlyAmplitude Only

Page 18: Introduction to Signal Processing

Fourier Fourier Trans. of Trans. of 1D1D signal signal

-5

-4

-3

-2

-1

0

1

2

3

4

5

0 200 400 600 800 1000 1200 1400 5 10 15

(Hz)

Page 19: Introduction to Signal Processing

Fourier Fourier Spectrum of 1D Spectrum of 1D

Page 20: Introduction to Signal Processing

FFourier Transformourier Transform

Fourier analysis uses Sinusoids as the basis function in decompositionFourier transforms give the

frequency information, smearing timeSamples of a function give the

temporal information, smearing frequency

Page 21: Introduction to Signal Processing

7

1ksin(kt)kb-(t)7sw

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

5

1ksin(kt)kb-(t)5sw

3

1ksin(kt)kb-(t)3sw

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

1

1ksin(kt)kb-(t)1sw

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

9

1ksin(kt)kb-(t)9sw

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

-1.5

-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10t

sq

ua

re s

ign

al,

sw

(t)

11

1ksin(kt)kb-(t)11sw

FS synthesisFS synthesisSquare wave Square wave reconstruction from reconstruction from spectral termsspectral terms

Convergence may be slow (~1/k) - ideally need infinite terms.Convergence may be slow (~1/k) - ideally need infinite terms.PracticallyPractically, series truncated when remainder below computer tolerance, series truncated when remainder below computer tolerance

(( errorerror). ). BUTBUT … Gibbs’ Phenomenon. … Gibbs’ Phenomenon.

Page 22: Introduction to Signal Processing

Stationarity of the signalStationarity of the signal

Stationary SignalSignals with frequency content

unchanged over the entire timeAll frequency components exist at all

times

Non-stationary SignalFrequency changes in timeOne example: the “Chirp Signal”

Page 23: Introduction to Signal Processing

Stationarity of the signalStationarity of the signal

0 0.2 0.4 0.6 0.8 1-3

-2

-1

0

1

2

3

0 5 10 15 20 250

100

200

300

400

500

600

Time

Ma

gn

itu

de

Ma

gn

itu

de

Frequency (Hz)

2 Hz + 10 Hz + 20Hz

Stationary

0 0.5 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 250

50

100

150

200

250

Time

Ma

gn

itu

de

Ma

gn

itu

de

Frequency (Hz)

Non-Stationary

0.0-0.4: 2 Hz + 0.4-0.7: 10 Hz + 0.7-1.0: 20Hz

Occur at all times

Do not appear at all times

Page 24: Introduction to Signal Processing

Chirp signalChirp signal

Same in Frequency Domain

0 0.5 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 250

50

100

150

Time

Ma

gn

itu

de

Ma

gn

itu

de

Frequency (Hz)0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 250

50

100

150

Time

Ma

gn

itu

de

Ma

gn

itu

de

Frequency (Hz)

Different in Time DomainFrequency: 2 Hz to 20 Hz Frequency: 20 Hz to 2 Hz

At what time the frequency components occur? FT can not tell!At what time the frequency components occur? FT can not tell!

Page 25: Introduction to Signal Processing

Limitations of Fourier TransformLimitations of Fourier Transform

FT Only Gives what Frequency Components Exist in the SignalThe Time and Frequency Information can not be seen at the same timeTime-frequency representation of the signal is needed

Most of Signals are Non-stationary

ONE SOLUTION: SHORT-TIME FOURIER TRANSFORM (STFT)

Page 26: Introduction to Signal Processing

Short Time Fourier TransformShort Time Fourier Transform

Dennis Gabor (1946) used STFTTo analyze only a small section of the signal at a

time -- a technique called Windowing the Signal.

The segment of signal is assumed stationary A 3D transform

dtetttxft ftj

t

2*X ,STFT

function window the:t

A function of time and frequency

Page 27: Introduction to Signal Processing

Short time Fourier TransformShort time Fourier Transform

FT

FT

Speech Speech signal and its signal and its STFTSTFT

Page 28: Introduction to Signal Processing

Via Narrow Window

Via Wide Window

DDrawbacks of rawbacks of STFTSTFTUnchanged WindowDilemma of Resolution

Narrow window -> poor frequency resolution Wide window -> poor time resolution

Heisenberg Uncertainty PrincipleCannot know what frequency exists at what time

intervals

Page 29: Introduction to Signal Processing

Wavelet Transform

To overcome some limitations of To overcome some limitations of Fourier transformFourier transform

SS

A1 D

1

A2 D2

A3 D3

Discrete Wavelet Discrete Wavelet decompositiondecomposition

Page 30: Introduction to Signal Processing

Wavelet OverviewWavelet Overview

WaveletA small wave

Wavelet TransformsProvide a way for analyzing waveforms, bounded in

both frequency and durationAllow signals to be stored more efficiently than by

Fourier transformBe able to better approximate real-world signalsWell-suited for approximating data with sharp

discontinuities

“The Forest & the Trees”Notice gross features with a large "window“Notice small features with a small "window”

Page 31: Introduction to Signal Processing

Wavelet TransformAn alternative approach to the short time Fourier

transform to overcome the resolution problem Similar to STFT: signal is multiplied with a function

Multi-resolution Analysis Analyze the signal at different frequencies with

different resolutionsGood time resolution and poor frequency resolution

at high frequenciesGood frequency resolution and poor time resolution

at low frequenciesMore suitable for short duration of higher frequency;

and longer duration of lower frequency components

MMulti-resolution analysisulti-resolution analysis

Page 32: Introduction to Signal Processing

Advantages of WT over STFTAdvantages of WT over STFT

Width of the Window is Changed as the Transform is Computed for Every Spectral ComponentsAltered Resolutions are Placed

Page 33: Introduction to Signal Processing

Principles of WTPrinciples of WT

Split Up the Signal into a Bunch of SignalsRepresenting the Same Signal, but all Corresponding to Different Frequency BandsOnly Providing What Frequency Bands Exists at What Time Intervals

Page 34: Introduction to Signal Processing

Wavelet Small waveMeans the window function is of finite length

Mother WaveletA prototype for generating the other window functionsAll the used windows are its dilated or compressed and

shifted versions

Principles of WTPrinciples of WT

dts

ttx

sss xx

*1

, ,CWT

TranslationTranslation

(The location of (The location of the window)the window)

Scale

Mother Wavelet

Page 35: Introduction to Signal Processing

Principles of WTPrinciples of WT

Wavelet Basis Functions:

21

1

241-

0

2

20

21

1- :devivativeDOG

1!2!2

DOG :order Paul

:)frequency(Morlet

edd

mm

immi

m

ee

m

mm

mmm

j

Derivative Of a GaussianM=2 is the Marr or Mexican hat wavelet

Time domain Frequency

domain

Wavelet basesWavelet bases

Page 36: Introduction to Signal Processing

Scale of waveletScale of wavelet

ScaleS>1: dilate the signalS<1: compress the signal

Low Frequency -> High Scale -> Non-detailed Global View of Signal -> Span Entire SignalHigh Frequency -> Low Scale -> Detailed View Last in Short TimeOnly Limited Interval of Scales is Necessary

Page 37: Introduction to Signal Processing

Computation of WTComputation of WT

Step 1: The wavelet is placed at the beginning of the signal, and set s=1 (the most compressed wavelet);Step 2: The wavelet function at scale “1” is multiplied by the signal, and integrated over all times; then multiplied by ;Step 3: Shift the wavelet to t= , and get the transform value at t= and s=1;Step 4: Repeat the procedure until the wavelet reaches the end of the signal;Step 5: Scale s is increased by a sufficiently small value, the above procedure is repeated for all s;Step 6: Each computation for a given s fills the single row of the time-scale plane;Step 7: CWT is obtained if all s are calculated.

dts

ttx

sss xx

*1

, ,CWT

s1

Page 38: Introduction to Signal Processing

Time & Frequency ResolutionTime & Frequency Resolution

Time

Frequency

Better time resolution;Poor frequency resolution

Better frequency resolution;Poor time resolution

• Each box represents a equal portion • Resolution in STFT is selected once for entire analysis

Page 39: Introduction to Signal Processing

Comparison of transformationsComparison of transformations

Page 40: Introduction to Signal Processing

Discretization of Discretization of WTWT

It is Necessary to Sample the Time-Frequency (scale) Plane.At High Scale s (Lower Frequency f ), the Sampling Rate N can be Decreased.The Scale Parameter s is Normally Discretized on a Logarithmic Grid.The most Common Value is 2.

1211212 NffNssN S 2 4 8 …

N 32 16 8 …

Page 41: Introduction to Signal Processing

SS

A1

A2 D2

A3 D3

D1

EEffective and Fastffective and Fast DWT DWT

The Discretized WT is not a True Discrete TransformDiscrete Wavelet Transform (DWT)

Provides sufficient information both for analysis and synthesis

Reduce the computation time sufficientlyEasier to implementAnalyze the signal at different frequency

bands with different resolutions Decompose the signal into a coarse

approximation and detail information

Page 42: Introduction to Signal Processing

Decomposition with DWT Decomposition with DWT

Halves the Time ResolutionOnly half number of samples resulted

Doubles the Frequency ResolutionThe spanned frequency band halved

0-1000 Hz

D2: 250-500 Hz

D3: 125-250 Hz

Filter 1

Filter 2

Filter 3

D1: 500-1000 Hz

A3: 0-125 Hz

A1

A2

X[n]512

256

128

64

64

128

256SS

A1

A2 D2

A3 D3

D1

Page 43: Introduction to Signal Processing

Decomposition of non-Decomposition of non-stationary signalstationary signal

Wavelet: db4

Level: 6

Signal:0.0-0.4: 20 Hz0.4-0.7: 10 Hz0.7-1.0: 2 Hz

fH

fL

Page 44: Introduction to Signal Processing

Decomposition of non-Decomposition of non-stationary signalstationary signal

Wavelet: db4

Level: 6

Signal:0.0-0.4: 2 Hz0.4-0.7: 10 Hz0.7-1.0: 20Hz

fH

fL

Page 45: Introduction to Signal Processing

RReconstruction from WTeconstruction from WT

WhatHow those components can be assembled

back into the original signal without loss of information?

A Process After decomposition or analysis.Also called synthesis

HowReconstruct the signal from the wavelet

coefficients Where wavelet analysis involves filtering and

downsampling, the wavelet reconstruction process consists of upsampling and filtering

Page 46: Introduction to Signal Processing

RReconstruction from WTeconstruction from WT

Lengthening a signal component by inserting zeros between samples (upsampling)MATLAB Commands: idwt and waverec.

Page 47: Introduction to Signal Processing

Wavelet ApplicationsWavelet Applications

Typical Application Fields Astronomy, acoustics, nuclear engineering, sub-

band coding, signal and image processing, neurophysiology, music, magnetic resonance imaging, speech discrimination, optics, fractals, turbulence, earthquake-prediction, radar, human vision, and pure mathematics applications

Sample ApplicationsDe-noising signalsBreakdown detectingDetecting self-similarityCompressing imagesIdentifying pure tone

Page 48: Introduction to Signal Processing

Signal De-noisingSignal De-noising

Highest Frequencies Highest Frequencies Appear at the Start of Appear at the Start of The Original Signal The Original Signal Approximations Approximations Appear Less and Less Appear Less and Less NoisyNoisyAlso Lose Also Lose Progressively More Progressively More High-frequency High-frequency Information. Information. In AIn A55, About the First , About the First 20% of the Signal is 20% of the Signal is TruncatedTruncated

Page 49: Introduction to Signal Processing

Breakdown Detection Breakdown Detection

The Discontinuous Signal Consists of a Slow Sine Wave Abruptly Followed by a Medium Sine Wave.The 1st and 2nd Level Details (D1 and D2) Show the Discontinuity Most Clearly Things to be Detected

The site of the change

The type of change (a rupture of the signal, or an abrupt change in its first or second derivative)

The amplitude of the change

Discontinuity Points

Page 50: Introduction to Signal Processing

Detecting Self-similarityDetecting Self-similarityPurpose

How analysis by wavelets can detect a self-similar, or fractal, signal.

The signal here is the Koch curve -- a synthetic signal that is built recursively

Analysis If a signal is similar to

itself at different scales, then the "resemblance index" or wavelet coefficients also will be similar at different scales.

In the coefficients plot, which shows scale on the vertical axis, this self-similarity generates a characteristic pattern.

Page 51: Introduction to Signal Processing

Image CompressionImage Compression

FingerprintsFBI maintains a large

database of fingerprints — about 30 million sets of them.

The cost of storing all this data runs to hundreds of millions of dollars.

ResultsValues under the threshold

are forced to zero, achieving about 42% zeros while retaining almost all (99.96%) the energy of the original image.

By turning to wavelets, the FBI has achieved a 15:1 compression ratio

better than the more traditional JPEG compression

Page 52: Introduction to Signal Processing

Identifying Pure ToneIdentifying Pure Tone

Purpose Resolving a signal into

constituent sinusoids of different frequencies

The signal is a sum of three pure sine waves

Analysis D1 contains signal components

whose period is between 1 and 2.

Zooming in on detail D1 reveals that each "belly" is composed of 10 oscillations.

D3 and D4 contain the medium sine frequencies.

There is a breakdown between approximations A3 and A4 -> The medium frequency been subtracted.

Approximations A1 to A3 be used to estimate the medium sine.

Zooming in on A1 reveals a period of around 20.

Page 53: Introduction to Signal Processing

Empirical Mode Empirical Mode DecompositionDecomposition

PrinciplePrincipleObjective — From one observation of x(t), get a AM-FM

type representation :

K

x(t) = Σ ak(t) Ψk(t) k=1

with ak(.) amplitude modulating functions and Ψk(.) oscillating functions.Idea — “signal = fast oscillations superimposed to slow oscillations”.

Operating mode — (“EMD”, Huang et al., ’98) (1) identify locally in time, the fastest oscillation ; (2) subtract it from the original signal ; (3) iterate upon the residual.

Page 54: Introduction to Signal Processing

0 1

-1

0

1

0 1

-1

0

1

0 1

0

A LF sawtooth

A linear FM

+

=

Empirical Mode Empirical Mode DecompositionDecomposition

PrinciplePrinciple

Page 55: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 56: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 57: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 58: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 59: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 60: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 61: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 62: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 63: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 64: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 65: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 66: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 67: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 68: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 69: Introduction to Signal Processing

First Intrinsic Mode Function

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 70: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 71: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 72: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 73: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 74: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 75: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 76: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 77: Introduction to Signal Processing

Second Intrinsic Mode Function

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 78: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 79: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 80: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 81: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 82: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 83: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 84: Introduction to Signal Processing

Third Intrinsic Mode Function

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 85: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 86: Introduction to Signal Processing

Residu

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

SIFTING

PROCESS

Page 87: Introduction to Signal Processing

Signal

1st Intrinsic Mode Function

2nd Intrinsic Mode Function

3rd Intrinsic Mode Function

Residu

Empirical Mode DecompositionEmpirical Mode DecompositionAlgorithmic definitionAlgorithmic definition

Page 88: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionIntrinsic Mode FunctionsIntrinsic Mode Functions

— Quasi monochromatic harmonic oscillations

#{zero crossing} = #{extrema} ± 1 symmetric envelopes around the y=0 axis

— IMF ≠ Fourier mode and, in nonlinear situations, IMF = several Fourier modes

— Output of a self-adaptive time-varying filter (≠ standard linear filter)

ex: 2 sinus FM + gaussian wave packet

Page 89: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionInstantaneous frequency (IF)Instantaneous frequency (IF)

— Analytic version of each IMF Ci(t) is computed using Hilbert transform as:

— hence zi(t) becomes complex with phase and amplitude. Then IF can be computed as:

-Hilbert spectrum (HS) is a triplet as is a triplet as HH((,t,t))=={{tt, , ii((tt), ), aaii((tt))}}

)()()]([)()( tjiiii

ietatCjHtCtz

dt

tdt i

i)(

)(

Page 90: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionIntrinsic Mode FunctionsIntrinsic Mode Functions

Signal

time

frequency

Spectrum

Time-Frequency representation

Page 91: Introduction to Signal Processing

Signal

Empirical Mode DecompositionEmpirical Mode DecompositionIntrinsic Mode FunctionsIntrinsic Mode Functions

Page 92: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionIntrinsic Mode FunctionsIntrinsic Mode Functions

Signal

time

frequency

1st IMF

3rd IMF2nd IMF

Page 93: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionIntrinsic Mode FunctionsIntrinsic Mode Functions

•EMD is fully data adaptive decomposition EMD is fully data adaptive decomposition for spectral and time-frequency for spectral and time-frequency representation of non-linear and non-representation of non-linear and non-stationary time seriesstationary time series

•It does not employ any basis function for It does not employ any basis function for decompositiondecomposition

•It produces perfect localization of the It produces perfect localization of the signal components in high resolution time-signal components in high resolution time-frequency space of the time seriesfrequency space of the time series

Page 94: Introduction to Signal Processing

Time-frequency representation of two pure tones (100Hz and 250Hz) using HS and STFT

Hilbert spectrum (HS) STFT

Empirical Mode DecompositionEmpirical Mode DecompositionComparison between HS and STFTComparison between HS and STFT

Page 95: Introduction to Signal Processing

Empirical Mode DecompositionEmpirical Mode DecompositionComparison between Wavelet and HSComparison between Wavelet and HS

WaveletWavelet Hilbert spectrum Hilbert spectrum (HS)(HS)

Page 96: Introduction to Signal Processing

Remarks on FTRemarks on FT

•Fourier Transform has a mathematical Fourier Transform has a mathematical foundationfoundation

•Can be used in robust analysis having Can be used in robust analysis having phase information phase information

•The detail signal information is limited The detail signal information is limited with the basis (sinusoid) functionwith the basis (sinusoid) function

•STFT analysis includes some addition STFT analysis includes some addition cross-spectral energy that cross-spectral energy that

degrades degrades the performance in some the performance in some applicationsapplications

Page 97: Introduction to Signal Processing

Remarks on WTRemarks on WT

•WT employs data adaptive basis function WT employs data adaptive basis function base base on its time and frequency scaleson its time and frequency scales•It can produce more detail signal It can produce more detail signal information information in T-F representationin T-F representation•WT also perform well in multi-band WT also perform well in multi-band decomposition decomposition •The reconstruction error of multi-band The reconstruction error of multi-band representation is much less than the FTrepresentation is much less than the FT•It can not preserve the phase information It can not preserve the phase information for for perfect reconstruction from T-F spaceperfect reconstruction from T-F space

Page 98: Introduction to Signal Processing

Remarks on EMD and HSRemarks on EMD and HS

•EMD is fully adaptive multi-band EMD is fully adaptive multi-band decomposition methoddecomposition method•It produces the perfect localization of It produces the perfect localization of signal signal components in T-F spacecomponents in T-F space•HS can represent the instantaneous HS can represent the instantaneous spectra of spectra of the signalthe signal•The signal can be reconstructed with The signal can be reconstructed with negligible negligible error termserror terms•It does not have mathematical foundation It does not have mathematical foundation yetyet•It is difficult to use EMD based It is difficult to use EMD based decomposition decomposition in robust analysisin robust analysis

Page 99: Introduction to Signal Processing

Application of DSP in audio Application of DSP in audio analysisanalysis

•Audio source separation from Audio source separation from mixture mixture using independent subspace using independent subspace

analysis (ISA) analysis (ISA) •Audio source separation by spatial Audio source separation by spatial

localization in underdetermined localization in underdetermined casecase•Robust pitch estimation using EMDRobust pitch estimation using EMD

Page 100: Introduction to Signal Processing

Application of DSP in audio Application of DSP in audio analysisanalysis

Audio source separation from Audio source separation from mixture using independent mixture using independent subspace subspace analysis (ISA) analysis (ISA)

Page 101: Introduction to Signal Processing

Source separation by Independent Source separation by Independent subspace analysis (ISA)subspace analysis (ISA)

STFTMSTFTM

s(t)s(t)

Audio mixtureAudio mixture

PCAPCA

Basis vector selectionBasis vector selection

ICAICA

Basis Basis vector vector

clusteringclustering

ISTFTISTFT

Individual Individual sourcessources

STFTSTFT Source Source spectrogramsspectrograms

Page 102: Introduction to Signal Processing

Short Time Fourier Transform (STFT)Short Time Fourier Transform (STFT)

Mixture Mixture AudioAudio

Magnitude Spectrogram Magnitude Spectrogram XX

Phase Information

windowwindow 30ms 30ms

OverlapOverlap20ms20ms

T-F representation of mixtureT-F representation of mixture

Page 103: Introduction to Signal Processing

Proposed separation modelProposed separation model

Mixture spectrogram X= xi

xi=BiAi

Bi Invariant frequency n-component basis

Ai Corresponding amplitude envelope

Ai=BiTX, Bi=XAi

T

** To find independent Bi or Ai

Source Source SpectrogramsSpectrograms

]......,[ )()(2

)(1

in

iii bbbB

Tin

iii aaaA ]........,[ )()(

2)(

1

Page 104: Introduction to Signal Processing

Dimension Reduction Dimension Reduction

Rows or columns of X number of sourcesSubject to reduce the dimension of XSingular value decomposition (SVD) is used

Xnk=UnnSnkVkkT

-U and V orthogonal matrices (column-wise)-S diagonal matrix of elements (singular values)1 2 3 .… n 0

p basis vectors (from U or V) are selected by setting =0.5 to 0.6 in inequality

p

iin

ii

1

1

1

Page 105: Introduction to Signal Processing

Proposed separation modelProposed separation model

To derive the basis vectorsSingular value decomposition (SVD) is applied as PCA

Some principal components are selected as basis vectors

Independent component analysis (ICA) is applied to make the bases independent

Page 106: Introduction to Signal Processing

Independent basis vectors Independent basis vectors

before ICA after ICA

**The bases Independent along time frames

Page 107: Introduction to Signal Processing

Producing source subspacesProducing source subspaces

The bases of the speech signal

Time basesTime bases Frequency basesFrequency bases

Page 108: Introduction to Signal Processing

Source Subspaces (cont.)Source Subspaces (cont.)

Mixture SpectrogramMixture Spectrogram

PCA+Basis Selection+ICAPCA+Basis Selection+ICA

BB AA

KLd based clusteringKLd based clustering

BB11AA11 BB22AA22

Basis vectorsBasis vectors

Source SubspacesSource Subspaces

Source Source SpectrogramsSpectrograms

Page 109: Introduction to Signal Processing

Source re-synthesisSource re-synthesis

Separated subspacesSeparated subspaces

(spectrograms) (spectrograms)

Append phase Append phase informationinformation

InverseInverse STFTSTFT

Mixture of speech & bip-bip soundMixture of speech & bip-bip sound

Separated speechSeparated speech

Separated bip-bip soundSeparated bip-bip sound

)],([.),( knjii exknS

Page 110: Introduction to Signal Processing

Experimental resultsExperimental results

Separated signals with proposed algorithm

mixtures separated

Speech+bip-bipSpeech+bip-bip

Male+female speechMale+female speech

Page 111: Introduction to Signal Processing

Application of DSP in audio Application of DSP in audio analysisanalysis

Audio source separation by Audio source separation by spatial localization in spatial localization in

underdetermined caseunderdetermined case

Page 112: Introduction to Signal Processing

Localization based separationLocalization based separation

To avoid the spectral dependency and signal content in separationTo increase the number of sourcesThe spatial location is considered

The use of Binaural mixtures instead of single mixture

Page 113: Introduction to Signal Processing

Localization based Localization based separation (cont.)separation (cont.)

Consider a multi-source audio situationHuman can easily localize and separate the sources by HAS (human auditory system)The binaural cues ITD and ILD are mainly used in source localizationSeparation is performed by applying Beamforming and Binary mask

Page 114: Introduction to Signal Processing

Source localization CuesSource localization Cues

• Interaural time difference (ITD) between two microphones’ signals (like two ears of human)

• Interaural level difference (ILD)

ITD ILD

Page 115: Introduction to Signal Processing

Source localizationSource localization

Xr() and Xl() are STFT of xr(t) and xl(t)

ITD and ILD are calculated as

where r() and l() are unwrap phase of Xr() and Xl() respectively at frequency

)}()({1

)(

rlITD

|)(|

|)(|log20)(

r

lILD X

X

Page 116: Introduction to Signal Processing

Source localization (cont.)Source localization (cont.)ITD becomes ambiguous at higher frequency (factor of mics` spacing)

ILD dominates to resolve the problem

ITDITD ITDITD

At low frequencyAt low frequency At high frequencyAt high frequency

Page 117: Introduction to Signal Processing

Source localization (cont.)Source localization (cont.)

ITD and ILD are quantized into 50 levelsCollection of T-F points corresponding to each ITD/ILD quantized pair produces peaks

Page 118: Introduction to Signal Processing

Separation by beamformingSeparation by beamforming

ITD is derived for each of the localized sourcesSpatial Beamforming is appliedLinearly constrained minimum variance Beamforming (LCMVB) is usedThe gain is selected based on the spatial locations

Page 119: Introduction to Signal Processing

Separation by binary mask Separation by binary mask with HSwith HS

It is required to avoid the limitations of spatial beamforming Separation is performed by binary mask estimation based on ITD/ILDThe sources are considered as disjoint orthogonal in T-F space not more than one source is active at any T-F point

Page 120: Introduction to Signal Processing

Computing ITD and ILDComputing ITD and ILDEach mixture is transformed to T-F domain using Hilbert spectrums (HL and HR)ITD and ILD are measured as:

where tf is the time frame

),(

),(,

),(

),(1),(),,(

fL

fR

fR

fLff tH

tH

tH

tHtILDtITD

2

2

),(

),(10log20),(

fL

fR

fdBtH

tHtILD

Page 121: Introduction to Signal Processing

ITD-ILD Space LocalizationITD-ILD Space Localization

ITD and ILD are quantized into 50 levelsCollection of T-F points from HS corresponding to each ITD/ILD quantized pair produces peaks

Page 122: Introduction to Signal Processing

Source SeparationSource Separation

Each peak region in the histogram refers to a source of the binaural mixturesConstruct a binary mask (nullifying T-F points of interfering sources) Mi(,t)The HS of ith source is separated as

Time domain ith source is given as

),(),(),( tHtMtH Lii

)],(cos[),()( ttHts ii

Page 123: Introduction to Signal Processing

Source disjoint Source disjoint orthogonalityorthogonality

Disjoint orthogonality (DO) of audio sources assumes that not more than one source is active at any T-F point

where F1 and F2 are TFR of two signals

SIR (signal to interference ratio) is used as the basis to measure DO

ttFtF ,;0),(),( 21

Page 124: Introduction to Signal Processing

Source disjoint orthogonality Source disjoint orthogonality (cont.)(cont.)

ss11 s s2 2 s s33 Three audio sourcesThree audio sources

MicrophoMicrophonesnesTFRTFRFrequencFrequenc

yy

TimeTime

ss11

ss22

ss33

Page 125: Introduction to Signal Processing

Source disjoint orthogonality Source disjoint orthogonality (cont.)(cont.)

The SIR of the jth source is defined as:

Yj sum of interfering sources

N

jii

ij

jt j

jj

tXtY

tYtY

tXSIR

1

),(),(

0),(;),(

),(

Page 126: Introduction to Signal Processing

Source disjoint orthogonality Source disjoint orthogonality (cont.)(cont.)

Dimensions of HS and STFT of same signal may be different

DO is defined as the percentage over the entire TFR region

Average DO (ADO) of all sources is

N number of sources

N

j

jSIRN

ADO1

1

Page 127: Introduction to Signal Processing

Experimental resultsExperimental results

The three mixtures are defined as m1{sp1(-40, 0), sp2(30, 0), ft(0, 0)}, m2{sp1(20, 10), sp2(0, 10), ft(-10,10)}, m3{sp1(40, 20), sp2(30, 20), ft(-20, 20)}

The separation efficiency is measured as OSSR (original to separated signal ratio) defined as:

T

tw

i

separated

w

i

original

its

its

TOSSR

1

1

2

1

2

)(

)(

10log1

Page 128: Introduction to Signal Processing

Experimental results Experimental results (cont.)(cont.)

The comparative separation efficiency (OSSR) using HS and STFT :

Mixtures TFR OSSR of sp1 OSSR of sp2 OSSR of ft

m1 HS -0.0271 0.0213 0.0264

STFT 0.0621 -0.0721 -0.0531

m2 HS 0.0211 -0.0851 -0.0872

STFT 0.0824 0.1202 0.1182

m3 HS 0.0941 -0.0832 0.0225

STFT -0.1261 0.1092 -0.0821

Page 129: Introduction to Signal Processing

Experimental resultsExperimental results

This experiment also compares the DO using HS and STFT as TFR

STFT is affected by many factors window function and its length, overlapping, FFT points

HS is independent of such factors

It is slightly affected by the number of frequency bins used in TFR

Page 130: Introduction to Signal Processing

Experimental results (cont.)Experimental results (cont.)

The ADO of HS and STFT as a function of number of frequency bins (N=3):

Page 131: Introduction to Signal Processing

Experimental results (cont.)Experimental results (cont.)

The ADO of only STFT is affected by the factor of window overlapping (%)

Page 132: Introduction to Signal Processing

Experimental results (cont.)Experimental results (cont.)

STFT includes more cross-spectral energy terms

The TFR of two pure tones using HS and STFT

Page 133: Introduction to Signal Processing

Experimental results (cont.)Experimental results (cont.)

Always HS has better DO for audio signalsDO depends on the resolution of TFRSTFT has to satisfy the inequality

The frequency resolution of HS is up to Nyquist frequencyIts time resolution is up to sampling rate and hence offers better resolution

2

1 t

Page 134: Introduction to Signal Processing

RemarksRemarks

The separation efficiency is independent The separation efficiency is independent of the signal’s spectral characteristicsof the signal’s spectral characteristics

The performance is affected by the apart The performance is affected by the apart angles and disjointness of the sourcesangles and disjointness of the sources

HS produces better disjointness in HS produces better disjointness in T-FT-F domain and hence better separationdomain and hence better separation

The Binaural mixtures are recorder in The Binaural mixtures are recorder in anechoic room of NTTanechoic room of NTT

Page 135: Introduction to Signal Processing

Application of DSP in audio Application of DSP in audio analysisanalysis

Robust pitch estimation Robust pitch estimation using EMDusing EMD

Page 136: Introduction to Signal Processing

Why EMD in pitch estimation?Why EMD in pitch estimation?

Pitch facilitates speech coding, enhancement, recognition etc. Autocorrelation function is mostly used in pitch estimation algorithmAutocorrelation (AC) function- recalls the periodic property of the speech

Page 137: Introduction to Signal Processing

EMD in pitch estimation (cont.)EMD in pitch estimation (cont.)

Page 138: Introduction to Signal Processing

EMD in pitch estimation (cont.)EMD in pitch estimation (cont.)

Pitch is the sample difference between two consecutive peaks in AC functionSometimes the pitch peak may be less prominent specially due to noise

Page 139: Introduction to Signal Processing

EMD in pitch estimation (cont.)EMD in pitch estimation (cont.)

EMD decomposes any signal into higher to lower frequency componentIt produces the local and global oscillations of the signalThe global oscillation almost represents the envelop of the signalThe IMF of global oscillation is used to estimate the pitch

Page 140: Introduction to Signal Processing

Pitch estimation with EMDPitch estimation with EMD

Page 141: Introduction to Signal Processing

Pitch estimation with EMD Pitch estimation with EMD (cont.)(cont.)

There exists an IMF in EMD There exists an IMF in EMD domain representing the global domain representing the global oscillation of the AC function oscillation of the AC function

That IMF represents the sinusoid That IMF represents the sinusoid of the pitch periodof the pitch period

Pitch is the frequency of that IMF Pitch is the frequency of that IMF rather than finding the pitch peakrather than finding the pitch peak

Page 142: Introduction to Signal Processing

Pitch estimation with EMD (cont.)Pitch estimation with EMD (cont.)

In EMD, IMF-5 is the oscillation of In EMD, IMF-5 is the oscillation of pitch periodpitch period

It is a crucial step to determine the It is a crucial step to determine the target IMF representing the target IMF representing the sinusoid with pitch periodsinusoid with pitch period

The IMF of low frequency oscillation The IMF of low frequency oscillation (than pitch period) can be (than pitch period) can be discarded by energy thresholdingdiscarded by energy thresholding

Page 143: Introduction to Signal Processing

Pitch estimation with EMD Pitch estimation with EMD (cont.)(cont.)

A reference pitch is computed by weighted AC (WAC) methodSuch pitch information is used to select the IMF with pitch periodThe periodicity of the selected each IMF is computed as pitch period

Page 144: Introduction to Signal Processing

Pitch estimation with EMD (cont.)Pitch estimation with EMD (cont.)

The peak at zero-lag is selectedTwo cycles are selected from both sides Average samples are the periodicity

Page 145: Introduction to Signal Processing

Proposed Pitch estimation Proposed Pitch estimation AlgorithmAlgorithm

Normalized autocorrelation (AC) of the speech frame is computedDetermine rough pitch period using WAC method Apply EMD on AC functionSelect the IMF of pitch period on the basis of WAC based method The period of the selected IMF is the estimated pitch

Page 146: Introduction to Signal Processing

Experimental resultsExperimental results

Keele pitch database is used here20kHz sampling rateFrame length is 25.6ms with 10ms shiftingEach frame is filtered by band-pass filter of pitch range (50-500Hz)Gross pitch error (GPE) is used to measure the performance

Page 147: Introduction to Signal Processing

Experimental results (cont.)Experimental results (cont.)

The %GPE of male and female speech with different SNR are presented hereTotal number of frames is 1823

SNR 30dB

20dB

10dB

0dB -5dB -15dB

Female

1.90 2.83 3.93 10.12

21.83

64.24

Male 2.15 3.78 5.22 11.89

23.56

66.76

Page 148: Introduction to Signal Processing

RemarksRemarks

The use of EMD makes the pitch estimation method more robustEMD of AC function can extract the fundamental oscillation of the signalThe pitch can be easily estimated from the single sinusoid of fundamental oscillationIt is not affected by the prominent non-pitch peak

Page 149: Introduction to Signal Processing

Future worksFuture works

The open problem is to identify the IMF with pitch periodIn present algorithm the error to estimate pitch roughly in ACF can propagate to the performance of final estimationThe performance is not yet tested with other existing algorithm

Page 150: Introduction to Signal Processing

Open Problem-1Open Problem-1Instantaneous Pitch (IP) estimation using EMD •Frame based pitch estimation is already doneFrame based pitch estimation is already done•Paper is accepted by EUROSPEECH 2007Paper is accepted by EUROSPEECH 2007

•We have used the pitch information based WAC to We have used the pitch information based WAC to compute the exact pitch (IMF) from EMD spacecompute the exact pitch (IMF) from EMD space•Problem to compute IP only from EMD spaceProblem to compute IP only from EMD space

Three methods Three methods of pitch of pitch estimationestimation

Page 151: Introduction to Signal Processing

Open Problems-2Open Problems-2Voiced/Unvoiced Detection with EMD •Useful in speech enhancement and speech/speaker Useful in speech enhancement and speech/speaker recog.recog.•Paper with preliminary results is published in Paper with preliminary results is published in ICICT2007ICICT2007

•Problem to derive better separation region for V/UV Problem to derive better separation region for V/UV and to conduct experiment with large speech dataand to conduct experiment with large speech data

V/UV V/UV differentiationdifferentiation

Page 152: Introduction to Signal Processing

Open Problems-3Open Problems-3Robust Audio Source Localization •Localization is done by delay-attenuation Localization is done by delay-attenuation computed in T-F space of binaural mixtures- NOT computed in T-F space of binaural mixtures- NOT noise robustnoise robust

•The problem is to derive mathematical mode for The problem is to derive mathematical mode for robust localization in underdetermined situationrobust localization in underdetermined situation

Localization of three Localization of three sources using TD-LD sources using TD-LD computed in T-F computed in T-F space space

Page 153: Introduction to Signal Processing

Open Problems-4Open Problems-4Speech denoising using image processing •Noisy speech can be represented as an image Noisy speech can be represented as an image with time-frequency (T-F) representation e.g. with time-frequency (T-F) representation e.g. SpectrogramSpectrogram

•Image processing algorithm can be used for Image processing algorithm can be used for denoisingdenoising•It seems easy for musical/white noisesIt seems easy for musical/white noises•Problem is to deal with other noise even by using Problem is to deal with other noise even by using binaural mixturesbinaural mixtures

Speech Speech with white with white noisenoise

Page 154: Introduction to Signal Processing

Open Problems-5Open Problems-5Auditory segmentation with binaural mixtures •Auditory segmentation is the first stage of Auditory segmentation is the first stage of source separation using auditory scene analysis source separation using auditory scene analysis (ASA)(ASA)

•Problem is to use of binaural mixtures for Problem is to use of binaural mixtures for improved auditory segmentation as source improved auditory segmentation as source separationseparation•T-F representation other than FT can be T-F representation other than FT can be employedemployed

Source Source separation separation by ASAby ASA

Page 155: Introduction to Signal Processing

Open Problems-6Open Problems-6Two stage speech enhancement •Single stage speech enhancement is not efficient Single stage speech enhancement is not efficient in all noisy situationsin all noisy situations•For example, musical noise is introduced with For example, musical noise is introduced with binary masking and some thresholding methods binary masking and some thresholding methods •Noise may not be separated perfectly by using Noise may not be separated perfectly by using ICA, ISA (independent subspace analysis) based ICA, ISA (independent subspace analysis) based techniquestechniques•Multi-stage enhancement with suitable order can Multi-stage enhancement with suitable order can improve the performance improve the performance Noisy Noisy speechspeech

First stage First stage enhancemeenhancementnt

Second Second stage stage enhancemenenhancementt

Clean Clean speecspeechh

Page 156: Introduction to Signal Processing

Open Problems-7Open Problems-7Informative features extraction •To use spectral dynamics in speech/speaker To use spectral dynamics in speech/speaker recog.recog.•Special type of speech features are requiredSpecial type of speech features are required

•How to parameterized speech signal to represent How to parameterized speech signal to represent speech dynamicsspeech dynamics•WT, HS based spectral analysis can be WT, HS based spectral analysis can be studied>>>studied>>>

Mixed signal Mixed signal with its with its spectrogramspectrogram

Page 157: Introduction to Signal Processing

Open Problems-8Open Problems-8Source based audio indexing

•Useful in multimedia applications and moving Useful in multimedia applications and moving audio source separationaudio source separation

•Several new method could be used for Several new method could be used for indexing indexing Ada-boost, Tree-ICA, condition Ada-boost, Tree-ICA, condition random fieldrandom field

s1

s2

s3

Audio sources at different azimuth angles

(0 to 180 degree)

1.5m

Separation of Separation of moving moving sourcessources

Page 158: Introduction to Signal Processing

Open Problems-9Open Problems-9Time-series prediction with EMD•Subject to financial and environment time seriesSubject to financial and environment time series•Conventional methods use Kalman filter (for Conventional methods use Kalman filter (for smoothing) and AR model for predictionsmoothing) and AR model for prediction

•EMD can be used as smoothing filter to enhance EMD can be used as smoothing filter to enhance the prediction accuracythe prediction accuracy

Non-stationary Non-stationary time-seriestime-series

Page 159: Introduction to Signal Processing

Open Problems-10Open Problems-10Heart-rate analysis with ECG data using EMD

•ECG Variability analysis at different frequency regionECG Variability analysis at different frequency region•Analysis of instantaneous ECG conditionAnalysis of instantaneous ECG condition•Abnormality analysis of heart-rate using EMD based Abnormality analysis of heart-rate using EMD based spectral modeling spectral modeling

Different parts Different parts of ECG signalof ECG signal

Page 160: Introduction to Signal Processing

The End

Questions/Suggestion Questions/Suggestion PleasePlease

Page 161: Introduction to Signal Processing

Source SeparationSource Separation

Each peak region in the histogram refers to a source of the stereo mixturesConstruct a binary mask (nullifying TF points of interfering sources) Mi(n,t)The HS of ith source is separated as

Time domain ith source is given as

),(),(),( tnHtnMtnH Lii

n

ii tntnHts )],(cos[),()(

Page 162: Introduction to Signal Processing

The End

Questions/Suggestion Questions/Suggestion PleasePlease