digital image processing lectures 13 & 14digital image processing lectures 13 & 14 m.r....

20
Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2013 M.R. Azimi Digital Image Processing

Upload: others

Post on 25-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Digital Image ProcessingLectures 13 & 14

M.R. Azimi, Professor

Department of Electrical and Computer EngineeringColorado State University

Spring 2013

M.R. Azimi Digital Image Processing

Page 2: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Properties of KL TransformThe KL transform has many desirable properties which makes it optimalfor many signal/image processing applications.

i. DecorrelationThe KL coefficients X(k), k ∈ [0, N − 1] are uncorrelated i.e.

E[X(k)X∗(l)] = γkδ(k − l)

since

E[XX∗t]

∆= Ψ∗tE[xxt]Ψ = Ψ∗tRΨ = Γ = Diag(γ0, · · · , γN−1)

That is each eigenvalue γi is the variance (or energy) of the ith elementof X along eigenvector ξ

i. Thus, the KL components represent the

contributions of the data along the relevant coordinates. Note that Ψ isnot a unique matrix w.r.t. this property, and there could be manymatrices (unitary or non-unitary) that would decorrelate the data intransformed domain.

M.R. Azimi Digital Image Processing

Page 3: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

ii. Optimality and Data ReductionConsider the scenario depicted below where vector x is transformedto X using N ×N unitary matrix A. The elements of Z arechosen to be the first m elements of X and zero elsewhere i.e.

Z(k) =

X(k) k ∈ [0,m− 1]0 k ≥ m

That is, Z is in the (m ≤ N)-D subspace, though both x and Xare in N -D space. Then, Z is transformed to x using N ×Nunitary matrix B. The average MSE between the original signalx(n) and reconstructed signal x(n) is,

Jm∆=

1

NE[

N−1∑n=0

|x(n)− x(n)|2]

M.R. Azimi Digital Image Processing

Page 4: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Now, it is desired to find matrices A and B such that Jm isminimized for each and every choice of m ∈ [1, N ].

Theorem:The MSE Jm is minimized for every choice of m when we haveA = Ψ∗t, B = Ψ, AB = I where the columns of Ψ are arrangedaccording to the decreasing order of the eigenvalues of R

Proof: See A.K. Jain, “Fundamentals of Digital Image Processing”.

Note that Jm is equal to the total energy in the discarded eigenvalues.To see this, using the unitary property we can rewrite Jm as

Jm =1

NE[

N−1∑k=0

|X(k)− Z(k)|2] =1

NE[

N−1∑k=m

|X(k)|2] =1

N

N−1∑k=m

γk

This result leads to the following procedure for applying KL transform for

data reduction.

M.R. Azimi Digital Image Processing

Page 5: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

KLT/PCA Procedure for Data Reduction:

1 Form R matrix and diagonalize it to findγ0 > γ1 > · · · > γN−1.

2 If the 1st m eigenvalues contain most of the energy i.e.

η =∑m−1i=0 γi∑N−1i=0 γi

≥ e.g., 95%, then form

Ψred = [ξ0, ξ

1, · · · , ξ

m−1]

3 Transform the data to an m× 1 PCA/KLT vector

Xred = Ψ∗redt · x

4 To reconstructxrec = ΨredXred

Clearly, if m = N we have perfect reconstruction i.e. xrec = x.

M.R. Azimi Digital Image Processing

Page 6: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

iii. Distribution of VarianceAmong all unitary transforms KL packs the maximum averageenergy into m ≤ N elements of X. That is, if A is any otherunitary transform and Ψ∗t is the KL transform matrix, then for anym ∈ [1, N ],

Sm(Ψ∗t) ≥ Sm(A)

where Sm(A) is energy function of unitary transform A, i.e.

Sm(A)∆=

m−1∑k=0

σ2k

and σ2k

∆= E[|X(k)|2] with σ2

0 ≥ σ21 · · · ≥ σ2

N−1.Proof: Note that

Sm(A) =

m−1∑k=0

(ARA∗t)k,k = tr(ImA∗tRA) = Jm

where Jm is the total energy in the retained eigenvalues.M.R. Azimi Digital Image Processing

Page 7: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

We know from property (ii) that Jm is maximized (or Jmminimized) when A is the KLT. Since σ2

k = γk when A = Ψ∗t fromthe KL property

Jm =

m−1∑k=0

γk ≥m−1∑k=0

σ2k, m ∈ [1, N ]

2-D KL Transform of ImagesFor a zero mean 2-D random process x(m,n), m, n ∈ [0, N − 1] usingthe same procedure adopted for DFT and DCT, the 2-D KL transform ofimage matrix x is

X = Ψ∗1txΨ2

where Ψ∗1t and Ψ∗2 are 1-D KL matrices applied to columns and rows of

the image, respectively. The inverse KT transform is

x = Ψ1XΨt2

M.R. Azimi Digital Image Processing

Page 8: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Eigen-images of 2-D KL Transform

The basis images (or eigen-images) of 2-D KL transform or PCA,are K(k, l) = ξ

1kξt

2l, k, l ∈ [0, N − 1] where ξ

1kis kth column of

Ψ1 and ξt2l

is lth row of Ψt2. Image x is decomposed as a linear

combination of these eigen-images with the KL coefficients (orPC’s), X(k, l)s, i.e.

x =

N−1∑k=0

N−1∑l=0

X(k, l)K(k, l).

However, this requires finding two KL matrices Ψ1 and Ψ2. On theother hand, arranging image x into a 1-D vector x leads to sizeN2 ×N2 covariance matrix which is not practical either. Followingalgorithm gives an efficient method to find these eigen-images.Algorithm for Extracting Eigen-imagesLet xi, i ∈ [1,M ] be a set of N ×N training images. Each xiimage is converted to a vector, xi, of size N2 × 1. Then,

M.R. Azimi Digital Image Processing

Page 9: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

(i) Find the ensemble mean image µ = 1M

∑Mi=1 xi and

mean-subtract each image, i.e. xi = xi − µ and form data

matrix Υ = 1√M

[x1 · · · xM ].

(ii) Find the ensemble covariance matrix R = 1M

∑Mi=1 xix

ti or

R = ΥΥt.

(iii) Find the principal eigenvectors of R by solving Rξk

= γkξk.These ξ

ks rearranged in image matrix form are the

eigen-images.However, since R is rank M − 1 (i.e. M N2 eigen-images)solving the original N2-D eigenvalue problem is inefficient.Thus, instead we solve the M -D eigenvalue problem,

ΥtΥζk

= γkζk

Now, pre-multiplying this Eq. by Υ yields

(ΥΥt)Υζk

= γkΥζk

M.R. Azimi Digital Image Processing

Page 10: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Alternatively,RΥζ

k= γkΥζk

implying that the eigen-images can be obtained from

ξk

= Υζk

=1√M

M∑l=1

ζl,kxl

where ζl,k is the lth element of vector ζk. An eigenvalue

associated with an eigen-image represents how much the image inthe training set vary from the mean image. We keep P ≤Meigen-images associated with the largest eigenvalues.

(iv) The reconstructed version of the ith training image fromonly P eigen-images is

xi =

P∑k=1

Xi(k)ξk

+ µ

where Xi(k) = ξ∗ktxi/γk is its kth PC. This is due to

ξ∗ltξk

= ζ∗ltΥtΥζ

k= ζ∗

ltγkζk = γkδ(k − l).

M.R. Azimi Digital Image Processing

Page 11: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Any other image y (mean subtracted by µ) can also be representedby the same eigen-images fairly accurately using

y =

P∑k=1

Y (k)ξk

+ µ

where Y (k) = ξ∗kty/γk is the kth PC of y. The PC or KL

transform vector Y = [Y (0), · · · , Y (P − 1)]t can be used asfeature vector to represent this image.

(v) A simple image recognition (minimum distance classifier) canbe implemented using

j = argmink||Y −Xk||2

where Xk is the PC vector of the kth training sample and j isthe class of the training sample that has the closest match(MSE sense) to the unknown image y.

M.R. Azimi Digital Image Processing

Page 12: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Face Recognition using Eigen-Faces

Turk & Pentland,1991 showed that with only a few eigen-faces(standardized face ingredients) extracted from an ensemble ofimages, any other face can be fairly accurately represented. Thismethod is used not only in face recognition but also in handwritinganalysis, lip reading, voice recognition, sign language/handgestures interpretation and medical imaging.Figures below show the original M=20 training faces and first 16(out of 20) eigen-faces ordered column-wise.

M.R. Azimi Digital Image Processing

Page 13: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Face Recognition using Eigen-Faces

Figures below show fairly accurate reconstruction of two trainingimages using only P=7 eigen-faces.

M.R. Azimi Digital Image Processing

Page 14: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Face Recognition using Eigen-Faces

Figures below show the reconstructed images of a face wearing twodifferent glasses (transparent and dark). The error images are alsoprovided. The reconstructed image in the second case is not asgood. Why? There are more robust methods (e.g., non-linearPCA) for these cases where the test sample is different than thoseof training samples.

M.R. Azimi Digital Image Processing

Page 15: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Short-Time Fourier & Wavelet Transforms

Several major shortcomings of the Fourier transform include:

1 Does’t allow for simultaneous time and frequency domain analysis(e.g., FT cannot localize a particular note in a given piece of music).Lack of time-frequency localization is an important drawback fordetecting and isolating events in both time or frequency domains.

2 Not useful for analyzing non-stationary signals.

3 Not appropriate for representing discontinuities or sharpchanges(i.e., requires a large number of Fourier components torepresent discontinuities).

4 It does’t provide multi-resolution look at the signals/images.

These deficiencies were first identified by D. Gabor, 1946 who introduced

the time-localization using STFT or Gabor transform.

M.R. Azimi Digital Image Processing

Page 16: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Short-Time Fourier Transform (STFT)-Fixed resolution

Time localization in FT can be achieved by windowing the signal x(t)over which the signal is nearly stationary. The FT of the windowed signalyields the STFT as

XSTFT (τ, ω) =

∫ ∞−∞

x(t)g∗(t− τ)e−jωtdt

where g(t) is the window function and τ is the center of the window.This windowing in STFT introduces time dependency in the analysis.

There are two ways to interpret STFT: (1) FT (over all frequencies) ofthe windowed signal around every τ ; or (2) if we leth(t− τ) = g∗(t− τ)e−jωt, then STFT becomes convolution integralrepresenting the output of a bandpass filter whose frequency response iscentered around every ω i.e., STFT amounts to filtering the signal ”at alltimes” with a bandpass filter having an impulse response which is thewindow function modulated to that frequency. Thus, STFT may beviewed as a modulated filter bank.

M.R. Azimi Digital Image Processing

Page 17: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Time-Frequency ResolutionsThe resolution in frequency domain corresponds to the bandwidth of thebandpass filter that is measured by estimating the RMS value

∆ω =(∫∞−∞ ω2|G(ω)|2dω)

12

(∫∞−∞ |G(ω)|2dω)

12

where G(ω) is the FT of g(t). The time resolution is given by the spread(window width) in the time domain i.e.

∆t =(∫∞−∞ t2|g(t)|2dt) 1

2

(∫∞−∞ |g(t)|2dt) 1

2

Owing to the Heisenberg inequality we have

Time− Bandwidth = ∆t∆ω ≥ 1/2 Lower bound

i.e. resolution in time and frequency cannot be made arbitrarily small.Thus, two sinusoids in the frequency domain may only be discriminated ifthey are more than ∆ω apart and two impulses in the time domain canbe separated if they are more than ∆t apart.

M.R. Azimi Digital Image Processing

Page 18: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

The lower bound (equality) is achieved when Gaussian window (FTis also a Gaussian) is used. This gives the “Gabor transform”.

gα(t) =1

2√πα

e−t24α

XGT (τ, ω) =

∫ ∞−∞

x(t)gα(t− τ)e−jωtdt

Note that because of∫∞−∞ gα(t− τ)dτ =

∫∞−∞ gα(τ)dτ = 1, we get∫ ∞

−∞XGT (τ, ω)dτ = X(ω)

i.e collection of localized Gabor transform (local spectralinformation values) gives the global FT of the signal. The width ofthe Gabor window is ∆t =

√α with α > 0. If we define the Gabor

basis function as gτ,ω(t) ≡ gα(t− τ)e−jωt then

XGT (τ, ω) =

∫ ∞−∞

x(t)gτ,ω(t)dt

M.R. Azimi Digital Image Processing

Page 19: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

Remarks

1 The plot of |XSTFT (τ, ω)|2 in (τ, ω)-plane is referred to as“spectrogram”, which is a very useful tool in signal analysis as itprovides a distribution of the signal in time-frequency plane. Someexamples of spectrograms for two different signals (linear FM andsignal with transients) are shown.

M.R. Azimi Digital Image Processing

Page 20: Digital Image Processing Lectures 13 & 14Digital Image Processing Lectures 13 & 14 M.R. Azimi, Professor Department of Electrical and Computer Engineering Colorado State University

Karhunen-Loeve (KL) Transform Face Recognition and Eigen-Faces Short-Time Fourier Transform

2 For a discrete-time signal, x(n), STFT becomes,

XSTFT (m,Ω) =

∞∑n=−∞

x(n)g(n−m)e−jΩn

where g(n) is the window function and m is the center of thewindow (or shift) in this case.

3 The biggest drawback of the STFT is its fixed resolutionwhich implies that arbitrary good resolution in time andfrequency cannot be achieved simultaneously. Choosingnarrow window =⇒ good time resolution but poor frequencyresolution; while choosing wide window =⇒ good frequencyresolution but poor time resolution.

M.R. Azimi Digital Image Processing