dimensionality reduction with pca - over ons · dimensionality reduction pca - principal components...

27
Dimensionality Reduction with PCA Ke Tran May 24, 2011

Upload: vokhanh

Post on 21-Jan-2019

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Dimensionality Reduction with PCA

Ke Tran

May 24, 2011

Page 2: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

IntroductionDimensionality Reduction

PCA - Principal Components AnalysisPCA

ExperimentThe Dataset

Discussion

Conclusion

Page 3: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To discover or to reduce the dimensionality of the data set.

I To identify new meaningful underlying variables.

I Curse of dimensionality: some problems become intractable asthe number of the variables increases.

Page 4: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To discover or to reduce the dimensionality of the data set.

I To identify new meaningful underlying variables.

I Curse of dimensionality: some problems become intractable asthe number of the variables increases.

Page 5: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To discover or to reduce the dimensionality of the data set.

I To identify new meaningful underlying variables.

I Curse of dimensionality: some problems become intractable asthe number of the variables increases.

Page 6: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To reason about or obtaininsights from.

I Too much noise in the data.

I To visualize

I Can build more effective dataanalyses on thereduced-dimensional space:classification, clustering, patternrecognition.

Page 7: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To reason about or obtaininsights from.

I Too much noise in the data.

I To visualize

I Can build more effective dataanalyses on thereduced-dimensional space:classification, clustering, patternrecognition.

Page 8: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To reason about or obtaininsights from.

I Too much noise in the data.

I To visualize

I Can build more effective dataanalyses on thereduced-dimensional space:classification, clustering, patternrecognition.

Page 9: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Why dimensionality reduction?

I To reason about or obtaininsights from.

I Too much noise in the data.

I To visualize

I Can build more effective dataanalyses on thereduced-dimensional space:classification, clustering, patternrecognition.

Page 10: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

PCA - Basic Idea

I Projection

I Can be used to determine how many real dimensions there arein the data.

Figure: The data forms acluster of points in a 3Dspace

Figure: The covarianceeigenvectors identified byPCA are shown in red.The plane defined by the 2largest eigenvectors isshown in light red.

Figure: If we look at thedata in the plane identifiedby PCA, it is clear that itwas mostly 2D

Page 11: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Linear Transformation

1. Let X be the original data set, where each column is a singlesample, X is an m × n matrix

2. Let Y be another m × n matrix related by a lineartransformation P

3. X is the original recorded data set and Y is a newrepresentation of that data set.

PX =

p1...pm

[x1 · · · xn] Y =

p1x1 · · · p1xn...

. . ....

pmx1 · · · pmxn

1. What is the best way to re-express X?

2. What is a good choice of basis P?

Page 12: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Variance and the Goal

What is the best way to re-express X?

Figure: The signal and noise variancesσ2signal and σ2

noise are graphicallyrepresented by the two lines subtendingthe cloud of data

I Signal-to-noise ratio (SNR)

I SNR =σ2signal

σ2noise

I A high SNR indicates a highprecision measurement, while alow SNR indicates very noisydata.

Page 13: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Variance and the Goal

Figure: A spectrum of possible redundancies in data from the twoseparate measurements r1 and r2. The two measurements on the left areuncorrelated because one can not predict one from the other. Conversely,the two measurements on the right are highly correlated indicating highlyredundant measurements.

Page 14: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Assumption behind PCA

1. Linearity

2. Large variances have important structure.

3. The principal components are orthogonal: pi × pj = 0

Page 15: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

PCA algorithm

1. Select a normalized direction in m-dimensional space alongwhich the variance in X is maximized. Save this vector as p1.

2. Find another direction along which variance is maximized,however, because of the orthonormality condition, restrict thesearch to all directions orthogonal to all previous selecteddirections. Save this vector as pi

3. Repeat this procedure until m vectors are selected.

The resulting ordered set of p’s are the principal components.

Page 16: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

PCA algorithm - Computational Trick

1. Compute covariance matrix Cx , CX ≡ 1nXX

T

2. We select the matrix P to be a matrix where each row pi is aneigenvector of 1

nXXT

3. If A is a square matrix, a non-zero vector v is an eigenvectorof A if there is a scalar λ such that Av = λv

4. Reduction: there are m eigenvectors, we reduce from mdimensions to k dimensions by choosing k eigenvectors relatedwith k largest eigenvalues λ

Page 17: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

How to choose k?

1. Proportion of Variance (PoV) explainedλ1+λ2+···λk

λ1+λ2+···λk+···+λm

when λi are sorted in descending order.

2. Typically, stop at PoV ≥ 0.9

3. Scree graph plots of PoV vs k , stop at “elbow”

Page 18: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Scree graph

Page 19: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Yale Faces B - The Dataset

standard face recognition test data set containing

I 10 subjects

I in 585 different positions and lighting conditions each

→ database of 5850 images

I representation: Matlab type

I image dimension: 30 x 40

→ image representation: 1200-dimensional vector

I split randomly into training and test set

Page 20: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Yale Faces B - The Dataset

Figure: Yale Faces B: selected sample pictures for one test person

Page 21: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

SVM experiment

Procedure:

I again: random division of Yale Faces B into training and testset:

I training set: 1800 images of 10 classesI test set: 4050 (remaining) images→ 30 % training, 70 % testing

Page 22: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Eigenfaces or Eigenvectors

1. Using PCA reduces to 10 dimensions

2. Classification with SVM: 97,5 % correctness

0 500 1000 1500 2000 2500 3000 3500 4000−1500

−1000

−500

0

500

1000

1500

Figure: 2-D Visualization of data encoded into Eigenfaces

Page 23: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

When does PCA fail?

I non-linear data

I non Gaussian distribution

I variance due to error

Page 24: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

PCA or not?

1. depend on the problem

2. depend on computational resource

3. there are many better methods for dimensionality reduction

PCA: 97,5 % correctness

0 500 1000 1500 2000 2500 3000 3500 4000−1500

−1000

−500

0

500

1000

1500

Figure: Visualization of 2-D projectiononto Eigenfaces showing linearseparability

Autoencoder: 99,8 % correctness

−30 −20 −10 0 10 20 30−20

−15

−10

−5

0

5

10

15

20

25

Figure: Comparison: Visualization of2-D autoencoded data showing betterlinear separability

Page 25: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

PCA or not?

1. depend on the problem

2. depend on computational resource

3. there are many better methods for dimensionality reduction

PCA: 97,5 % correctness

0 500 1000 1500 2000 2500 3000 3500 4000−1500

−1000

−500

0

500

1000

1500

Figure: Visualization of 2-D projectiononto Eigenfaces showing linearseparability

Autoencoder: 99,8 % correctness

−30 −20 −10 0 10 20 30−20

−15

−10

−5

0

5

10

15

20

25

Figure: Comparison: Visualization of2-D autoencoded data showing betterlinear separability

Page 26: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Questions?

Thanks!

Page 27: Dimensionality Reduction with PCA - Over ons · Dimensionality Reduction PCA - Principal Components Analysis PCA Experiment The Dataset Discussion Conclusion. Why dimensionality reduction?

Questions?

Thanks!