principal component analysisfiles.meetup.com/18665604/pwlb161216-pca.pdf · principal component...

17
Principal Component Analysis the original paper ADRIAN FLOREA PAPERS WE LOVE BUCHAREST CHAPTER MEETUP #12 DECEMBER 16 TH , 2016

Upload: others

Post on 03-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Principal Component Analysis – the original paper ADRIAN FLOREA

PAPERS WE LOVE – BUCHAREST CHAPTER – MEETUP #12

DECEMBER 16 TH, 2016

Page 2: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Karl Pearson, 1901

K. Pearson, "On lines and planes of closest fit to systems of points in space", The London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, Sixth Series, 2, pp. 559-572 (1901)

(1857-1936)

Page 3: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Dimensionality reduction (Pearson’s words)

(1901)

Page 4: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Dimensionality reduction (115 years later)

(2016)

71291 points 200 features 2 principal components

https://projector.tensorflow.org

Page 5: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Dimensionality reduction

n

d

n

r

• n observations • d possibly correlated features

• r << d • n observations • r uncorrelated features • retain most of the data variability

transform

Original data set

New data set

Page 6: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Introduction

niRxxxx

xxxX

dT

idiii

T

n

,1,)(

)(

21

21

),,,span( s.t. ,},,1|{

),,,span(

21,

21

djij

T

i

d

i

d

d

uuuxuudiRu

eeexRx

xUaaUUxU

UUUU

aUxuauauaxuuux TTT

T

ddd

1

221121

orthogonal is basis lorthonorma an are of Columns

),,,span(

Page 7: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Goal: find an “optimal” basis

),,,(

)(

),,,(

)(

21

21

21

21

d

T

dU

d

T

d

uuuspan

aaa

eeespan

xxxT

=

r << d p

rese

rves

mo

st o

f “v

aria

bili

ty”

xPxxUUx

xUaxUa

aUx

uauauax

r

T

rrT

rr

T

rr

rr

'''

' 2211

'x

ra

Page 8: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

r=1 Find unitary that maximizes the variance of the projected points on it.

Projection of on

1 , uuuT

ix :u

uauxuxxuxu

xuxuxxx ii

T

ii

T

i

i

T

iii

)('||||||||

||||),cos(||||||'||

uuuXuuxxn

uuxxun

xuxun

xun

xun

an

TTn

i

T

ii

n

i

TT

ii

T

u

n

i

T

i

T

i

Tn

i i

Tn

i i

Tn

i uiu

)cov()1

()(1

))((1

)(1

)0(1

)(1

11

2

11

2

1

2

1

22

1 ,max uuuuTT

u

0 Assume u

Page 9: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

First principal component

0220)(0),(

),(max ),1(),(,u

uuuuuuuu

uL

uLuuuuuL

TT

TT

uu is an eigenvalue of the covariance matrix with the associated eigenvector u

1

22maxmax

uuu

TT

uuuuu

The largest eigenvalue is the projected variance on the associated eigenvector that is the direction of most variance, called the first principal component.

Page 10: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER
Page 11: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

1

)(min

uu

uMSE

T

u The equivalent optimization problem solved by Pearson

Page 12: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

An equivalent objective function

n

i i

T

ii

T

iii

n

i i

T

iii

n

i i xxxxxn

xxxxn

xxn

uMSE1

2

1

2

1)'''2||(||

1)'()'(

1||'||

1)(

But:

n

i i

TT

i

T

i

TT

iii

T

i uxuuxuuxuxxn

uMSEuxux1

2 ))())(()(2||(||1

)()('

)))((||||(1

)))(())((2||(||1 2

11

2 uxxuxn

uuuxxuuxxuxn

T

ii

T

i

n

i

n

i

TT

ii

TT

ii

T

i

2

const

1

22

1maxmax)(min)(min

||||))(||||(

1uu

T

u

T

uu

n

i

iTT

ii

T

i

n

iuuuuuMSE

n

xuuuxxux

n

Maximizing the projected variance is equivalent with minimizing the mean squared error

2max)(min

uuuuMSE

Page 13: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

r=2 We have found the vector of the most variance (r = 1 case) 1u

0

1

max

1

2

uv

vv

T

T

vv

vuvuuvuvuvvu

uuvuvuuvvv

L

uvvvvvvL

TTTTTT

TTT

TTT

11111111

11111

1

2202222

220220

)1(),,(

vv So is an eigenvector of v

2

2maxmax

uvvthe second largest eigenvalue of

The second principal component is the eigenvector associated with the second largest eigenvalue, 2

The total variance is invariant to a change of basis!

d

i i

d

i i 11

2

Page 14: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

PCA algorithm 1. Normalize to zero-mean, unit-variance

2. Calculate the covariance matrix of the normalized data set

3. Find all eigenvalues of and arrange them in descending order

4. Choose r for the top r eigenvalues (and corresponding eigenvectors/principal components)

021 d

rr uuuU 21 the reduced basis (of principal components)

,nixUa i

T

ri 1 , the reduced dimensionality data set

Page 15: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

How to choose r?

]9.0,7.0[ ,|min

1

1

di

iir

Kaiser criterion (Henry Felix Kaiser, 1960) }1|min{ iir

}|min{ 1

dir d

i

screeplot(modelname)

Page 16: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

PCA Pros & Cons PROS

• dimensionality reduction can help learning

• removes noise

• can deal with large data sets

CONS

• hard to interpret

• sample dependent

• linear (

Page 17: Principal Component Analysisfiles.meetup.com/18665604/PWLB161216-PCA.pdf · Principal Component Analysis – the original paper ADRIAN FLOREA PAPERS WE LOVE – BUCHAREST CHAPTER

Bibliography • A.A. Farag, S. Elhabian, "A Tutorial on Principal Component Analysis" (2009) - http://dai.fmph.uniba.sk/courses/ml/sl/PCA.pdf

• N. de Freitas, "CPSC 340 - Machine Learning and Data Mining Lectures", University of British Columbia (2012) - http://www.cs.ubc.ca/~nando/340-2012/lectures.php

• I. Georgescu, “Inteligență computațională“, Editura ASE (2015)

• I.T. Jolliffe, "Principal Component Analysis", 2nd ed., Springer (2002)

• J.N. Kutz, "Data-Driven Modeling & Scientific Computation. Methods for Complex Systems & Big Data", Oxford University Press (2013)

•J. N. Kutz, "Singular Value Decomposition" - http://courses.washington.edu/am301/page1/page15/video.html

• K. Pearson, "On lines and planes of closest to systems of points in space", Philos. Mag, Sixth Series, 2, pp. 559-572 (1901) - http://stat.smmu.edu.cn/history/pearson1901.pdf

• M.J. Zaki, W. Meira Jr., “Data Mining and Analysis. Fundamental Concepts and Algorithms”, Cambridge University Press (2014)