warm-up as you log in10601/lectures/10601_fa20...wrap-up clustering clustering slides introduction...
TRANSCRIPT
![Page 1: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/1.jpg)
Warm-up as you log in
1
1. https://www.sporcle.com/games/MrChewypoo/minimalist_disney
2. https://www.sporcle.com/games/Stanford0008/minimalist-cartoons-slideshow
3. https://www.sporcle.com/games/MrChewypoo/minimalist
![Page 2: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/2.jpg)
AnnouncementsAssignments
▪ HW9
▪ Due Wed, 12/9, 11:59 pm
▪ The two slip days are free (last possible submission Fri, 12/11, 11:59 pm)
Final Exam
▪ Mon, 12/14
▪ Stay tuned to Piazza for more details
![Page 3: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/3.jpg)
Wrap-up ClusteringClustering slides
![Page 4: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/4.jpg)
Introduction to Machine Learning
Dimensionality Reduction and PCA
Instructor: Pat Virtue
![Page 5: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/5.jpg)
Learning Paradigms
Slid
e c
red
it:
CM
U M
LD, M
att
Go
rmle
y
![Page 6: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/6.jpg)
Outline
Dimensionality Reduction▪ High-dimensional data
▪ Learning (low dimensional) representations
Principal Component Analysis (PCA)▪ Examples: 2D and 3D
▪ PCA algorithm
▪ PCA objective and optimization
▪ PCA, eigenvectors, and eigenvalues
6
![Page 7: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/7.jpg)
Warm-up as you log in
7
1. https://www.sporcle.com/games/MrChewypoo/minimalist_disney
2. https://www.sporcle.com/games/Stanford0008/minimalist-cartoons-slideshow
3. https://www.sporcle.com/games/MrChewypoo/minimalist
![Page 8: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/8.jpg)
Dimensionality Reduction
![Page 9: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/9.jpg)
Dimensionality Reduction
![Page 10: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/10.jpg)
Dimensionality Reduction
![Page 11: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/11.jpg)
Dimensionality Reduction
![Page 12: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/12.jpg)
Dimensionality Reduction
For each 𝑥(𝑖) ∈ ℝ𝑀 find representation 𝑧(𝑖) ∈ ℝ𝐾 where 𝐾 ≪ 𝑀
12
![Page 13: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/13.jpg)
High Dimension Data
Examples of high dimensional data:
– High resolution images (millions of pixels)
13
![Page 14: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/14.jpg)
Dimensionality Reductionhttp://timbaumann.info/svd-image-compression-demo/
https://cs.stanford.edu/people/karpathy/convnetjs/demo/autoencoder.html
![Page 15: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/15.jpg)
Dimensionality Reductionhttp://timbaumann.info/svd-image-compression-demo/
https://cs.stanford.edu/people/karpathy/convnetjs/demo/autoencoder.html
![Page 16: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/16.jpg)
High Dimension Data
Examples of high dimensional data:
– Brain Imaging Data (100s of MBs per scan)
16Image from https://pixabay.com/en/brain-mrt-magnetic-resonance-imaging-1728449/
Image from (Wehbe et al., 2014)
![Page 17: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/17.jpg)
PCA, Kernel PCA, ICA: Powerful unsupervised learning techniques for extracting hidden (potentially lower dimensional) structure from high dimensional datasets.
Learning Representations
Useful for:
• Visualization
• Further processing by machine learning algorithms
• More efficient use of resources (e.g., time, memory, communication)
• Statistical: fewer dimensions → better generalization
• Noise removal (improving data quality)
Slide from Nina Balcan
![Page 18: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/18.jpg)
PRINCIPAL COMPONENT ANALYSIS (PCA)
18
![Page 19: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/19.jpg)
Principal Component Analysis (PCA)
In case where data lies on or near a low d-dimensional linear subspace, axes of this subspace are an effective representation of the data.
Identifying the axes is known as Principal Components Analysis, and can be obtained by using classic matrix computation tools (Eigen or Singular Value Decomposition).
Slide from Nina Balcan
![Page 20: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/20.jpg)
2D Gaussian dataset
Slide from Barnabas Poczos
![Page 21: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/21.jpg)
1st PCA axis
Slide from Barnabas Poczos
![Page 22: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/22.jpg)
2nd PCA axis
Slide from Barnabas Poczos
![Page 23: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/23.jpg)
PCA Axes
23
![Page 24: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/24.jpg)
Data for PCA
We assume the data is centered
24
Q: What if your data is
not centered?
A: Subtract off the
sample mean
Slide from Matt Gormley
![Page 25: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/25.jpg)
Sample Covariance Matrix
The sample covariance matrix is given by:
25
Since the data matrix is centered, we rewrite as:
Slide from Matt Gormley
![Page 26: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/26.jpg)
PCA AlgorithmInput: 𝑿, 𝑿𝑡𝑒𝑠𝑡, 𝐾
1. Center data (and scale each axis) based on training data → 𝑿,𝑿𝒕𝒆𝒔𝒕
2. 𝑽 = eigenvectors(𝑿𝑇𝑿)
3. Keep only the top 𝐾 eigenvectors: 𝑽𝐾4. 𝐙test = 𝑿𝑡𝑒𝑠𝑡𝑽𝐾
Optionally, use 𝑽𝐾𝑇 to rotate 𝐙test back to original subspace 𝐗′test and
uncenter
![Page 27: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/27.jpg)
27GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingGrowth Plate Disruption and Limb Length Discrepancy
Images Courtesy H. Potter, H.S.S.
8 year-old boy with previous fracture and 4cm leg length discrepancy
![Page 28: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/28.jpg)
28GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingGrowth Plate Disruption and Limb Length Discrepancy
Images Courtesy H. Potter, H.S.S.
8 year-old boy with previous fracture and 4cm leg length discrepancy
![Page 29: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/29.jpg)
29GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingArea Measurement
![Page 30: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/30.jpg)
30GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingArea Measurement
Flatten Growth Plate to Enable 2D Area Measurement
![Page 31: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/31.jpg)
Piazza Poll 1What is the projection of point 𝒙 onto vector 𝒗, assuming that ‖𝒗‖2 = 1?
A. 𝒗𝒙
B. 𝒗𝑇𝒙
C. 𝒗𝑇𝒙 𝒗
D. 𝒗𝑇𝒙 𝒙𝑇𝒗
![Page 32: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/32.jpg)
Rotation of Data (and back)1. For any orthogonal matrix 𝑽 ∈ ℝ𝑀×𝑀
2. Rotate to new space: 𝒛(𝑖) = 𝑽𝒙(𝑖) ∀𝑖
3. (Un)rotate back: 𝒙′(𝑖) = 𝑽𝑇𝒛(𝑖)
![Page 33: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/33.jpg)
PCA AlgorithmInput: 𝑿, 𝑿𝑡𝑒𝑠𝑡, 𝐾
1. Center data (and scale each axis) based on training data → 𝑿,𝑿𝒕𝒆𝒔𝒕
2. 𝑽 = eigenvectors(𝑿𝑇𝑿)
3. Keep only the top 𝐾 eigenvectors: 𝑽𝐾4. 𝐙test = 𝑿𝑡𝑒𝑠𝑡𝑽𝐾
Optionally, use 𝑽𝐾𝑇 to rotate 𝐙test back to original subspace 𝐗′test and
uncenter
![Page 34: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/34.jpg)
Outline
Dimensionality Reduction▪ High-dimensional data
▪ Learning (low dimensional) representations
Principal Component Analysis (PCA)▪ Examples: 2D and 3D
▪ PCA algorithm
▪ PCA objective and optimization
▪ PCA, eigenvectors, and eigenvalues
34
![Page 35: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/35.jpg)
Sketch of PCA1. Select “best” 𝑽 ∈ ℝ𝐾×𝑀
2. Project down: 𝒛(𝑖) = 𝑽𝒙(𝑖) ∀𝑖
3. Reconstruct up: 𝒙′(𝑖) = 𝑽𝑇𝒛(𝑖)
![Page 36: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/36.jpg)
Sketch of PCA1. Select “best” 𝑽 ∈ ℝ𝐾×𝑀
2. Project down: 𝒛(𝑖) = 𝑽𝒙(𝑖) ∀𝑖
3. Reconstruct up: 𝒙′(𝑖) = 𝑽𝑇𝒛(𝑖)
Definition of PCA
1. Select 𝑣1 that best explains data
2. Select next 𝑣𝑗 that
i. Is orthogonal to 𝑣1, … , 𝑣𝑗−1ii. Best explains remaining data
3. Repeat 2 until desired amount of data is explained
![Page 37: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/37.jpg)
Select “Best” VectorReconstruction Error vs Variance of Projection
![Page 38: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/38.jpg)
Piazza Poll 2 & Poll 3Consider the two projections below
Poll 2: Which maximizes the variance?
Poll 3: Which minimizes the reconstruction error?
Option A Option B
![Page 39: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/39.jpg)
Select “Best” VectorReconstruction Error vs Variance of Projection
![Page 40: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/40.jpg)
PCA
40
Equivalence of Maximizing Variance and Minimizing Reconstruction Error
![Page 41: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/41.jpg)
Sketch of PCA1. Select “best” 𝑽 ∈ ℝ𝐾×𝑀
2. Project down: 𝒛(𝑖) = 𝑽𝒙(𝑖) ∀𝑖
3. Reconstruct up: 𝒙′(𝑖) = 𝑽𝑇𝒛(𝑖)
Definition of PCA
1. Select 𝑣1 that best explains data
2. Select next 𝑣𝑗 that
i. Is orthogonal to 𝑣1, … , 𝑣𝑗−1ii. Best explains remaining data
3. Repeat 2 until desired amount of data is explained
![Page 42: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/42.jpg)
PCA: the First Principal Component
44
![Page 43: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/43.jpg)
SVD for PCASVD matrix factorization
𝑿 = 𝑼𝑺𝑽𝑇 , 𝐴 ∈ ℝ𝑁×𝑀
𝑼: 𝑁 × N orthogonal matrix
▪ Columns of 𝑼 are left singular vectors of 𝑿
▪ Columns of 𝑼 are eigenvectors of 𝑿𝑿𝑇
𝑽: 𝑀 ×𝑀 orthogonal matrix
▪ Columns of 𝑽 are right singular vectors of 𝑿
▪ Columns of 𝑽 are eigenvectors of 𝑿𝑇𝑿
𝑺: 𝑁 ×𝑀 diagonal matrix
▪ Diagonal entries are singular values of 𝑿, 𝜎𝑘
▪ Each 𝜎𝑘2 are the eigenvalues of both 𝑿𝑿𝑇 and 𝑿𝑇𝑿!!
![Page 44: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/44.jpg)
PCA AlgorithmInput: 𝑿, 𝑿𝑡𝑒𝑠𝑡, 𝐾
1. Center data (and scale each axis) based on training data → 𝑿,𝑿𝒕𝒆𝒔𝒕
2. 𝑽 = eigenvectors(𝑿𝑇𝑿)
3. Keep only the top 𝐾 eigenvectors: 𝑽𝐾4. 𝐙test = 𝑿𝑡𝑒𝑠𝑡𝑽𝐾
Optionally, use 𝑽𝐾𝑇 to rotate 𝐙test back to original subspace 𝐗′test and
uncenter
![Page 45: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/45.jpg)
Principal Component Analysis (PCA)
XTX v = λv , so v (the first PC) is the eigenvector of
sample correlation/covariance matrix 𝑋𝑇𝑋
Sample variance of projection v𝑇𝑋𝑇𝑋 v = 𝜆v𝑇v = 𝜆
Thus, the eigenvalue 𝜆 denotes the amount of variability captured along that dimension (aka amount of energy along
that dimension).
Eigenvalues 𝜆1 ≥ 𝜆2 ≥ 𝜆3 ≥ ⋯
• The 1st PC 𝑣1 is the eigenvector of the sample covariance matrix𝑋𝑇𝑋 associated with the largest eigenvalue
• The 2nd PC 𝑣2 is the eigenvector of the sample covariance matrix𝑋𝑇𝑋 associated with the second largest eigenvalue
• And so on …
Slide from Nina Balcan
![Page 46: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/46.jpg)
• For M original dimensions, sample covariance matrix is MxM, and has up to M eigenvectors. So M PCs.
• Where does dimensionality reduction come from?Can ignore the components of lesser significance.
0
5
10
15
20
25
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Var
ian
ce (
%)
How Many PCs?
© Eric Xing @ CMU, 2006-2011 53
• You do lose some information, but if the eigenvalues are small, you don’t lose much
– M dimensions in original data – calculate M eigenvectors and eigenvalues– choose only the first D eigenvectors, based on their eigenvalues– final data set has only D dimensions
Variance (%) = ratio of variance along given principal component to total
variance of all principal components
![Page 47: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/47.jpg)
PCA EXAMPLES
54
![Page 48: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/48.jpg)
Projecting MNIST digits
55
Task Setting:
1. Take 28x28 images of digits and project them down to K components
2. Report percent of variance explained for K components
3. Then project back up to 28x28 image to visualize how much information was preserved
![Page 49: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/49.jpg)
Projecting MNIST digits
56
Task Setting:1. Take 28x28 images of digits and project them down to 2 components2. Plot the 2 dimensional points
![Page 50: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/50.jpg)
Projecting MNIST digits
57
Task Setting:1. Take 28x28 images of digits and project them down to 2 components2. Plot the 2 dimensional points
![Page 51: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/51.jpg)
Dimensionality Reduction with Deep LearningHinton, Geoffrey E., and Ruslan R. Salakhutdinov.
"Reducing the dimensionality of data with neural networks.”
Science 313.5786 (2006): 504-507.
![Page 52: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/52.jpg)
Dimensionality Reduction with Deep LearningHinton, Geoffrey E., and Ruslan R. Salakhutdinov.
"Reducing the dimensionality of data with neural networks.”
Science 313.5786 (2006): 504-507.
PCANeural Network
![Page 53: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/53.jpg)
A Huge Thanks to the Course Team!Education Associates
![Page 54: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/54.jpg)
A Huge Thanks to the Course Team! TeamTeaching Assistants
![Page 55: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/55.jpg)
A Huge Thanks to the Course Team! TeamTeaching Assistants
![Page 56: Warm-up as you log in10601/lectures/10601_Fa20...Wrap-up Clustering Clustering slides Introduction to Machine Learning Dimensionality Reduction and PCA Instructor: Pat Virtue Learning](https://reader033.vdocuments.mx/reader033/viewer/2022060917/60aa48fc2ad4632a18770f23/html5/thumbnails/56.jpg)
A Huge Thanks to the Course Team! TeamStudents!!