introduction to machine learning10701/slides/16_ica.pdfica weights (mixing matrix) help find...
TRANSCRIPT
![Page 1: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/1.jpg)
Introduction to Machine Learning Independent Component Analysis
Barnabás Póczos
![Page 2: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/2.jpg)
2
Independent Component Analysis
![Page 3: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/3.jpg)
3
Observations (Mixtures)
original signals
Model
ICA estimated signals
Independent Component Analysis
![Page 4: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/4.jpg)
4
We observe
Model
We want
Goal:
Independent Component Analysys
![Page 5: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/5.jpg)
5
PCA EstimationSources Observation
x(t) = As(t)s(t)
Mixing
y(t)=Wx(t)
The Cocktail Party ProblemSOLVING WITH PCA
![Page 6: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/6.jpg)
6
ICA EstimationSources Observation
x(t) = As(t)s(t)
Mixing
y(t)=Wx(t)
The Cocktail Party ProblemSOLVING WITH ICA
![Page 7: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/7.jpg)
7
• Perform linear transformations
• Matrix factorization
X U S
X A S
PCA: low rank matrix factorization for compression
ICA: full rank matrix factorization to remove dependency among the rows
=
=
N
N
N
M
M<N
ICA vs PCA, Similarities
Columns of U = PCA vectors
Columns of A = ICA vectors
![Page 8: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/8.jpg)
8
PCA: X=US, UTU=I
ICA: X=AS, A is invertible
PCA does compression • M<N
ICA does not do compression • same # of features (M=N)
PCA just removes correlations, not higher order dependence
ICA removes correlations, and higher order dependence
PCA: some components are more important than others (based on eigenvalues)
ICA: components are equally important
ICA vs PCA, Similarities
![Page 9: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/9.jpg)
9
Note• PCA vectors are orthogonal
• ICA vectors are not orthogonal
ICA vs PCA
![Page 10: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/10.jpg)
10
ICA vs PCA
![Page 11: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/11.jpg)
11
Gabor wavelets,
edge detection,
receptive fields of V1 cells..., deep neural networks
ICA basis vectors extracted from natural images
![Page 12: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/12.jpg)
12
PCA basis vectors extracted from natural images
![Page 13: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/13.jpg)
13
EEG ~ Neural cocktail party Severe contamination of EEG activity by
• eye movements • blinks• muscle• heart, ECG artifact• vessel pulse • electrode noise• line noise, alternating current (60 Hz)
ICA can improve signal • effectively detect, separate and remove activity in EEG records
from a wide variety of artifactual sources. (Jung, Makeig, Bell, and Sejnowski)
ICA weights (mixing matrix) help find location of sources
ICA Application,Removing Artifacts from EEG
![Page 14: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/14.jpg)
14Fig from Jung
ICA Application,Removing Artifacts from EEG
![Page 15: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/15.jpg)
15Fig from Jung
Removing Artifacts from EEG
![Page 16: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/16.jpg)
1616
original noisy Wiener filtered
median filtered
ICA denoised
ICA for Image Denoising
(Hoyer, Hyvarinen)
![Page 17: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/17.jpg)
17
Method for analysis and synthesis of human motion from motion captured data
Provides perceptually meaningful “style” components
109 markers, (327dim data)
Motion capture ) data matrix
Goal: Find motion style components.
ICA ) 6 independent components (emotion, content,…)
ICA for Motion Style Components
(Mori & Hoshino 2002, Shapiro et al 2006, Cao et al 2003)
![Page 18: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/18.jpg)
18
walk sneaky
walk with sneaky sneaky with walk
![Page 19: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/19.jpg)
19
ICA Theory
![Page 20: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/20.jpg)
20
Definition (Independence)
Statistical (in)dependence
Definition (Mutual Information)
Definition (Shannon entropy)
Definition (KL divergence)
![Page 21: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/21.jpg)
21
Solving the ICA problem with i.i.d. sources
![Page 22: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/22.jpg)
22
Solving the ICA problem
![Page 23: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/23.jpg)
23
Whitening
(We assumed centered data)
![Page 24: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/24.jpg)
24
Whitening (continued)
We have,
![Page 25: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/25.jpg)
25
whitenedoriginal mixed
Whitening solves half of the ICA problem
Note:
The number of free parameters of an N by N orthogonal
matrix is (N-1)(N-2)/2.
) whitening solves half of the ICA problem
![Page 26: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/26.jpg)
26
Remove mean, E[x]=0
Whitening, E[xxT]=I
Find an orthogonal W optimizing an objective function• Sequence of 2-d Jacobi (Givens) rotations
find y (the estimation of s),
find W (the estimation of A-1)
ICA solution: y=Wx
ICA task: Given x,
original mixed whitened rotated
(demixed)
Solving ICA
![Page 27: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/27.jpg)
27
p q
p
q
Optimization Using Jacobi Rotation Matrices
![Page 28: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/28.jpg)
28
ICA Cost Functions
Proof: HomeworkLemma
Therefore,
![Page 29: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/29.jpg)
29
) go away from normal distribution
ICA Cost Functions
Therefore,
The covariance is fixed: I. Which distribution has the largest entropy?
![Page 30: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/30.jpg)
3030
The sum of independent variables converges to the normal distribution
) For separation go far away from the normal distribution
) Negentropy, |kurtozis| maximization
Figs from Ata Kaban
Central Limit Theorem
![Page 31: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/31.jpg)
31
Maximum Entropy
![Page 32: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/32.jpg)
Independent Subspace Analysis
Observation
Hidden, independent
sources (subspaces)
Estimate A and S observing samples from X=AS only
Goal:
32
![Page 33: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/33.jpg)
Independent Subspace Analysis
In case of perfect separation,
WA is a block permutation
matrix.
Objective:
33
Observation
Estimation
Hidden sources
![Page 34: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/34.jpg)
34
ICA Algorithms
![Page 35: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/35.jpg)
35
Kurtosis = 4th order cumulant
Measures
•the distance from normality
•the degree of peakedness
ICA algorithm based on Kurtosis maximization
![Page 36: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/36.jpg)
36
Probably the most famous ICA algorithm
The Fast ICA algorithm (Hyvarinen)
(¸ Lagrange multiplier)
Solve this equation by Newton–Raphson’s method.
![Page 37: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/37.jpg)
37
Newton method for finding a root
![Page 38: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/38.jpg)
38
Example: Finding a Root
http://en.wikipedia.org/wiki/Newton%27s_method
![Page 39: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/39.jpg)
39
Newton Method for Finding a Root
Linear Approximation (1st order Taylor approx):
Goal:
Therefore,
![Page 40: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/40.jpg)
40
Illustration of Newton’s method
Goal: finding a root
In the next step we will linearize here in x
![Page 41: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/41.jpg)
41
Newton Method for Finding a Root
This can be generalized to multivariate functions
Therefore,
[Pseudo inverse if there is no inverse]
![Page 42: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/42.jpg)
42
Newton method for FastICA
![Page 43: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/43.jpg)
43
The Fast ICA algorithm (Hyvarinen)
Solve:
The derivative of F :
Note:
![Page 44: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/44.jpg)
44
The Fast ICA algorithm (Hyvarinen)
Therefore,
The Jacobian matrix becomes diagonal, and can easily be inverted.
![Page 45: Introduction to Machine Learning10701/slides/16_ICA.pdfICA weights (mixing matrix) help find location of sources ICA Application, Removing Artifacts from EEG. Fig from Jung 14 ICA](https://reader033.vdocuments.mx/reader033/viewer/2022053010/5f0d95ce7e708231d43b1595/html5/thumbnails/45.jpg)
45
Thanks for your Attention