independent component analysis applied to biophysical time ... · independent component analysis...
TRANSCRIPT
Independent component analysis applied to biophysical time series
and EEG
© Arnaud Delorme CERCO, CNRS, France & SCCN, UCSD, La Jolla, USA
A
B
ICAY=[A;B] Linear Combination
-0.33
1.57
1
-2
X=YW
A
B
Y=W-1X~~
ICA is a method to recover a version, of the original sources by multiplying the data by a unmixing matrix
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−−
−− ...
3011
23
1015
45
2120
23
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−
−
310421235
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
∗+∗−∗∗+∗−∗∗+∗−+∗∗−∗+∗−∗+∗−+∗−∗−∗+∗
402515402515...422)2(12412013
)2(23)2(52)2(13053
**
*
Weight matrix W
Data X
ICA activity U
U = WX
DataICA activity
Channel 1
Channel 2
Channel 3
Comp. 1
Comp. 2
Comp. 3
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−−
−− ...
3011
23
1015
45
2120
23
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−
−
310421235
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
∗+∗−∗∗+∗−∗∗+∗−+∗∗−∗+∗−∗+∗−+∗−∗−∗+∗
402515402515...422)2(12412013
)2(23)2(52)2(13053
**
*
Inverse weight matrix W-1
ICA activity U
Data X
X=W-1U
Data
Chan 1
Chan 2
Chan 3
Comp. 1
Comp. 2
Comp. 3
Historical Remarks
• Herault & Jutten ("Space or time adaptive signal processing by neural network models“, Neural Nets for Computing Meeting, Snowbird, Utah, 1986): Seminal paper, neural network
• Bell & Sejnowski (1995): Information Maximization• Amari et al. (1996): Natural Gradient Learning• Cardoso (1996): JADE
• Applications of ICA to biomedical signals– EEG/ERP analysis (Makeig, Bell, Jung & Sejnowski, 1996).
– fMRI analysis (McKeown et al. 1998)
Courtesy of TP Jung
ICA Theory – Cost Functions
Family of BSS algorithms• Information theory (Infomax)• Bayesian probability theory (Maximum likelihood estimation)• Negentropy maximization• Nonlinear PCA • Statistical signal processing (cumulant maximization, JADE)
A unifying Information-theoretic framework for ICA• Pearlmutter & Parra showed that InfoMax, ML estimation are
equivalent.• Lee et al. (1999) showed negentropy has the equivalent
property to InfoMax.• Girolami & Fyfe showed nonlinear PCA can be viewed from
information-theoretic principle. Courtesy of TP Jung
Central limit theorem
Brain source A
Brain source B
Scalp channels = linear mixture of A and B
(more gaussian)
binn
ing
time
Scalp channel 1
Scalp channel 2
ICA Training Process
• Remove the mean x = x - <x>
• ‘Sphere’ the data by diagonalizing its covariance matrix, x = <xxT>-1/2(x-<x>).
• Update W according to
Central limit theorem
YXS
Maximization of information transfer I(X,Y)
Non-linearity- Pick up higher moment in input distribution- Perform true redundancy reduction in the output- Perform independent source separation
From Bell & Sejnowski, Neural Computation, 1995
Entropy
Mutual Information
H(Y) is the entropy of the outputH(Y|X) is whatever entropy the output has
that did not come from the input
Entropy
Fake dice (make a 6 half of the time): entropy 2.16 (base 2)
Dice: 1/6
1/6
1 2 3 4 5 6
21 16 log 2.586 6
H ⎛ ⎞⎛ ⎞= − =⎜ ⎟⎜ ⎟⎝ ⎠⎝ ⎠
1/10
1 2 3 4 5 6
2 21 1 1 15 log log 2.16
10 10 2 2H ⎛ ⎞⎛ ⎞ ⎛ ⎞= − − =⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝ ⎠ ⎝ ⎠⎝ ⎠
Contingency table for stress and emotionality
33
12
11191
STRE
87
4166342
84
11
1864
3
35129
203
4
1314621
5
81322
6
36Total
105234603
142223EMOT= 1
Total
From http://tecfa.unige.ch/~lemay/thesis/THX-Doctorat/node149.html
60.010.020.0150.010.020.030.0240.010.010.080.070.060.013
0.010.250.240.0420.020.07EMOT= 1
654321STRE
Joint entropy 3.46; exercise: compute mutual information
Contingency frequencies for stress and emotionality
Kurtosis, Super- and Sub-Gaussian
Kurtosis: a measure of how peaked or flat of a probability distribution is.
- 3
Gaussian Dist. Kurtosis = 0
Super-Gaussian: kurtosis > 0
Sub-Gaussian: kurtosis < 0
ICA/EEG Assumptions
• Mixing is linear at electrodes
• Propagation delays are negligible
• Component time courses are independent
• Number of components less than the number of channels.
OK
OK
~
Con
trib
utio
nto
EEG
Number of independent components
ICA limit
Characteristics of Independent Component of the EEG
• Concurrent Activity
• Maximally Temporally Independent
• Overlapping Maps and Spectra
• Dipolar Scalp Maps
• Functionally Independent
• Between-Subject Regularity
Independent component analysisTheoryExamples and localization
ICA reliabilityICA repetitionsDifferent ICA algorithmsData reduction
Outline
0 0 0 0 0 0 ...0 2 5 1 1 1 ...0 0 0 0 0 0 ......
⎡ ⎤⎢ ⎥− − − −⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
5 3 2 ...1 2 4 ...0 1 3 ......
−⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥−⎢ ⎥⎣ ⎦
3 5 0 3 1 ( 2) 2 5 ( 2) 3 2 ( 2) ...3 1 0 2 1 4 2 1 ( 2) 2 2 4 ...5 1 5 2 0 4 5 1 5 2 0 4 ...
... ...
∗ + ∗ − ∗ − ∗ + − ∗ + ∗ −⎡ ⎤⎢ ⎥∗ + ∗ − ∗ ∗ + − ∗ + ∗⎢ ⎥⎢ ⎥∗ − ∗ + ∗ ∗ − ∗ + ∗⎢ ⎥⎣ ⎦
**
*
Inverse weight matrix W-1
ICA activity U
Data X
X=W-1U
Data
Chan 1Chan 2Chan 3
Comp. 1Comp. 2Comp. 3
Subject MRI MNI model (Loreta)Subject scanned
electrode positionsSubject components
scalp map
Front
EEG
LAB
EE
GLA
B &
FI
LED
TRIP
Normalization of subject MRI to MNI brain
LORETAEEGLAB
SPM2
Head mesh
BRAI
NVI
SAEE
GLA
BEE
GLA
B
Coregistration
* **
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡...
4,33,3
4,23,2
4,13,1
2,31,3
2,21,2
2,11,1
uuuuuu
uuuuuu
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−
−−−
−−−
13,3
12,3
11,3
13,2
12,2
11,2
13,1
12,1
11,1
wwwwwwwww
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
++++++++++++
−−−−−−
−−−−−−
−−−−−−
13,32,3
12,32,2
11,32,1
13,31,3
12,31,2
11,31,1
13,22,3
12,22,2
11,22,1
13,21,3
12,21,2
11,21,1
13,12,3
12,12,2
11,12,1
13,11,3
12,11,2
11,11,1
...wuwuwuwuwuwuwuwuwuwuwuwuwuwuwuwuwuwu
* **
Inverse weight matrix W-1
ICA activity U
Data X (EEG/MEG time series)
X=W-1UChan 1Chan 2Chan 3
Comp. 1Comp. 2Comp. 3
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡...
4,33,3
4,23,2
4,13,1
2,31,3
2,21,2
2,11,1
uuuuuu
uuuuuu
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−
−−−
−−−
13,3
12,3
11,3
13,2
12,2
11,2
13,1
12,1
11,1
wwwwwwwww
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
++++++++++++
−−−−−−
−−−−−−
−−−−−−
13,32,3
12,32,2
11,32,1
13,31,3
12,31,2
11,31,1
13,22,3
12,22,2
11,22,1
13,21,3
12,21,2
11,21,1
13,12,3
12,12,2
11,12,1
13,11,3
12,11,2
11,11,1
...wuwuwuwuwuwuwuwuwuwuwuwuwuwuwuwuwuwu
Inverse weight matrix W-1
ICA activity U
Data X (EEG/MEG voxel activities)
Time 1Time 2Time 3
Comp. 1Comp. 2Comp. 3
Temporal ICA
X=W-1U
Spatial ICA