Cédric Richard Lagrange Lab., University Nice Sophia-Antipolis, France Email: [email protected] Web: www.cedric-richard.fr
1
Geometric methods in NMF for hyperspectral data unmixing
Acknowledgments
The author acknowledges • Nicolas Gillis (University of Mons, BE) • Mathieu Fauvel (ENSAT, F)
for providing access to some pictures.
Hyperspectral data acquisition
1. source (active or passive) 2. electromagnetic radiation 3. interaction 4. sensor
5. transmission 6. processing 7. analysis
Reflectance: fraction of incident radiation that is reflected at an interface.
www.hsaj.org
Nontrivial problems: ❖ spectral mixture analysis ❖ detection/classification ❖ characterization ❖ fusion ❖ etc.
Information processing in remote sensing
ultraspectral (1000’s of bands)
hyperspectral (100’s of bands)
multispectral (10’s of bands)
panchromaticwww.higp.hawaii.edu www.corista.eu
Information processing in remote sensing
agriculture
forestry
urbanism
© Mathieu Fauvel, INPT ENSAT
❖ Some particularities: ❖ mixed pixels: due to insufficient spatial resolution and mixing effects ❖ sub-pixel targets: crucial in many hyperspectral applications
❖ Increasing the spatial resolution is not necessarily a solution: ❖ mixed pixels can still be observed at very high spatial resolutions ❖ intimate mixtures may take place regardless of the spatial resolution
Hyperspectral data unmixing
Macroscopic mixture Intimate mixtureminerals intimately mixed10% grass, 80% soil, 10% tree
intimate mixture
❖ The linear mixture model assumes that endmember substances are sitting side-by-side within the FOV.
❖ Nonlinear mixture models assume intimate mixture of endmember components, multiple scattering effects, etc.
Linear vs. nonlinear mixing model
r = Mα+ e
Modele lineaire [Keshava, 2002] Modele non-lineaire
r = ψ(M ,α) + er = Mα+ e
Modele lineaire [Keshava, 2002] Modele non-lineaire
r = ψ(M ,α) + elinear model (Keshava’02) nonlinear model
y = (M ,↵) + zy = M↵+ z s.t. ↵ � 0, 1>↵ = 1
Blind spectral data unmixingroad
grass
Urban hyperspectral image with 162 spectral bands and 307-by-307 pixels
© Nicolas Gillis, University of Mons, Belgium
❖ Basis elements allow to recover the endmember spectra: ❖ Abundances of the endmembers in each pixel:
�1>A = 1>�
Blind spectral data unmixing with NMF
M < 0
A < 0
© Nicolas Gillis, University of Mons, Belgium
. . .
wavelen
gths
⇡ . . .
⇥ . . .
pixels
spectral signatures
=Y M A=
abundances
Blind spectral data unmixing with NMF
© Nicolas Gillis, University of Mons, Belgium
Decomposition of the urban data set
==================================================P
k M(· , k) A(k, j)Y (· , j)
❖ NMF problem: Given an matrix and a factorization rank , determine:
❖ the mixing matrix (endmember spectra) ❖ the fractional abundance matrix
such that
Blind spectral data unmixing with NMF
Ym⇥ n r
m⇥ r
r ⇥ n
M
A
minM ,A
kY �MAk2F =X
i,j
(Y �MA)2ijA < 0, 1>A = 1>
M < 0
Blind source separation problem
s.t. {
❖ Can we only solve NMF problems? ❖ NMF is an NP-hard problem (Vavasis’09) ❖ NMF is ill-posed (Gillis’09) ❖ Under the pure-pixel assumption, the problem becomes tractable.
❖ Pure-pixel assumption a.k.a separability There exist such that where each column of is a column of .
Pure-pixel assumption: columns of are the spectral signatures of the endmembers in the hyperspectral image .
Blind spectral data unmixing with NMF
M , A < 0 Y = MA M
Y
M
Y
www.trimble.com
Under the pure-pixel assumption, the columns of are the vertices of a convex hull of the columns of
Geometric interpretation
Y
M
in the presence of noise:
yj =P
k akj mk akj � 0P
k akj = 1 8 k, j
m1
m2
m3
m1
m2
m3
❖ Geometric methods exploit properties of convex hulls to address the linear unmixing problem:
❖ NMF-based methods optimize a regularized regression function.
❖ Statistical modeling methods address the unmixing problem as a statistical inference problem:
Bayesian framework, essentially.
Blind spectral unmixing frameworks
www.newenergyconnections.com
Geometric methods
M = [m1 . . . mr]
❖ Minimum volume simplex Find the simplex of minimum volume enclosing all the data (Craig’90)
Endmember extraction: pure-pixel based algorithms
Determinesuch that
is maximized and encloses all the data
Data projection onto a dimensional space is required
(r � 1)
volume(M) =
����det✓
1>
M
◆����(r � 1)!
m1
m2
m3
❖ N-FINDR (Winter’99) • randomly select in
• iteratively increase by substituting by if
❖ SGA (Chang et al.’06) Greedy counterpart of N-FINDR
Endmember extraction: pure-pixel based algorithms
Y
mi mj
M = [m10 . . . mr0 ]
volume(M)
volume(M) < volume(M � {mi} [ {mj})
❖ Pixel purity index (PPI) (Boardman’93) • project spectral vectors onto skewers • extreme points onto skewers are stored • points with highest scores are endmembers
Endmember extraction: pure-pixel based algorithms
Remarks Parameters: number of skewers and cut-off threshold No estimation of the number of endmembers
skewer 3
skewer 2
skew
er 1
❖ Orthogonal subspace projection (OSP) (Harsanyi&Chang’94) For • find • • with
Endmember extraction: pure-pixel based algorithms
Remarks Convergence analysis in (Gillis and Vavasis’14) Extremely fast No parameter
i = 1 : r
j⇤ = argmaxj kyjkM = [M yj⇤ ]Y (I � uu>)Y u = yj⇤/kyj⇤k2
❖ Noise-free data with N-FINDR, OSP, … (Honeine, Richard’11) Let be the “abundance” of in
Abundance estimation
mj
mi
�i
�j
mi mjaji
aji =volume(M)
volume(M � {mi} [ {mj})=
�i�j
no extra calculation
volume = algebraic volume
m1
m2
m3
By Cramer’s rule:
Application
1: alunite, 2: kaolinite, 3: sphene
Cuprite mining district (Nevada) with AVIRIS spectrometer
Application
Data in 2D space given by PCA
Cuprite mining district (Nevada) with AVIRIS spectrometer
outliers
m2
m1
m3
❖ Noisy data: a standard QP problem
Abundance estimation
a⇤= argmin
a
1
2
ky �Mak2
subject to a ⌫ 0 and 1>a = 1
a1
a2
0
a⇤nneg
a⇤ls
a⇤sto-nneg
m2
m1
m3
❖ Noisy data: a standard QP problem ❖ historical FCLS:
❖ Prefer standard solvers: • active set methods • projected-gradient methods • interior point methods
Abundance estimation
M 0 =
✓M1>
◆=
✓m1 . . . mr
1 . . . 1
◆y0 =
✓y1
◆
a⇤= argmin
a
1
2
ky0 �M 0ak2
subject to a ⌫ 0
NMF-based algorithms
(M ,A) = arg minM ,A
1
2kY �MAk2F + �1 �1(M) + �2 �2(A)
❖ Minimum volume constrained NMF
❖ Literature • ICE (Breman et al.’04) • MVC-NMF (Miao’07) • SPICE (Zare and Gader’07) • L1/2-NMF (Qian et al.’11) • CoNMF (Li et al’12)
MVC-NMF
s.t. A < 0M < 0 1>A = 1>
volume reg.abundance reg.
�1(M) ⌘ quadratic, �2(A) = 0
�1(M) ⌘ quadratic, �2(A) ⌘ weighted `1
�1(M) ⌘ quadratic, �2(A) = kAk2,1
�1(M) = 0, �2(A) =P
ij |aij |12
�1(M) = | det(MM>)|, �2(A) = 0
❖ Consider that endmembers are known and noise-free. The LMM is given by
❖ In the presence of noisy endmembers in the scene, the LMM is (only) approximated by:
❖ In the presence of noisy endmembers in the scene, the LMM is exactly given by Y = (Y �E)X +E
Y ⇡ Y X +E
Y = MA+E
Separable LMM
M : Y :noise-free endmembers noisy observationsY �E : noise-free observations
Y ⇡ Y X +E
Approximate model
=
s6
x
0,40 0,35 0,30 0,30 0,34 0,31 0,33 0,31 0,33
0,70 0,55 0,00 1,00 0,68 0,77 0,61 0,57 0,61
0,20 0,13 0,10 0,00 0,10 0,04 0,09 0,06 0,09
0,70 0,67 0,80 0,40 0,60 0,51 0,61 0,59 0,61
0,00 0,29 0,70 0,40 0,30 0,42 0,37 0,48 0,37
1,00 0,50 0,00 0,00 0,00 0,10 0,30 0,10 0,00
0,00 0,30 1,00 0,00 0,20 0,20 0,30 0,40 0,30
0,00 0,20 0,00 1,00 0,40 0,70 0,40 0,50 0,40
0,40 0,35 0,30 0,30 0,34 0,31 0,33 0,31 0,33
0,70 0,55 0,00 1,00 0,68 0,77 0,61 0,57 0,61
0,20 0,13 0,10 0,00 0,10 0,04 0,09 0,06 0,09
0,70 0,67 0,80 0,40 0,60 0,51 0,61 0,59 0,61
0,00 0,29 0,70 0,40 0,30 0,42 0,37 0,48 0,37
0 0 0 0 0 0 0 0 0
r1 r2 r3
a1
a2
a3
+ E
LxN LxN NxN
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
where are noisy observationsY
+ E
Y Y X
y6
noisy endmembers
Consider the model:
y1 y3 y4
noisy endmembers
Group-lasso NMF
min
X
12kY � Y Xk2F + µ
nPk=1
kX(k, ·)k2subject to X < 0
1>X = 1>
GLUP: Group-Lasso with Unit sum and Positivity constraints
The GLUP optimization problem ensures that • matches • has only a few non-zero rows • the positivity and sum-to-one constraints are enforced on
Y Y X
X
X
L⇢(X,Z,⇤) = f(X) + g(Z) + trace(⇤>(AX +BZ �C))+⇢
2 kAX +BZ �Ck2F
Alternating Direction Method of Multipliers (ADMM)
❖ The ADMM solves problems of the formmin
X,Zf(X) + g(Z)
subject to AX +BZ = C
❖ The augmented Lagrangian is given by
❖ The ADMM consists of iterating the following steps1. Xk+1 = min
XL⇢(X,Zk,⇤k)
2. Zk+1 = minZ
L⇢(Xk+1,Z,⇤k)
3. ⇤k+1 = ⇤k+1 + ⇢(AXk+1 +BZk+1 �C)
GLUP with ADMM❖ In order to apply the ADMM, we consider the canonical form
min
X,Z
12kY � Y Xk2F + µ
nPk=1
kZ(k, ·)k2 + I(Z)
subject to
✓I1>
◆X +
✓�I0>
◆Z =
✓01>
◆ consensussum-to-one
indicator func.
Y = (Y �E)X +E ! Y = Y X +E(I �X)
Exact model
with the noisy observations.Y
❖ The model is heteroscedastic: the noise variance depends on X
❖ The Maximum Likelihood estimate with Group-lasso yields:
NGLUP: Reduced noise GLUPmin
X
m2 log |�2C(X)|+ 1
2kY � Y Xk2(�2C(X))�1 + µnP
k=1kX(k, ·)k2
subject to X < 0
1>X = 1>
where C(X) = (I �X)>(I �X)
min
X
12kY � Y Xk2
W k + µnP
k=1kX(k, ·)k2
subject to X < 0
1>X = 1>
NGLUP iterative solution
with the noisy observations.Y
❖ The noise variance that maximizes the loss function for fixed isX
❖ At iteration , as in Iteratively Reweighed Least Squares (IRLS):
�2
�2(X) =1
nmtrace((Y � Y X)C(X)�1(Y � Y X)>)
k + 1
• Inject the weight estimate:W k = (�2(Xk)C(Xk))�1
• Solve the optimization problem for fixed weight: (same ADMM steps)
GLUP experiments: synthetic data
Synthetic data set: • 200 pixels • 8 endmembers • SNR: 40 dB
Grayscale image of estimated abundance matrix X , SNR=40dB.
Num. of column
Num.ofline
0 20 40 60 80 100 120 140 160 180 200
0
20
40
60
80
100
120
140
160
180
200 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
SNR 40 dB SNR 20 dB
SDSOMP 100 % (0.023 sec) 72.37 % (0.023 sec)NFINDR 100 % (0.069 sec) 89.75 % (0.068 sec)GLUP 100 % (1.490 sec) 94.12 % (3.737 sec)
Percentage of identified endmembers (100 realizations).
GLUP experiments: real data
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
SDSOMP (0.1103)
N-FINDR (0.0290)
water roof tops 1 roof tops 2 meadow tree shadow
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
GLUP experiments: real data
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
50 100 150 200 250 300 350 400 450
100
200
300
400
500
600
700
800
900
1000
GLUP (0.0198)
N-FINDR (0.0290)
water roof tops 1 roof tops 2 meadow tree shadow
NGLUP experiments: synthetic data
0 10 20 30 40 50 60 70 80 90 1000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Mean of l ines in X, SNR=20dB.
Num. of line0 10 20 30 40 50 60 70 80 90 100
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Mean of l ines in X, SNR=20dB.
Num. of line
Mean value of each row of the abundance matrix, obtained with 100 pixels and SNR=20 dB GLUP (left), NGLUP (right).
Remark: NGLUP was initialized by the GLUP solution
NGLUP experiments: real data
Abundance maps determined by NGLUP for Pavia University data set
shadowroof metalmeadow tree
Algorithm RMSE max angle (rad) avg angle (rad)
N-FINDR 0.0641 1.0592 0.1549NGLUP 0.0287 0.7468 0.074
Concluding remarks