dimension reduction for hyperspectral imaging:...

DIMENSION REDUCTION FOR

HYPERSPECTRAL IMAGING:

FIRST SEMESTER PROGRESS

REPORT

Yiran Li AMSC 663

Advisors: John Benedetto, Wojtek Czaja Department of Mathematics

1

Background

Light is described in terms of its wavelength

A reflectance spectrum shows the reflectance of a material measured across a range of wavelengths. It helps identify certain materials uniquely

Hyperspectral images are three dimensional (x-coordinate, y-coordinate, spectrum)

Each pixel has a different spectrum that may represent different materials

Sometimes over 100 bands and with large number of pixels

2

Spectrum and hyperspectral imagery

Left: Reflectance spectra measured by laboratory

spectrometers for three materials: a green bay laurel leaf, the

mineral talc, and a silty loam soil.

Right: The concept of hyperspectral imagery. (Shippert, 2003)

3

Project Goal

Reduce dimensionality of hyperspectral imaging data

Because that hyperspectral imaging contains large amount of (possibly redundant) information, we want to reduce the dimensionality (and thus the size) of the data while preserving key features of the original data

Two algorithms: Laplacian eigenmaps and Randomized Principal Componenet Analysis are tested and compared

4

Laplacian eigenmaps: the idea

We view each pixel on the hyperspectral imaging data as a node on the graph G, and the distance between them is measured by the Euclidean distance of the spectrum

Distance= 𝑥𝑖 − 𝑥𝑗 Because the spectrum is long (usually hundreds of bands), we want

to map the graph G to a lower dimensional space so that the size of the data is reduced, and that connected points stay as close together as possible, let 𝑦 = 𝑦1, 𝑦2, … 𝑦𝑛

T be such a map. Our goal is to minimize

𝑖,𝑗 𝑦𝑖 − 𝑦𝑗2𝑊𝑖𝑗

where

𝑊𝑖𝑗 = 𝑒−𝑥𝑖−𝑥𝑗

2

𝑡 , is the weight on each edge (large if points are close),

and t is the time passed. In my code, I chose t=10000, due to large distances between nodes.

5

Laplacian eigenmaps: the idea

Since

𝑖,𝑗

𝑦𝑖 − 𝑦𝑗2𝑊𝑖𝑗 = 2y

TLy,

where

𝐷𝑖𝑖 = 𝑗𝑊𝑗𝑖 ,

and

𝐿 = 𝐷 −𝑊the problem of finding 𝑎𝑟𝑔𝑚𝑖𝑛 𝑦𝑇𝐿𝑦 given that 𝑦𝑇𝐷𝑦 = 1,𝑦𝑇𝐷1 = 0 becomes the minimum eigenvalue problem:

𝐿𝑓 = 𝜆𝐷𝑓(Belkin, Niyogi, 2002)

6

Laplacian eigenmaps: the algorithm

Step 1: Constructing the Adjacency Graph

Construct a weighted graph with n nodes (n number of data points), and a set of edges connecting neighboring points.

In our context, each node represents one pixel on the graph, with position represented by its spectrum of length l (l dimensional vector).

Two nodes are connected if

𝑥𝑖 − 𝑥𝑗2< 𝜀

In my code, I chose 𝜀 to be 1/5 of the maximum distance between nodes.

7

Laplacian eigenmaps: the algorithm8

Step 2:

Choosing the weights using Heat Kernel:

𝑊𝑖𝑗 = 𝑒−𝑥𝑖−𝑥𝑗

2

𝑡

Step 3:

Compute eigenvalues and eigenvectors for the generalized eigenvector problem:

𝐿𝑓 = 𝜆𝐷𝑓 (1)

Where 𝑊 is the weight matrix defined earlier, 𝐷 is diagonal weight matrix, 𝐿 = 𝐷 −𝑊

Laplacian eigenmaps: the algorithm

Result:

Let 𝑓0, 𝑓1, … , 𝑓𝑛−1 be the solutions of equation (1), ordered such that

0 = 𝜆0 ≤ 𝜆1 ≤ … ≤ 𝜆𝑛−1Then the first m eigenvectors (excluding 𝑓0) ,

{𝑓1, 𝑓2, … , 𝑓𝑚}

are the desired vectors for embedding in m-dimensional Euclidean space

(Belkin, Niyogi, 2002)

9

Discussion: dimension reduction

Each eigenfunction in {𝑓1, 𝑓2, … , 𝑓𝑚} represents a mapped image onto one dimensional space.

We pick the first m eigenfunctions corresponding to the first m smallest eigenvalues, so that our goal of minimization is satisfied as much as possible.

The resulting image, represented as a graph (nodes are the pixels, locations indicated by the corresponding vector on each pixel), lies in the m dimensional space

Key structure of the graph is preserved

10

Discussion: Dimension reduction11

Visualization of dimension reduction (Shen-En Qian)

Implementation

Software: matlab

hardware: personal computer

Databases:

• Salinas A scene, SalinasA, 1.5MB

86*83 pixels, subscene of Salinas, which was collected by the 224-band AVIRIS sensor over Salinas Valley, California. Contains 6 classes.

• Indian Pines, 6.0MB.

145*145 pixels, gathered by 224-band AVIRIS sensor over the Indian Pines test site in North-western Indiana.

Contains 16 classes.

(Hyperspectral Remote Sensing Scenes)

12

Achievements of the semester

Implementation of laplacian eigenmaps

Understanding of the math behind laplacian

eigenmaps

Code validation

Results compared with groundtruth images

(verification)

Results with error percentage computed

(verification)

13

Code validation

Comparison of results directly with laplacian

eigenmaps code that is publically available

(from DR toolbox from Delft University)

Run my code and the code from the toolbox of Delft

University on the same data sets (pseudo)

Compared the results directly by looking at the

eigenvalues and eigenvectors generated by each

algorithm

14

Verification: ground truth image

Ground truth image is the classification of hyperspectral imaging based on the real objects in the image

Left: Indian Pines hyperspectral image

Right: groundtruth image (Hyperspectral Remote Sensing Scenes)

15

Verification: ground truth image16

Groundtruth classes for the Indian Pines scene and their respective samples number

# Class Samples

1 Alfalfa 46

2 Corn-notill 1428

3 Corn-mintill 830

4 Corn 237

5 Grass-pasture 483

6 Grass-trees 730

7 Grass-pasture-mowed 28

8 Hay-windrowed 478

9 Oats 20

10 Soybean-notill 972

11 Soybean-mintill 2455

12 Soybean-clean 593

13 Wheat 205

14 Woods 1265

15 Buildings-Grass-Trees-Drives 386

16 Stone-Steel-Towers 93

Verification: classification of pixels

In order to verify that the image of reduced

dimension preserves key structure of the original

data, we classify the vectors produced by the

algorithm based on ground truth data

In each category of ground truth image, pick one

pixel as a representative (training data)

In the vector space of reduced dimension, identify

the vectors at the location of the training data

17

Verification: 1NN-classifier18

Find k nearest vectors of a training vector, nearest

in terms of distance, and classify them in the same

category as the training vector

Example of an KNN classifier(Wikipedia)

Results: SalinasA vs groundtruth19

Left: my result

Right: ground truth

Quantitative analysis: error percentage

Calculated the percentage of number of pixels that

disagree with groundtruth data sets.

Percentage: ≈7.1%

Possible reasons: unrepresentative training data;

dimension is too low; epsilon is large

Time analysis

Running time: 15829.7s ≈ 4.40h

20

Timeline/milestones

October 17th: Project proposal

Now to November, 2014: Implement and test

laplacian eigenmaps, prepare for implementation

of randomized PCA

December, 2014: Midyear presentation

January to March: Implement and test randomized

PCA, compare two methods in various situations

April to May: Final presentation and Final report

21

Deliverables

Presentation of data sets with reduced dimensions

of both algorithms (presented as images)

Comparison charts in terms of running time and

accuracy of two different methods (half completed)

Comparison charts with other methods that are

available from the DR matlab toolbox

Data sets (available), Matlab codes(half available),

presentations(two available), proposals

(completed), mid-year report, final report

22

Possible modifications

Algorithm: different ways of choosing neighboring

nodes (the value of epsilon can be modified)

Verification: Use different classification methods. for

example: K-means

Parallelization in C to improve running time

23

Summary

Implementation of the algorithm was successful

Validated the code

Verified and compared results to groundtruth image

The eigenvectors of reduced dimensionality did

preserved the information about the original data

set while being able to reduce complexity of the

data sets and allowing for more actions

24

Bibliography

Shippert, Peg. Introduction to Hyperspectral Image Analysis. Online Journal of Space Communication, issue No. 3: Remote Sensing of Earth via Satellite. Winter 2003. http://spacejournal.ohio.edu/pdf/shippert.pdf

Hyperspectral Imaging. From Wikipedia. Oct. 6th, 2014. http://en.wikipedia.org/wiki/Hyperspectral_imaging

Belkin, Mikhail; Niyogi, Partha. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation.Neural Computation, vol 15. Dec. 8th, 2002. Web. http://web.cse.ohio-state.edu/~mbelkin/papers/LEM_NC_03.pdf

25

http://spacejournal.ohio.edu/pdf/shippert.pdf

http://en.wikipedia.org/wiki/Hyperspectral_imaging

Rokhlin, Vladimir; Szlam, Arthur; Tygert, Mark. A Randomized Algorithm for Principal Component Analysis. SIAM Journal on Matrix Analysis and Applications Volume 31 Issue 3. August 2009. Web. ftp://ftp.math.ucla.edu/pub/camreport/cam08-60.pdf

Matlab Toolbox for Dimension Reduction. Delft University. Web. Oct. 6th, 2014. http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html

IC: Hyperspectral Remote Sensing Scenes. Web. Oct. 6th, 2014. http://www.ehu.es/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_Scenes

Qian, Shen-En. Dimensionality reduction of multidimensional satellite imagery. SPIE Newsroom. 21 March 2011. http://spie.org/x45097.xml

26

ftp://ftp.math.ucla.edu/pub/camreport/cam08-60.pdf

http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html

http://www.ehu.es/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_Scenes

dimension reduction for hyperspectral imaging:...

Documents