manifold learning using geodesic entropic graphs alfred o. hero and jose costa dept. eecs, dept...

Manifold Learning Using Geodesic Entropic Graphs

Alfred O. Hero and Jose Costa Dept. EECS, Dept Biomed. Eng., Dept. Statistics

University of Michigan - Ann Arbor [email protected]

http://www.eecs.umich.edu/~hero

Research supported in part by: ARO-DARPA MURI DAAD19-02-1-0262

1. Manifold Learning and Dimension Reduction2. Entropic Graphs3. Examples

1.Dimension Reduction and Pattern Matching

• 128x128 images of three vehicles over 1 deg increments of 360 deg azimuth at 0 deg elevation

• The 3(360)=1080 images evolve on a lower dimensional imbedded manifold in R^(16384)

Courtesy of Center for Imaging Science, JHU

HMMVT62Truck

Land Vehicle Image Manifold

Entropy:

Manifold (intrinsic) Dimension: d

Embediing (extrinsic) Dimension: D

Qua

ntiti

es O

f In

tere

st

Assumption:

is a conformal mappingA statistical sample

Sampling distribution

2dim manifold

Sampling

Embedding

Sampling on a Domain Manifold

Background on Manifold Learning1. Manifold intrinsic dimension estimation

1. Local KLE, Fukunaga, Olsen (1971)2. Nearest neighbor algorithm, Pettis, Bailey, Jain, Dubes (1971) 3. Fractal measures, Camastra and Vinciarelli (2002)4. Packing numbers, Kegl (2002)

2. Manifold Reconstruction1. Isomap-MDS, Tenenbaum, de Silva, Langford (2000)2. Locally Linear Embeddings (LLE), Roweiss, Saul (2000)3. Laplacian eigenmaps (LE), Belkin, Niyogi (2002)4. Hessian eigenmaps (HE), Grimes, Donoho (2003)

3. Characterization of sampling distributions on manifolds1. Statistics of directional data, Watson (1956), Mardia (1972)2. Data compression on 3D surfaces, Kolarov, Lynch (1997) 3. Statistics of shape, Kendall (1984), Kent, Mardia (2001)

2. Entropic GraphsA Planar Sample and its Euclidean MST

MST and Geodesic MST• For a set of points in D-

dimensional Euclidean space, the Euclidean MST with edge power weighting gamma is defined as

• edge lengths of a spanning tree over

• When pairwise distances are geodesic distances on obtain Geodesic MST

• For dense samplings GMST length = MST length

Convergence of Euclidean MST

Beardwood, Halton, Hammersley Theorem:

Convergence Theorem for GMST

Ref: Costa&Hero:TSP2003

Special Cases

• Isometric embedding ( distance preserving)

• Conformal embedding ( angle preserving)

Joint Estimation Algorithm

• Convergence theorem suggests log-linear model

• Use bootstrap resampling to estimate mean MST length and apply LS to jointly estimate slope and intercept from sequence

• Extract d and H from slope and intercept

3. ExamplesRandom Samples on the Swiss Roll

• Ref: Tenenbaum&etal (2000)

Bootstrap Estimates of GMST Length

785 790 795 800805

806

807

808

809

810

811

812

813

814

815

n

E[L

n]

Segment n=786:799 of MST sequence (=1,m=10) for unif sampled Swiss Roll

Bootstrap SE bar (83% CI)

loglogLinear Fit to GMST Length

6.665 6.67 6.675 6.68 6.6856.692

6.694

6.696

6.698

6.7

6.702

6.704Segment of logMST sequence (=1,m=10) for unif sampled Swiss Roll

log(n)

log

(E[L

n])

y = 0.53*x + 3.2

log(E[Ln])

LS fit

Dimension and Entropy Estimates

• From LS fit find:• Intrinsic dimension estimate

• Alpha-entropy estimate ( )

– Ground truth:

Dimension Estimation Comparisons

Application to Faces

• Yale face database 2– Photographic folios of many people’s faces – Each face folio contains images at 585

different illumination/pose conditions– Subsampled to 64 by 64 pixels (4096 extrinsic

dimensions)

• Objective: determine intrinsic dimension and entropy of a typical face folio

GMST for 3 Face Folios

Ref: Costa&Hero 2003

Conclusions

• Characterizing high dimension sampling distributions – Standard techniques (histogram, density estimation) fail

due to curse of dimensionality– Entropic graphs can be used to construct consistent

estimators of entropy and information divergence – Robustification to outliers via pruning

• Manifold learning and model reduction– LLE, LE, HE estimate d by finding local linear

representation of manifold– Entropic graph estimates d from global resampling – Computational complexity of MST is only n log n

Advantages of Geodesic Entropic Graph Methods

References• A. O. Hero, B. Ma, O. Michel and J. D. Gorman,

“Application of entropic graphs,” IEEE Signal Processing Magazine, Sept 2002.

• H. Neemuchwala, A.O. Hero and P. Carson, “Entropic graphs for image registration,” to appear in European Journal of Signal Processing, 2003.

• J. Costa and A. O. Hero, “Manifold learning with geodesic minimal spanning trees,” accepted in IEEE T-SP (Special Issue on Machine Learning), 2004.

• A. O. Hero, J. Costa and B. Ma, "Convergence rates of minimal graphs with random vertices," submitted to IEEE T-IT, March 2001.

• J. Costa, A. O. Hero and C. Vignat, "On solutions to multivariate maximum alpha-entropy Problems", in Energy Minimization Methods in Computer Vision and Pattern Recognition (EMM-CVPR), Eds. M. Figueiredo, R. Rangagaran, J. Zerubia, Springer-Verlag, 2003

manifold learning using geodesic entropic graphs alfred o. hero and jose costa dept. eecs, dept...

Documents