richard g. baraniuk chinmay hegde

36
Richard G. Baraniuk Chinmay Hegde Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan Rice University

Upload: emele

Post on 24-Feb-2016

87 views

Category:

Documents


0 download

DESCRIPTION

Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan Rice University. Richard G. Baraniuk Chinmay Hegde. Sensor Data Deluge. Internet Scale Databases. Tremendous size of corpus of available data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Richard G.  Baraniuk Chinmay Hegde

Richard G. Baraniuk Chinmay Hegde

Manifold Learning in the WildA New Manifold Modeling and Learning Framework for Image Ensembles

Aswin C. SankaranarayananRice University

Page 2: Richard G.  Baraniuk Chinmay Hegde

Sensor Data Deluge

Page 3: Richard G.  Baraniuk Chinmay Hegde

Internet Scale Databases• Tremendous size of corpus of available data

– Google Image Search of “Notre Dame Cathedral” yields 3m results 3Tb of data

Page 4: Richard G.  Baraniuk Chinmay Hegde

Concise Models• Efficient processing / compression requires

concise representation• Our interest in this talk: Collections of images

Page 5: Richard G.  Baraniuk Chinmay Hegde

Concise Models• Our interest in this talk:

Collections of image parameterized by q \in

Q– translations of an object

q: x-offset and y-offset

– rotations of a 3D objectq: pitch, roll, yaw

– wedgeletsq: orientation and offset

Page 6: Richard G.  Baraniuk Chinmay Hegde

Concise Models• Our interest in this talk:

Collections of image parameterized by q \in

Q– translations of an object

q: x-offset and y-offset

– rotations of a 3D objectq: pitch, roll, yaw

– wedgeletsq: orientation and offset

• Image articulation manifold

Page 7: Richard G.  Baraniuk Chinmay Hegde

Image Articulation Manifold• N-pixel images:

• K-dimensional articulation space

• Thenis a K-dimensional manifoldin the ambient space

• Very concise model– Can be learnt using Non-linear dim. reduction

articulation parameter space

Page 8: Richard G.  Baraniuk Chinmay Hegde

Ex: Manifold Learning

LLEISOMAPLEHEDiff. Geo …

• K=1rotation

Page 9: Richard G.  Baraniuk Chinmay Hegde

Ex: Manifold Learning

• K=2rotation and scale

Page 10: Richard G.  Baraniuk Chinmay Hegde

Smooth IAMs• N-pixel images:

• Local isometry image distance parameter space distance

• Linear tangent spacesare close approximationlocally

• Low dimensional articulation space

articulation parameter space

Page 11: Richard G.  Baraniuk Chinmay Hegde

Smooth IAMs

articulation parameter space

• N-pixel images:

• Local isometry image distance parameter space distance

• Linear tangent spacesare close approximationlocally

• Low dimensional articulation space

Page 12: Richard G.  Baraniuk Chinmay Hegde

Smooth IAMs

articulation parameter space

• N-pixel images:

• Local isometry image distance parameter space distance

• Linear tangent spacesare close approximationlocally

• Low dimensional articulation space

Page 13: Richard G.  Baraniuk Chinmay Hegde

• Ex: translation manifold

all blue images are equidistant from the red image

• Local isometry

– satisfied only when sampling is dense

0 20 40 60 80 100

0

0.5

1

1.5

2

2.5

3

3.5

Translation q in [px]

Euc

lidea

n di

stan

ce

Theory/Practice Disconnect Isometry

Page 14: Richard G.  Baraniuk Chinmay Hegde

Theory/Practice DisconnectNuisance articulations

• Unsupervised data, invariably, has additional undesired articulations– Illumination– Background clutter, occlusions, …

• Image ensemble is no longer low-dimensional

Page 15: Richard G.  Baraniuk Chinmay Hegde

Image representations

• Conventional representation for an image– A vector of pixels– Inadequate!

pixel image

Page 16: Richard G.  Baraniuk Chinmay Hegde

Image representations• Replace vector of pixels with an abstract

bag of features

– Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptors for each keypoint

– Very popular in many many vision problems

Page 17: Richard G.  Baraniuk Chinmay Hegde

Image representations• Replace vector of pixels with an abstract

bag of features

– Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptors for each keypoint

– Keypoint descriptors are local; it is very easy to make them robust to nuisance imaging parameters

Page 18: Richard G.  Baraniuk Chinmay Hegde

Loss of Geometrical Info• Bag of features representations hide

potentially useful image geometry

• Goal: make salient image geometrical info more explicit for exploitation

Image space

Keypoint space

Page 19: Richard G.  Baraniuk Chinmay Hegde

Key idea• Keypoint space can be endowed with a rich

low-dimensional structure in many situations

Page 20: Richard G.  Baraniuk Chinmay Hegde

Key idea• Keypoint space can be endowed with a rich

low-dimensional structure in many situations

• Mechanism: define kernels , between keypoint locations, keypoint descriptors

Page 21: Richard G.  Baraniuk Chinmay Hegde

Keypoint Kernel• Keypoint space can be endowed with a rich

low-dimensional structure in many situations

• Mechanism: define kernels , between keypoint locations, keypoint descriptors

• Joint keypoint kernel between two images

is given by

Page 22: Richard G.  Baraniuk Chinmay Hegde

Many Possible Kernels• Euclidean kernel

• Gaussian kernel

• Polynomial kernel

• Pyramid match kernel [Grauman et al. ’07]

• Many others

Page 23: Richard G.  Baraniuk Chinmay Hegde

Keypoint Kernel• Joint keypoint kernel between two images

is given by

• Using Euclidean/Gaussian (E/G) combination yields

Page 24: Richard G.  Baraniuk Chinmay Hegde

From Kernel to MetricLemma: The E/G keypoint kernel is a Mercer kernel

– enables algorithms such as SVM

Lemma: The E/G keypoint kernel induces a metric on the space of images

– alternative to conventional L2 distance between images– keypoint metric robust to nuisance imaging parameters,

occlusion, clutter, etc.

Page 25: Richard G.  Baraniuk Chinmay Hegde

Keypoint GeometryTheorem: Under the metric induced by the kernel

certain ensembles of articulating images formsmooth, isometric manifolds

• Keypoint representation compact, efficient, and …

• Robust to illumination variations, non-stationary backgrounds, clutter, occlusions

Page 26: Richard G.  Baraniuk Chinmay Hegde

Keypoint GeometryTheorem: Under the metric induced by the kernel

certain ensembles of articulating images formsmooth, isometric manifolds

• In contrast: conventional approach to image fusion via image articulation manifolds (IAMs) fraught with non-differentiability (due to sharp image edges)– not smooth– not isometric

Page 27: Richard G.  Baraniuk Chinmay Hegde

Application: Manifold Learning

2D Translation

Page 28: Richard G.  Baraniuk Chinmay Hegde

Application: Manifold Learning

2D Translation IAM KAM

Page 29: Richard G.  Baraniuk Chinmay Hegde

Manifold Learning in the Wild• Rice University’s Duncan Hall Lobby

– 158 images– 360° panorama using handheld camera– Varying brightness, clutter

Page 30: Richard G.  Baraniuk Chinmay Hegde

• Duncan Hall Lobby• Ground truth using state of the art

structure-from-motion software

Manifold Learning in the Wild

Ground truth IAM KAM

Page 31: Richard G.  Baraniuk Chinmay Hegde

Manifold Learning in the Wild• Rice University’s Brochstein Pavilion

– 400 outdoor images of a building– occlusions, movement in foreground, varying background

Page 32: Richard G.  Baraniuk Chinmay Hegde

Manifold Learning in the Wild• Brochstein Pavilion

– 400 outdoor images of a building– occlusions, movement in foreground, background

IAM KAM

Page 33: Richard G.  Baraniuk Chinmay Hegde

Internet scale imagery• Notre-dame

cathedral – 738 images

– Collected from Flickr

– Large variations in illumination (night/day/saturations), clutter (people, decorations), camera parameters (focal length, fov, …)

– Non-uniform sampling of the space

Page 34: Richard G.  Baraniuk Chinmay Hegde

Organization• k-nearest neighbors

Page 35: Richard G.  Baraniuk Chinmay Hegde

Organization• “geodesics’

3D rotation

“Walk-closer”

“zoom-out”

Page 36: Richard G.  Baraniuk Chinmay Hegde

Summary• Challenges for manifold learning in the wild are both

theoretical and practical

• Need for novel image representations– Sparse features

Robustness to outliers, nuisance articulations, etc. Learning in the wild: unsupervised imagery

• Promise lies in fast methods that exploit only neighborhood properties– No complex optimization required