![Page 1: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/1.jpg)
Recent developments in nonlinear dimensionality
reduction
Josh Tenenbaum
MIT
![Page 2: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/2.jpg)
Collaborators
• Vin de Silva
• John Langford
• Mira Bernstein
• Mark Steyvers
• Eric Berger
![Page 3: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/3.jpg)
Outline
• The problem of nonlinear dimensionality reduction
• The Isomap algorithm
• Development #1: Curved manifolds
• Development #2: Sparse approximations
![Page 4: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/4.jpg)
Learning an appearance map
• Given input: . . .
• Desired output:– Intrinsic dimensionality: 3– Low-dimensional
representation:
![Page 5: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/5.jpg)
Linear dimensionality reduction: PCA, MDS
• PCA dimensionality of faces:
• First two
PCs:
![Page 6: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/6.jpg)
• Linear manifold: PCA
• Nonlinear manifold: ?
![Page 7: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/7.jpg)
Previous approaches to nonlinear dimensionality reduction
• Local methods seek a set of low-dimensional models, each valid over a limited range of data:– Local PCA
– Mixture of factor analyzers
• Global methods seek a single low-dimensional model valid over the whole data set:– Autoencoder neural networks– Self-organizing map– Elastic net– Principal curves & surfaces– Generative topographic mapping
![Page 8: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/8.jpg)
A generative model
• Latent space Y Rd
• Latent data {yi} Y generated from p(Y)
• Mapping f: YRN for some N > d
• Observed data {xi = f (yi)} RN
Goal: given {xi}, recover f and {yi}.
![Page 9: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/9.jpg)
Chicken-and-egg problem
• We know {xi} . . .
• . . . and if we knew{yi}, could estimate f.
• . . . or if we knew f, could estimate {yi}.
• So use EM, right? Wrong.
![Page 10: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/10.jpg)
The problem of local minima
GTM SOM
• Global nonlinear dimensionality reduction + local optimization = severe local minima
![Page 11: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/11.jpg)
A different approach
• Attempt to infer {yi} directly from {xi}, without explicit reference to f.
• Closed-form, non-iterative, globally optimal solution for {yi}.
• Then can approximate f with a suitable interpolation algorithm (RBFs, local linear, ...).
• In other words, finding f becomes a supervised learning problem on pairs {yi ,xi}.
![Page 12: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/12.jpg)
When does this work?
• Only given some assumptions on the nature of f and the distribution of the {yi}.
• The trick: exploit some invariant of f, a property of the {yi} that is preserved in the {xi}, and that allows the {yi} to be read off uniquely*.
* up to some isomorphism (e.g., rotation).
![Page 13: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/13.jpg)
The assumptions behind three algorithms
No free lunch: weaker assumptions on f stronger assumptions on p(Y).
Distribution: p(Y) Mapping: f Algorithm
ii) convex, dense isometric Isomap
iii) convex, uniformly dense conformal C-Isomap
i) ii) iii)
i) arbitrary linear isometric Classical MDS
![Page 14: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/14.jpg)
The assumptions behind three algorithms
Distribution: p(Y) Mapping: f Algorithm
ii) convex, dense isometric Isomap
iii) convex, uniformly dense conformal C-Isomap
i)
i) arbitrary linear isometric Classical MDS
![Page 15: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/15.jpg)
Classical MDS
• Invariant: Euclidean distance • Algorithm:
– Calculate Euclidean distance matrix D– Convert D to canonical inner product matrix B by
“double centering”:
– Compute {yi} from eigenvectors of B.
ijij
jij
iijijij d
nd
nd
ndb 2
2222 111
2
1
![Page 16: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/16.jpg)
The assumptions behind three algorithms
Distribution: p(Y) Mapping: f Algorithm
ii) convex, dense isometric Isomap
iii) convex, uniformly dense conformal C-Isomap
ii)
i) arbitrary linear isometric Classical MDS
![Page 17: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/17.jpg)
Isomap
• Invariant: geodesic distance
![Page 18: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/18.jpg)
The Isomap algorithm• Construct neighborhood graph G.
– method– K method
• Compute shortest paths in G, with edge ij weighted by the Euclidean distance |xi - xj|.
– Floyd – Dijkstra (+ Fibonacci heaps)
• Reconstruct low-dimensional latent data {yi}.
– Classical MDS on graph distances– Sparse MDS with landmarks
![Page 19: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/19.jpg)
Illustration on swiss roll
![Page 20: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/20.jpg)
Discovering the dimensionality
• Measure residual variance in geodesic distances . . .
• . . . and find the elbow.
MDS / PCA
Isomap
![Page 21: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/21.jpg)
Theoretical analysis of asymptotic convergence
• Conditions for PAC-style asymptotic convergence– Geometric:
• Mapping f is isometric to a subset of Euclidean space (i.e., zero intrinsic curvature).
– Statistical: • Latent data {yi} are a “representative” sample* from
a convex domain.
* Minimum distance from any point on the manifold to a sample point < e.g., variable density Poisson process).
![Page 22: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/22.jpg)
Theoretical results on the rate of convergence
• Upper bound on the number of data points required.
• Rate of convergence depends on several geometric parameters of the manifold: – Intrinsic:
• dimensionality
– Embedding-dependent: • minimal radius of curvature
• minimal branch separation
![Page 23: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/23.jpg)
Face under varying pose and illumination
• Dimensionality
• pictureMDS / PCA
Isomap
![Page 24: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/24.jpg)
Hand under nonrigid articulation
• Dimensionality
• pictureMDS / PCA
Isomap
![Page 25: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/25.jpg)
Apparent motion
![Page 26: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/26.jpg)
![Page 27: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/27.jpg)
![Page 28: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/28.jpg)
![Page 29: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/29.jpg)
![Page 30: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/30.jpg)
![Page 31: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/31.jpg)
![Page 32: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/32.jpg)
![Page 33: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/33.jpg)
![Page 34: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/34.jpg)
![Page 35: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/35.jpg)
![Page 36: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/36.jpg)
![Page 37: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/37.jpg)
![Page 38: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/38.jpg)
Digits
• Dimensionality
• picture. MDS / PCA
Isomap
![Page 39: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/39.jpg)
![Page 40: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/40.jpg)
![Page 41: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/41.jpg)
![Page 42: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/42.jpg)
![Page 43: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/43.jpg)
![Page 44: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/44.jpg)
![Page 45: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/45.jpg)
![Page 46: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/46.jpg)
![Page 47: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/47.jpg)
![Page 48: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/48.jpg)
![Page 49: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/49.jpg)
![Page 50: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/50.jpg)
Summary of Isomap
A framework for global nonlinear dimensionality reduction that preserves the crucial features of PCA and classical MDS:
• A noniterative, polynomial-time algorithm.• Guaranteed to construct a globally optimal Euclidean
embedding. • Guaranteed to converge asymptotically for an important class
of nonlinear manifolds.
Plus, good results on real and nontrivial synthetic data sets.
![Page 51: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/51.jpg)
Outline
• The problem of nonlinear dimensionality reduction
• The Isomap algorithm
• Development #1: Curved manifolds
• Development #2: Sparse approximations
![Page 52: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/52.jpg)
Locally Linear Embedding (LLE)
• Roweis and Saul (2000)
![Page 53: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/53.jpg)
Comparing LLE and Isomap
• Both start with only local metric information.• Isomap first estimates global metric structure, then
finds an embedding that optimally preserves global structure.
• LLE finds an embedding that optimally preserves only local structure.
• LLE may be more efficient, but may also introduce unpredictable global distortions.
• No asymptotic convergence results for LLE.
![Page 54: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/54.jpg)
LLE Isomap
![Page 55: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/55.jpg)
Outline
• The problem of nonlinear dimensionality reduction
• The Isomap algorithm
• Development #1: Curved manifolds
• Development #2: Sparse approximations
![Page 56: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/56.jpg)
The assumptions behind three algorithms
Distribution: p(Y) Mapping: f Algorithm
ii) convex, dense isometric Isomap
iii) convex, uniformly dense conformal C-Isomap
iii)
i) arbitrary linear isometric Classical MDS
![Page 57: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/57.jpg)
Isometric vs. conformal mapping
• Isometric map: preserves the Euclidean metric at each point y.
• Conformal map: preserves the Euclidean metric at each point y, up to an arbitrary scale factor (y) > 0.
• Properties of conformal maps: – Angle-preserving.– Any subset topologically equivalent to a disk can be
conformally mapped onto a disk.
![Page 58: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/58.jpg)
)()()( iYX yiMiM
C-Isomap
• Invariant: ,
,
f
ijjiX xxiM
||)(ijjiY yyiM
||)(
independent of i
Y
X
![Page 59: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/59.jpg)
The Isomap algorithm• Construct neighborhood graph G.
– method– K method
• Compute shortest paths in G, with edge ij weighted by the Euclidean distance |xi - xj|.
– Floyd – Dijkstra (+ Fibonacci heaps)
• Reconstruct low-dimensional latent data {yi}.
– Classical MDS on graph distances– Sparse MDS with landmarks
![Page 60: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/60.jpg)
The C-Isomap algorithm• Construct neighborhood graph G.
– method– K method
• Compute shortest paths in G, with edge ij weighted by rescaled distance – Floyd – Dijkstra (+ Fibonacci heaps)
• Reconstruct low-dimensional latent data {yi}.
– Classical MDS on graph distances– Sparse MDS with landmarks
)()(|| jMiMxx XXji
![Page 61: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/61.jpg)
Conformal fishbowl
Data MDS Isomap
C-Isomap LLE GTM
![Page 62: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/62.jpg)
Uniform fishbowl
Data MDS Isomap
C-Isomap LLE GTM
![Page 63: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/63.jpg)
Conformal fishbowl, Gaussian density
Latent data C-Isomap LLE
![Page 64: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/64.jpg)
Conformal fishbowl, offset Gaussian density
Latent data C-Isomap LLE
![Page 65: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/65.jpg)
Wavelet
Data MDS Isomap
C-Isomap LLE GTM
![Page 66: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/66.jpg)
Images of Tom’s face
• Two intrinsic degrees of freedom:– Translation: left/right– Zoom: in/out
• Scale variables (e.g., zoom) introduce conformal distortion.
. . .
![Page 67: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/67.jpg)
Face under translation and zoom
Data MDS Isomap
C-Isomap LLE GTM
![Page 68: Recent developments in nonlinear dimensionality reduction](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56813399550346895d9aa617/html5/thumbnails/68.jpg)
Curvature in LLE vs. Isomap
• LLE: +/- Approach: look only at local structure, ignoring global structure.
- Asymptotics: unknown.
+ Nonconformal maps: good for some, but not all.
• Isomap: +/- Approach: explicitly estimate, and factor out, local metric distortion (assuming uniform density).
+ Asymptotics: succeeds for all conformal mappings.
+ Nonconformal maps: good for some, but not all.