similarities, distances and manifold...
TRANSCRIPT
![Page 1: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/1.jpg)
Similarities, Distances and ManifoldLearning
Prof. Richard C. Wilson
Dept. of Computer ScienceUniversity of York
![Page 2: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/2.jpg)
Part I: Euclidean SpacePosition, Similarity and Distance
Manifold Learning in Euclidean space
Some famous techniques
Part II: Non-Euclidean ManifoldsAssessing Data
Nature and Properties of Manifolds
Data Manifolds
Learning some special types of manifolds
Part III: Advanced TechniquesMethods for intrinsically curved manifolds
Thanks to Edwin Hancock, Eliza Xu, Bob Duin for contributionsAnd support from the EU SIMBAD project
![Page 3: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/3.jpg)
Part I: Euclidean Space
![Page 4: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/4.jpg)
Position
The main arena for pattern recognition and machinelearning problems is vector space
– A set of n well defined features collected into a vector
ℝn
Also defined are addition of vectors andmultiplication by a scalarmultiplication by a scalar
Feature vector → position
![Page 5: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/5.jpg)
Similarity
To make meaningful progress, we need a notion ofsimilarity, the inner product
• The inner-product ‹x,y› can be considered to be asimilarity between x and y
In Euclidean space, position, similarity are all neatly
i
ii yxyx,
In Euclidean space, position, similarity are all neatlyconnected
yxyxyx
yx
yx
,d
yx,i
ii
),((squared)Distance
Similarity
,Position
2
![Page 6: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/6.jpg)
The Golden Trio
• In Euclidean space, the concepts of position, similarity anddistance are elegantly connected
PositionX
SimilarityK
DistanceD
![Page 7: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/7.jpg)
Point position matrix
• In a normal manifold learning problem, we have a set ofsamples X={x1,x2,...,xm}
• These can be collected together in a matrix X
T
T
x
x
X
2
1
I use this convention, but othersmay write them vertically
Tmx
may write them vertically
![Page 8: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/8.jpg)
Centring
A common and important operation is centring –moving the mean to the origin– Centred points behave better
is the mean matrix, so is the centredmatrix
m/JX m/JXX matrix
– J is the all-ones matrix
This can be done with C
– C is the centring matrix (and is symmetric C=CT)
CXXJIC / m
![Page 9: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/9.jpg)
Position-Similarity
• The similarity matrix K is defined as
• From the definition of X, we simply get
• The Gram matrix is the similarity matrix ofthe centred points (from the definition of X)
PositionX
SimilarityK
jiijK xx ,
TXXK
the centred points (from the definition of X)
– i.e. a centring operation on K
• Kc is really a kernel matrix for the points(linear kernel)
CKCCCXXK TTc
![Page 10: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/10.jpg)
Position-Similarity
• To go from K to X, we need to consider theeigendecomposition of K
• As long as we can take the square root of Λthen we can find X as
PositionX
SimilarityK
T
T
XXK
UUK
Λ
then we can find X as1/2ΛUX
![Page 11: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/11.jpg)
Kernel embedding
kernel embedding
Finds a Euclidean manifold from object similarities
• Embeds a kernel matrix into a set of points in Euclideanspace (the points are automatically centred)
1/2ΛUX
TUUK Λ
space (the points are automatically centred)
• K must have no negative eigenvalues, i.e. it is a kernelmatrix (Mercer condition)
![Page 12: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/12.jpg)
Similarity-Distance
SimilarityK
DistanceD
ijsijjjii
jijjii
jijiji
DKKK
d
,
2
2
,2,,
,),(
xxxxxx
xxxxxx
ijsijjjii DKKK ,2
• We can easily determine Ds from K
![Page 13: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/13.jpg)
Similarity-Distance
What about finding K from Ds ?
Looking at the top equation, we might imagine that
K=-½ Ds is a suitable choice
ijjjiiijs KKKD 2,
• Not centred; the relationship is actually
CCDK s2
1
![Page 14: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/14.jpg)
Classic MDS
• Classic Multidimensional Scaling embeds a (squared)distance matrix into Euclidean space
• Using what we have so far, the algorithm is simple
kernel theposeEigendecom Λ
kerneltheCompute2
1
KUU
CCDK
T
s
• This is MDS
kerneltheEmbedΛ1/2UX
PositionX
DistanceD
![Page 15: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/15.jpg)
The Golden Trio
PositionX
Similarity
KernelEmbedding
MDS
SimilarityK
DistanceD
ijjjiiijs
s
KKKD 2
2
1
,
CCDK
![Page 16: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/16.jpg)
Kernel methods
• A kernel is function k(i,j) which computes an inner-product
– But without needing to know the actual points (the space isimplicit)
• Using a kernel function we can directly compute K withoutknowing X
Position
jijik xx ,),(
PositionX
SimilarityK
DistanceD
Kernelfunction
![Page 17: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/17.jpg)
Kernel methods
• The implied space may be very high dimensional, but atrue kernel will always produce a positive semidefinite Kand the implied space will be Euclidean
• Many (most?) PR algorithms can be kernelized– Made to use K rather than X or D
• The trick is to note that any interesting vector should lie inthe space spanned by the examples we are giventhe space spanned by the examples we are given
• Hence it can be written as a linear combination
• Look for α instead of u
αX
xxxuT
mm
2211
![Page 18: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/18.jpg)
Kernel PCA
• What about PCA? PCA solves the following problem
• Let’s kernelize:
XuXu
Σuuu
u
u
TT
T
n
1minarg
minarg*
11
αKα
αXXXXα
αXXXαXXuXu
21
1
)()(11
T
TTT
TTTTTT
n
n
nn
![Page 19: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/19.jpg)
Kernel PCA
• K2 has the same eigenvectors as K, so the eigenvectors ofPCA are the same as the eigenvectors of K
• The eigenvalues of PCA are related to the eigenvectors ofK by
2PCA
1K
n
• Kernel PCA is a kernel embedding with an externallyprovided kernel matrix
![Page 20: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/20.jpg)
Kernel PCA
• So kernel PCA gives the same solution as kernelembedding– The eigenvalues are modified a bit
• They are essentially the same thing in Euclidean space
• MDS uses the kernel and kernel embedding
• MDS and PCA are essentially the same thing in Euclideanspacespace
• Kernel embedding, MDS and PCA all give the sameanswer for a set of points in Euclidean space
![Page 21: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/21.jpg)
Some useful observations
• Your similarity matrix is Euclidean iff it has no negativeeigenvalues (i.e. it is a kernel matrix and PSD)
• By similar reasoning, your distance matrix is Euclidean iffthe similarity matrix derived from it is PSD
• If the feature space is small but the number of samples islarge, then the covariance matrix is small and it is better todo normal PCA (on the covariance matrix)do normal PCA (on the covariance matrix)
• If the feature space is large and the number of samples issmall, then the kernel matrix will be small and it is better todo kernel embedding
![Page 22: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/22.jpg)
Part II: Non-Euclidean ManifoldsPart II: Non-Euclidean Manifolds
![Page 23: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/23.jpg)
Non-linear data
• Much of the data in computer vision lies in a high-dimensional feature space but is constrained insome way
– The space of all images of a face is a subspace of thespace of all possible images
– The subspace is highly non-linear but low dimensional(described by a few parameters)(described by a few parameters)
![Page 24: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/24.jpg)
Non-linear data
• This cannot be exploited by the linear subspace methodslike PCA– These assume that the subspace is a Euclidean space as well
• A classic example is the
‘swiss roll’ data:
![Page 25: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/25.jpg)
‘Flat’ Manifolds• Fundamentally different types of data, for example:
• The embedding of this data into the high-dimensionalspace is highly curvedspace is highly curved– This is called extrinsic curvature, the curvature of the manifold
with respect to the embedding space
• Now imagine that this manifold was a piece of paper; youcould unroll the paper into a flat plane without distorting it– No intrinsic curvature, in fact it is homeomorphic to Euclidean
space
![Page 26: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/26.jpg)
• This manifold is different:
• It must be stretched to map it onto a plane
– It has non-zero intrinsic curvature
Curved manifold
– It has non-zero intrinsic curvature
• A flatlander living on this manifold can tell that it is curved, forexample by measuring the ratio of the radius to the circumference of acircle
• In the first case, we might still hope to find Euclidean embedding
• We can never find a distortion free Euclidean embedding of the second(in the sense that the distances will always have errors)
![Page 27: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/27.jpg)
Intrinsically Euclidean Manifolds
• We cannot use the previous methods on the second type ofmanifold, but there is still hope for the first
• The manifold is embedded in Euclidean space, butEuclidean distance is not the correct way to measuredistance
• The Euclidean distance ‘shortcuts’ the manifold
• The geodesic distance calculates the shortest path along themanifold
![Page 28: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/28.jpg)
Geodesics
• The geodesic generalizes the concept of distance to curvedmanifolds– The shortest path joining two points which lies completely within
the manifold
• If we can correctly compute the geodesic distances, and themanifold is intrinsically flat, we should get Euclideandistances which we can plug into our Euclidean geometrymachine Position
X
SimilarityK
DistanceD
GeodesicDistances
![Page 29: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/29.jpg)
ISOMAP
• ISOMAP is exactly such an algorithm
• Approximate geodesic distances are computed for the points from agraph
• Nearest neighbours graph
– For neighbours, Euclidean distance≈geodesic distances
– For non-neighbours, geodesic distance approximated by shortest distancein graph
• Once we have distances D, can use MDS to find Euclidean embedding
![Page 30: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/30.jpg)
ISOMAP
• ISOMAP:
– Neighbourhood graph
– Shortest path algorithm
– MDS
• ISOMAP is distance-preserving – embeddeddistances should be close to geodesic distancesdistances should be close to geodesic distances
![Page 31: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/31.jpg)
Laplacian Eigenmap
• The Laplacian Eigenmap is another graph-based method of embeddingnon-linear manifolds into Euclidean space
• As with ISOMAP, form a neighbourhood graph for the datapoints
• Find the graph Laplacian as follows
• The adjacency matrix A is
connectedareandif
2
jieAt
d
ij
ij
• The ‘degree’ matrix D is the diagonal matrix
• The normalized graph Laplacian is
otherwise0
Aij
j
ijii AD
2/12/1 ADDIL
![Page 32: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/32.jpg)
Laplacian Eigenmap
• We find the Laplacian eigenmap embedding using theeigendecomposition of L
• The embedded positions are
• Similar to ISOMAP
TUUL
UDX 2/1• Similar to ISOMAP
– Structure preserving not distance preserving
![Page 33: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/33.jpg)
Locally-Linear Embedding
• Locally-linear Embedding is another classic method which also beginswith a neighbourhood graph
• We make point i (in the original data) from a weighted sum of theneighbouring points
i j j
jiji W xx̂
• Wij is 0 for any point j not in the neighbourhood (and for i=j)
• We find the weights by minimising the reconstruction error
– Subject to the constrains that the weights are non-negative and sum to 1
• Gives a relatively simple closed-form solution
2|ˆ|min ii xx
j
ijij WW 1,0
![Page 34: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/34.jpg)
Locally-Linear Embedding
• These weights encode how well a point j represents a pointi and can be interpreted as the adjacency between i and j
• A low dimensional embedding is found by then findingpoints to minimise the error
• In other words, we find a low-dimensional embeddingwhich preserves the adjacency relationships
j
jijii
ii W yyyy ˆ|ˆ|min 2
which preserves the adjacency relationships
• The solution to this embedding problem turns out to besimply the eigenvectors of the matrix M
• LLE is scale-free: the final points have the covariancematrix I– Unit scale
)()( WIWIM T
![Page 35: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/35.jpg)
Comparison
• LLE might seem like quite a different process to the previous two, butactually very similar
• We can interpret the process as producing a kernel matrix followed byscale-free kernel embedding
UXUUΛK
WWWWJIK
T
TT
n
kk )1(
ISOMAP Lap. Eigenmap LLE
Representation Neighbourhoodgraph
Neighbourhoodgraph
Neighbourhoodgraph
Similarity matrix From geodesicdistances
Graph Laplacian Reconstructionweights
Embedding UDX 2/12/1 UX UX
![Page 36: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/36.jpg)
Comparison
• ISOMAP is the only method which directly computes anduses the geodesic distances– The other two depend indirectly on the distances through local
structure
• LLE is scale-free, so the original distance scale is lost, butthe local structure is preserved
• Computing the necessary local dimensionality to find the• Computing the necessary local dimensionality to find thecorrect nearest neighbours is a problem for all suchmethods
![Page 37: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/37.jpg)
Part II: Indefinite Similarities
![Page 38: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/38.jpg)
Non-Euclidean data
• Data is Euclidean iff K is psd
• Unless you are using a kernel function, this is often nottrue
• Why does this happen?
![Page 39: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/39.jpg)
What type of data do I have?
• Starting point: distance matrix
• However we do not know apriori if our measurements arerepresentable on a manifold– We will call them dissimilarities
• Our starting point to answer the question “What type ofdata do I have?” will be a matrix of dissimilarities Dbetween objectsbetween objects
• Types of dissimilarities– Euclidean (no intrinsic curvature)
– Non-Euclidean, metric (curved manifold)
– Non-metric (no point-like manifold representation)
![Page 40: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/40.jpg)
Causes
• Example: Chicken pieces data
• Distance by alignment
• Global alignment of everything could find Euclideandistances
• Only local alignments are practical
![Page 41: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/41.jpg)
Causes
Dissimilarities may also be non-metric
The data is metric if it obeys the metric conditions
1. Dij≥ 0 (nonegativity)
2. Dij= 0 iff i=j (identity of indiscernables)
3. Dij= Dji (symmetry)
4. D ≤D + D (triangle inequality)4. Dij≤Dik+ Dkj (triangle inequality)
Reasonable dissimilarites should meet 1&2
![Page 42: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/42.jpg)
Causes
• Symmetry Dij= Dji
• May not be symmetric by definition
• Alignment: i→j may find a better solution thanj→i
![Page 43: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/43.jpg)
Causes
• Triangle violations Dij≤Dik+ Dkj
• ‘Extended objects’
k
i j
0D
• Finally, noise in the measure of D can cause all of theseeffects
0
0
0
ij
kj
ik
D
D
D
![Page 44: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/44.jpg)
Tests(1)
• Find the similarity matrix
• The data is Euclidean iff K is positive semidefinite (nonegative eigenvalues)– K is a kernel, explicit embedding from kernel embedding
CCDK s2
1
– K is a kernel, explicit embedding from kernel embedding
• We can then use K in a kernel algorithm
• Negative eigenfraction (NEF)
• Between 0 and 0.5– 0 for Euclidean similarities
i
i
i
0
NEF
![Page 45: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/45.jpg)
Tests(2)
3. Dij= Dji (symmetry)
– Mean, maximum asymmetry
– Easy to check by looking at pairs
4. Dij≤Dik+ Dkj (triangle inequality)
– Number, maximum violation
– Check these for your data (3rd involves checking all– Check these for your data (3rd involves checking alltriples – possibly expensive)
• Metric data is embeddable on a (curved)Reimannian manifold
![Page 46: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/46.jpg)
Determining the causes
• The negative eigenvalues
-5
-4
-3
-2
-1
0
Eig
env
alu
e
Noise
-2.5
-2
-1.5
-1
-0.5
0
Eig
env
alu
e
Extended Objects
-7
-6
-5
-4
-3.5
-3
-25
-20
-15
-10
-5
0
Eig
env
alu
e
Spherical manifold
![Page 47: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/47.jpg)
Corrections
• If the data is non-metric or non-Euclidean, we can ‘correct it’
• Symmetry violations
– Average
– For min-cost distances may be moreappropriate
• Triangle violations
– Constant offset
– This will also remove non-Euclidean behaviour for large enough c
)(2
1jiijjiij DDDD
),min( jiijjiij DDDD
)( jicDD ijij
– This will also remove non-Euclidean behaviour for large enough c
• Euclidean violations
– Discard negative eigenvalues
– Even when the violations are caused by noise, some information is stilllost
• There are many other approaches*
* “On Euclidean corrections for non-Euclidean dissimilarities”, Duin, Pekalska, Harol,Lee and Bunke, S+SSPR 08
![Page 48: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/48.jpg)
Part III: Techniques for non-Euclidean EmbeddingsPart III: Techniques for non-Euclidean Embeddings
![Page 49: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/49.jpg)
Known Manifolds
• Sometimes we have data which lies on a known but non-Euclidean manifold
• Examples in Computer Vision– Surface normals
– Rotation matrices
– Flow tensors (DT-MRI)
• This is not Manifold Learning, as we already know what• This is not Manifold Learning, as we already know whatthe manifold is
• What tools do we need to be able to process data like this?– As before, distances are the key
![Page 50: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/50.jpg)
Example: 2D direction
Direction of an edge in an image, encoded as a unit vector
1x
2x
x
The average of the direction vector isn’t even a direction vector (not unitlength), let alone the correct ‘average’ direction
The normal definition of mean is not correct
– Because the manifold is curved
i
in
xx1
![Page 51: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/51.jpg)
Tangent space
• The tangent space (TP) is the Euclidean space which isparallel to the manifold(M) at a particular point (P)
M
T
P
• The tangent space is a very useful tool because it isEuclidean
TP
![Page 52: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/52.jpg)
Exponential Map
• Exponential map:
• ExpP maps a point X on the tangent plane onto a point A onthe manifold– P is the centre of the mapping and is at the origin on the tangent
space
XA
MT
P
PP
Exp
:Exp
space
– The mapping is one-to-one in a local region of P
– The most important property of the mapping is that the distances tothe centre P are preserved
– The geodesic distance on the manifold equals the Euclidean distance on thetangent plane (for distances to the centre only)
),(),( PAdPXd MTP
![Page 53: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/53.jpg)
Exponential map
• The log map goes the other way, from manifold to tangentplane
MX
TM
P
pP
Log
:Log
![Page 54: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/54.jpg)
Exponential Map
• Example on the circle: Embed the circle in the complexplane
• The manifold representing the circle is a complex numberwith magnitude 1 and can be written x+iy=exp(i)
ImPieP
Re
![Page 55: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/55.jpg)
• In this case it turns out that the map is related to the normalexp and log functions
M
TP PieP
PAi
i
P
P
A
e
ei
P
AiAX
log
logLog
X
AieA APAP
P
iii
iXPXA
exp)(expexp
expExp
![Page 56: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/56.jpg)
Intrinsic mean
• The mean of a set of samples is usually defined as the sumof the samples divided by the number– This is only true in Euclidean space
• A more general formula
• Minimises the distances from the mean to the samples
i
igd ),(minarg 2 xxxx
• Minimises the distances from the mean to the samples(equivalent in Euclidean space)
![Page 57: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/57.jpg)
Intrinsic mean
• We can compute this intrinsic mean using the exponentialmap
• If we knew what the mean was, then we can use the meanas the centre of a map
• From the properties of the Exp-map, the distances are the
iMi AX Log
• From the properties of the Exp-map, the distances are thesame
• So the mean on the tangent plane is equal to the mean onthe manifold
),(),( MAdMXd igie
![Page 58: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/58.jpg)
Intrinsic mean
• Start with a guess at the mean and move towards correctanswer
• This gives us the following algorithm– Guess at a mean M0
1. Map on to tangent plane using Mi
2. Compute the mean on the tangent plane to get new estimate Mi+1
1
iiMMk A
nM
kkLog
1Exp1
![Page 59: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/59.jpg)
Intrinsic Mean
• For many manifolds, this procedure will converge to theintrinsic mean– Convergence not always guaranteed
• Other statistics and probability distributions on manifoldsare problematic.– Can hypothesis a normal distribution on tangent plane, but
distortions inevitabledistortions inevitable
![Page 60: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/60.jpg)
Some useful manifolds and maps
• Some useful manifolds and exponential maps
• Directional vectors (surface normals etc.)
map)(Log)cos(sin
1,
pax
aa
• a, p unit vectors, x lies in an (n-1)D space
map)(Expsin
cos xpa
![Page 61: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/61.jpg)
Some useful manifolds and maps
• Symmetric positive definite matrices (covariance, flowtensors etc)
map)(Expexp
map)(Loglog
00,
21
21
21
21
21
21
21
21
PXPPPA
PAPPPX
uAuuA
T
• A is symmetric positive definite, X is just symmetric
• log is the matrix log defined as a generalized matrixfunction
map)(Expexp PXPPPA
![Page 62: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/62.jpg)
Some useful manifolds and maps
• Orthogonal matrices (rotation matrices, eigenvectormatrices)
map)(Expexp
map)(Loglog
I,
XPA
APX
AAA
T
T
• A orthogonal, X antisymmetric (X+XT=0)
• These are the matrix exp and log functions as before
• In fact there are multiple solutions to the matrix log– Only one is the required real antisymmetric matrix; not easy to find
– Rest are complex
![Page 63: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/63.jpg)
![Page 64: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/64.jpg)
Embedding on Sn
• On S2 (surface of a sphere in 3D) the followingparameterisation is well known
• The distance between two points (the length of thegeodesic) is
Trrr )cos,sinsin,cossin( x
rd coscossinsincos 1
xyd
x
y
yxxyyxij rd coscossinsincos
![Page 65: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/65.jpg)
More Spherical Geometry
• But on a sphere, the distance is the highlighted arc-length– Much neater to use inner-product
– And works in any number of dimensions
2
1
2
,cos
coscos,
rrrd
rxy
xyxy
xyxy
yx
yx
xyrθ
xyθ
x
y
– And works in any number of dimensions
![Page 66: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/66.jpg)
Spherical Embedding
• Say we had the distances between some objects (dij),measured on the surface of a [hyper]sphere of dimension n
• The sphere (and objects) can be embedded into an n+1dimensional space– Let X be the matrix of point positions
• Z=XXT is a kernel matrix
• But Z xx ,• But
• And
• We can compute Z from D and find the sphericalembedding!
jiijZ xx ,
r
drZ
rrd
ij
jiij
xy
cos,
,cos
2
2
1
xx
yx
![Page 67: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/67.jpg)
Spherical Embedding
• But wait, we don’t know what r is!
• The distances D are non-Euclidean, and if we use thewrong radius, Z is not a kernel matrix– Negative eigenvalues
• Use this to find the radius– Choose r to minimise the negative eigenvalues
)(minarg* rZr or
![Page 68: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/68.jpg)
Example: Texture Mapping
• As an alternative to unwrapping object onto a plane andtexture-mapping the plane
• Embed onto a sphere and texture-map the sphere
Plane Sphere
![Page 69: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/69.jpg)
Backup slidesBackup slides
![Page 70: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/70.jpg)
Laplacian and related processes
• As well as embedding objects onto manifolds, we canmodel many interesting processes on manifolds
• Example: the way ‘heat’ flows across a manifold can bevery informative
•
equationheat2udt
du
2 isitspaceEuclidean3DinandLaplaciantheis•
• On a sphere it is
2
2
2
2
2
2
2 isitspaceEuclidean3DinandLaplaciantheis
zyx
sin
sin
1
sin
122
2
22 rr
![Page 71: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/71.jpg)
Heat flow
• Heat flow allows us to do interesting things on a manifold
• Smoothing: Heat flow is a diffusion process (will smooththe data)
• Characterising the manifold (heat content, heat kernelcoefficients...)
• The Laplacian depends on the geometry of the manifold• The Laplacian depends on the geometry of the manifold– We may not know this
– It may be hard to calculate explicitly
• Graph Laplacian
![Page 72: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/72.jpg)
Graph Laplacian
• Given a set of datapoints on the manifold, describe them bya graph– Vertices are datapoints, edges are adjacency relation
• Adjacency matrix (for example)
2
2 )/exp(
ij
ij
ijd
dA
• Then the graph Laplacian is
• The graph Laplacian is a discrete approximation of themanifold Laplacian
ijd
j
ijii AVAVL
![Page 73: Similarities, Distances and Manifold Learningsimbad-fp7.eu/images/tutorial/02-ECCV2012Tutorial.pdf• Classic Multidimensional Scaling embeds a (squared) distance matrix into Euclidean](https://reader034.vdocuments.mx/reader034/viewer/2022050310/5f721fa4c5180773994e0729/html5/thumbnails/73.jpg)
Heat Kernel
• Using the graph Laplacian, we can easily implement heat-flow methods on the manifold using the heat-kernel
kernelheat)exp(
equationheat
t
dt
d
LH
Luu
• Can diffuse a function on the manifold by
Hff '