spherical embedding and classification
TRANSCRIPT
Spherical Embedding and Classification
Richard C. Wilson
Edwin R. Hancock
Dept. of Computer Science
University of York
Background
Dissimilarities are a common starting point in pattern recognition
• Dissimilarity matrix D
• Find the similarity matrix S
If S is PSD then
– S is a kernel matrix K
– Can use a kernel machine
– Or embed points in a Euclidean space
– And use standard vector-based PR
We can identify D with the (Euclidean) distances between points:
2/1
2/12/1
UX
UUXXKTT
jijjiijijijiij dD xxxxxxxxxxxx ,2,,,),(2
)()(2
1
nn
JID
JIS
Background
Commonly, when comparing structural representations, we do some kind
of alignment
The similarity matrix is usually indefinite (negative eigenvalues) and we
do not get a kernel
There is no representation as points in a Euclidean space
Two basic approaches
– Modify the dissimilarities to make them Euclidean
– Use a non-Euclidean space
Ideally, any non-Euclidean space should be metric and feasible to
compute distances in
0 if , 2/1 i
T UXUUK
Riemannian space
We want to embed the objects in space so we can compute statistics of
them
Need a metric distance measure
Space cannot be Euclidean (normal, flat space)
Riemannian space fulfils all these requirements
• Space is curved – distances are not Euclidean, indefinite similarities
• Metric space
• Distances are measured by geodesics (shortest curve joining two
points in the space)
– Can be difficult to compute the geodesics
– Need to use a space where geodesics are easy
• Obvious choice is space of constant curvature everywhere
Spherical space
• The elliptic manifold has constant positive curvature everywhere
– Can visualise as the surface of a hypersphere embedded in Euclidean
space
– Embedding equation:
– Has positive definite metric tensor, so it is Riemannian
– Sectional curvature is K=1/r2
Well-known example: the sphere
22 rxi
i
yxyxT
dvrvdurds
vrvurvurzyxP
rzyx
,
sin
)cos,sincos,sinsin(),,(
222222
2222
Non-Euclidean geometry
Previous work
• Lindman and Caelli (1978)
– Embedding of psychological similarity data in elliptic and
hyperbolic(negatively curved) space
– Optimisation method suitable for small datasets
• Cox and Cox (1991)
– Define the stress of a configuration on the sphere and minimise
– Difficult optimisation problem
– Not practical on large datasets
• Shavitt and Tankel (2008)
– Embedding of internet connectivity into hyperbolic space
– Physics-based simulation of partical dynamics
In this paper we use the exponential map to develop an
efficient solution for large datasets
Geodesics on the sphere
• A geodesic curve on a manifold is the curve of shortest length joining
two points
– The geodesic distance between two points is the length of the geodesic
joining them
• Geodesics are „great circles‟ of the hypersphere
• Distance between points dependent on the angle and curvature
2
1
2
,cos
cos,
rrd
r
rd
ji
ij
ijji
ijij
xx
xx
ijr
ij
i
j
Elliptic space embedding
Problem:
Find points on the surface of a hypersphere such that the
geodesic distances are given by D
Non-linear constrained optimisation problem
Computationally expensive for large datasets
Our strategy is to update each point position separately
2
1
22*2
,
,cos where
||
min
rrd
r
dd
ji
ij
i
ijijr
xx
x
X
Exponential Map
One of the difficulties of the embedding is that it is constrained to a
hypersphere
The exponential map is a tool from differential geometry which allows us
to map between a manifold and the tangent space at a point
• The tangent plane is a Euclidean subspace
• Log part: from the manifold to the tangent plane
• The exp part goes in the opposite direction
• The map is defined relative to a centre M
YX MLog
XY MExp
M X
Y
Log
Exp
TM
Exponential map for sphere
Map points on the sphere onto the tangent plane
– Map has an origin where sphere touched tangent plane (O)
– Distances (to origin) are preserved (OX=OX‟)
Optimise on the tangent plane
X
X’
Exponential map for the sphere
Given a centre m, a point x on the sphere and a point x‟ on the tangent
plane:
The tangent plane is flat, so distances are Euclidean:
When xi is the centre of the map (xi=m) then the distances are exact,
distortion will occur as xi moves away from the centre
• Project current positions onto tangent plane using xi as centre
• Compute gradient of embedding error on the tangent plane
• Update position xi
sphere) (to 'sin
cos
plane) tangent (to )cos(sin
'
xmx
mxx
)()(2
ij
T
ijijd xxxx
Updating procedure
i
k
i
k
i
j
jiijiji
ji
ijij
E
ddE
ddE
)()1(
2*2
,
22*2
4
xx
xx
For large datasets, computation of second derivatives is
expensive, so we use a simple gradient descent
Can however choose an optimal step size as the smallest root
of the cubic:
)( ,with
023
2*2
222232
ji
T
jijijj
j
jj
j j
jj
j
j
Edd
EEEn
xx
Initialisation
• Need a good initialisation
• Method presented in CVPR10
• If λ0=0 the result is exact
• Otherwise, this a good starting point for our optimisation
2/1
0
*
12
)(minarg
cos)(
ZZ
r
T
rr
rrr
ΛUX
Z
XXDZ
Algorithm
Algorithm
Minimise Z(r) to find initial embedding
Iterate:
For each point xi:
Map points onto tangent plane at xi
Optimise on tangent plane
Map points onto manifold
2/1
0
* ,)(minarg ZZr
rr ΛUXZ
)cos(sin
'
mxx
'sin
cos xmx
i
k
i
k
i
j
jiijiji
E
ddE
)()1(
2*24
xx
xx
Classifiers in Elliptic space
In practical applications, we want to do some kind of learning on the data,
for example classification
• NN classifier is straightforward, as we can compute distances
• In principle, we can implement any geometric classifier as we have a
smooth metric space
– But the classifier must respect the geometry of the manifold
• A simple classifier we can use is the nearest mean classifier (NMC)
– Can compute the generalised mean on the manifold for each class
process) (iterative Log1
Exp
),(minarg
mean dGeneralise
)()(
)1(
2
i
i
k
i
i
kk
n
d
xx
xxx
xx
x
Some examples
*Reproduced from “Classification of silhouettes using contour fragments”
Daliri and Torre, CVIU 113(9) 2009
*
Conclusions
• We can use Riemannian spaces to represent data from
dissimilarity measures when they cannot be represented in
Euclidean space
• Showed efficient method for embedding in elliptic space
which works on large datasets
– Produces embeddings of low distortion
• Can define simple classifiers which respect the manifold
– NN, NMC
• Need to extend to more sophisticated geometric classifiers