non-euclidean problems in human centered information...

76
7/2/07 R.P.W. Duin 1 Non-Euclidean Problems in Human Centered Information Processing Robert P.W. Duin, Delft University of Technology (in cooperation with El bieta P kalska) London, 7 February 2007 [email protected], http://ict.ewi.tudelft.nl/~duin/ z c e

Upload: others

Post on 04-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 1

Non-Euclidean Problems inHuman Centered Information Processing

Robert P.W. Duin, Delft University of Technology

(in cooperation with El bieta P kalska)

London, 7 February 2007

[email protected], http://ict.ewi.tudelft.nl/~duin/

z• ce

Page 2: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 2

The Problem

Possible (initial) solutions

Further analysis

Conclusion

Contents

Page 3: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

The Problem

Page 4: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 4

Measuring human relevant information

Page 5: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 5

Measuring human relevant information

Page 6: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 6

Measuring human relevant information

The nearest neighbors sorted:

Euclidean distances used

Page 7: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 7

Spatial connectivity is lost

x1 x2 x3

x1

x2

x3

Dependent (connected) measurements are represented independently,The dependency has to be refound from the data

The Connectivity Problem in the Pixel Representation

Page 8: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 8

Feature space

Training set

Test object

Spatial connectivity is lost

The Connectivity Problem in the Pixel Representation

Page 9: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 9

ReshufflePixels

Feature space

Reshuffling pixels will not change the classification

Training set

Test object

Spatial connectivity is lost

The Connectivity Problem in the Pixel Representation

Page 10: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 10

ReshufflePixels

Feature space

Reshuffling pixels will not change the classification

Training set

Test object

Spatial connectivity is lost

The Connectivity Problem in the Pixel Representation

Page 11: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 11

Interpolation does not yield valid objects

Feature Space

image_1 image_2 image_3

image_1 image_3image_2

class subspace

The Connectivity Problem in the Pixel Representation

Page 12: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Another Metric is Needed

Page 13: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Another Metric is Needed

Learn from Human Observed Similarities

Page 14: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 14

World

Page 15: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 15

World

sensor

Page 16: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 16

World

representation

feature 1

feature 2

sensor

Page 17: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 17

World

human expert

representation

feature 1

feature 2

sensor

Page 18: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 18

Examples

Image database retrieval

Man-machine interaction

Consumer preferesences

Computer vision

Pattern recognition

Machine learning

World

human expert

representation in a non-Euclidean (non-metric) space

feature 1

feature 2

sensor

Page 19: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 19

Deformable Templates

A.K. Jain, D. Zongker, Representation and recognition of handwritten digit using deformable templates,

IEEE-PAMI, vol. 19, no. 12, 1997, 1386-1391.

Matching new objects x to various templates y

class x( ) class minarg y D x y,( )( )( )=

Examples of deformed templates

Dissimilarity measure appears to be non-metric

Page 20: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

The Problem

Non Metric Data

Page 21: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Possible Solution

A New Representation

Page 22: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 22

The Pattern Recognition System

(area)

(perimeter) x1

x2

Class A

Class B

Objects

Training Set GeneralizationRepresentation

Feature SpaceClassifier

Test Object classified as ’B’

Page 23: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 23

Dissimilarity Based Representation

d2

d1

dissimilarity space

d1

d2

Page 24: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 24

1. Positivity: dij ≥ 0 2. Reflexivity: dii = 0 3. Definiteness: dij = 0 objects i and j are identical 4. Symmetry: dij = dji 5. Triangle inequality: dij < dik + dkj 6. Compactness: if the objects i and j are very similar then dij < δ. 7. True representation: if dij < δ then the objects i and j are very similar. 8. Continuity of d

Met

ricDissimilarity - Assumptions

Page 25: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 25

A B

X

Given labeled training set T

Unlabeled object x to be classified

The traditional Nearest Neighbour rule (template matching) just finds: label(argmintrainset(di)), without using DT. Can we do any better?

dx = (d1 d2 d3 d4 d5 d6 d7)

DT

d11d12d13d14d15d16d17

d21d22d23d24d25d26d27

d31d32d33d34d35d36d37

d41d42d43d44d45d46d47

d51d52d53d54d55d56d57

d61d62d63d64d65d66d67

d71d72d73d74d75d76d77

=

Define dissimilarity measure dij between raw data of objects i and j

Dissimilarity Representation

Page 26: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 26

features

objects

Consider dissimilarities as ’features’

⇒ n objects given by n features ⇒ overtrained

⇒ select ’features’, i.e. representation objects,

by

- regularisation

- systematic selection

- random selection

Approach 1: Dissimilarity Space

DT

d11d12d13d14d15d16d17

d21d22d23d24d25d26d27

d31d32d33d34d35d36d37

d41d42d43d44d45d46d47

d51d52d53d54d55d56d57

d61d62d63d64d65d66d67

d71d72d73d74d75d76d77

=

Page 27: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 27

dx = ( d1 d2 d3 d4 d5 d6 d7)

DT

d11d12d13d14d15d16d17

d21d22d23d24d25d26d27

d31d32d33d34d35d36d37

d41d42d43d44d45d46d47

d51d52d53d54d55d56d57

d61d62d63d64d65d66d67

d71d72d73d74d75d76d77

=

r1 r2 r3

r1(d2)

r3(d7)

R3

Dissimilarities

Selection of 3 objects for representation Classification in a 3D dissimilarity space

A B

X

Given labeled training set T

Unlabeled object x to be classified

r2(d4)

Approach 2: Dissimilarity Space

Page 28: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 28

Examples of the raw data

Example Dissimilarity Space: NIST Digits 3 and 8

Page 29: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 29

d10

d300

NIST digits: Hamming distances of 2 x 200 digits

Example of Dissimilarity Space

Page 30: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 30

0 20 40 60 80 1000.1

0.14

0.18

0.22

0.26

0.3

TRAINING size per class

Cla

ssifi

catio

n er

ror

Modified−Hausodorff dist. on contour digit

10

CNN selection 45 90

31

30

* Nearest neighbour results

Fisher LD

20 Size of the representation set

Dissimilarity based Classification outperforms Nearest neighbour Rule

Dissimilarity Space Classification ⇔ Nearest Neighbour rule

Page 31: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 31

A B

Training set

Is there a feature space X for which Dist(X,X) = D?

Rk

x1

x2

Euclidean distances D

Dissimilarity matrix D X?

Approach 2: Embedding the Dissimilarity Representation

Page 32: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 32

−300 −200 −100 0 100 200 300 400 500

−400

−300

−200

−100

0

100

200

ALEXANDRA

BALCLUTHA

BLENHEIM

CHRISTCHURCH

DUNEDIN FRANZ JOSEF

GREYMOUTH

INVERCARGILL

MILFORD

NELSON

QUEENSTOWN

TE ANAU

TIMARU

13 x 13Distance Matrix

Euclidean Embedding: Multidimensional Scaling

Page 33: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 33

A B

Training set

Dissimilarity matrix Ddij

dkjdik

i

k

j

or if sometimes dij > dik + dkj : triangle inequality not satisfied.

embedding in Euclidean space not possible → Pseudo-Euclidean embedding

If the dissimilarity matrix cannot be explained from a vector space,

e.g. Hausdorff and Hamming distance.

(Linear) Euclidean Embedding

Page 34: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 34

Metric non-Euclidean Example

B

C

A

D10

10

10

5.1

5.1

5.1

Triangle inequality is satisfied, but Euclidean embedding is impossible.

Examples: Hausdorff, L1-norm

Page 35: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 35

Non-Metric Distances

14.9

7.8 4.1

object 78

object 419

object 425

D(A,C)A

B

C

D(A,C) > D(A,B) + D(B,C)

D(A,B) D(B,C)

A BC

J(A,C) = 0; J(A,B) = large; J(C,B) = small ≠ J(A,B)

Weighted edit-distance for strings Single Linkage Clustering

µA µB–

x

σA σB

The Fisher Criterion

Bunke’s Chicken Dataset

J A B,( )µA µB– 2

σA2 σB

2+-------------------------=

Page 36: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 36

(Pseudo-)Euclidean Embedding

m×m D is a given, imperfect dissimilarity matrix of training objects.

Construct inner-product matrix: ,

Eigenvalue Decomposition ,

Select k eigenvectors: (problem: < 0)

Let ℑk be a k x k diag. matrix, ℑk(i,i) = sign(Λk(i,i))

Λk(i,i) < 0 → Pseudo-Euclidean

m×n Dz is the dissimilarity matrix between new objects and the training set.

The inner-product matrix:

The embedded objects:

B 12---JD 2( )J–= J I 1

m-----11T–=

B QΛQT=( )p

X QkΛk

12---

= Λk

X Qk Λk

12---

ℑk=

Bz12--- Dz

2( )J 1n---11TD 2( )J–( )–=

Z BzQk Λk

12---

ℑk=

Page 37: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 37

If D is non-Euclidean, B has p positive and q negative eigenvalues.

Solutions:• Remove all eigenvectors with small and negative eigenvalues

• Take absolute values of eigenvalues and proceed

• Construct a pseudo-Euclidean space

Pseudo-Euclidean Embedding

0 200 400 600 800 1000−50

0

50

100

150

200Eigenvalues

Page 38: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 38

Pseudo-Euclidean Space (Krein Space)If D is non-Euclidean, B has p positive and q negative eigenvalues.A pseudo-Euclidean space ε with signature (p,q), k =p+q, is a non-degenerate inner product space ℜk = ℜp ⊕ ℜq such that:

, x y,⟨ ⟩ε xTℑpqy xii 1=

p

∑ yii p 1+=

q

∑–= = ℑpqIp p× 0

0 Iq q×–=

dε2 x y,( ) x y– x y–,⟨ ⟩ε dp

2 x y,( ) dq2 x y,( )–= =

11w=J v

11v J x = 0T

<x,x> > 0ε

<x,x> < 0ε R1

R1

i

<x,x> > 0ε

<x,x> < 0ε

F

D

A

CE

B

−4

−3

−2

−1

0

1

2

3

4

−4 −3 −2 −1 0 1 2 3 4

v

Rq

R p

���������������

���������������

��������������������

��������������������

���������������

���������������

����������

����������

���������������

���������������

��������������������������������������������������������

��������������������������������������������������������

2 2d (x,y) = d (x,y) − d (x,y) p q 2

Page 39: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 39

Dissimilarity Representation ⇔ Kernels

Let R = {x1, x2, ... , xn}

Kernel classifier:

Linear classifier in a dissimilarity space:

If D has an Euclidean behavior, then

or

are Mercer kernels: SVM can be directly applied, Otherwise:

• Transform D appropriately or regularize K

• Treat K as kernels in pseudo-Euclidean space: indefinite SVM

f x( ) wTK x R,( ) w0+ wiK x xi,( )wi SV∈

∑ w0+= =

f D x R,( )( ) wTD x R,( ) w0+=

K 12---JD*2J–= K D*2 σ2⁄–{ }exp=

Page 40: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 40

1. Nearest neighbour rule

2. Reduce training set to representation set

⇒ dissimilarity space

3. Embedding:Select large Λii > 0 ⇒ Euclidean space }discriminant function

Select large |Λii| > 0 → pseudo-Euclidean space

A BTraining set Dissimilarity matrix D

XTest object Dissimilarities dx with training set

Dissimilarity Based Classification

Page 41: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 41

Dissimilarity Space equivalent to Embedding better than Nearest Neighbour Rule

Three Approaches Compared for the Zongker Data

0 500 1000 15000

0.1

0.2

Size of the representation set R

Ave

rage

d ge

nera

lizat

ion

erro

r

Digit data

RLDC; Rep. SetLP; Rep. SetRLDC; Embed.1−NN3−NN

Nearest neighbour Rule

Dissimilarity Space

Embedding

Page 42: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 42

Convex

Heptagons:

Pentagons:

Minimum edge length: 0.1 of maximum edge length

Distance measures: Hausdorff D = max{ maxi(minj(dij)) , maxj(mini(dij)) }.

dij = distance between vertex i of polygon_1 and vertex j of polygon_2.Polygons are scaled and centered.

D

Modified Hausdorff D = max{meani(minj(dij)) , meanj(mini(dij)) }. (no metric!)

find the largest from the smallest vertex distance

no class overlapzero error

Example 2: Polygons

Page 43: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 43

0 500 1000 15000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Size of the representation set R

Ave

rage

d ge

nera

lizat

ion

erro

r

Polygon data

RLDC; Rep. SetLP; Rep. SetLDC; Embed.1−NN3−NN

Nearest neighbour Rule

Dissimilarity Space

Embedding

Zero error difficult to reach

Dissimilarity Based Classification of Polygons

Page 44: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 44

4 6 8 10 14 20 30 40 55 70 100 140 2000

1

2

3

4

5

6

7

8

9

Number of prototypes

Ave

rage

cla

ssifi

catio

n er

ror

(in %

)Polydistm; #Objects: 1000; Classifier: BayesNQ

1−NN−finalk−NN−finalk−NNk−NN−DSEdiCon−1−NNRandom *RandomC *ModeSeek *KCentres *FeatSel *KCentres−LP *LinProg *EdiCon *

Prototype Selection: Polygon Dataset

The classification performance of the quadratic Bayes Normal classifier and the k-NN in dissimilarity spaces and the direct k-NN, as a function of the number of selected prototypes. Note that for 10-20 prototypes already better results are obtained than by using 1000 objects in the NN rules.

Page 45: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Possible Solution

The Dissimilarity Representation

Page 46: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Possible Solution

The Dissimilarity Representation

E. Pekalska et al., The Dissimilarity Representation for Pattern Recognition,

Foundations and Applications,

World Scientific, Singapore, 2005, ISBN 981-256-530-2

Page 47: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

An Analysis

Causes of Non-Metric Data

Page 48: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 48

Non-Metric Measurements

14.9

7.8 4.1

object 78

object 419

object 425

Weighted edit-distance for strings

Bunke’s Chicken Dataset

Page 49: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 49

Representation by Orders Sets (Strings); Edit Distance

X = (x1, x2, .... , xk) Y = (y1, y2, .... , yn)

DE(X,Y) : Σ edit operations X ⇒ Y(insertions, deletions, substitutions)

DE(snert ,meer ) = 3:snert ⇒ seert ⇒ seer ⇒ meer

DE( ner ,meer ) = 2:ner ⇒ mer ⇒ meer

b

a

u

v

u

b

u

a u a v v b

Possibly weighted.

Triangle inequality ⇒ computational feasible.

Length normalisation problem: D(aa,bb) < D(abcdef,bcdd)

See Marzal & Vidal, IEEE PAMI-15, 1993, 926-932

Page 50: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 50

Better Measurements are Sometimes more Non-Euclidean

E.Pekalska et al., Non-Euclidean or non-metric measures can be informative, SSSPR 2006, Springer, LNCS 4109.

Clas

sific

atio

n Er

ror

Non

-Euc

lidea

ness

Page 51: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 51

VJ

1800:

Crossing the Jostedalsbreen was impossible.

Page 52: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 52

VJ

1800:

Crossing the Jostedalsbreen was impossible.

Travelling around (200 km) lasted 5 days.

Page 53: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 53

VJX

1800:

Crossing the Jostedalsbreen was impossible.

Travelling around (200 km) lasted 5 days.

Unitll the shared point X was found.

Page 54: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 54

VJX

1800:

Crossing the Jostedalsbreen was impossible.

Travelling around (200 km) lasted 5 days.

Unitll the shared point X was found.

People could visit each other in 8 hours.

Page 55: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 55

VJX

1800:

Crossing the Jostedalsbreen was impossible.

Travelling around (200 km) lasted 5 days.

Unitll the shared point X was found.

People could visit each other in 8 hours.

D(V,J) = 5 days

D(V,X) = 4 hours

D(X,J) = 4 hours} Non-Metric

Page 56: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 56

Cause of non-metric data - 1

14.9

7.8 4.1

object 78

object 419

object 425

Weighted edit-distance for strings

Bunke’s Chicken Dataset

Page 57: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 57

Cause of non-metric data - 1

14.9

7.8 4.1

object 78

object 419

object 425

Weighted edit-distance for strings

Bunke’s Chicken Dataset

Large distances are overestimated,due to computational problems

Page 58: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 58

Cause of non-metric data - 2

Page 59: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 59

Cause of non-metric data - 2

Page 60: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 60

Cause of non-metric data - 2

Page 61: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 61

Cause of non-metric data - 2

Page 62: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 62

Cause of non-metric data - 2

?

Non-metric data due to partially observed projections

Page 63: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 63

Cause of non-metric data - 2

?

Non-metric data due to partially observed projections

Small distances are underestimated

Page 64: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 64

Cause of non-metric data - 2

S =

5 3 1 2 42 1 3 4 5

5 3 1 4 25 2 4 1 3

4 2 3 5 11 3 2 5 4

1 4 3 5 23 5 1 4 2

Preferences for items

Consumers

Example: consumer preferences for recommendation systems

Page 65: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 65

Cause of non-metric data - 3

Distance(Table,Book) = 0

Distance(Table,Cup) = 0

Distance(Book,Cup) = 0.75

D(A,C)A

B

C

D(A,C) > D(A,B) + D(B,C)

D(A,B) D(B,C)

Single Linkage Clustering

Page 66: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 66

Cause of non-metric data - 3

Page 67: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 67

Cause of non-metric data - 3

Non-Euclidean Human Relations

Page 68: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 68

Causes of non-metric data

1. Overestimated large distances (too difficult to compute)

2. Underestimated small distances (one-sided view of objects)

caused by the construction of complicated measures, needed to

correspond with human observations.

Page 69: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

7/2/07 R.P.W. Duin 69

Causes of non-metric data

1. Overestimated large distances (too difficult to compute)

2. Underestimated small distances (one-sided view of objects)

caused by the construction of complicated measures, needed to

correspond with human observations.

3. Essential non-metric distance definitions

as the human concept of distance differs from the mathematical one.

Page 70: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Conclusion

The Human World is Non-Metric

Page 71: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Solutions

Neglect it: Use Euclidean Distances only

Page 72: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Solutions

Live with it: Heuristic Corrections

Page 73: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Solutions

Adapt to it: Dissimilarity Space

d2

dissimilarity space

d1

d2

Page 74: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Solutions

Adapt to it: Krein Space

11w=J v

11v J x = 0T

<x,x> > 0ε

<x,x> < 0ε R1

R1

i

<x,x> > 0ε

<x,x> < 0ε

F

D

A

CE

B

−4

−3

−2

−1

0

1

2

3

4

−4 −3 −2 −1 0 1 2 3 4

v

Rq

R p

���������������

���������������

��������������������

��������������������

���������������

���������������

����������

����������

���������������

���������������

��������������������������������������������������������

��������������������������������������������������������

2 2d (x,y) = d (x,y) − d (x,y) p q 2

Page 75: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Solutions

Adapt to it: Non-Point Representations

Blob Representation String Representation

Page 76: Non-Euclidean Problems in Human Centered Information ...homepage.tudelft.nl/a9p19/presentations/Imperial_Human...Deformable Templates A.K. Jain, D. Zongker, Representation and recognition

Thank You!