fast algorithm for nearest neighbor search based on a lower bound tree yong-sheng chen yi-ping hung...

18
Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-S hann Fuh 8 th International Conference on Computer Vi sion

Upload: andrew-morrison

Post on 17-Dec-2015

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

Fast Algorithm for Nearest Neighbor Search Based on a

Lower Bound Tree

Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh

8th International Conference on Computer Vision

Page 2: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

2

OutlineIntroductionMultilevel Structure and LB-TreeAgglomerative ClusteringData TransformationWinner-Update SearchConclusions

Page 3: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

3

Nearest Neighbor Search Problem

isi

i pq 1min argˆ

q

ip

Given a fixed set of s points in Rd, P={p1,p2,…,ps}

For a query point q, find in P the point, pî, closest to q

1p

2p3p

sp

D1

D2

Dd

Page 4: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

4

Applications of Nearest Neighbor Search

Object (pattern) recognition Image matching

Stereo matchingData compression

Block motion estimationVector quantization

Information retrieval in database systems Image and video databasesDNA sequence databases

Page 5: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

5

Space partition methodsk-d tree, 1975R-tree, 1984SS-tree, 1996SR-tree, 1997Pyramid, 1998

Elimination-based methodsBranch and bound, 1975Projection elimination, 1975, 1987Threshold rejection, 1997

Literature Review

Page 6: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

6

Central Idea of This WorkWe skip distance calculation for a point in data

base by calculating and comparing its distance lower bound.Distance lower bound was derive from Minkowski’s

inequality.Computational cost of the distance lower bound is l

ess than that of the distance itself.Optimal search is guaranteed.

Page 7: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

7

Central Idea of This Work Reduce the number of distance lower bounds

actually calculated by using an LB-tree constructed in the preprocessing stage the winner-update search strategy for traversing the

constructed LB-tree

Tighten distance lower bounds by constructing the LB-tree with an agglomerative

clustering method transforming each data point with

Wavelet transform KL transform (principal component analysis)

Page 8: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

8

Proposed Algorithm for NN Search

Sample points in database

data transformation

Query point

data transformation

search by using winner-update strategy

agglomerative clustering

multilevel structure

Lower-bound tree

Page 9: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

9

Multilevel Structure of a Point

For a point p=[p1, p2, …, pd] in Rd, d=2L, we denote its multilevel structure: {p0, p1, …, pL}

876543213 ppppppppp

43212 ppppp

211 ppp

10 pp

43212 qqqq q

876543213 qqqqqqqqq

211 qq q

10 q q

Llll ,...,022

,qpqp

2

00 qp

2

11 qp

2

22 qp

2

33 qp

EX

Page 10: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

10

Lower Bound TreeMultilevel structure

s of every points in the database, p1,p2,…,ps

Idea of node reductionSelect representati

vesHierarchical, agglo

merative clustering

Page 11: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

11

Hierarchical, Agglomerative Clustering

D1

D2

Ex:

r1 r2

Page 12: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

12

Lower Bound Tree Representative fo

r each cluster mean point ml

j*

distance rlj* of the

farthest point in the cluster from ml

j*

Each point p*l satisfies

l

j

l

j

lr **

2

* mp

Page 13: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

13

Distance Lower Bound Using LB-Tree For each point p*, we can derive the lower bound

of its distance to the query point q:

l

j

ll

jr **

2 qm

2

*

2

* llqpqp

p*l

ql

2

*

2**

l

j

lll

jmpqm

mlj*

rlj*

l

j

ll

j

ll

jrd ***

2),( qmqmLB

Page 14: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

14

Data TransformationWhy Data Transformation?

Make anterior dims more discriminative than posterior dims.

Lower bound can be tightened.By data content, two transformations used i

n this work:Haar wavelet transform(autocorrelated data)KL transform (principal component analysis)

(object recognition)

Page 15: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

15

Winner-Update Search for LB-Tree Traversal

8 2

3 2

9 63

4 7

Given a query point q

Calculate dLB for all the nodes in level 0Choose the node having the minimum dLB as the temporary winner

While the winner is not at the bottom level

Replace the winner node with its childrenCalculate dLB for each new child nodeChoose the node having the minimum dLB as the temporary winner

Output the final winner

Page 16: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

16

Other Query TypesWinner-update algorithm can be easily

extended to support other useful query types:Progressive search for k-nearest neighborsSearch for k-nearest neighbors within a

distance thresholdSearch for neighbors close to the nearest

neighbor

Page 17: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

17

Conclusions Fast nearest neighbor search techniques, includin

g Lower bound tree Winner-update search strategy Data transformation

Wavelet transform Principal component analysis

Various useful query types According to our experiments on Nayar’s object re

cognition database, our algorithm can be more than one thousand faster than the full search algorithm.

Page 18: Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer

18

Thank you.