reverse furthest neighbors in spatial databases

33
Reverse Furthest Neighbors in Spatial Databases Bin Yao, Feifei Li, Piyush Kumar Florida State University, USA

Upload: oscar-cantu

Post on 30-Dec-2015

38 views

Category:

Documents


2 download

DESCRIPTION

Reverse Furthest Neighbors in Spatial Databases. Bin Yao , Feifei Li, Piyush Kumar Florida State University, USA. A Novel Query Type. Reverse Furthest Neighbors (RFN) Given a point q and a data set P, find the set of points in P that take q as their furthest neighbor Two versions : - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Reverse Furthest Neighbors in Spatial Databases

Reverse Furthest Neighbors in Spatial Databases

Bin Yao, Feifei Li, Piyush Kumar

Florida State University, USA

Page 2: Reverse Furthest Neighbors in Spatial Databases

A Novel Query Type Reverse Furthest Neighbors (RFN)

Given a point q and a data set P, find the set of points in P that take q as their furthest neighbor

Two versions: Monochromatic Reverse Furthest Neighbors (MRFN) Bichromatic Reverse Furthest Neighbors (BRFN)

Page 3: Reverse Furthest Neighbors in Spatial Databases

Motivation and Related works

Motivation: inspired by RNN Reverse Nearest Neighbor

Set of points taking query point as their NN.Monochromatic & Bichromatic RNN

Many applications that are behind the studies of the RNN have the corresponding “furthest” versions.

Page 4: Reverse Furthest Neighbors in Spatial Databases

MRFN Application P: a set of sites of interest in a region For any site, it could find the sites that take itself

as their furthest neighbors This has an implication that visitors to the RFN of

a site are unlikely to visit this site because of the long distance.

Ideally, it should put more efforts in advertising itself in those sites.

Page 5: Reverse Furthest Neighbors in Spatial Databases

BRFN Application P: a set of customers Q: a set of business competitors offering similar

products A distance measure reflecting the rating of

customer(p) to competitor(q)’s product. A larger distance indicates a lower preference. For any competitor in Q, an interesting query is to

discover the customers that dislike his product the most among all competing products in the market.

Page 6: Reverse Furthest Neighbors in Spatial Databases

BRFN Example : customer : product

876531 ,,,,: of RFN pppppq

1p

2p

1q

4p

3p

6p

5p8p

2q

3q

7p

4213 ,,: of RFN pppq : of RFN 2q

Page 7: Reverse Furthest Neighbors in Spatial Databases

MRFN and BRFN

MRFN for q and P:

BRFN for a point q in Q and P are:

q),fn(),,( QpPppPQqBRFN

q)}{,fn(),( qPpPppPqMRFN

Page 8: Reverse Furthest Neighbors in Spatial Databases

Outline

MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset

BRFN

Page 9: Reverse Furthest Neighbors in Spatial Databases

MRFN: Progressive Furthest Cell Algorithm (first algorithm) Lemma: Any point from the furthest Voronoi cell(fvc) of p

takes p as its furthest neighbor among all points in P.

1p

3p2p

)( 1pfvc

5p4p

Page 10: Reverse Furthest Neighbors in Spatial Databases

Progressive Furthest Cell Algorithm (PFC)PFC(Query q; R-tree T)

Initialize two empty vectors and ; priority queue L with T’s root node; fvc(q)=S;

While L is not empty do Pop the head entry e of L If e is a point then, update the fvc(q)

If fvc(q) is empty, return; If e is in fvc(q), then Push e into ;

else If e fvc(q) is empty then push e to ; Else for every child u of node e

If u fvc(q) is empty, insert u into ; Else insert u into L ;

CV PV

CV

PV

PV Update fvc(q) using points contained by entries in ; Filter points in using fvc(q);CV

PV

1p

3p2p

)( 1pfvc

4p

)( 1pfvc

Page 11: Reverse Furthest Neighbors in Spatial Databases

Outline

MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset

BRFN

Page 12: Reverse Furthest Neighbors in Spatial Databases

MRFN: Convex Hull Furthest Cell Algorithm(second algorithm)

Lemma: the furthest point for p from P is always a vertex of the convex hull of P. (i.e., only vertices of CH have RFN.)

Find the convex hull of P; if , then return empty; else

Compute using ; Set fvc(q,P*) equal to fvc(q, ); Execute a range query using fvc(q,P*) on T;

PC

PCq

*PC }{qCP

*PC

CHFC(Query q; R-tree T (on P))

// compute only once

Page 13: Reverse Furthest Neighbors in Spatial Databases

Outline

MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset

BRFN

Page 14: Reverse Furthest Neighbors in Spatial Databases

Dynamically updating to dataset

PFC: update R-tree CHFC:

update R-tree& re-compute CH (expensive)Qhull algorithm

Page 15: Reverse Furthest Neighbors in Spatial Databases

Dynamically Maintaining CH: insertion

1p4p

3p2p

6p

5p

7p}{}{ 77 pCpP P

CC

Page 16: Reverse Furthest Neighbors in Spatial Databases

Dynamically Maintaining CH: deletion

2p

8p

1p9p

3p

4p5p

6p

7p

The qhull algorithm

Page 17: Reverse Furthest Neighbors in Spatial Databases

Dynamically Maintaining CH

2p

3p

2e

3e

1e

1p

minVdist

maxVdist

Adapt qhull to R-tree

Page 18: Reverse Furthest Neighbors in Spatial Databases

Outline

MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset

BRFN

Page 19: Reverse Furthest Neighbors in Spatial Databases

BRFN

After resolving all the difficulties for the MRFN problem, solving the BRFN problem becomes almost immediate.

Observations: all points in P that are contained by fvc(q,Q) will have

q as their furthest neighbor. Only the vertexes of the convex hull have fvc.

Page 20: Reverse Furthest Neighbors in Spatial Databases

BRFN algorithm

BRFN(Query q, Q; R-tree T) Compute the convex hull of Q; If then return empty; Else

Compute fvc(q, );Execute a range query using fvc(q, ) on T;

QC

QCq

QC

QC

Page 21: Reverse Furthest Neighbors in Spatial Databases

BRFN: Disk-Resident Query Group

Limitation: query group size may not fit in memory

Solution: Approximate convex hull of Q (Dudley’s approximation)

Page 22: Reverse Furthest Neighbors in Spatial Databases

Experiment Setup

Dataset: Real dataset (Map: USA, CA, SF)Synthetic dataset (UN, CB, R-Cluster)

MeasurementComputation time Number of IOsAverage of 1000 queries

Page 23: Reverse Furthest Neighbors in Spatial Databases

MRFN algorithm

CPU computation Number of IOs

Page 24: Reverse Furthest Neighbors in Spatial Databases

BRFN algorithms

CPU: vary A, Q=1000 IOs: vary A, Q=1000

Page 25: Reverse Furthest Neighbors in Spatial Databases

Scalability of various algorithms

MRFN number of IOs BRFN number of IOs

Page 26: Reverse Furthest Neighbors in Spatial Databases

Conclusion

Introduced a novel query (RFN) for spatial databases.

Presented R-tree based algorithms for both versions of RFN that feature excellent pruning capability.

Conducted a comprehensive experimental evaluation.

Page 27: Reverse Furthest Neighbors in Spatial Databases

Thank you!Questions?

Page 28: Reverse Furthest Neighbors in Spatial Databases

Datasets: San Francisco

Page 29: Reverse Furthest Neighbors in Spatial Databases

Datasets: California

Page 30: Reverse Furthest Neighbors in Spatial Databases

Datasets: North America

Page 31: Reverse Furthest Neighbors in Spatial Databases

Datasets : uncorrelated uniform

Page 32: Reverse Furthest Neighbors in Spatial Databases

Datasets : correlated bivariate

Page 33: Reverse Furthest Neighbors in Spatial Databases

Datasets : random clusters