1
Hypersphere Dominance: An Optimal Approach
Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min XieThe Hong Kong University of Science and Technology
Prepared by Cheng LongPresented by Cheng Long
24 June, 2014
Hyperspheres
A hypersphere in a d-dimensional space (center, radius) the set of all points that have their distances from
the center bounded by the radius
2
ππ π
π
2D: a disk 3D: a ball
Hyperspheres are commonly used Uncertain databases
the location of an uncertain object Spatial databases
SS-tree, SS+-tree, M-tree, VP-tree and SR-tree
3
SS-tree: similar to R-tree with hyperrectangles replaced by hyperspheres
SS-tree based on A-Hlayout of 8 objects: A-H
Motivating example Scenario
Ada has her location uncertain, but constrained in a disk Sa. Bob has his location uncertain, but constrained in a disk Sb. Connie has her location uncertain, but constrained in a disk Sq.
Question Is Ada always closer to Connie than Bob?
4
(Ada)
Sb (Bob)
Sq (Connie) Sq
(Connie)
(Ada)
Sb (Bob)
No
For this specification of the locations, Ada is closer to Connie than Bob
In fact, for all specifications of the locations, Ada is closer to Connie than Bob
Yes
Hypersphere dominance: definition
5
Definition 1: Hypersphere dominanceGiven
, , and , it decides whether
Dominance condition
Yes: No:
Basic operator used in many queries Probabilistic RkNN query [Lian and Chen, VLDBJβ09] AkNN query [Emrich et al., SSDBMβ10] kNN query [Long et al., SIGMODβ14]
Hypersphere dominance: existing solutionsβoverview
MinMax [Roussopoulos et al., SIGMOD Recordβ95; Hjaltason and Samet, TODSβ99]
MBR [Emrich et al., SIGMODβ10]
GP [Lian and Chen, VLDBJβ09]
Trigonometric [Emrich et al., SSDBMβ10]
6
Hypersphere dominance: existing solutionsβMinMax (1)
7
ππ ππ
π π π ππ π π π
πππ₯π·ππ π‘ (ππ ,ππ)=π·ππ π‘ (π π ,ππ )+ππ+π π =
( and Sb overlap), β β ( and Sb do not overlap)
πππ π π ππ π π π
ππ
Definition: the maximum distance between a point in and a point in Sb
Definition: the minimum distance between a point in Sa and a point in Sb
πππ₯π·ππ π‘ (ππ ,ππ) ππππ·ππ π‘ (ππ ,ππ)ππππ·ππ π‘ (ππ ,ππ)=0
πππ π
π π π ππ π π π
Hypersphere dominance: existing solutionsβMinMax (2)
8
MinMaxCompute Compute If
Return Else
Return
ππ
SbSq ππ
Sb
Sq
πππ₯π·ππ π‘ (ππ ,ππ)ππππ·ππ π‘ (ππ ,ππ)
πππ₯π·ππ π‘ (ππ ,ππ)
ππππ·ππ π‘ (ππ ,ππ)
π·ππ(ππ ,ππ ,ππ)=π‘ππ’πMinMax returns
βfalse negativeβ
<
MinMax returns
>
correct π·ππ(ππ ,ππ ,ππ)=π‘ππ’π
bisector and
Hypersphere dominance: existing solutions--Insufficiency
Methods Correct? Sound? Efficient?
MinMax Yes No Yes
MBR Yes No Yes
GP Yes No Yes
Trigonometric No Yes Yes
9
Criteria of a method:1. Correctness: No false positive2. Soundness: No false negative3. Efficiency: runs in O(d) where d is the number of dimensionality
Our approach is the only one which is correct, sound and efficient!
Our approach(Hyperbola)
Yes Yes Yes
Our approach: major idea Step 1: pre-checking
Do the decision directly Step 2: dominance checking
Drive an equivalent condition of which is easier to decide Do the decision
10
For cases where it is easy to decide whether the dominance condition is true For cases where it is difficult to decide whether the dominance condition is true directly
Our approach: pre-checking
11
ππ
Sb
Sq ππ
Sb
Sq
Step 1: Pre-checking:If and Sb overlap
Return If Sb and Sq overlap
Return and Sb overlapπ·ππ(ππ ,ππ ,ππ)= ππππ π
Sb and Sq overlapπ·ππ(ππ ,ππ ,ππ)= ππππ π
Our approach: dominance checking (1)
12
Dominance condition:
Equivalent condition (1):
Proof of the equivalence between Condition (1) and Condition (2):β=>β: By contradiction β<=β:
Step 2: Dominance checking:Derive an equivalent condition of and check whether the derived condition is true
Our approach: dominance checking (5)
13
Equivalent condition (2):
Equivalent condition (3):
πππ₯π·ππ π‘ (π ,ππ)=π·ππ π‘ (π ,ππ )+π π+0=π·ππ π‘ (π ,ππ )+π π πβπ·ππ π‘(π ,ππ)=π·ππ π‘ (π ,ππ)βπ πβ0=π·ππ π‘ (π ,ππ)βπ π
Our approach: dominance checking (3)
14
Space partitioning: Boundary : Region : Region :
Boundary : Region Ra
Region Rb
Equivalent condition (4): is in Region ( is in Region )
SaSb
ca
cb
Sqcq
Equivalent condition (3):
Our approach: dominance checking (4)
15
Equivalent condition (5): is in Region and
Equivalent condition (4): is in Region
rq
ππππ₯βππ·ππ π‘ (ππ ,π₯ )
SaSb
ca
cb
Region Ra
Region Rb
Sqcq
Boundary :
ππππ₯βππ·ππ π‘ (ππ ,π₯ )>ππ
is Region
is in Region
Our approach (2)
Compute constraint: objective: minimize
We use the Lagrange Multiplier (LM) method. Details could be found in the paper
16
correct sound efficientThe condition (3) is equivalent to the dominance conditionEach condition transformation takes O(d) time and the cost of LM is also O(d)
Equivalent condition (5): is in Region and
Space partitioning: Boundary : Region : Region :
Empirical study: set-up
Datasets: Real datasets: NBA, Color, Texture, and Forest Synthetic datasets
Algorithms: MinMax, MBR, GP, Trigonometric, Hyperbola (our
method) Measures:
precision = TP/(TP+FP) recall = TP/(TP+FN) running time
17
A correct method has the precision always equal to 1A sound method has the recall always equal to 1
Criteria of a method:1. Correctness: No false positive (FP)2. Soundness: No false negative (FN)3. Efficiency: runs in O(d) where d is the number of dimensionality
Empirical study: results (precision, NBA)
All algorithms except Trigonometric have precisions = 1.
18
Methods Correct? Sound? Efficient?
MinMax Yes No Yes
MBR Yes No Yes
GP Yes No Yes
Trigonometric No Yes Yes
Our approach Yes Yes Yes
Empirical study: results (recall, NBA)
Only our approach (Hyperbola) and Trigonometirc have recalls = 1.
19
Methods Correct? Sound? Efficient?
MinMax Yes No Yes
MBR Yes No Yes
GP Yes No Yes
Trigonometric No Yes Yes
Our approach Yes Yes Yes
Empirical study: results (running time, NBA)
MinMax < GP < Hyperbola (our method) < MBR < Trigonometric
20
Conclusion
First solution for the hypersphere dominance problem, which is correct, sound and efficient for any dimension
An application study: kNN Experiments
21
Q & A
22
The following slides are for backup use only
23
Hyperspheres in uncertain databases
Song and Roussopoulos [SSTDβ01] Cheng et al. [TKDEβ04] Chen and Cheng [ICDEβ07] Beskales et al. [PVLDBβ08]
24
Our approach (1)
25
Dominance condition:
Equivalent condition (1): :
Major idea:Derive an equivalent condition of and check whether the derived condition is true
Equivalent condition (2):
Equivalent condition (3): and :
Definition 1: Hypersphere dominanceGiven
, , and , it decides whether
Dominance condition
Yes: No:
An application study: kNN qeury
kNN query: Given a set D of hyperspheres, , , β¦, , a query
hypershere , and an integer , the query finds a set of hyperspheres in D each of
which is not dominated by wrt where is the hypersphere in D with the k-th smallest maximum distance from .
Solution: A best-first search algorithm based on SS-tree Some pruning strategies 26
27
Boundary : Region Ra
Region RbIllustration 1: 2D space, and are two points (i.e., = 0, = 0)Sb ()
SqSa () cq