1 query processing in spatial network databases presented by hao hong dimitris papadias jun zhang...
TRANSCRIPT
1
Query Processing in Spatial Network Databases
presented by Hao Hong
Dimitris Papadias Jun Zhang
Hong Kong University of Science and Technology
Nikos Mamoulis University of Hong Kong
Yufei Tao City University of Hong Kong
2
Outline Introduction Related work
Spatial query processing in Euclidean Space Disk-based graph representations- CCAM structure
Spatial query in network databases Architecture Spatial queries:
Nearest neighbor query Other queries
Performances Summary
3
Introduction Motivation Euclidean distance vs. network distance
Euclidean distance <= network distance Conventional spatial queries:
K Nearest neighbor query: retrieves the k ponits closest to the query location
Range query: retrieves the points covered by certain range
Intersection join: retrieves all the intersected points from the query location sets
4
Outline Introduction Related work
Spatial query processing in Euclidean Space Disk-based graph representations- CCAM structure
Spatial query in network databases Architecture Spatial queries:
Nearest neighbor query Other queries
Performances Summary
5
Spatial query processing in Euclidean Space
An R-tree index Multidimentional extension of B-tree (quoted from
Query Processing in Spatial Network Databases) MBR: Minimum Bounding Rectangle Spatial points are clustered according to the
distances between their MBR Hereby, R-tree is fast for spatial query
6
Spatial query processing in Euclidean Space
an example
n1
n3
n4 n6
n8
E1 E2
E3
E4
n2
n5
n7
E5E6E1
E2
n9 E5 E6 E3 E4
n1 n2 n7 n8 n9 n3 n4 n5 n6
7
Disk-based graph representations- CCAM structure
A graph can be represented as An adjacency list Two-dimensional matrix
The Connectivity-clustered Access Method (CCAM) structure Stores the single dimensional lists Stores the lists of neighbor nodes together
8
Disk-based graph representations- CCAM structure
An example
n1
n2
n3
n4
n5
A graph
n1 n4
A B-tree in order of node id
n1 n2 n3 n4 n5
Disk pages
List 1
......
List 5
List 3
......
List 4
page1
page2
5
6
2
1
4
n2 2 n4 7
7
n5 5 null
Adjacency list of n1
9
Outline Introduction Related work
Spatial query processing in Euclidean Space Disk-based graph representations- CCAM
structure Spatial query in network databases
Architecture Spatial queries:
Nearest neighbor query Other queries
Performances Summary
10
Architecture The above example
n1
n2
n3
n4
n5
The networkE1 E2
Network R-tree
n1 n2 n3 n4 n55
6
2
1
4
7E1
E2
Adjacency component
List 1
......
List 5
List 3
......
List 4
page1
page2
page2
Adjacency list of n1
2MBR(n1, n2)
page1 7MBR(n1, n4)
page1 5MBR(n1, n5)
11
Architecture
E1 E2
Network R-tree
n1 n2 n3 n4 n5
page1
page2
listPtr
Disk pages
2MBR(n1, n2)
listPtr 7MBR(n1, n4)
listPtr 5MBR(n1, n5)
Adjacency component :
Enanle fast access to Neighbor nodes
and the correspondingpolyline
... ......
... ......
listPtr 4MBR(n4, n5)
pageN
pageM
polylineComponent:
Stores the endpoints
And the MBR of each segment
Polyline of n1,n2Page2
Polyline of n1,n4
Polyline of n1, n5
... ...
... ...
... ...
page1
...
...
...
pageNMBR(n1, n2)
pageNMBR(n1, n4)
pageNMBR(n1, n5)
... ...
12
Primitive operations
Check_entity(seg,p): if entity p is covered by seg, then it returns true
Find_segment(p): returns the segment which covers entity p
If there are more than one result, then return the first one
If there is no result, then return the most appoximate one
Find_entities(seg): returns the entities which are covered by the specified segment seg
Compute_ND(p1, p2): returns the network distance between the specified entities p1 and p2
P1 and p2 are arbitrary points in the network
13
Outline Introduction Related work
Spatial query processing in Euclidean Space Disk-based graph representations- CCAM
structure Spatial query in network databases
Architecture Spatial queries:
Nearest neighbor query Other queries
Performances Summary
14
Nearest neighbor query
Incremental Euclidean Restriction (IER) algorithm: Find the first 1 nearest neighbors of location q
n1
n2
n3
n4
n5
5
5
2
1
4
7
Find the Euclidean nearest neighbor n3
Compute the network distance: dN(q, n3)= Compute_ND(q, n3)
Set dEmax = dN(q, n3) Repeat the process of retrieving other
nodes. To node nk , if dN(q, nk) < dN(q, n3), then set
dEmax = dN(q, nk) Otherwise, return the node which has
set dEmax and stop
q1
15
Nearest neighbor query
Incremental Euclidean Restriction (IER) algorithm:Find the k nearest neighbors of location q
Has the R-tree sorted in ascending order of their network distance to q
If the kth node is nk, then set dEmax = dN(q, nk) Repeat the process of retrieving other nodes (nk+1, nk+2, ...,
nm) , for node ni, if dN(q, ni) <= dEmax, then set
dEmax = dN(q, ni) Insert ni to the queue of k nesrest neighbors Remove the former kth node
Otherwise, return the node which has set dEmax and stop
16
Nearest neighbor query
3
5
24
2
61
94 6
4
n1
n2n3
n4
n5
n6
n7n8 n9
p1p2
p3
p4
p5
q
In this case, according to IER algorithm, p5 will be retrieved as the last one, because it has the largest Euclidean distance to q
Quoted from Query Processing in Spatial Network Databases
17
Nearest neighbor query
Incremental Network Expansion (INE) algorithm Performe the nodes checking in the order of encounting
sequence Initiate the Q to be (n1, n2), which covers
q Since (n1, n2) doesn‘t cover any entity,
expand n1 with n7, and Q = <(n2,5), (n7,12)>
Repeate the expansion Expand n2 with n4 and n3, Q = <(n4, 7),
(n3, 9), (n7, 12)> Here p5 is covered by segment (n2, n4), set
threshold dNmax = dN(q, p5) = 6 Since the next one in the Q has dN(q, n4) >
dNmax , the algorithm terminates, returning p5.
3
5
24
2
61
94 6
4
n1
n2n3
n4
n5
n6
n7n8 n9
p1p2
p3
p4
p5
q
18
Other queries
Other queries: Range query Closest-Pairs E-distance joins
Provide algorithms which process queries in euclidean space and network spaces
19
Outline Introduction Related work
Spatial query processing in Euclidean Space Disk-based graph representations- CCAM
structure Spatial query in network databases
Architecture Spatial queries:
Nearest neighbor query Other queries
Performances Summary
20
Performancess Experiments comparing between Euclidean restriction
(ER) and Network Expansion (NE) Experimental targets:
Page accesses CPU cost
21
Nearest Neighbor Queries Experiments
|S|=the number of the entity set and |N|=the number of segments When |S| / |N| decrease, the entities among segments are relatively sparse, which
relatively increases false hits. Therefore, the amount of network computation is increased.
0
20
40
60
80
0.1 0.5 1 2 10
R-trees
network IER
INE
IER
INE IERINE
IERINE
IERINE
cardinality ratio - |S|/|N|
Page Accesses
0
20
40
60
80
100
0.1 0.5 1 2 10
IER
INE
CPU time -msecs
cardinality ratio - |S|/|N|
Quoted from Query Processing in Spatial Network Databases
22
Outline Introduction Related work
Spatial query processing in Euclidean Space Disk-based graph representations- CCAM
structure Spatial query in network databases
Architecture Spatial queries:
Nearest neighbor query Other queries
Performances Summary
23
Relating to DE3 project
Our idea: Spatial network is represented by Oracle Spatial
(SDO_Geometry) Indexing with R-tree Tracking moving objects with segment-based policy
We have in common: Processing NN query and Range query in Euclidean
space and network space Using R-tree index
We can borrow the idea: The network expansion policy