spatial query processing spatial dbs do not have a set of operators that are considered to be basic...

21
Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle a large set of complex data, which are not sorted in a dimension. Complex algorithms are needed for evaluating spatial predicates. It is not possible to assume that the computational cost in the query processing is only associated with I/O.

Upload: patience-jennings

Post on 05-Jan-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Spatial Query Processing

• Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation.

• Spatial DBs handle a large set of complex data, which are not sorted in a dimension.

• Complex algorithms are needed for evaluating spatial predicates.

• It is not possible to assume that the computational cost in the query processing is only associated with I/O.

Page 2: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Spatial Operations

• Update operations• Selection operations:

– Point Query (PQ): given a query point p, fin all objects O that contain it:PQ(p) = { O| p O.G ≠ Ø}

– Range or region query (WQ): given a query polygon P, find all objects O that intersect P. When P is rectangular, we call it windows query.

WQ(P) = { O| O.G P.G ≠ Ø}

• Spatial aggregation: It is a variant of the search for nearest neighbor. Given an object O’, find objects o that have a minimum distance to o’.

NNQ(o’) = { o|o’’: dist(o’.G,o.G) ≤ dist(o’.G,o’’.G) }

Page 3: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Spatial Operations

• Spatial JOIN: This is one of the most important operators in relational databases. When two tables R y S are joined based on a spatial predicate , the join is called spatial join. A variant of this operator in GIS is the map overlay. This operator combines two set of spatial objects to create a new set. The boundaries of these new objects are determined by the nonspatial attributes assigned by the overlap operation. For example, if an operation assigns the same value of a nonspatial attribute two adjacent objects, they will merge.

R S = {(o, o’)| o R , o’ S, (o.G, o’.G)}

Some spatial predicates are: intersection, northeast, distance, overlap, meets, adjacent, contains, and so on.

Page 4: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Techniques of Query Processing

• Selection: – Unsorted data and no index – Spatial Indexing– Rank = selectivity - 1/differential cost selectivity(p): cardinality(output(p))/cardinality(input(p)) differential cost is the cost of the predicate.

Page 5: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Techniques of Query Processing• Nearest Neighbor:

An approach to solve this type of queries uses a couple of distance measures, search pruning criteria, and a search algorithm. Min-distance(P,R) is zero if P is inside of R or on its boundary. If P is

outside of R, then min-distance(P,R) is the Euclidean distance between P and any side of R.Min-Max distance(P,R) is the distance to P from the farthest point on any

face of R that contain the vertex closest from R to P. The construction of the R-tree guarantee that there is an object O inside of R in the R-tree such that distance(O,P) ≤ Min-Max distance(P,R).

Some search pruning strategies are:• An MBR M can be eliminated if if there is another MBR M’ such that

min-distance(P,M) > min-max distance(P,M’)• An MBR M can be eliminated if if there is an object O such that

distance(P,O) < min- distance(P,M)• An object O can be eliminated if if there is an MBR M such that

distance(P,O) > min-max distance(P,M)

Page 6: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Techniques of Query Processing

• Join: Un join is defined as the cross product followed by a selection condition. This is specially expensive for spatial databases. Associated with a filter step, which is then followed by a refinement, the following algorithms are concentrated on the spatial operations over rectangles (mbrs).

Page 7: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

JOIN Algorithms• Nested loop

for all tuple f F for all tuple r R if overlap(F.Geom, R.Geom) then add <f,r> to result

If F needs M pages with pf tuples in each page, and R needs N pages with pr tuples in each of them, the computational cost is prohibitive. If we consider B buffers in memory, one can transfer B-2 pages from F, leave one buffer for R, and one for the results of <f,r>.

An alternative is to use each tuple in F as a window query over an indexed R.

Page 8: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

JOIN Algorithms• Tree matching: Both tables are indexed.

SJ(R1,R2: nodes) forall er2 in R2 [

forall er1 in R1 [ if overlap(er1.rect, er2.rect) then [ if R1 and R2 are leaf pages then

output(er1.oid,er2.oid) else if R1 is leaf page then [

ReadPage(er2.ptr); SJ(er1.ptr.er2.ptr)] else if R2 is leaf page then[

ReadPage(er1.ptr); SJ(er1.ptr.er2.ptr)] else [

ReadPage(er1.ptr), ReadPage(er2.ptr)SJ(er1.ptr,er2.ptr)]

] ] ]

Page 9: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

JOIN Algorithms• Partition-Based Spatial Merge Join• Filter Step: Given two relations F y R:

– Given each tuple in F y R, form the tuple key-pointer consisting of the unique id OID and the MBR. Llame a esto Fkp y Rkp.

– If both relations Fkp y Rkp fit in main memory, the operation can be processed with a plane-sweep algorithm.

– If the relations do not fit in memory, partition both relations in P parts.• Partition: The partition must satisfy the following constraints:

– For each Fikp, the element in Ri

kp lies in Rikp

– Both Fikp y Ri

kp lie in main memory.

Page 10: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Sweep plane: intersection of polygonsl

l

Page 11: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

OptimizationIn traditional DB, the computational cost of a query is

defined in terms of I/O. In a spatial DB, in contrast, the fact that the system deals with complex data makes the definition of a query plan and optimization more relevent.

The query optimizer generates different evaluation planes and selects one. Many times, time is not the best, but at least, it is not the worst. The activities of the optimizer can be classified into: logical trasnformation and dynamic programming.

Page 12: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Schema of Query Optimizer

a

Query OptimizerParser LogicalTransformationDescomposition DynamicProgrammingEvaluationMerge

SQL GrammerAbstract Data Types

HybridArchitectureSpecificationCost FunctionNonspatial SpatialSystem Catalogselectivity Index CPU Bfr

Heuristic RuleNonspatial Spatial

Page 13: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Query Optimizer

• Parsing: Before that the optimizer can operate, a high-level declarative statement must be scanned through a parser.In traditional DB, the types of data and functions are fixed and the parsers are relatively simple. Spatial DB are extended by user defined types so that parsers are more complicated.

SELECT L.nombre

FROM Lago L, Servicio Fa

WHERE Area(L.Geometry) > 20 ANDs

Fa.nombre = ‘camping’ AND

Distance(Fa.Geometry, L.Geometry) < 50

Page 14: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Query Tree

aa

π .L nombreσ ( . ) > 20Area L Geometryσ . = ‘ ’Fa nombre campingDistance(Fa.Geometry,L.Geometry) < 50Lago L Servicio Fa

Page 15: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Query Optimizer

• Logical transformation: The strategy derived from the parser can be very inefficient. The join operation is very expensive and whose complexity is bounded by the size of the input.Thus,it is better to decrease the size of the input of the join operation.An option is to move the selection of nonspatial attribute down in the query tree.

aa

π .L nombreσ ( . ) > 20Area L GeometryDistance(Fa.Geometry,L.Geometry) < 50Lago L Servicio Faσ . = ‘ ’Fa nombre camping

Page 16: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Transformations

• In the step, the tree is mapped onto equivalent trees by using a set of formal rules inherited from relational algebra.

• The trees are numbered based on the heuristics to filter candidates that are obviously no recommended. The general rule in this case is “ move the nonspatial operators SELECT and PROJECT down in the tree.” For each alternative is possible to define the rank.

Rank = selectivity - 1/differential cost selectivity(p): cardinality(output(p))/cardinality(input(p))

The space of alternatives is generated with rules of relationsl algebra based on notions of commutativity, associativity and distributivity.

Page 17: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Equivalence Rules

• Selection

σc1 c2…cn(R ) σc1(σc2(…(σcn (R ))…) All nonspatial relation are moveed to the right.

σc1(σc2 (R )) σc2(σc1 (R )) Nonspatial selection is first than spatial selection.

• Projection

If ai’ are a set of attributes such that ai ai+1 for i = 1,…n-1, then

πa1 (R ) π a1(π a2(…(π an (R ))…)

Page 18: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Equivalence Rules

• Cross Product and Join

Conmutativity:

R S S R

Associativity

R (S T) (S R) T

Implication

(R T) S (T R) S

• Selection, Projection and Join

If the selection condition involves attributes used by the projection operator:

πa(σc(R )) σc (πa (R ))

Page 19: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Equivalence Rules

• Selection, Projection and Join

If a condition of selection c involves an attribute that only appears in R and not in S, then:

σc(R S ) σc (R ) S

Projection can be processed with Join:

πa(R S ) πa1(R ) πa2 (S )

where a1 is a subset of a, which appears in R, and a2 is the subset of a that appears in S.

Page 20: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Query Optimizer

• Dynamic Programming. It is the technique that selects an evaluation plan. This selection is carried out with the goal of minimizing the computational cost.The factors to consider are:

– Access cost– Storage cost – CPU cost – Communication cost

• Catalogs. It keeps the information for computing the cost

• Cost function:Cost = Espression(records-examined) + K* Expresión(pages-read)

K weigth of CPU respect to I/O.

Page 21: Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle

Execution Plan

Ej. SELECT F. Nombre FROM Bosque F, Rios R

WHERE Intersect(F. Geometry, :WINDOW) AND Overlap( F. Geometry, R.Geometry)

aa

π F. ( - )nombre on the flyσ (Intersect F.Geometry,: ) ( -WINDOWS R T )ree indexOverlap(F.Geometry,R.Geometry) (Tree-Matching Join)Bosque F Rios R