answering queries and hypertree decompositions conjunctive queries the problem bcq: instance:...
TRANSCRIPT
Answering Queriesand Hypertree Decompositions
Conjunctive Queries
The problem BCQ:
Instance: < DB, Q>
Question: Has Q a nonempty result over DB?
Combined Complexity(Vardi ’82)
Problems Equivalent to BCQ
Conjunctive Query Containment
Query of Tuple Problem
Constraint Satisfaction in AI
Clause Subsumption in Theorem Proving
)()( 2121
dbQdbQdbQQ
? )( dbQt
? s.t. DC
BCQ CSP
Example of CSP: Crossword Puzzle
Complexity of BCQ
NP-complete in the general case (Chandra and Merlin ’77)NP-hard even for fixed database
Polynomial if Q has an acyclic hypergraph(Yannakakis ’81)LOGCFL-complete (in NC2) (G.L.S. ’98)
Interest in larger tractable classes of CQS
Is this query hard?
),','(),',()',',,,(
)','()','()',',(),(
),(),',()',',',,(),,',,(
FXBqFXBpYXYXJj
ZYhZXgZFFfZYe
ZXdZCCcFCYYSbFCXXSaans
n size of the databasem number of atoms in the query
• Classical methods worst-case complexity:O(n
m)
m = 11 !
• Despite its apparence, this query is nearly acyclic
It can be evaluated in O(m·n 2· logn)
Nearly Acyclic Queries
Bounded Treewidth (tw) a measure of the cyclicity of graphs for queries: tw(Q) = tw(G(Q))
For fixed k: checking tw(Q) k Computing a tree decomposition
linear time(Bodlaender’96)
Answering BCQ of treewidth k:O(nk log n) (Chekuri & Rajaraman’97)LOGCFL-complete (G.L.S.’98)
Beyond treewidth
Bounded Degree of Cyclicity
Bounded Query width
(Gyssens & Paredaens ’84)
(Chekuri & Rajaraman ’97)
Group together query atoms (hyperedges) instead of variables
Hypertree Decomposition
We use p(X,Y,Z) partially p(X,Y,_), c(T,W)
d(X,T)
a(X,U,W), b(Y,V,W)
c(Y,T)
p(X,Y,Z), q(U,V,Z)
p(X,Y,_),p(X,Y,_), c(T,W)
We group atoms p(X,Y,Z), q(U,V,Z)
a(S,X,X’,C,F), b(S,Y,Y’,C’,F’)
j(J,X,Y,X’,Y’)
j(_,X,Y,_,_), c(C,C’,Z) j(_,_,_,X’,Y’), f(F,F’,Z’)
d(X,Z) e(Y,Z) h(Y’,Z’)g(X’,Z’), f(F,_,Z’)
p(B,X’,F) q(B’,X’,F)
),','(),',()',',,,(
)','()','()',',(),(
),(),',()',',',,(),,',,(
FXBqFXBpYXYXJj
ZYhZXgZFFfZYe
ZXdZCCcFCYYSbFCXXSaans
Connectedness Condition
a(S,X,X’,C,F), b(S,Y,Y’,C’,F’)
j(J,X,Y,X’,Y’)
j(_,X,Y,_,_), c(C,C’,Z) j(_,_,_,X’,Y’), f(F,F’,Z’)
d(X,Z) e(Y,Z) h(Y’,Z’)g(X’,Z’), f(F,_,Z’)
p(B,X’,F) q(B’,X’,F)
Evaluating queries having bounded hypertree widthk fixed
Given:a database db
a query Q over db such that hw(Q) ka width k hypertree decomposition of Q
Deciding whether Q(db) is not empty is in O(n k+1 log n) and complete for LOGCFL
Computing Q(db) is feasible in output-polynomial time
Comparison results
Hypertree Decomposition
Hinge Decomposition+
Tree ClusteringCycle Hypercutset
Tree Clusteringw* treewidth
Cycle Cutset
HingeDecomposition
Biconnected Components
Characterizations ofHypertree width
Logical characterization:Loosely guarded logic
Game characterization:The robber and marshals game
Work in progressAnswering queries and hypertree decompositions: A query-planner based on hypertree
decompositions Choosing the best query plan (i.e., the
best decomposition) exploiting data on tables, attibute selectivity, indices, etc.
Further possible applications: Answering queries using views