bottom-up evaluation of xpath queries stephanie h. li zhiping zou
TRANSCRIPT
![Page 1: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/1.jpg)
Bottom-up Evaluation of XPath Queries
Stephanie H. Li
Zhiping Zou
![Page 2: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/2.jpg)
Outline
Overview of XPath Motivation Algorithms : bottom-up evaluation Design and implementation
![Page 3: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/3.jpg)
Introduction- Overview
Overview of Xpath XPath is a querying language and is designed for
addressing nodes of XML documents.Data modelSyntaxExpressions
Location paths Operators Functions
Evaluation(context)
![Page 4: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/4.jpg)
Data Model
Data Model XML document = tree of nodes 7 kinds of nodes:
Element Attribute Text Namespace Processing-instruction Comment Document (root) nodes.
![Page 5: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/5.jpg)
Data Model(Example)
r
a
b b
The root node
The root element
bb
<a><b/><b/><b/><b/>
</a>
![Page 6: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/6.jpg)
Expression
XPath uses expressions to select nodes from XML documents
The main types of expressions are
Location Paths, Functions and operators
![Page 7: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/7.jpg)
Location Paths
Although there are many different kinds of XPath expressions, the one that’s of primary use in Java programs is the location path.
Location Path: /child::movies/child::movie[position()=5]
step axis nodetest predicate location path
![Page 8: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/8.jpg)
Location Step
Axis::Nodetest[predicts] Axis: chooses the direction to move from the
context node Node test: determines what kinds of nodes will
be selected along that axis Predicts: further filter the node-set.
![Page 9: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/9.jpg)
XPath Axis
Axis---main navigator for a XML docancestor : nodes along the path to the root
ancestor-or-self : same but including the context node
child : children of the context node
descendant : descendants of the context node
descendant-or-self : same but including the context node
following : nodes after the context node in document order, excluding descendants
following-sibling : following sibling of the context node
parent : the parent of the context node
preceding : nodes before the context node in document order,excluding ancestors
preceding-sibling : preceding sibling of the context node
![Page 10: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/10.jpg)
Node Test
Node Type test Example
T(root()) = {r}, T(element()) = {a; b1; : : : ; b4} T(element(a))= {a}T(element(b)) = {b1; : : : ; b4}
Node Name test Element node name
![Page 11: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/11.jpg)
Operators and Functions
Arithmetic Ops
Ops for comparisons and boolean logic: {<,>,<=,>=,=,!=} {or, and}
Functions Position() Last()
![Page 12: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/12.jpg)
Xpath Query Evalutation
Query evaluation is a major algorithmic problem Main construct is the expression Each expression is evaluated to yield an object one of
these four types: Node-set (an unordered collection of nodes without
duplicates ) Boolean(true or false) Number(a floating-point number ) String
![Page 13: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/13.jpg)
Context
All XPath expressions are evaluated w.r.t. a Context,which consists of A context node A context position(int) A context size(int)
The input context for query evaluation is chosen by the user.
![Page 14: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/14.jpg)
Motivation
Claim: The way XPath is defined in W3C XPath
recommendation motivates an inefficient implementation (exponential-time).
This paper propose more efficient way (polynomial-time)
![Page 15: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/15.jpg)
Basic query evaluation strategy
Procedure process-location-step(n0, Q)/* n0 is the context node;
query Q is a list of location steps */Begin
node set S := apply Q.first to node n0;if (Q.tail is not empty) then
for each node n ∈ S doprocess-location-step(n, Q.tail);
End
Time(|Q|) = |D| * Time(|Q|-1) or |D||Q| when |Q| > 0 1 when |Q| = 0
The algorithm recursively evaluates each remaining step for each matching node of the current step
![Page 16: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/16.jpg)
Xpath Evaluate in PTime
Theorem: Let e be an arbitrary XPath expression. Then, for context node x, position k, and size n, the value of e is v, where v is the unique value such that <x,k,n,v>∈ E↑[e]
The main principle that the paper propose to obtain an XPath evaluation algorithm with PTime complexity is the notion of a context-value table(CVT)
![Page 17: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/17.jpg)
Context-value table Principle
Given an expression e, the CVT of e specifies all valid combinations of contexts c<x,k,n> and values v, s.t. e evaluates to v in context c<x,k,n>
Such a table for expression e is obtained by first computing the CVTs of the direct subexpressions of e and then combining them into the CVT for e.
The size of each of the CVTs has a polynomial bound
Each of the combination steps can be effected in PTime
Thus, query evaluation in total under our principle also has a PTime bound
![Page 18: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/18.jpg)
Bottom-up evaluation of XPath
![Page 19: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/19.jpg)
Bottom-up evaluation of XPath
Algorithm (Bottom-up algorithm for XPath)Input: An XPath query Q;Output: E↑[Q]Method:
Let Tree(Q) be the parse tree of query Q;R:=Ø;For each atomic expression l ∈ leaves(Tree(Q)) do
compute table E↑[l] and add it to R; [Note: we use JDom to do this]
While E↑[root(Tree(Q))]! ∈ R doBegin
take an Op(l1,…ln) nodes(Tree(Q))s.t. E↑[l1],… E↑[ln] ∈ R;
compute E↑[Op(l1,…ln)] using E↑[l1],…, E↑[ln];add E↑[Op(l1,…ln)] to R;
End;Return E↑[root(Tree(Q))]
By a bottom-up algorithm we mean a method of processing XPath while traversing the parse tree of the query from its leaves up to its root.
![Page 20: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/20.jpg)
Bottom-up evaluation of XPath
Example XML :
<?xml version="1.0"?><people> <person born="1912" died="1954" id="p342">
<name> Alan Turing </name> <!-- Did the word computer scientist exist in Turing's day? --> <profession>computer scientist</profession> <profession>mathematician</profession> <profession>cryptographer</profession> <homepage>href="http://www.turing.org.uk/"</homepage>
</person> <person born="1918" died="1988" id="p4567">
<name>Richard M. Feynman</name> <profession>physicist</profession> <hobby>Playing the bongoes</hobby>
</person> </people>
![Page 21: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/21.jpg)
Example: XML Doc Tree
![Page 22: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/22.jpg)
Example: XPath Query tree
Parse tree XPath query:
descendant:: profession/following-sibling::*[position()!= last()]
![Page 23: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/23.jpg)
Example: Evaluate subexpressions
![Page 24: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/24.jpg)
Example: Evaluate subexpressions
![Page 25: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/25.jpg)
Example: Evaluate subexpressions
![Page 26: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/26.jpg)
Design and Implementaion
Environment Java,JDK1.5.0 Jdom1.0 XPath1.0 Features:
Only Element nodes are queriedNot support abbreviated xpath expressionsNot support format of location steps in predicts.
![Page 27: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/27.jpg)
System Structure
Query Parser(Parser.java BinaryTree.java,Node.java)
User input(MyDriver.java)
Query tree
Evaluator( QueryEval.java)
JDom XML parser(org.jdom.input.SAXBuilder)
Context value tables (ContextValTable.java and others)
XML document tree
Result for the full xpath query
XML fileQuery
Context node
![Page 28: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/28.jpg)
Conclusion
XPath query evaluation algorithm that runs in polynomial time with respect to the size of both the data and the query (linear in the size of queries and quadratic in the size of data)
No optimization, strictly coheres to the specification given in the paper
![Page 29: Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou](https://reader036.vdocuments.mx/reader036/viewer/2022062223/551c0c0c5503469e4f8b5188/html5/thumbnails/29.jpg)
References
G. Gottlob, C. Koch, and R. Pichler. "Xpath Processing in a Nutshell". In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE'03), Bangalore, India, Mar. 2003.
G. Gottlob, C. Koch, and R. Pichler. "Efficient Algorithms for Processing XPath Queries". In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB'02), Hong Kong, China, Aug. 2002.
G. Gottlob, C. Koch, and R. Pichler. "XPath Query Evaluation: Improving Time and Space Efficiency". In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE'03), Bangalore, India, Mar. 2003.
http://www.ibiblio.org/xml/books/xmljava/chapters/ch16.html