xpathlog - institut f¼r informatik
TRANSCRIPT
XPathLog: A Declarative, Native XMLData Manipulation Language
Wolfgang MayInstitut fur InformatikUniversitat Freiburg
Germanymay�informatik�uni�freiburg�de
IDEASGrenoble, 16.7.2001
XPathLog
EXAMPLE: MONDIAL
�mondial��country car code=“B” capital=“cty-Brussels”
memberships=“org-eu org-nato . . . ”��name�Belgium�/name��population�10170241�/population��city id=“cty-Brussels” country=“B”��name�Belgium�/name��population year=“95”�951580�/population�
�/city�
:�/country�
�country car code=“D” capital=“cty-Berlin”memberships=“org-eu org-nato . . . ”�
:�/country�
�organization id=“org-eu” seat=“cty-Brussels”��name�European Union�/name� �abbrev�EU�/abbrev��members type=“member” country=“GR F E A D I B L . . . ”/��members type=“membership applicant” country=“AL CZ . . . ”/��/organization�
�organization id=“org-nato” seat=“cty-Brussels” . . .�
:�/organization�
:�/mondial�
1
EXAMPLE: MONDIAL XML TREE
document(“mondial.xml”)
mondialmondial
country
belgium
@car code=“B”
@memberships=“org-eu org-natoeu”@capital=“city-brussels”
country
germany
@car code=“D”
@memberships=“org-eu org-nato”
name
b-name
population
b-popcity
brussels@country=“B”@id=“city-brussels”
“Belgium” 10170241name
brus-namepopulationbrus-pop @year=1995
“Brussels” 951580
organization
eu @seat=“city-brussels”@id=“org-eu”
organization
nato @seat=“city-brussels”@id=“org-nato”
name
eu-name
abbrev
eu-abbrev
member
eu-mem1 @type=“member”@country=“B D”
member
eu-mem2 @type=“mem.appl.”@country=“CZ . . . ”
“Europ.Union” “EU”
XPathLog
LANGUAGES FOR XML
� two proposals of querying languages:XML-QL, W3C
� W3C XML Query Requirements/Data Model/Algebra
� no XML update language“Updating XML” @ SIGMOD 2001
Selection
� Navigation: XPath, XQuery
� XML Patterns: XML-QL
Generation
� instantiating XML patterns: XML-QL and XQuery
Languages
� Why not use XPath for updating and generating?
3
XPathLog
DESIGN DECISIONS
� experiences with F-Logic for semi-structured data and dataintegration
� declarative rule-based language with bottom-up semantics
� extend XPath with variable bindings
� graph data model, overlapping trees, multiple parents
4
XPathLog
XPATHLOG: SYNTAX
Extends the XPath syntax
XPathLog reference expressions are XPath location paths
��� ReferenceExpr ��� AbsLocPath
j ConstLocPath
��b� ConstLocPath ��� constant ��� RelLocPath
j variable ��� RelLocPath
Extend LocationSteps
�� Step ��� AxisSpec NodeTest Pred
j AxisSpec NodeTest Pred ���� Var Pred
j AxisSpec Var Pred
j AxisSpec Var Pred ���� Var Pred
� navigation by dereferencing IDREF attributes.
� Predicates over reference expressions
XPathLog 5
XPathLog
XPATHLOG BY EXAMPLES
Pure XPath expressions
?- //country[name/text() = “Belgium”]//city/name/text().
true
Output Result Set
?- //country[name/text() = “Belgium”]//city/name/text()�N.
N/“Brussels”
:
Additional Variables
?- //country[name/text()�N1 and
@car code�C]//city/name/text()�N2.
N2/“Brussels” C/“B” N1/“Belgium”
:
Local Variables
?- //country[name/text()�N1]//city[population/text()� P]
/name/text()�N2,
P � 100000.
Dereferencing
?- //organization[@seat = members/@country/@capital]
/@seat/name/text()�N.
XPathLog 6
XPathLog
XPATHLOG BY EXAMPLES
Metadata: Tag Variables and Schema querying
Navigation Variables
?- //Type�X[name/text()�“Monaco”].
Type/country X/country-monaco
Type/city X/city-monaco
Schema Querying
?- //city/N.
N/name
N/population
:
XPathLog 7
XPathLog
DATA MODEL: XTREEGRAPH
� extends DOM/XML Query Data Model
� Tailored to data integration
� Graph “database”
– XML element nodes: vertices
– subelement relationship,
– text children are leaf literals
– XML attribute nodes: literal values
– multivalued attributes (NMTOKENS, IDREFS) are split
– reference attributes (IDREF) are resolved
� element may be subelement of several vertices
� multiple trees, overlapping,no global “document” order
� (names are elements of the universe)
Data Model
For every vertex x:
� AX �child� x�
� AX �attribute� x�
� derived axes as views
Mapping: XML instance � XTreeGraph
XPathLog 8
XPathLog
SEMANTICS OF XPATHLOG
� Extends semantics for XPath by P.Wadler (1999)
XPathLog expressions: annotated result list
(i) a result list, and
(ii) with every element of the result list, a list of variablebindings is associated.
Example
expr =
//organization�O[member/@country[@car code�C and
name/text()�N]]
/abbrev/text()�A.
results in
list((“UN”, f(O/un, A/“UN”, C/“AL”, N/“Albania”),
(O/un, A/“UN”, C/“GR”, N/“Greece”),
: g),
(“EU”, f(O/eu, A/“EU”, C/“D”, N/“Germany”),
(O/eu, A/“EU”, C/“F”, N/“France”),
: g),
: )
XPathLog 9
XPathLog
XPATHLOG RULES
head(V�,. . . ,Vn) :- body(V�,. . . ,Vn)
� Evaluation of rule bodies = queries
� Constructive semantics for XPathLog atoms in rule heads
� Definite XPathLog atoms:
– uses only the child and sibling axes
– no negation, function applications, aggregation, andproximity position predicates
“/” and “[. . . ]” act as constructors:
� host�property�value� modifies host
� host�property remainder
inserts new element host�property which satisfiesremainder
� property of the form
– child��name
– child�i���name
– preceding�following�sibling��name
– preceding�following�sibling�i���name
– attribute��name
� unambiguous insertions
XPathLog: Rule Heads 10
XPathLog
RULE HEADS: UPDATES
Attributes
C[@datacode�“be”], C[@memberships�O] :-
//country�C[@car code=“B”],
//organization�O[abbrev/text()�“EFTA”].
�
�country datacode=“be” car code=“B”
memberships=“org-eu org-un org-efta . . . ”�...
�/country�
� C: host element
� O: target of reference
� extends AX �attribute�belgium�
XPathLog: Rule Heads 11
XPathLog
SEMANTICS OF RULE HEADS
Create “free” elements
/country[@car code�“BAV”].
�
�country car code=“BAV”� �/country�
XPathLog: Rule Heads 12
XPathLog
SEMANTICS OF RULE HEADS
Add subelement relationships
C[@capital�X and city�X and city�Y] :-
//country�C[@car code�“BAV”],
//city�X[name/text()=“Munich”],
//city�Y[name/text()=“Nurnberg”].
� city elements are linked as subelements.
� extends AX �child�bavaria� with �munich� city� and�nurnberg� city�.
� crucial for efficient in-place restructuring and integration
XPathLog: Rule Heads 13
XPathLog
Example: Linking
root
germany@car code=“D”@capital=� bavaria
@car code=“BAV”@capital=�
berlin nurnberg munich
b-name b-pop n-name n-pop m-name m-pop
“Berlin” 3472009 “Nurnberg” 495845 “Munich” 1244676
country country
citycity
city
city city
name population name populationname population
text() text() text() text() text() text()
� elements may have multiple parents
XPathLog: Rule Heads 14
XPathLog
SEMANTICS OF RULE HEADS
Generation of Elements by Path Expressions
C/name[text()�“Bavaria”] :-
//country�C[@car code=“BAV”].
�
�country car code=“BAV” capital=“city-munich”��city�. . . �/city��city�. . . �/city�
�name�Bavaria�/name��/country�
Atomized:C[name� N], N[text()�“Bavaria”] :-
root[descendant::country�C], C[@car code=“BAV”].
� variable binding C�bavaria.
� insert bavaria�child��name�x��:add an (empty) name subelement �x�� name� toAX �bavaria� child�.
� bind the local variable N to x�.
� generate the text contents to x�:add (“Bavaria”,text) to AX �x�� child�.
XPathLog: Rule Heads 15
XPathLog
DATA INTEGRATION
� objects of different sources represent the same real-worldobject
� Fusing objects, merging their properties
� synonyms
� not compatible with XML Data Model (DOM, XML QueryData Model)
� graph vs. tree
Integration 16
XPathLog
DATA INTEGRATION: “THREE-LEVEL” MODEL
access multiple sources
� “basic” layer: source(s) provide tree structures,
� optionally with namespaces
merge data from different sources
� “internal” layer: XTreeGraph
– overlapping trees
– multiple parents
– references
� fuse elements/merge subtrees
� add subelement links
� generate elements
� synonyms for properties
“export” layer: result trees views defined as projections
� root node
� signature
Integration 17
XPathLog
IMPLEMENTATION: LOPIX
Object
Manager
Sto
rage
OM AccessWebAccessDTD
ParserXML
Parser
Eva
luat
ion
Algebraic
Evaluation (S)
Algebraic
Insertion
Logic Evaluation
(Bottom-up) TXP
Exe
cutio
n
SystemCommands
XPathLog Parser(Programs and Queries)
User Interface Pretty PrinterBindings/XML
Inte
rnet
XMLurl�
DTDurl�
interactive
Output
XMLoutput
Internet in- and output
inserts to internal storage
internal information flow
querying internal storage
Implementation 18
XPathLog
RESULTS
� declarative semantics for generating XML with XPath
� graph data model suitable & necessary for integration
� firstly implemented XML updates
� DOM/XML Query Data Model not suitable:unique-parent, no fusion, synonyms, linking
� Updates in XML: requires to copy treesProblems with IDREFs
Conclusion 19
XPathLog
CONCLUSION
� practicable approach(implementation + case study)
– Faster than Kweelt, XML-QL, Excelon, and Tamino forqueries
� data model
� powerful, flexible language
– (metadata/schema querying)
– element creation
– element fusion
� extension concepts(classes, signatures)
� data-driven Web access,extensible for XLinks
Conclusion 20
XPathLog
Contents
1 Example: Mondial 1
2 Example: Mondial XML Tree 2
Design & Basics 3
3 Languages for XML 3
4 Design Decisions 4
5 XPathLog: Syntax 5
XPathLog 6
6 XPathLog by Examples 6
7 XPathLog by Examples 7
8 Data Model: XTreeGraph 8
9 Semantics of XPathLog 9
11 XPathLog Rules 11
12 Rule Heads: Updates 12
13 Semantics of Rule Heads 13
14 Semantics of Rule Heads 14
16 Semantics of Rule Heads 16
17 Data Integration 17
18 Data Integration: “Three-level” model 18
Implementation 19
19 Implementation: LoPiX 19
Results 20
20 Results 20
21 Conclusion 21
Conclusion 21