stammtisch datenbankbasierte verarbeitung von xml
TRANSCRIPT
© 2006 AG DBIS
Datenbank-stammtisch
2006Datenbankbasierte Verarbeitung von
XML-Dokumenten ist anders! Datenbankbasierte Verarbeitung von
XML-Dokumenten ist anders!
Theo Härderwww.haerder.de
Modellierungs- und Verarbeitungskonzepte von RDBMS
Architektur- und Speicherungskonzepte in der XML-Welt
Nummerierungsverfahren für XML-Bäume
Indexierung von XML-Dokumenten
Algorithmen für die Pfadverarbeitung
Sperrverfahren in XML-Bäumen
© 2006 AG DBIS 2
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Bird‘s Eye View to Conventional Database ManagementBird‘s Eye View to Conventional Database Management
d at a
type
s
filesystems relational
object oriented object relational
simple
sequentialprocessing,. . .
CAD,. . .CAD, SE, GIS,
. . .Multimedia,
. . .(R)OLAP,business appl.
queries complex
complexsystemtypes
OODBS
file systemsvideo server
UniversalServer
complex
simple
simple complexqueries
datatypes
trends
n*108 $
k*1010 $
and marketdominance
© 2006 AG DBIS 3
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Conventional Data ModelsConventional Data ModelsSupport of structured data• separation of schema (structure information) and data
DB schema• complete description of structure (structural meta-data)• has to be specified before storing data objects• needed to interpret and manipulate the data
Data• always are instances of the schema
- structure is fixed, no flexibility/deviation possible
• do not carry structure information - must be interpreted and manipulated by means of the schema
AgeAddressName
Employee
35Opernplatz 5, …Schmidt
20Badstraße 3, …Maier
55Schlossallee 1, …Müller
© 2006 AG DBIS 4
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
RDBMS Architecture – Mapping HierarchyRDBMS Architecture – Mapping Hierarchy
tables/views with set-oriented operations
T2T1
data system
variety of record types and access paths with record-orientedoperations
R1 …R2 Rn
access system
S1 S2 Sm
external storagewith filesof different types
P1
P2 P3
P17
DB buffer in memoryproviding page access
storage system
R1S2
R2
S1
P17
P1
P2
P3
© 2006 AG DBIS 5
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Query Processing in SQL is well KnownQuery Processing in SQL is well Known
Declarative queries on tables/views
Optimized DBMS model and query at runtime
Q1: Select E.Name, E.Age, D.NameFrom Employee E, Department D Where E.Dno = D.Dno
And E.Name = “S*” And D.Loc = “Dresden”
Open Scan(ID(Loc), Loc=’Dresden’,Loc>’Dresden’) /*SCB1*/Sort Access (SCB1) ASC Dno Into T1 (Dno, …)Close Scan (SCB1)Open Scan (IE(Name), Name>=’S’, Name>’S’) /*SCB2*/Sort Access (SCB2) ASC Dno Into T2 (Dno, …)Close Scan (SCB2)Open Scan (T1, BOF, EOF) /*SCB3*/Open Scan (T2, BOF, EOF) /*SCB4*/While Not FinishedDo
Fetch Tuple (SCB3, Next, None)Fetch Tuple (SCB4, Next, None). . .
End
SQL query Q1
L4
L3
L2large
DB buffer
© 2006 AG DBIS 6
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Essential Mechanisms for Internal ProcessingEssential Mechanisms for Internal Processing
spec.: CREATE [UNIQUE] INDEX indexON base-table (column [ORDER] [,column [ORDER]] ...)[CLUSTER] [PCTFREE]
indexing and clustering are the performance-boosting mechanisms!conceptually simple operations: equality-, range-based selection, equi-joins, different scans, …
25 61
8 13 33 45 77 85
IDept(Dno)clustered index
1. Addressing rows of tables: TIDs (stable, but not order preserving!)
2. Value-based indexes:
unclustered index25 61
8 13 33 45 77 85
IEmp(Dno)
3. Multi-granular locking: database
segment
table index
row
locking along the organizational hierarchy- use of intention locks
- implicit locks for subtrees
------+X
---+-++U
-----++RIX
---+-++R
----+++IX
--+++++IR
XURIXRIXIR-
compatibility matrix
© 2006 AG DBIS 7
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Why XML Database Systems (XDBMS)?Why XML Database Systems (XDBMS)?
Properties wanted:• instances carry structure and data (integrated meta-data)• modeling flexibility and schema evolution• DB-controlled or application-controlled consistency• in addition to the salient DBMS features
- ACID transactions- declarative query processing- efficient and parallel processing of large data volumes- high availability and fault tolerance- scalability w.r.t transaction workload and data volumes
Examples:• document-centric view: document collections
- books, articles, web pages, …→ application: structure-sensitive information retrieval
• data-centric view: semi-structured data model - messages, configuration files, semi-structured data per se→ application: healthcare information management
© 2006 AG DBIS 8
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Mapping Flexibility: XML Relational ModelMapping Flexibility: XML Relational ModelD
Employees
Schmidt
Employee
Name Address Age
Opern-platz 5, …
35
Maier
Employee
Name Address Age
Badstr. 3, … 20
Müller
Employee
Name Address Age
Schloss-allee 1, …
55
Perfect mapping to RM only, if- uniform objects (entities)- description exactly in three levels O/A/V- atomic values
- no aspects (attributes of XML elements)- order is insignificant.35Opernplatz 5, …Schmidt
20Badstraße 3, …Maier
55Schlossallee 1, …Müller
AgeAddressName
Employee
integrated meta-data
© 2006 AG DBIS 9
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Mapping Flexibility: XML Relational Model (2)Mapping Flexibility: XML Relational Model (2)
Mapping to RM across several tables
- is at least very complex and
becomes quickly incomprehensible,
- additionally must preserve order,
- provides no solution for ‘aspects’
- is called “shredding”
180----Opernplatz 5, …Schmidt3
--201- -Maier2
--55--Schlossallee1, …Müller1
HeightAgeANoAddressNameCNoEmployee
…
KL67657F-W-Str. 521
KL67663Bachstr. 311
CityZIPStreetCNoANoAddress
D
Employees
180Schmidt
Employee
Name Address
Opern-platz 5, …
Müller
Employee
Name Address Age
Schloss-allee 1, …
55Employee
Maier
Name Age
20
Address
Badstr. 3
Street ZIP City
67663 KL
Address
F-W-Str. 5
Street ZIP City
67657 KL
Measured_at=“10.01.2006”
HeightUnit=“cm”
© 2006 AG DBIS 10
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Flexibility and Schema EvolutionFlexibility and Schema Evolution
Data + algorithms, but data first• DB people are a “strong schema” community
- that has pros and cons- views enable a certain kind of schema evolution
Opposite view: extreme schema evolution• “everyone starts with the same schema <stuff/>.”
then they refine it.” J. Widom
Simpler data integration• heterogeneous data sources and messages• file directories are becoming databases;
- pivot on any attribute- folders are standing queries- freetext + schema search (better precision/recall)
Jim Gray’s opinion• XSD (xml schema) and XQuery are transitional;
but we have to do them to get to the real answer• cohabit with row stores• challenge: figure out what comes after XSD+XQuery
© 2006 AG DBIS 11
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
“Entire world” a single doc• heterogeneous subtrees• typical index-supported ops: XPath
processing, TWIGs, …• variety of language models
DOM, SAX, XPath, XQuery
Flexibility and schema evolution• schema-less approach (consistency
guarantees, search?)• therefore with multiple or evolving
schemas• or schemas including extensibility
points (Really Simple Syndication, Atom, …)
Collection of moderate sized (<1MB) docs• heterogeneous • filtering by index support• XQuery, XSQL, SQL/XML
DB-controlled consistency• XSD design first• schema-supported search• strong consistency checking
(hardly any serious work)• how to handle 105 schemas
Timberisolatedsingle docs
schema-lessdocuments
dataorganization
DB2 ViperMS SQLServer
. . .
XTC, Natix
105 XSDscollectionsof docs
schema-based
1 XSD
Bird‘s Eye View to XML Database ManagementBird‘s Eye View to XML Database Management
© 2006 AG DBIS 12
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Relational, Native, or Hybrid Storage?Relational, Native, or Hybrid Storage?
LOBs don’t enable fine-granular management, no content-based search and no multi-user operation
Mapping onto relational tables?• many solutions: “shredding”• XML query language (e.g., XQuery, XPath, DOM, SAX)
must be mapped to SQL• use of the SQL optimizer!• but: concurrency control (locking) very cumbersome,
because a document is distributed over n tables
Hybrid approach• example: Viper• SQL with XQuery subqueries
from SQL‘s functions xmlQuery, xmlExists, and xmlTable
• stand-alone XQuery interface- provides access to XML columns- db2-fn:xmlcolumn imports an
entire XML column as a sequence of items
DB2 Storage
. . .. . .. . .
<employee> …<name> … </name></employee>
. . .„V18“
xmldoc. . .ID
create table department (ID char(8), …, xmldoc xml)
© 2006 AG DBIS 13
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Relational, Native, or Hybrid Storage? (2)Relational, Native, or Hybrid Storage? (2)
employee
id=450 name tel office id=426 name tel office
Tom Meier 0211-126812 119 Tina Lint 0211-679088 113
department
employee
<department><employee id=“450”>
<name> Tom Meier</name><tel>0211-126812</tel><office>119</office>
</employee><employee id=“426”>
<name>Tina Lint</name><tel>0211-679088</tel><office>113</office>
</employee></department>
Transformation into an internal XML tree
Native: conceptual XML mapping to a fine-grained storage structure
5=450 7 3 5=426 7
Tom Meier 0211-126812 458 Tina Lint 0211-679088 113
9
66
1 1 3
office3
tel7
id5
name1
employee6
department9
String table
SYSIBM:SYSXMLSTRINGS
Element names are replaced by means of a dictionary
© 2006 AG DBIS 14
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Native XMLStore
XQuery DBMS
XQuery XML
Tables
SQL DBMS
SQL Tuples
Approaches to Coexistence with SQLApproaches to Coexistence with SQL
XQuery Rewriter
XQuery XML
SQL Rewriter
SQL Tuples
• XOR: "XML over Relational"• "Shredding" XML -> Tables
• ROX: "Relational over XML"• "Native" XML storage• SQL Systems become legacy
© 2006 AG DBIS 15
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XML Transaction Coordinator (XTC)XML Transaction Coordinator (XTC)XML Transaction Coordinator (XTC)
OS File SystemTransaction Log Container Files Container Logs Temporary Files
XTCs
erve
r
Transaction Services
File Services
Propagation Control
Access Services
Node Processing Services
XML Services
Interface Services
XTCdriver
Http Agent Ftp Agent DOM RMI SAX RMI API RMI
XML Manager XSLT ProcessorXQuery Processor
Node Manager
Record Mgr Index Mgr Catalog Mgr
Buffer Manager
I/O Manager Temp File Mgr
Transaction Manager
Lock Manager
Deadlock Detector
DOM
SAXXTCconnection
Browser FTP Client
L1
L2
L3
L4
L5
Path Processing
Architectural overview- reuse of the layer model is possible, but needs substantial adjustments
and, in particular, new functionality in the higher layers - focus on key mechanisms for efficient internal processing
© 2006 AG DBIS 16
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Example of an XML DocumentExample of an XML Document
• <bib><book year=“1994“ id=“1“>
<title>TCP/IP Illustrated</title><author>
<last>Stevens</last><first>W.</first>
</author><price>65.95</price>
</book><book year=“2000“ id=“2“>
<title>Data on the Web</title><author>
<last>Abiteboul</last><first>Serge</first>
</author><author>
<last>Buneman</last><first>Peter</first>
</author><author>
<last>Suciu</last><first>Dan</first>
</author><price>39.95</price>
</book><book year=“1999“ id=“3“>
<title>The Economics of . . . </title><editor>
<last>Gerbarg</last><first>Darcy</first><affiliation>CITI</affiliation>
</editor><price>129.95</price>
</book></bib>
© 2006 AG DBIS 17
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Example of a DOM Tree: Conceptual RepresentationExample of a DOM Tree: Conceptual Representation
bib
T text node
book
1994 1title author price
last firstyear id
T
TCP/IPT T
T
Stevens W.
69.95
element
attribute
book
T
T
title author price
last first
TT
last first
author
TT
last first
author2id
2000year
TAbitebul
TData …
SergeBuneman
Peter SucioDan
39,95
book
T
T
T
31999 title author price
lastfirst
T
affiliationyear id
TheEnd…
GerbargDarcy
CIII
129,95
T
© 2006 AG DBIS 18
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Node Labeling – Use of k-ary TreesNode Labeling – Use of k-ary Trees
Concept of virtual nodes
• parent (cn, k) = ceil ((cn – 1)/k)• child (cn, k) = cn*k – (k-1) + 1, cn*k – (k-2) + 1,
cn*k – (k-i) + 1, …, cn*k – 1 + 1, cn*k + 1 • ancestor (cn, k) = parent (cn, k), parent (parent (cn, k), k), …• descendant (cn, k) = child (cn, k), child (child (cn, k), k), …• sibling (cn, k) = child (parent (cn, k), k), …• previous/following …
KO criterion• any computed label may correspond to a virtual node• tree representation has to be accessed to check whether a node is
real or virtual
1
2 43
135 10
14 31 40
a document may have a very large k and very many levels
© 2006 AG DBIS 19
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
DeweyIDs Embody a Special Prefix Labeling SchemeDeweyIDs Embody a Special Prefix Labeling Scheme
Labels must• be immutable for the lifetime of the nodes• preserve the document order, when inserting new nodes • easily reveal the level and the ID for all ancestor nodes
DeweyID consists of several divisions separated by dots• overflow mechanism: even division values
• level determination
• ancestor IDs: a0 = 1; a1 = 1.3; a2 = 1.3.17; a3 = 1.3.17.2.2.3
• ordering d2 ? d1
d1 = 1.3.17.2.2.3.4.9
d1 = 1.3.17.2.2.3.4.9
d2 = 1.3.17.2.3.7
d1 < d2 : 1.3.17.2.2.3.4.9 < 1.3.17.2.3.7
© 2006 AG DBIS 20
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Initial Assignment of DeweyIDs (SPLIDs)Initial Assignment of DeweyIDs (SPLIDs)
• assignment of division values is affected by parameter distance (= 4)
• on initial loading, only odd division values are assigned• odd division value indicates level transition
1
book
T
1.5
year
1.5.1.3
id
1.5.1.5
bib
book
1994 1
1.13book1.9
title author price
T
1.5.5 1.5.9 1.5.13
TPC/IP...
last first1.5.5.5 1.5.9.5 1.5.9.9 1.5.13.5
1.5.9.5.5 1.5.9.9.5 65.95
Stevens W.T T
T
elementattribute
text node
© 2006 AG DBIS 21
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
book 1.2.9
DeweyIDs: Insertion of SubtreesDeweyIDs: Insertion of Subtrees
distance = 4
worst-case considerations:distance = 8
book 1.2.3 1.5
1
1.9
bib
bookbookbook 1.3book 1.2.5book 1.2.2.9
1
book
T
1.5
year
1.5.1.3
id
1.5.1.5
bib
book
1994 1
1.13book1.9
title author price
T
1.5.5 1.5.9
author
1.5.11 1.5.13
TPC/IP...
last first1.5.5.5 1.5.9.5 1.5.9.9 1.5.13.5
1.5.9.5.5 1.5.9.9.5 65.95
Stevens W.T T
type
1.5.3
author
1.5.12.5
© 2006 AG DBIS 22
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Characteristics of XML Documents ConsideredCharacteristics of XML Documents Considered
1.912,1124.376666,72911)uwm.xml
2.4150,0004.0762,189,8592,977,0314)SwissProt.xml
1.9015,0013.4241150,00110)orders.xml
1,762,4356.08956,317476,6462)nasa.xml
3.459554.15647,42322,4239) mondial-3.0.xml
1.9460,1763.45411,022,9768)lineitem.xml
1.90124.26601567)ebay.xml
1.891,5013.414113,5016)customer.xml
1.81262,5295.6881,290,64721,305,8183)psd7003.xml
1.5856,3858.443712,437,6661)treebank_e.xml
∅−fanoutof elements
max.fanout
∅−depth
max.depth
number of
attributes
numberof element
nodessize
(bytes)
Courses of a University Website
DB of protein sequences
Orders from TPC-H Benchmark
Astronomical data
Geographical DB of diverse sources
Line items fromTPC-H benchmark
Ebayauction data
Customers fromTPC-H benchmark
DB of protein sequences
Encoded DB of English records of Wall Street Journal
descriptionfile name
5)dblp.xml 2.11649,0803.3971,375,8326,662,623
2,337,522
114,820,211
5,378,845
25,050,288
1,784,825
32,295,475
35,562
515,660
716,853,016
86,082,517
284,994,162Computer Science Index
numberof text nodes
40,234
2,013,844
135,000
303,676
7,467
962,800
107
12,000
15,955,109
1,391,845
6,013,355
© 2006 AG DBIS 23
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Encoding of DeweyIDsEncoding of DeweyIDs
17,895,768-2,165,379,4143111111
1,118,552-17,895,7672411110
69,976-1,118,5512011101
4,440-69,9751611100
344-4,439121101
88-34381100
24-876101
8-234100
1-730
value range of OiLiHuffman
code
Optimization and adaptivity potential- cut prefix 1.- apply prefix compression to DeweyIDs- analysis phase, if possible: determine DOM tree parameters
for optimized Huffman code assignment (even level-wise applicable)- adjust Huffman code to the size and anticipated modification
operations of each tree
range
expansion
© 2006 AG DBIS 24
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
DeweyIDs – Comparison of Avg. Sizes to Max. SizesDeweyIDs – Comparison of Avg. Sizes to Max. Sizes
6
10
11
13
13
46
dist(256)
max-size
746.195.043.176. customer
1377.166.124.585. dblp
1388.147.045.104. SwizzProt
17811.308.845.613. psd7003
18811.308.545.192. nasa
722215.9411.576.671. tree-bank_e
dist(256)dist(2)dist(256)dist(32)dist(2)
∅-sizeDocument
© 2006 AG DBIS 25
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XTC Document IndexXTC Document Index
prefix compression works!
document index is a B-tree for the document(s) stored in the doubly-chained pages of the document containertext values exceeding a given threshold are stored in referenced mode
1.3.1.31.3.11.31
1.3.31.3.1.5.11.31.51.3.1.3.1
1.3.5.31.3.51.3.3.3.11.3.3.3
1.3.5.5.31.3.5.51.3.5.3.3.11.3.5.3.3
1.3.7.3.11.3.7.31.3.71.3.5.5.3.1
DeweyID node data (byte representation)
1.3.1.3.1
1 1.3.5.3.3
1.3.5.5.3.1
1.3.3.3
docu
men
tind
exdo
cum
entc
onta
iner
© 2006 AG DBIS 26
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XML Indexing MethodsXML Indexing Methods
Path indexing• effective support for
- query rewriting- query pruning based on structural summaries,
e.g., to check the existence of paths
• examples- DataGuide, 1-Index, APEX, D(k)-Index, A(k)-Index
Many types of path fragments in queries to be supported• parent/child: a/b/c
• ancestor/descendants: a//x//y
• enhanced by predicates: a/b[pred1]//x[pred2]
• twigs: a/b[.//c]//x[d]
What is the ubiquitous index – the silver bullet – for XML queries?
use of DeweyIDsin element indexeshelp a lot!
path processing opsoften needed
© 2006 AG DBIS 27
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Genealogy of XML Indexes not yet KnownGenealogy of XML Indexes not yet KnownWhat is the “B*-tree” for XML queries?
Design of XML indexes ahead of us
© 2006 AG DBIS 28
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Data Guide – Supporting Query Optimization and EvaluationData Guide – Supporting Query Optimization and Evaluation
bib {1}
book {1.5, 1.9.1.13, …}
price{1.5.13, …}
authortitleidyear
firstlast
{1.5.9, 1.5.11, …}{1.5.5, 1.9.5, …}{1.5.1.5, …}{1.5.1.3, …}
{1.5.9.9.5, …}{1.5.9.5.5, …}
A data guide d has for every label path in the source document s a path class in d and every path class of d occurs as one or more label paths of s
A data guide provides a structural summaryXTC stores the DeweyIDs in separate index structures
DOM tree example
T
T
title author price
last first
TT
last first
book
author
TT
last first
author2id
2000year
TAbitebul
TData …
SergeBuneman
Peter SucioDan
39,95
bib
© 2006 AG DBIS 29
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XTC Element IndexXTC Element Index
author
book
price
1.3.5 1.3 1.3.7
each of them sorted in document order
name directory(B-tree)
node-reference indexes(B*-trees)
In combination with a data guide, the XTC element index can be used like a path index- unique occurrence of element names: DeweyIDs are path indexes- homonyms of element names: additional tests allow the separation of the paths(DeweyIDs)
© 2006 AG DBIS 30
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XML Indexing Methods (2)XML Indexing Methods (2)
Hybrid Indexing• goal: evaluation of queries on structure and content
IndexFabric as an example (root to leaf)• designators: special characters for encoding paths• a unique designator is assigned to each tag
- bib = bi, type = ty, author = a, last = l- book = bo, title = ti, price = p, first = f- example: “bibotiTCP/IP”
• all strings are stored in a (layered) PATRICIA trie
4
5 5 5
6666
t
l
p
i f
a
j…
………
designator
bibotiTCP/IPDOCID. . .
bibotiXMLDOCID. . .
normalchar T X
bibo
query evaluation: translate query to designator string and lookup in the trie
© 2006 AG DBIS 31
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XML Indexing Methods (3)XML Indexing Methods (3)
Content or text indexing (known from IR)• goals:
- support of predicates on textual content of a document (contains, near, matches, …)
- indexing of (typed) values such as Integer, Double, … • examples: inverted lists, value index (VIndex in Lore)
2561
813
3345
7785
Potential indexes D
Emps
Schmidt
Emp
Name Address Age
Opern-platz 5, …
35
Maier
Emp
Name Address Age
Badstr. 3, … 20
Müller
Emp
Name Address Age
Schloss-allee 1, …
55
Query predicate must match index predicate: Q (/Emps/Emp/Age=20) could not be used with I (//Age)
Use of DeweyIDs would allow to filter out false positives!
I (/Emps/Emp/Age)
I (//Age)I (//Emp/Age)
I (//[Integer])
leaf to root!
© 2006 AG DBIS 32
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Evaluation Example – FLWOR ExpressionEvaluation Example – FLWOR Expression
1) $p iterates over all “Employee” elements and bind the children “Name” and “Age” to $n resp. $a.
2) Each iteration step checks whether “where” clause matches
3) Matches are stored in a sequence and sorted according to $n (Names)
4) Output is generated
output
youngEmp
Name Age
„Maier“ „20“
. . . youngEmp
Name Age
„. . .“ „. . .“
no matchno match match
„Müller“
D
Name Employees Name Employees
Enterprise
„Controlling“
DepartmentDepartment
WorksFor = „Maier“
„Personal“
Oid = „Maier“
„Badstr.“ „20“ „17.000“„Maier“
Name Age
Employee
Address Salary
Oid = „Müller“
„Schloss-allee“
„55“ „35.000“
Name Age
Employee
Address Salary
Oid = „Schmidt“
„Opern-platz“
„35“ „40.000“„Schmidt“
Name Age
Employee
Address Salary
$n1
$p1
$a1
$p2
$n2$a2
$p3
$a3$n3
. . .
for $p in //Employeelet $n := $p/Name, $a := $p/Agewhere $a < 25order by $nreturn <youngEmp>
{$n, $a } </youngEmp>
© 2006 AG DBIS 33
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
XQuery – Path Processing StepsXQuery – Path Processing Steps
Problem: XQuery is really complex• order-preserving joins, implicit grouping,
result construction …Example fragment
doc(“sample.xml”)//autor[count
(parent::buch/preceding-sibling::element())>3]/vname[../nname = “Adams”]
concentration on a sequence of path processing steps
Axis::name_test
evaluation order: bottom-up, top-down, starting in the middle
algorithms: selective access via DeweyIDs, locking-aware
© 2006 AG DBIS 34
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Child AxisChild Axis
n
nn
n n n
1
1.3
1.3.3 1.3.5 1.3.7 1.3.9
1.3.5.3 1.3.5.5 1.3.9.3
1.3.5.3.3 1.3.5.3.5 1.3.9.3.3 1.3.9.3.5 1.3.9.3.7
input node
node satisfying name test n
insertion of the input sequence into a hash table
HT(1.3)HT(1.3.5.3)HT(1.3.9.3)
element index access for name
HT(1.3) 1.3.3HT(1.3.5.3) 1.3.5.3.5HT(1.3.9.3) 1.3.5.5
1.3.71.3.9.3.31.3.9.3.7
„probing“ of the parent DeweyIDs
HT(1.3) 1.3.3HT(1.3.5.3) 1.3.5.3.5HT(1.3.9.3) 1.3.5.5
1.3.71.3.9.3.31.3.9.3.7
result
1.3.31.3.5.3.51.3.71.3.9.3.31.3.9.3.7
input sequence: dark nodes observe level information
1.3.5.5
© 2006 AG DBIS 35
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Path Processing AlgorithmsPath Processing Algorithms
axis
ope
rato
rs
ancestor
descend.
middletop-down
bottom-up
12345678
2 3 4 5 6 7 8 9 10 12merge hash index
stack-tree
node-at-a-time
node-at-a-time
3
navigational set-orientedsequence-at-a-time navigational set-oriented
binary structural joins holistic twig joins(branching path queries)
use of DeweyIDs
parent
child
.
.
.
evaluationsequence
x
and
y z
a
c
d e
b
© 2006 AG DBIS 36
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
taDOM Storage Model – View of Lock MgrtaDOM Storage Model – View of Lock Mgr
XML document<?xml version="1.0"?><bib><book year="2004" id="book1"><title>The Title</title><author><last>Lastname</last><first>Firstname</first>
</author><price>49.99</price>
</book></bib>
string node
T
bib
book
title author price
id year Tlast first
TTThe Title
Lastname Firstname
49.99book1 2004
attribute root node
element node
attribute nodetext node
• NavigationgetFirst/LastChildgetNextSiblinggetPreviousSiblinggetAttributes/Value
• ModificationappendChildinsertBeforeremoveChild
• QuerygetElementByIdgetChildNodes
DOM API (> 20 ops)
© 2006 AG DBIS 37
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Tailored Node Locks for XML – taDOM2Tailored Node Locks for XML – taDOM2
Node locks and compatibility matrix• refined URIX protocol with extensions to lock a complete level in a subtree
- well known: IR/IX and R/X (here SR/SX)
• edge locks not discussed (3 modes)
intention read lock on a nodeIR
read lock on an entire subtreeSR (subtree read)
read lock on a context node and all direct-child nodes
LR (level read)
locks only a context nodeNR (node read)
effectlock
indicates existence of a write lock on a direct child node
CX (child excl.)
write lock on an entire subtreeX (exclusive)
intention of a write lock on a non-direct child node
IX (intent. excl.)
read lock for intended update operations on an entire subtree
SU (update option)
effectlock
--+++++++IR
+
+
+
+
+
+
+
-
--------SX
----++++SU
--++--++CX
--++-+++IX
----++++SR
---+++++LR
--++++++NR
XUCXIXSRLRNRIR
Read locks
Write locks
Compatibility matrix
© 2006 AG DBIS 38
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Node Locks (1)Node Locks (1)
Node read lock (NR)• requires IR locks on the ancestor path
Level read lock (LR)• requested for reading the context node and all nodes
located at the level below (all direct-child nodes)Child exclusive lock (CX)• indicates an X lock on a child• defined, in addition to IX, to detect conflicts with LR
T
bib
book
title author price
Tlast first
TTThe Title
Lastname Firstname
49.99
Transaction T1 is reading <price>IR
IR
NR
Transaction T2 is reading <book>and all direct-child nodes(<title>, <author>, and <price>)
LR
IR
Transaction T3 is modifyingthe book title
IX
IX
IX
CXX
© 2006 AG DBIS 39
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Node Locks (2)Node Locks (2)
Locking subtrees exclusively: intention exclusive lock (IX), child exclusive lock (CX), and exclusive lock (X)• requested for updating the context node's content or deleting
the context node and its entire subtree• requires a CX lock on the parent and IX locks on the ancestors
T
bib
book
title author price
Tlast first
TTThe Title
Lastname Firstname
49.99
Transaction T1 is deleting the<first> node and its content
X
CX
IX
IXTransaction T2 is deleting the<last> node and its content
X
CX
IX
IX
Transaction T4 is reading alldirect-child nodes of <book>
LR
IR
but is blocked when readingall child nodes of <author>
LR
Transaction T3 is reading the<author> node
IR
IR
NR
© 2006 AG DBIS 40
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
taDOM* Group – Lock Protocol OptimizationtaDOM* Group – Lock Protocol Optimization
Optimization steps• locking nodes without accessing the document • taDOM2+: LRIX, SRIX, LRCX, SRCX• taDOM3, taDOM3+
Example: lock conversion in taDOM3
SXSXSXSXSXSXSXSXSXSXSXSXSXSX
SXSUSXSUSXSXSXSXSUSUSUSUSUSU
SXSXNXNXNXNXNXNXNXSRNXNRNXNXNXNX
SXSUNXNUNXNXNXNXNUSRNUNRNUNUNUNU
SXSXNXNXNRCXNRCXNRCXNRCXNRCXSRNRCXNRNRCXNRCXNRCXNRCX
SXSXNXNXNRCXCXNRCXCXNRCXSRNRCXNRNRCXCXCXCX
SXSXNXNXNRCXNRCXNRIXNRIXNRIXSRNRIXNRNRIXNRIXNRIXNRIX
SXSXNXNXNRCXCXNRIXIXNRIXSRNRIXNRNRIXIXIXIX
SXSRNXSRNUSRNRCXSRNRCXSRNRIXSRNRIXSRSRSRSRSRSRSR
SXSUNXNRNUNRNRCXNRNRCXNRNRIXNRNRIXNRSRLRLRLRLRLR
SXSUNXNRNRCXNRCXNRIXNRIXSRLRNRNRNRNR
SXSUNXNUNRCXCXNRIXIXSRLRNRIRIRIR
SXSUNXNUNRCXCXNRIXIXSRLRNRIR-
© 2006 AG DBIS 41
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Data base (DataGuide)
Size• ~8 MB • 580,000 taDOM nodes
Bank
Customers Accounts
Customer
id
Address
Street No Zip City
Name
Fname Name
Account
id OwnerBalance Credit
Standing_Orders
Protocols Postings
Protocol Posting
Standing_Order
Day Receiver AccountNo ABA_No Amount Disposition
Performance Measurement Performance Measurement
© 2006 AG DBIS 42
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Performance Measurement (2)Performance Measurement (2)Number of committed transactions in Banking benchmark
100
150
200
250
300
350
400
0 1 2 3 4 5
taDOM3+, taDOM2+
taDOM3, taDOM2
URIX
RIX(+), IRIX(+)
Node2PL, NO2PL, OO2PL
lock protocol taDOM3+taDOM3taDOM2+taDOM2URIXIRIXIRIX+RIXRIX+OO2PLNO2PLNode2PL
lock depth
© 2006 AG DBIS 43
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Number of aborted transactions in Banking benchmark
0
100
200
300
400
500
600
700
0 1 2 3 4 5
RIX(+), IRIX(+)
taDOM, URIX
Node2PL, NO2PL, OO2PL
lock protocol taDOM3+taDOM3taDOM2+taDOM2URIXIRIXIRIX+RIXRIX+OO2PLNO2PLNode2PL
Performance Measurement (3)Performance Measurement (3)
lock depth
© 2006 AG DBIS 44
Datenbank-stammtisch
2006
Path ProcessingAlgorithms
Path ProcessingAlgorithms
Labeling ofXML Tree Nodes
Labeling ofXML Tree Nodes
Flexibility andEvolution
Flexibility andEvolution
XMLIndexing
XMLIndexing
ConventionalModels & DBMSConventional
Models & DBMS
ConcurrencyControl
ConcurrencyControl
SummarySummary
XML DBMS -the Big PictureXML DBMS -
the Big Picture
Conclusions and OutlookConclusions and OutlookXTC is used as a test vehicle for empirical DB research• DeweyIDs are the concept for efficient internal XML processing
• index structures and a toolbox with PPOs greatly support XML query evaluation
• effectiveness of XML concurrency control- fine-granular locking on nodes and edges- taDOM* protocols
multiplicity of lock modesintention locks are importantindexed document access is frequentancestor path locking without accessing the storage engine desirable
• performance evaluation revealed- use of tailored lock modes pays off- influence of node numbering schemes (insertions at arbitrary positions)
Outlook• work on logical XML algebra and selectivity estimation for flexible
query optimization• adaptivity aspects in layered DBMS architectures