8/17/20151 querying xml database using relational database system rucha patel ms cs (spring 2008)...
TRANSCRIPT
04/19/2304/19/23 11
Querying XML DatabaseQuerying XML DatabaseUsingUsing
Relational Database SystemRelational Database System
Rucha PatelRucha Patel
MS CS (Spring 2008)MS CS (Spring 2008)
Advanced Database Systems CSc 8712Advanced Database Systems CSc 8712
Instructor : Dr. Yingshu LiInstructor : Dr. Yingshu Li
04/19/2304/19/23 22
Outline of PresentationOutline of Presentation
1.1. Background Information regarding XMLBackground Information regarding XML2.2. Storing XML documents in relational DB systemStoring XML documents in relational DB system3.3. Querying & Manipulating XML dataQuerying & Manipulating XML data
1.1. XML Data Models for Query ProcessingXML Data Models for Query Processing2.2. XML Labeling SchemesXML Labeling Schemes3.3. Structural JoinsStructural Joins
4.4. General Technique for Querying XML Documents General Technique for Querying XML Documents using Relational DB Systemusing Relational DB System
5.5. XQL ( XML Query Language )XQL ( XML Query Language )6.6. ConclusionConclusion
04/19/2304/19/23 33
Background Information - Background Information - XMLXML
Evolved from a document markup languageEvolved from a document markup language For exchange of structured and semi-structured For exchange of structured and semi-structured
datadata For self-describing data -> between For self-describing data -> between
heterogeneous data sourcesheterogeneous data sources XML Data Management SystemsXML Data Management Systems
Specialized system – only for XML documentsSpecialized system – only for XML documents General System – manage XML along with other data General System – manage XML along with other data
formats.formats.
04/19/2304/19/23 44
Background Information – Background Information – XML ( Contd… )XML ( Contd… )
XML is a recommendation of W3CXML is a recommendation of W3C XML Schema – Type System for XMLXML Schema – Type System for XML XPath – A language for navigating within XML XPath – A language for navigating within XML
documentsdocuments XSLT – an XML transformation languageXSLT – an XML transformation language XQuery – a general purpose XML query XQuery – a general purpose XML query
languagelanguage Based on XML Schema typesBased on XML Schema types Includes XPath as a subset.Includes XPath as a subset.
04/19/2304/19/23 55
Storing XML Documents in Storing XML Documents in RDB System RDB System
1 ) Simplest one is to use 1 ) Simplest one is to use Long Character StringLong Character String data typedata type like, CLOB in SQL like, CLOB in SQL
Will store entire document as a character stringWill store entire document as a character string Textual FidelityTextual Fidelity Fails to take advantage of structural information Fails to take advantage of structural information
available in XML markupavailable in XML markup
04/19/2304/19/23 66
Storing XML Documents in Storing XML Documents in RDB System ( Contd… ) RDB System ( Contd… )
2 ) 2 ) ShreddingShredding Distributes XML information across one/more columns of tables Distributes XML information across one/more columns of tables
preserving both data values & structural relationships.preserving both data values & structural relationships. For XML schema => tablesFor XML schema => tables
levels of elements….levels of elements….
at each level – different tables for elements in hierarchyat each level – different tables for elements in hierarchy
Schema Based ShreddingSchema Based Shredding Not efficient withNot efficient with
sparse element - with varying contentssparse element - with varying contents Mixed contents – text + child elementsMixed contents – text + child elements
Fails to preserveFails to preserve Document orderingDocument ordering Processing instructions of XML documentsProcessing instructions of XML documents
04/19/2304/19/23 77
Storing XML Documents in Storing XML Documents in RDB System ( Contd… ) RDB System ( Contd… )
3)3) XML PublishingXML Publishing to reconstructs XML documents from relational tables,to reconstructs XML documents from relational tables,
Systems usually provides inverse information called “XML Systems usually provides inverse information called “XML Publishing”Publishing”
Such Systems with shredding + XML publishing are Such Systems with shredding + XML publishing are said to provide said to provide Relational FidelityRelational Fidelity
As authoritative form of data is relational, not XML.As authoritative form of data is relational, not XML.
4) Native XML with XML Fidelity.4) Native XML with XML Fidelity.
04/19/2304/19/23 88
Querying & Manipulating Querying & Manipulating XML DataXML Data
XML Storage facility -> interface to access and XML Storage facility -> interface to access and manipulate stored data.manipulate stored data.
XPath – better navigation within documents but, XPath – better navigation within documents but, can not transform structurescan not transform structures Can not construct new elementsCan not construct new elements
XSLT – transformation + Construction But,XSLT – transformation + Construction But, Recursive template-driven nature – unsuitable for Recursive template-driven nature – unsuitable for
optimizationoptimization
XQuery – complete set of query facilities.XQuery – complete set of query facilities.
04/19/2304/19/23 99
Querying & Manipulating Querying & Manipulating XML Data ( Contd… )XML Data ( Contd… )
XML Data ModelXML Data Model XML documents as ordered, labeled, finite, unranked XML documents as ordered, labeled, finite, unranked
trees.trees. Relative order of nodes – order of siblingsRelative order of nodes – order of siblings Region encoding labeling schemeRegion encoding labeling scheme < doc, start, end, level >< doc, start, end, level >
• Doc – to which document, node belongs toDoc – to which document, node belongs to• Start & end – position of element in a documentStart & end – position of element in a document• Level – level of a node in a treeLevel – level of a node in a tree
X is an ancestor of y, if and only ifX is an ancestor of y, if and only if• x.start < y.start and x.end > y.endx.start < y.start and x.end > y.end
04/19/2304/19/23 1010
Querying & Manipulating Querying & Manipulating XML Data ( Contd… )XML Data ( Contd… )
XML Labeling SchemesXML Labeling Schemes To evaluate queries in XPath, XSLT & XQuery,To evaluate queries in XPath, XSLT & XQuery,
1.1. Maintain results throughout the evaluation in document Maintain results throughout the evaluation in document orderorder• Restricts choice of query plansRestricts choice of query plans• Impossible if query requires data to be resorted along different Impossible if query requires data to be resorted along different
axis at some point.axis at some point.
2.2. Sort OperatorSort Operator – handled at appropriate times – handled at appropriate times• Assign each node a label – denoting relative orderAssign each node a label – denoting relative order• Like, region encoding schemeLike, region encoding scheme
• Ancestor-descent problemAncestor-descent problem• Variable size labeling scheme Variable size labeling scheme
• Do not need to relabel a node on update.Do not need to relabel a node on update.• Difficult to allocate fixed portion of each record for label. Difficult to allocate fixed portion of each record for label.
04/19/2304/19/23 1111
Querying & Manipulating Querying & Manipulating XML Data ( Contd… )XML Data ( Contd… )
Storing XML in RDBMSStoring XML in RDBMS Labeling Scheme + edge shredding = form a single Labeling Scheme + edge shredding = form a single
relation for storing XML Docrelation for storing XML Doc Edge relationEdge relation
1.1. Global Encoding SchemeGlobal Encoding Scheme• Edge(Edge(idid, parent-id, end, path-id, value), parent-id, end, path-id, value)
2.2. Local Encoding SchemeLocal Encoding Scheme• Edge(Edge(idid, parent-id, sIndex, path-id, value), parent-id, sIndex, path-id, value)• sIndex – position of a node among siblingssIndex – position of a node among siblings
04/19/2304/19/23 1212
General Technique for General Technique for Querying XML Doc in RDBMSQuerying XML Doc in RDBMS
To store and query an XML DocTo store and query an XML Doc1.1. Relational Schema Generation – table creationRelational Schema Generation – table creation
2.2. Shredding – storing XML DocShredding – storing XML Doc
3.3. Converting queries over stored XML into SQL Converting queries over stored XML into SQL queries over created tablesqueries over created tables
Relational schema generation – requires its Relational schema generation – requires its own query processor to convert the queriesown query processor to convert the queries
But, the same query processor can be used..But, the same query processor can be used..
04/19/2304/19/23 1313
Contd...Contd...
To use the same query processor for relational To use the same query processor for relational schema generation and converting queries,schema generation and converting queries,
Along with shredding, Along with shredding, Reconstruction XML ViewReconstruction XML View is is created over relational tablescreated over relational tables
Virtually reconstructs the Virtually reconstructs the
Stored XML Doc <- shredded rows.Stored XML Doc <- shredded rows. Just like the normal view over the Stored XML Doc.Just like the normal view over the Stored XML Doc. Queries on Stored XML = Queries over Queries on Stored XML = Queries over
Reconstruction XML ViewReconstruction XML View
04/19/2304/19/23 1414
Contd...Contd...
04/19/2304/19/23 1515
Contd...Contd...
For Relational Schema Generation, a program For Relational Schema Generation, a program thatthat
Generated desired relational schemaGenerated desired relational schema Produce XML Shredder object Produce XML Shredder object Create reconstruction XML viewCreate reconstruction XML view
• Either for,Either for, Shared relational schemaShared relational schema Edge relational schemaEdge relational schema
04/19/2304/19/23 1616
Contd...Contd... Shared Relational SchemaShared Relational Schema Steps to generate relational schemaSteps to generate relational schema
Create a DTD Graph node ( XML Element, Create a DTD Graph node ( XML Element, Attribute, Operator)Attribute, Operator)
Create a relation for root element in graphCreate a relation for root element in graph All children of element are represented in same All children of element are represented in same
relation of element EXCEPT,relation of element EXCEPT,• *-node, - is a ‘set’ values + can’t captured by relational *-node, - is a ‘set’ values + can’t captured by relational
expressionsexpressions• So, create separate relation for these nodes.So, create separate relation for these nodes.
04/19/2304/19/23 1717
Contd...Contd...
04/19/2304/19/23 1818
Contd...Contd...
04/19/2304/19/23 1919
Contd...Contd...
04/19/2304/19/23 2020
XQL ( XML Query Language )XQL ( XML Query Language )
Structured Queries – relational / OO DBStructured Queries – relational / OO DB Unstructured Queries – DocumentsUnstructured Queries – Documents Semi-structured Queries – XML DocumentsSemi-structured Queries – XML Documents Features like,Features like,
Allows, user to combine information from multiple sourcesAllows, user to combine information from multiple sources Uses links as a part of a queryUses links as a part of a query Search based on text containtmentSearch based on text containtment
Eg ) Doc1 – recommended booksEg ) Doc1 – recommended books
Doc 2 – Books + PricesDoc 2 – Books + Prices
Doc 3 – Reviews of BooksDoc 3 – Reviews of Books
Then, a query -> list recommended books, prices and reviews.Then, a query -> list recommended books, prices and reviews.
04/19/2304/19/23 2121
XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…
Difference between SQL & XQL QueryDifference between SQL & XQL QuerySQL XQL
The database is a set of tables. The database is a set of one or more XML documents.
uses the structure of tables as a basic model.
uses the structure of XML documents as a basic model.
The FROM clause determines the tables which are examined by the query.
A query is given a list of input nodes from one or more documents.
The result of a query is a table containing a set of rows; this table may serve as the basis for further queries.
The result of a query is a list of XML document nodes, which may serve as the basis for further queries.
04/19/2304/19/23 2222
XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…
Basic Concepts of XQLBasic Concepts of XQL Simple string – element nameSimple string – element name
• Eg. tableEg. table ‘‘/’ – child operator – indicates hierarchy/’ – child operator – indicates hierarchy
• Eg. Front/authorEg. Front/author ‘‘front/author='Theodore Seuss Geisel'front/author='Theodore Seuss Geisel' front/author/address/@type='email' front//address //address front/author/address[@type='email'] front/author='Theodore Seuss Geisel'[@gender='male' and shoesize='9EEEE'] section[1,3 to 5, 8, -1] section[@level='3'][1 to 2]
04/19/2304/19/23 2323
XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…
Example QueriesExample Queries
04/19/2304/19/23 2424
XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…
Example QueriesExample Queries
04/19/2304/19/23 2525
XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…
Grouping of resultsGrouping of results QueryQuery – –
lists the products on invoices might want to group products by invoice, placing each group of lists the products on invoices might want to group products by invoice, placing each group of
products within an invoice tag.products within an invoice tag.
04/19/2304/19/23 2626
XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…
Join Combine information from multiple sources to create one unifies view.
Queries can be written like,
04/19/2304/19/23 2727
ConclusionConclusion
XML Documents can be stored efficiently in a relational database system using number of approaches.
General Technique for storing and querying XML Document using RDBMS eliminated need of separate query processors for XML query translation.
Using General Technique Reconstruction XML View can be generated for both shared and edge based relational schema.
Stored XML Document can be queried effectively through the use of XQuery, XPath, XSLT or XQL.
04/19/2304/19/23 2828
ReferencesReferences
“XML and Relational Database Management Systems : the inside Story” by Michael Rys, Don Chamberlin, & Daniela Florescu.
“A General Technique for Querying XML Documents using a Relational Database System” by Jayavel Shanmugasundaram, Rajasekar Krishnamurthy, Igor Tatarinov.
“Querying and Maintaining Ordered XML Data Using Relational Databases” by Willium SHui, Franky Lam, Damien Fisher & Raymond Wong.
“Querying Structured Text in an XML Database” by Shurung Al-Khalifa, Cong Yu, H.V. Jagdish.
“Structured Materialized Views for XML Queries” by Andrei Arion, Veronique Benzaken & Ioana Manolescu.
04/19/2304/19/23 2929
Thank You.Thank You.
Any Questions ???