8/17/20151 querying xml database using relational database system rucha patel ms cs (spring 2008)...

29
03/16/22 03/16/22 1 Querying XML Database Querying XML Database Using Using Relational Database System Relational Database System Rucha Patel Rucha Patel MS CS (Spring 2008) MS CS (Spring 2008) Advanced Database Systems CSc 8712 Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu Li Instructor : Dr. Yingshu Li

Upload: penelope-mcdonald

Post on 24-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 11

Querying XML DatabaseQuerying XML DatabaseUsingUsing

Relational Database SystemRelational Database System

Rucha PatelRucha Patel

MS CS (Spring 2008)MS CS (Spring 2008)

Advanced Database Systems CSc 8712Advanced Database Systems CSc 8712

Instructor : Dr. Yingshu LiInstructor : Dr. Yingshu Li

Page 2: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 22

Outline of PresentationOutline of Presentation

1.1. Background Information regarding XMLBackground Information regarding XML2.2. Storing XML documents in relational DB systemStoring XML documents in relational DB system3.3. Querying & Manipulating XML dataQuerying & Manipulating XML data

1.1. XML Data Models for Query ProcessingXML Data Models for Query Processing2.2. XML Labeling SchemesXML Labeling Schemes3.3. Structural JoinsStructural Joins

4.4. General Technique for Querying XML Documents General Technique for Querying XML Documents using Relational DB Systemusing Relational DB System

5.5. XQL ( XML Query Language )XQL ( XML Query Language )6.6. ConclusionConclusion

Page 3: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 33

Background Information - Background Information - XMLXML

Evolved from a document markup languageEvolved from a document markup language For exchange of structured and semi-structured For exchange of structured and semi-structured

datadata For self-describing data -> between For self-describing data -> between

heterogeneous data sourcesheterogeneous data sources XML Data Management SystemsXML Data Management Systems

Specialized system – only for XML documentsSpecialized system – only for XML documents General System – manage XML along with other data General System – manage XML along with other data

formats.formats.

Page 4: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 44

Background Information – Background Information – XML ( Contd… )XML ( Contd… )

XML is a recommendation of W3CXML is a recommendation of W3C XML Schema – Type System for XMLXML Schema – Type System for XML XPath – A language for navigating within XML XPath – A language for navigating within XML

documentsdocuments XSLT – an XML transformation languageXSLT – an XML transformation language XQuery – a general purpose XML query XQuery – a general purpose XML query

languagelanguage Based on XML Schema typesBased on XML Schema types Includes XPath as a subset.Includes XPath as a subset.

Page 5: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 55

Storing XML Documents in Storing XML Documents in RDB System RDB System

1 ) Simplest one is to use 1 ) Simplest one is to use Long Character StringLong Character String data typedata type like, CLOB in SQL like, CLOB in SQL

Will store entire document as a character stringWill store entire document as a character string Textual FidelityTextual Fidelity Fails to take advantage of structural information Fails to take advantage of structural information

available in XML markupavailable in XML markup

Page 6: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 66

Storing XML Documents in Storing XML Documents in RDB System ( Contd… ) RDB System ( Contd… )

2 ) 2 ) ShreddingShredding Distributes XML information across one/more columns of tables Distributes XML information across one/more columns of tables

preserving both data values & structural relationships.preserving both data values & structural relationships. For XML schema => tablesFor XML schema => tables

levels of elements….levels of elements….

at each level – different tables for elements in hierarchyat each level – different tables for elements in hierarchy

Schema Based ShreddingSchema Based Shredding Not efficient withNot efficient with

sparse element - with varying contentssparse element - with varying contents Mixed contents – text + child elementsMixed contents – text + child elements

Fails to preserveFails to preserve Document orderingDocument ordering Processing instructions of XML documentsProcessing instructions of XML documents

Page 7: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 77

Storing XML Documents in Storing XML Documents in RDB System ( Contd… ) RDB System ( Contd… )

3)3) XML PublishingXML Publishing to reconstructs XML documents from relational tables,to reconstructs XML documents from relational tables,

Systems usually provides inverse information called “XML Systems usually provides inverse information called “XML Publishing”Publishing”

Such Systems with shredding + XML publishing are Such Systems with shredding + XML publishing are said to provide said to provide Relational FidelityRelational Fidelity

As authoritative form of data is relational, not XML.As authoritative form of data is relational, not XML.

4) Native XML with XML Fidelity.4) Native XML with XML Fidelity.

Page 8: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 88

Querying & Manipulating Querying & Manipulating XML DataXML Data

XML Storage facility -> interface to access and XML Storage facility -> interface to access and manipulate stored data.manipulate stored data.

XPath – better navigation within documents but, XPath – better navigation within documents but, can not transform structurescan not transform structures Can not construct new elementsCan not construct new elements

XSLT – transformation + Construction But,XSLT – transformation + Construction But, Recursive template-driven nature – unsuitable for Recursive template-driven nature – unsuitable for

optimizationoptimization

XQuery – complete set of query facilities.XQuery – complete set of query facilities.

Page 9: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 99

Querying & Manipulating Querying & Manipulating XML Data ( Contd… )XML Data ( Contd… )

XML Data ModelXML Data Model XML documents as ordered, labeled, finite, unranked XML documents as ordered, labeled, finite, unranked

trees.trees. Relative order of nodes – order of siblingsRelative order of nodes – order of siblings Region encoding labeling schemeRegion encoding labeling scheme < doc, start, end, level >< doc, start, end, level >

• Doc – to which document, node belongs toDoc – to which document, node belongs to• Start & end – position of element in a documentStart & end – position of element in a document• Level – level of a node in a treeLevel – level of a node in a tree

X is an ancestor of y, if and only ifX is an ancestor of y, if and only if• x.start < y.start and x.end > y.endx.start < y.start and x.end > y.end

Page 10: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1010

Querying & Manipulating Querying & Manipulating XML Data ( Contd… )XML Data ( Contd… )

XML Labeling SchemesXML Labeling Schemes To evaluate queries in XPath, XSLT & XQuery,To evaluate queries in XPath, XSLT & XQuery,

1.1. Maintain results throughout the evaluation in document Maintain results throughout the evaluation in document orderorder• Restricts choice of query plansRestricts choice of query plans• Impossible if query requires data to be resorted along different Impossible if query requires data to be resorted along different

axis at some point.axis at some point.

2.2. Sort OperatorSort Operator – handled at appropriate times – handled at appropriate times• Assign each node a label – denoting relative orderAssign each node a label – denoting relative order• Like, region encoding schemeLike, region encoding scheme

• Ancestor-descent problemAncestor-descent problem• Variable size labeling scheme Variable size labeling scheme

• Do not need to relabel a node on update.Do not need to relabel a node on update.• Difficult to allocate fixed portion of each record for label. Difficult to allocate fixed portion of each record for label.

Page 11: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1111

Querying & Manipulating Querying & Manipulating XML Data ( Contd… )XML Data ( Contd… )

Storing XML in RDBMSStoring XML in RDBMS Labeling Scheme + edge shredding = form a single Labeling Scheme + edge shredding = form a single

relation for storing XML Docrelation for storing XML Doc Edge relationEdge relation

1.1. Global Encoding SchemeGlobal Encoding Scheme• Edge(Edge(idid, parent-id, end, path-id, value), parent-id, end, path-id, value)

2.2. Local Encoding SchemeLocal Encoding Scheme• Edge(Edge(idid, parent-id, sIndex, path-id, value), parent-id, sIndex, path-id, value)• sIndex – position of a node among siblingssIndex – position of a node among siblings

Page 12: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1212

General Technique for General Technique for Querying XML Doc in RDBMSQuerying XML Doc in RDBMS

To store and query an XML DocTo store and query an XML Doc1.1. Relational Schema Generation – table creationRelational Schema Generation – table creation

2.2. Shredding – storing XML DocShredding – storing XML Doc

3.3. Converting queries over stored XML into SQL Converting queries over stored XML into SQL queries over created tablesqueries over created tables

Relational schema generation – requires its Relational schema generation – requires its own query processor to convert the queriesown query processor to convert the queries

But, the same query processor can be used..But, the same query processor can be used..

Page 13: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1313

Contd...Contd...

To use the same query processor for relational To use the same query processor for relational schema generation and converting queries,schema generation and converting queries,

Along with shredding, Along with shredding, Reconstruction XML ViewReconstruction XML View is is created over relational tablescreated over relational tables

Virtually reconstructs the Virtually reconstructs the

Stored XML Doc <- shredded rows.Stored XML Doc <- shredded rows. Just like the normal view over the Stored XML Doc.Just like the normal view over the Stored XML Doc. Queries on Stored XML = Queries over Queries on Stored XML = Queries over

Reconstruction XML ViewReconstruction XML View

Page 14: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1414

Contd...Contd...

Page 15: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1515

Contd...Contd...

For Relational Schema Generation, a program For Relational Schema Generation, a program thatthat

Generated desired relational schemaGenerated desired relational schema Produce XML Shredder object Produce XML Shredder object Create reconstruction XML viewCreate reconstruction XML view

• Either for,Either for, Shared relational schemaShared relational schema Edge relational schemaEdge relational schema

Page 16: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1616

Contd...Contd... Shared Relational SchemaShared Relational Schema Steps to generate relational schemaSteps to generate relational schema

Create a DTD Graph node ( XML Element, Create a DTD Graph node ( XML Element, Attribute, Operator)Attribute, Operator)

Create a relation for root element in graphCreate a relation for root element in graph All children of element are represented in same All children of element are represented in same

relation of element EXCEPT,relation of element EXCEPT,• *-node, - is a ‘set’ values + can’t captured by relational *-node, - is a ‘set’ values + can’t captured by relational

expressionsexpressions• So, create separate relation for these nodes.So, create separate relation for these nodes.

Page 17: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1717

Contd...Contd...

Page 18: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1818

Contd...Contd...

Page 19: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 1919

Contd...Contd...

Page 20: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2020

XQL ( XML Query Language )XQL ( XML Query Language )

Structured Queries – relational / OO DBStructured Queries – relational / OO DB Unstructured Queries – DocumentsUnstructured Queries – Documents Semi-structured Queries – XML DocumentsSemi-structured Queries – XML Documents Features like,Features like,

Allows, user to combine information from multiple sourcesAllows, user to combine information from multiple sources Uses links as a part of a queryUses links as a part of a query Search based on text containtmentSearch based on text containtment

Eg ) Doc1 – recommended booksEg ) Doc1 – recommended books

Doc 2 – Books + PricesDoc 2 – Books + Prices

Doc 3 – Reviews of BooksDoc 3 – Reviews of Books

Then, a query -> list recommended books, prices and reviews.Then, a query -> list recommended books, prices and reviews.

Page 21: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2121

XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…

Difference between SQL & XQL QueryDifference between SQL & XQL QuerySQL XQL

The database is a set of tables. The database is a set of one or more XML documents.

uses the structure of tables as a basic model.

uses the structure of XML documents as a basic model.

The FROM clause determines the tables which are examined by the query.

A query is given a list of input nodes from one or more documents.

The result of a query is a table containing a set of rows; this table may serve as the basis for further queries.

The result of a query is a list of XML document nodes, which may serve as the basis for further queries.

Page 22: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2222

XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…

Basic Concepts of XQLBasic Concepts of XQL Simple string – element nameSimple string – element name

• Eg. tableEg. table ‘‘/’ – child operator – indicates hierarchy/’ – child operator – indicates hierarchy

• Eg. Front/authorEg. Front/author ‘‘front/author='Theodore Seuss Geisel'front/author='Theodore Seuss Geisel' front/author/address/@type='email' front//address //address front/author/address[@type='email'] front/author='Theodore Seuss Geisel'[@gender='male' and shoesize='9EEEE'] section[1,3 to 5, 8, -1] section[@level='3'][1 to 2]

Page 23: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2323

XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…

Example QueriesExample Queries

Page 24: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2424

XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…

Example QueriesExample Queries

Page 25: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2525

XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…

Grouping of resultsGrouping of results QueryQuery – –

lists the products on invoices might want to group products by invoice, placing each group of lists the products on invoices might want to group products by invoice, placing each group of

products within an invoice tag.products within an invoice tag.

Page 26: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2626

XQL ( XML Query Language ) XQL ( XML Query Language ) Contd…Contd…

Join Combine information from multiple sources to create one unifies view.

Queries can be written like,

Page 27: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2727

ConclusionConclusion

XML Documents can be stored efficiently in a relational database system using number of approaches.

General Technique for storing and querying XML Document using RDBMS eliminated need of separate query processors for XML query translation.

Using General Technique Reconstruction XML View can be generated for both shared and edge based relational schema.

Stored XML Document can be queried effectively through the use of XQuery, XPath, XSLT or XQL.

Page 28: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2828

ReferencesReferences

“XML and Relational Database Management Systems : the inside Story” by Michael Rys, Don Chamberlin, & Daniela Florescu.

“A General Technique for Querying XML Documents using a Relational Database System” by Jayavel Shanmugasundaram, Rajasekar Krishnamurthy, Igor Tatarinov.

“Querying and Maintaining Ordered XML Data Using Relational Databases” by Willium SHui, Franky Lam, Damien Fisher & Raymond Wong.

“Querying Structured Text in an XML Database” by Shurung Al-Khalifa, Cong Yu, H.V. Jagdish.

“Structured Materialized Views for XML Queries” by Andrei Arion, Veronique Benzaken & Ioana Manolescu.

Page 29: 8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu

04/19/2304/19/23 2929

Thank You.Thank You.

Any Questions ???