end of sql xml

40
End of SQL XML April 22 th , 2002

Upload: keaira

Post on 25-Feb-2016

70 views

Category:

Documents


1 download

DESCRIPTION

End of SQL XML. April 22 th , 2002. Null Values. If x=Null then 4*(3-x)/7 is still NULL If x=Null then x=“Joe” is UNKNOWN Three boolean values: FALSE = 0 UNKNOWN = 0.5 TRUE = 1. Null Value Logic. C1 AND C2 = min(C1, C2) C1 OR C2 = max(C1, C2) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: End of SQL XML

End of SQLXML

April 22th, 2002

Page 2: End of SQL XML

Null Values

• If x=Null then 4*(3-x)/7 is still NULL

• If x=Null then x=“Joe” is UNKNOWN• Three boolean values:

– FALSE = 0– UNKNOWN = 0.5– TRUE = 1

Page 3: End of SQL XML

Null Value Logic• C1 AND C2 = min(C1, C2)• C1 OR C2 = max(C1, C2)• NOT C1 = 1 – C1

SELECT *FROM PersonWHERE (age < 25) AND (height > 6 OR weight > 190)

Semantics of SQL: include only tuples that yield TRUE

Page 4: End of SQL XML

Null ValuesUnexpected behavior:

SELECT *FROM PersonWHERE age < 25 OR age >= 25

Some Persons are not included !

Page 5: End of SQL XML

Testing for NullCan test for NULL explicitly:

– x IS NULL– x IS NOT NULL

SELECT *FROM PersonWHERE age < 25 OR age >= 25 OR age IS NULL

Now it includes all Persons

Page 6: End of SQL XML

OuterjoinsExplicit joins in SQL:

Product(name, category) Purchase(prodName, store)

SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName

Same as:SELECT Product.name, Purchase.store

FROM Product, Purchase WHERE Product.name = Purchase.prodName

But Products that never sold will be lost !

Page 7: End of SQL XML

Left OuterjoinsLeft outer joins in SQL:

Product(name, category) Purchase(prodName, store)

SELECT Product.name, Purchase.store FROM Product LEFT OUTER JOIN Purchase ON Product.name = Purchase.prodName

Page 8: End of SQL XML

Name Category

Gizmo gadget

Camera Photo

OneClick Photo

ProdName Store

Gizmo Wiz

Camera Ritz

Camera Wiz

Name Store

Gizmo Wiz

Camera Ritz

Camera Wiz

OneClick -

Product Purchase

Page 9: End of SQL XML

Outer Joins

• Left outer join:– Include the left tuple even if there’s no match

• Right outer join:– Include the right tuple even if there’s no match

• Full outer join:– Include the both left and right tuples even if

there’s no match

Page 10: End of SQL XML

XML

Page 11: End of SQL XML

More Facts About XML

• Every database vendor has an XML page:– www.oracle.com/xml– www.microsoft.com/xml– www.ibm.com/xml

• Many applications are just fancier Websites• But, most importantly, XML enables data

sharing on the Web – hence our interest

Page 12: End of SQL XML

What is XML ?From HTML to XML

HTML describes the presentation: easy for humans

Page 13: End of SQL XML

HTML

<h1> Bibliography </h1><p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995<p> <i> Data on the Web </i> Abiteboul, Buneman, Suciu <br> Morgan Kaufmann, 1999

HTML is hard for applications

Page 14: End of SQL XML

XML<bibliography>

<book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> …

</bibliography>

XML describes the content: easy for applications

Page 15: End of SQL XML

XML

• eXtensible Markup Language• Roots: comes from SGML

– A very nasty language• After the roots: a format for sharing data• Emerging format for data exchange on the

Web and between applications

Page 16: End of SQL XML

XML Applications

• Sharing data between different components of an application.

• Archive data in text files.• EDI: electronic data exchange:

– Transactions between banks– Producers and suppliers sharing product data (auctions)– Extranets: building relationships between companies

• Scientists sharing data about experiments.

Page 17: End of SQL XML

Web Services

• A new paradigm for creating distributed applications?

• Systems communicate via messages, contracts.

• Example: order processing system.• MS .NET, J2EE – some of the platforms• XML – a part of the story; the data format.

Page 18: End of SQL XML

XML Syntax

• Very simple:<db> <book> <title>Complete Guide to DB2</title> <author>Chamberlin</author> </book> <book> <title>Transaction Processing</title> <author>Bernstein</author> <author>Newcomer</author> </book> <publisher> <name>Morgan Kaufman</name> <state>CA</state> </publisher></db>

Page 19: End of SQL XML

XML Terminology• tags: book, title, author, …• start tag: <book>, end tag: </book>• start tags must correspond to end tags, and

conversely

Page 20: End of SQL XML

XML Terminology• an element: everything between tags

– example element: <title>Complete Guide to DB2</title>

– example element:

• elements may be nested• empty element: <red></red> abbreviated <red/>• an XML document has a unique root element

well formed XML document: if it has matching tags

<book> <title> Complete Guide to DB2 </title> <author>Chamberlin</author> </book>

Page 21: End of SQL XML

The XML Treedb

book book publisher

title author title author author name state“CompleteGuideto DB2”

“Chamberlin” “TransactionProcessing”

“Bernstein” “Newcomer”“MorganKaufman”

“CA”

Tags on nodesData values on leaves

Page 22: End of SQL XML

More XML Syntax: Attributes

<book price = “55” currency = “USD”> <title> Complete Guide to DB2 </title> <author> Chamberlin </author> <year> 1998 </year></book>

price, currency are called attributes

Page 23: End of SQL XML

Replacing Attributes with Elements

<book> <title> Complete Guide to DB2

</title> <author> Chamberlin </author> <year> 1998 </year> <price> 55 </price> <currency> USD </currency></book>

attributes are alternative ways to represent data

Page 24: End of SQL XML

“Types” (or “Schemas”) for XML

• Document Type Definition – DTD• Define a grammar for the XML document,

but we use it as substitute for types/schemas• Will be replaced by XML-Schema (will

extend DTDs)

Page 25: End of SQL XML

An Example DTD

• PCDATA means Parsed Character Data (a mouthful for string)

<!DOCTYPE db [ <!ELEMENT db ((book|publisher)*)> <!ELEMENT book (title,author*,year?)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT publisher (#PCDATA)>]>

Page 26: End of SQL XML

More on DTDs: Attributes<!DOCTYPE db [ <!ELEMENT db ((book|publisher)*)> <!ELEMENT book (title,author*,year?)> . . . <!ATTLIS book price CDATA #REQUIRED language CDATA #IMPLIED> <!ATTLIS author phone CDATA #IMPLIED> ]>

<db> <book price=“55” language=“English”> <title> Complete Guide to DB2 </title> <author> Chamberlin </author> </book>…</db>

The type:CDATA = stringID = a keyIDREF = a foreign keyothers=rarely used

Default declaration:#REQUIRED=required#IMPLIED=optional#FIXED=fixed (rarely used)

Page 27: End of SQL XML

DTDs as Grammars

Same thing as:

• A DTD is a EBNF (Extended BNF) grammar• An XML tree is precisely a derivation tree

XML Documents that have a DTD and conform to it are called valid

db ::= (book|publisher)*book ::= (title,author*,year?)title ::= stringauthor ::= stringyear ::= stringpublisher ::= string

Page 28: End of SQL XML

More on DTDs as Grammars<!DOCTYPE paper [ <!ELEMENT paper (section*)> <!ELEMENT section ((title,section*) | text)> <!ELEMENT title (#PCDATA)> <!ELEMENT text (#PCDATA)>]>

<paper> <section> <text> </text> </section> <section> <title> </title> <section> … </section> <section> … </section> </section></paper>

XML documents can be nested arbitrarily deep

Page 29: End of SQL XML

XML for Representing Data

<persons><row> <name>John</name> <phone> 3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone> 6363</phone></row>

</persons>

n a m e p h o n e

J o h n 3 6 3 4

S u e 6 3 4 3

D i c k 6 3 6 3

row row row

name name namephone phone phone

“John” 3634 “Sue” “Dick”6343 6363

persons XML: persons

Page 30: End of SQL XML

XML vs Data Models

• XML is self-describing• Schema elements become part of the data

– Relational schema: persons(name,phone)– In XML <persons>, <name>, <phone> are part

of the data, and are repeated many times• Consequence: XML is much more flexible• XML = semistructured data

Page 31: End of SQL XML

Semi-structured Data Explained

• Missing attributes:

• Repeated attributes

<person> <name> John</name> <phone>1234</phone> </person>

<person> <name>Joe</name></person> no phone !

<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person>

two phones !

Page 32: End of SQL XML

Semistructured Data Explained

• Attributes with different types in different objects

• Nested collections (no 1NF)• Heterogeneous collections:

– <db> contains both <book>s and <publisher>s

<person> <name> <first> John </first> <last> Smith </last> </name> <phone>1234</phone></person>

structured name !

Page 33: End of SQL XML

XML Data v.s. E/R, ODL, Relational

• Q: is XML better or worse ?• A: serves different purposes

– E/R, ODL, Relational models:• For centralized processing, when we control the data

– XML:• Data sharing between different systems• we do not have control over the entire data• E.g. on the Web

• Do NOT use XML to model your data ! Use E/R, ODL, or relational instead.

Page 34: End of SQL XML

Data Sharing with XML: Easy

Data source(e.g. relational

Database)

ApplicationWeb

XML

Page 35: End of SQL XML

Exporting Relational Data to XML

• Product(pid, name, weight)• Company(cid, name, address)• Makes(pid, cid, price)

product companymakes

Page 36: End of SQL XML

Export data grouped by companies

<db><company> <name> GizmoWorks </name> <address> Tacoma </address> <product> <name> gizmo </name> <price> 19.99 </price> </product> <product> …</product> …</company><company> <name> Bang </name> <address> Kirkland </address> <product> <name> gizmo </name> <price> 22.99 </price> </product> …</company>…

</db>

Redundantrepresentationof products

Page 37: End of SQL XML

The DTD

<!ELEMENT db (company*)><!ELEMENT company (name, address, product*)><!ELEMENT product (name,price)><!ELEMENT name (#PCDATA)><!ELEMENT address (#PCDATA)><!ELEMENT price (#PCDATA)>

Page 38: End of SQL XML

Export Data by Products<db> <product> <name> Gizmo </name> <manufacturer> <name> GizmoWorks </name> <price> 19.99 </price> <address> Tacoma </address> </manufacturer> <manufacturer> <name> Bang </name> <price> 22.99 </price> <address> Kirkland

</address> </manufacturer> … </product> <product> <name> OneClick </name> …</db>

RedundantRepresentationof companies

Page 39: End of SQL XML

Which One Do We Choose ?

• The structure of the XML data is determined by agreement, with our partners, or dictated by committees– Many XML dialects (called applications)

• XML Data is often nested, irregular, etc• No normal forms for XML

Page 40: End of SQL XML

Storing XML Data

• We got lots of XML data from the Web, how do we store it ?

• Ideally: convert to relational data, store in RDBMS

• Much harder than exporting relations to XML (why ?)

• DB Vendors currently work on tools for loading XML data into an RDBMS