data modeling using xml schemas murali mani extreme 2002

28
Data Modeling using XML Schemas Murali Mani Extreme 2002

Post on 18-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Modeling using XML Schemas Murali Mani Extreme 2002

Data Modeling using XML Schemas

Murali Mani

Extreme 2002

Page 2: Data Modeling using XML Schemas Murali Mani Extreme 2002

What this talk is not about

Not about

<review> <reviewer>X</reviewer> gave a <rating>two thumbs up</rating> for the <movie>Fugitive, The</movie></review>

We talk about data modeling from database perspective.

Page 3: Data Modeling using XML Schemas Murali Mani Extreme 2002

What is database perspective?

Our world consists of Entities Relationships

binary - 1:1, 1:many, many:many n-ary recursive

Attributes for entities Attributes for relationships

Page 4: Data Modeling using XML Schemas Murali Mani Extreme 2002

Outline of the talk

How XML can contribute to the DB community. Introduction of the ER model How ER concepts are modeled using relational

model Mapping ER concepts to XML model Constraint specification for XML – what are the

options? Subtyping for XML processing – do we need it,

and what are the options?

Page 5: Data Modeling using XML Schemas Murali Mani Extreme 2002

How XML can contribute to DB community

Standard exchange format Superior data model?

Recursive relationships Union types

Person

(name | (lastname, firstname), age, address) Frendlier representation of relationships?

Page 6: Data Modeling using XML Schemas Murali Mani Extreme 2002

person (person*)person (person?)

Person Age Father

X 25 Y

Y 55 null<person Y, 55> <person X, 25/></person>

Page 7: Data Modeling using XML Schemas Murali Mani Extreme 2002

Data Modeling

What is a data model? Structural specification Specification of constraints Operations to retrieve/update the data

Stages in database design Conceptual model Logical Model Physical

model Conceptual Model and Logical Model –

absolutely NO (almost no) redundancy

Page 8: Data Modeling using XML Schemas Murali Mani Extreme 2002

Database Design and Redundancy

Prof Age

Muntz 60student BS Prof

MM CS Muntz

YC EE Muntz

Student BS Prof Age

MM CS Muntz 60

YC EE Muntz 60

Page 9: Data Modeling using XML Schemas Murali Mani Extreme 2002

Database design and redundancy

Person Address City State zip

X A1 LAX CA 90066

Y A2 LAX CA 90066

Page 10: Data Modeling using XML Schemas Murali Mani Extreme 2002

Entity Relationship (ER model)

Consider students and professors in a dept, with a relationship advisor

Student Prof since

MM Muntz 1998

YC Muntz 2000

Page 11: Data Modeling using XML Schemas Murali Mani Extreme 2002

ER Model (contd…)

N-ary relationship

Page 12: Data Modeling using XML Schemas Murali Mani Extreme 2002

Relational Model

Every relation has a key Relationships are represented using

foreign keys Foreign key from A to B represents

A (_, 1) : B (_, _) relationship

Supplier Part City lastShipment

Page 13: Data Modeling using XML Schemas Murali Mani Extreme 2002

Relational Model (contd…)

Supplier Part City lastShipment

PName

Muntz

Student Professor since

MM Muntz 1998

YC Muntz 2000

Page 14: Data Modeling using XML Schemas Murali Mani Extreme 2002

Relationships in XML model

A (1, 1) : B (_, _) can be represented using parent-child relationships as

B A* prof (@PName, student*)<prof PName=“Muntz”>

<student SName=“MM” since=“1998”/>

<student SName=“YC” since=“1998”/>

</prof>

Page 15: Data Modeling using XML Schemas Murali Mani Extreme 2002

Entity Relationship (ER model)

Consider students and professors in a dept, with a relationship advisor

Student Prof since

MM Muntz 1998

YC Muntz 2000

Page 16: Data Modeling using XML Schemas Murali Mani Extreme 2002

Using ID/IDREF to represent relationships

A (_, 1) : B (_, _) can be represented using ID/IDREF as Define an ID attribute for B Define an IDREF attribute for A referring B

prof (@PName, @id), student (@SName, @since, @idref::prof)<prof PName=“Muntz” id=“P1”/><student SName=“MM” since=“1998” @idref=“P1”/><student SName=“YC” since=“2000” @idref=“P1”/>

Page 17: Data Modeling using XML Schemas Murali Mani Extreme 2002

Using ID/IDREFS to represent relationships – not Really… ID/IDREFS can represent any binary relationship – A (_, _)

: B (_, _), but cannot represent attributes for relationship

A (@id) B (@idrefs::A*) student (SName, @id) professor (PName, @idrefs::student*) <student SName=“MM” id=“S1”/> <student SName=“YC” id=“S2”/> <professor PName=“Muntz” @idrefs=“S1 S2”/>

Page 18: Data Modeling using XML Schemas Murali Mani Extreme 2002

Using foreign keys to represent relationships

student (SName, Professor, since)

professor (PName)

Page 19: Data Modeling using XML Schemas Murali Mani Extreme 2002

Summary so far…

XML schemas allow us to represent relationships in a friendlier way…

All foreign key constraints can be represented using parent-child or ID/IDREF – we do not really need foreign keys

IDREFS not recommended for representing relationships.

Page 20: Data Modeling using XML Schemas Murali Mani Extreme 2002

Constraint specification in XML – questions to be asked

Node equality vs value equality (or) Can a path field produce an element?

Can a path field produce a set of elements/values? – if so, what semantics?

Should a path field exist? (or) Can a path field return empty?

Should path expressions traverse only down the tree? Should our constraints be based on type selectors or

should they be based on path expression selectors? If we use path expression or type selectors, do we need

relative keys?

Page 21: Data Modeling using XML Schemas Murali Mani Extreme 2002

Node Equality

Makes it easier, but… When are two elements equal – their

serialized string values ignoring the order of attributes is the same.

We have used order among child nodes in defining node equality…

Page 22: Data Modeling using XML Schemas Murali Mani Extreme 2002

Can a path field produce a set of values?

professor (Pname, Age)<professor>

<Age>60</Age>

<Pname>Muntz</Pname>

<Pname>Chu</Pname>

</professor>

If a type X has a key (X1, X2, …, Xn), then the set Y1 * Y2 * … * Yn should be unique

Page 23: Data Modeling using XML Schemas Murali Mani Extreme 2002

Should a path expression traverse only down the tree? Trade off is relative keys vs traversing up the tree.. For example, consider student, professor with a

difference – a student can have multiple professors. Consider the same designProfessor (PName, Student*)Student (Sname)

Key for student can be specified as either(professor, Sname) (or)Key for student relative to professor is (Sname) But this is bad design anyways…

Page 24: Data Modeling using XML Schemas Murali Mani Extreme 2002

Three different constraint specifications UCM – WWW10

Type selectors, no relative keys, path expressions can produce set of values.

Keys for XML – WWW10 Path selectors, relative keys specified through

paths, path expressions cannot produce set of values.

W3C XML Schema Path selectors, relative keys specified through

types, path expressions cannot produce set of values.

Page 25: Data Modeling using XML Schemas Murali Mani Extreme 2002

Commonalities across the 3 specifications

No concept of node equality Path expressions traverse only down the

tree A path field should exist

Page 26: Data Modeling using XML Schemas Murali Mani Extreme 2002

Summary about Data Modeling

Entity types map to element types. Some relationship types map to element types. Ability to define element types –

RELAX NG provides the ability for us to define element types,

In XML Schema, this is not so easy. Key constraints based on type selectors seem

the right way to go.

Page 27: Data Modeling using XML Schemas Murali Mani Extreme 2002

XML Processing and Subtyping

Subtyping is essential for static type checking

function f1 : a{A} B*,C* {for $x in a//name return <b/>;for $x in a//name return <c/>; }

function f2 : d{(B, B)*, (C, C)* | B, (B,B)*, C, (C, C)*} { … }

Is this type-safe? Type-inferencing vs type-checking problem.

Page 28: Data Modeling using XML Schemas Murali Mani Extreme 2002

Two techniques for subtyping

Implicit – tree/hedge language inclusion A type A is a subtype of type B iff L (A) is a

sublanguage of L (B) – used in XDuce Explicit – user specifies type hierarchy

As in XML Schema Explicit subtyping “implicitly” solves type-

inferencing vs type checking problem. Implicit subtyping poses several interesting

research problems.