xml validation i dtds

32
XML Validation I DTDs Robin Burke ECT 360 Winter 2004

Upload: laksha

Post on 18-Jan-2016

55 views

Category:

Documents


2 download

DESCRIPTION

XML Validation I DTDs. Robin Burke ECT 360 Winter 2004. Outline. History Grammars / Regular expressions DTDs elements attributes entities Declarations. Validation. Why bother?. The idea. Language consists of terminals a, b, c Set of productions beginning with non-terminals - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: XML Validation I DTDs

XML Validation IDTDs

Robin Burke

ECT 360

Winter 2004

Page 2: XML Validation I DTDs

Outline

History Grammars / Regular expressions DTDs

elementsattributesentities

Declarations

Page 3: XML Validation I DTDs

Validation

Why bother?

Page 4: XML Validation I DTDs

The idea

Language consists of terminalsa, b, c

Set of productionsbeginning with non-terminals

• A, B, C

rules specifying how to generate sequences of terminals

Page 5: XML Validation I DTDs

Example

A aB A aBA B b generates strings

ababab etc.

Page 6: XML Validation I DTDs

Grammar

Can be used to efficiently parse a languagebasis of all modern programming

language parsing since Algol-60Java Language Specification is

completely in EBNF grammar

Page 7: XML Validation I DTDs

Grammar

XMLgrammar-based syntaxadheres to EBNF

SGMLSGML had a more complex language

definition syntaxHTML is defined the SGML way

Page 8: XML Validation I DTDs

Regular expressions

Language for expressing patterns Basic components

pattern elements optional element = ? repetition (1 or more) = + repetition (0 or more) = * choice = | grouping = ( ) sequence = ,

Page 9: XML Validation I DTDs

Examples

(a, b)*all strings "ab" "abab" etc.

(a | b | c)+, q, (b, c)*aaqbbqbqcccccccc

Page 10: XML Validation I DTDs

Note

Regular expressions are different in different applicationsPerlJavascriptXML Schemas

DTDs only support?+*|,()

Page 11: XML Validation I DTDs

EBNF

EBNF is more compact version of BNF it uses regular expressions to simplify

grammar expression A aB A aBA turns into

A aB(A)? only one production per non-terminal

allowed

Page 12: XML Validation I DTDs

DTDs

Use EBNF to specify structure of XML documents

Plusattributesentities

Syntaxholdover from SGMLUgly

Page 13: XML Validation I DTDs

DTD Syntax

<!ELEMENT element-name content_model>

Content model contains the RHS of the production rule

Example<!ELEMENT name

(firstName, lastName)>

Page 14: XML Validation I DTDs

DTD Syntax cont'd

Not XML<! begins a declarationNo "content"Empty elements not indicated with />

Page 15: XML Validation I DTDs

Simple content models

Content can be any text#PCDATA

Content can be anything at all (useful for debugging)ANY

Element has no contentEMPTY

Page 16: XML Validation I DTDs

Example

<grades><grade>

<student>Jane Doe</student><assigned-grade>A</assigned-grade>

</grade><grade>

<student>John Doe</student><assigned-grade>A-</assigned-grade>

</grade></grades>

Page 17: XML Validation I DTDs

Example

<grades><grade>

<student>Jane Doe</student><assigned-grade>A</assigned-grade>

</grade><grade>

<student>John Doe</student><assigned-grade>A-</assigned-grade>

</grade><grade> <student>Wayne Doe</student>

<assigned-grade>I</assigned-grade><reason>Alien abduction</reason>

</grade></grades>

Page 18: XML Validation I DTDs

DTD?

Page 19: XML Validation I DTDs

Mixed content

Legal to have a content model with text and element data<story category="national" byline="Karen

Wheatley"><headline>President Meets with

Congress</headline>The President meet with Congressional leaders

today in effort to jump-start faltering budget negotiations.

Sources described the mood of the meeting as "cordial". <full_text ref="news801" /> <image src="img2071.jpg" /> <image src="img2072.jpg" /> <image src="img2073.jpg" /></story>

Page 20: XML Validation I DTDs

Mixed content, cont'd

<!ELEMENT story (headline, #PCDATA, full-story, image*)>

Mixed content makes handling XML complexnecessary for many applications

Page 21: XML Validation I DTDs

Recursion

Unlike grammarsrecursive formulation ≠ repetition

Difference between<!ELEMENT students (student+)><!ELEMENT students (student,

students?)>

Page 22: XML Validation I DTDs

Restriction

The grammar cannot be ambiguousA (a, b)| (a, c)this makes the parser implementation

difficult Usually easy to make non-ambiguous

A a, (b | c)

Page 23: XML Validation I DTDs

Attribute lists

Declared separately from elementscan be anywhere in the DTD

Specification includesname of the elementname of the attributeattribute typedefault

Page 24: XML Validation I DTDs

Attribute types

Character data CDATA different from XML CDATA section!

Enumerated (yes|no)

ID must be unique in the document

IDREF must refer to an id in the document

NMTOKEN a restriction of CDATA to single "word"

Also IDREFS and NMTOKENS

Page 25: XML Validation I DTDs

Default declaration

#REQUIRED #IMPLIED

means optional Value

this becomes the default #FIXED

value provided

Page 26: XML Validation I DTDs

Examples

<!ATTLIST img

src CDATA #REQUIRED

alt CDATA #REQUIRED

align (left|right|center) "left"

id ID #IMPLIED

>

<!ATTLIST timestamp

time-zone NMTOKEN #IMPLIED>

Page 27: XML Validation I DTDs

Entities

Like macroscontent to be insertedindicated with &name;

Predefined general entities&amp; &lt;essential part of XML

User-defined general entities&disclaimer;

Page 28: XML Validation I DTDs

Entities, cont'd

Parameter entitiescan also be used to simplify DTD

creationor to combine DTDsindicated with a %

More on this next week

Page 29: XML Validation I DTDs

Defining general entities

<!ENTITY name content> Example

<!ENTITY disclaimer

"This is a work of fiction. Any resemblance to persons living or dead is unintentional.">

Page 30: XML Validation I DTDs

In-class exercise

Business cards

Page 31: XML Validation I DTDs

Next week

More DTDsEntitiesModularization and parameterizationpg. 129-148

Page 32: XML Validation I DTDs

Lab