tp2

32
XML DTD and Namespaces Chapter 2

Upload: binh-trong-an

Post on 10-May-2015

285 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tp2

XML DTD and Namespaces

Chapter 2

Page 2: Tp2

Core XML / Chapter 2 / Slide 2 of 25

Review-1 A markup language defines a set of rules that adds meaning to the content and

structure of documents XML is extensible, which means that we can define our own set of tags, and

make it possible for other parties (people or programs) to know and understand these tags. This makes XML much more flexible than HTML

XML inherits features from SGML and includes the features of HTML. XML can be generated from existing databases using a scalable three-tier model. XML-based data does not contain information about how data should be displayed

An XML document is composed of a set of “entities” identified by unique names

Page 3: Tp2

Core XML / Chapter 2 / Slide 3 of 25

Review-2 A well-formed document is one that conforms to the basic rules

of XML; a valid document is a well-formed document that conforms to the rules of a DTD (Document Type Definition)

The parser helps the computer to interpret an XML file Steps involved in the building of an XML document are:

Stating an XML declaration Creating a root element Creating the XML code Verifying the document

Character data is classified into PCDATA and CDATA

Page 4: Tp2

Core XML / Chapter 2 / Slide 4 of 25

Review-3 Entities are used to avoid typing long pieces of text

repeatedly in a document. The two types of entities are: General entities Parameter entities

The <!DOCTYPE […]> declaration follows the XML declaration in an XML document.

An attribute gives information about an element

Page 5: Tp2

Core XML / Chapter 2 / Slide 5 of 25

Chapter Objectives

Explain Document Type Definition Create Document Type Definitions:

Declaring an ElementDeclaring Attributes

Explain the use of DTD Describe namespaces

Page 6: Tp2

Core XML / Chapter 2 / Slide 6 of 25

Document Type Definition (DTD)

It is a feature of SGML, which is inherited by XML.

It contains the list of tags that specifies the grammatical structure of an XML document.

DTD defines the way elements relate to one another within the document’s tree structure, and specifies the attributes.

DTD are of two types: An external DTD An internal DTD

Page 7: Tp2

Core XML / Chapter 2 / Slide 7 of 25

Why Use a DTD DTDs are used by XML to provide an application

independent way of sharing data. Common DTD can be used to interchange data

between independent groups of people. DTD can be used by the application to verify that

valid data has been entered. It defines the legal building blocks of an XML

document.

Page 8: Tp2

Core XML / Chapter 2 / Slide 8 of 25

Structure of a DTD

<!DOCTYPE dtd-name

[

<!ELEMENT element-name (element-content type) >

<!ATTLIST element-name attribute-name attribute-type default-value>

]>

DOCTYPE declaration

ELEMENT declaration

ATTRIBUTE declaration

Page 9: Tp2

Core XML / Chapter 2 / Slide 9 of 25

Declaring an Element XML elements are declared with an element

declaration. Syntax<!ELEMENT element-name (element-content

type)> Example

<!ELEMENT SHOWROOM (TV|LAPTOP)+>

Page 10: Tp2

Core XML / Chapter 2 / Slide 10 of 25

Empty Element EMPTY element-content type specifies that the

element has no child elements or character data. Syntax

<!ELEMENT element-name (EMPTY)> Example

<!ELEMENT img (EMPTY)> Empty elements with attributes are possible:

<img src=“Tittle.gif”></img>

Page 11: Tp2

Core XML / Chapter 2 / Slide 11 of 25

Elements with Data Syntax

<!ELEMENT element-name (#CDATA)>or<!ELEMENT element-name (#PCDATA)>or<!ELEMENT element-name (ANY)>

Where: #CDATA = element contains character data that is not parsed

#PCDATA = element contains character data that is to be parsedANY = element with any content

Page 12: Tp2

Core XML / Chapter 2 / Slide 12 of 25

Elements with Child Elements

Elements with one or more children are defined with the name of the child element inside the parentheses.

Syntax <!ELEMENT element-name (child-element-name)>

or

<!ELEMENT element-name (child-element-name, child-element-name,.....)>

Example

<!ELEMENT note (to, from, heading, body)>

<!ELEMENT to (#CDATA)> <!ELEMENT from (#CDATA)>

<!ELEMENT heading (#CDATA)> <!ELEMENT body (#CDATA)>

Page 13: Tp2

Core XML / Chapter 2 / Slide 13 of 25

Declaring According to the Occurrences of Elements

Element occurrences

Only one occurrence

<!ELEMENT element-name (child-name)>

Minimum one occurrence

<!ELEMENT element-name (child-name+)>

Zero or more occurrences

<!ELEMENT element-name (child-name*)>

Zero or one occurrence

<!ELEMENT element-name (child-name?)>

Page 14: Tp2

Core XML / Chapter 2 / Slide 14 of 25

Declaring Mixed Content An element can have a mixed combination of child

elements. Example

<!ELEMENT note (to+, from, header, message*,#PCDATA)>

The sub elements and subgroups can be declared in Sequence or Choice.

Page 15: Tp2

Core XML / Chapter 2 / Slide 17 of 25

Declaring Attributes Elements can have attributes. Syntax

<!ATTLIST element-name attribute-name attribute-type default-value>

Example: 1

<!DOCTYPE Book [ <!ELEMENT Book (Title, Chapter+)> <!ATTLIST Book

Author CDATA #REQUIRED> <!ELEMENT Chapter (#PCDATA)> <!ATTLIST Chapter

id (4 | 7) #REQUIRED> <!ELEMENT Title (#PCDATA)> ]>

Page 16: Tp2

Core XML / Chapter 2 / Slide 18 of 25

Declaring AttributesExample: 2

Page 17: Tp2

Core XML / Chapter 2 / Slide 19 of 25

Attribute (Attribute- Type Values)

Value Explanation

CDATA The value is character data

(eval|eval|..) The value must be an enumerated value

ID The value is an unique id

IDREF The value is the id of another element

IDREFS The value is a list of other ids

NMTOKEN The value is a valid XML name

NMTOKENS The value is a list of valid XML names

ENTITY The value is an entity

ENTITIES The value is a list of entities

NOTATION The value is a name of a notation

xml: The value is predefined

Page 18: Tp2

Core XML / Chapter 2 / Slide 20 of 25

Attribute (Attribute-Default-Value) Value Explanation

#DEFAULT

The attribute has a default value.

#REQUIREDThe attribute value must be included in the element.

#IMPLIED The attribute does not have to be included.

#FIXED The attribute value is fixed.

Page 19: Tp2

Core XML / Chapter 2 / Slide 21 of 25

Internal DTD It is written directly

in the XML document after the XML declaration.

Writing the DTD within the DOCTYPE definition is called as Wrapping.

The file with the DTD and XML code has a .xml extension.

<!DOCTYPE SHOWROOM

[

<!ELEMENT SHOWROOM

(TV|LAPTOP)+>

<!ELEMENT TV (#PCDATA)>

<!ATTLIST TV

count CDATA #REQUIRED>

<!ELEMENT LAPTOP(#PCDATA)>

<!ATTLIST LAPTOP

count CDATA #REQUIRED>

]

>

Page 20: Tp2

Core XML / Chapter 2 / Slide 22 of 25

External DTD It exists outside the

content of a document. The DTD file has

a .dtd extension. The reference to the

DTD file is added at the beginning of the XML file.

The DTD reference in the XML document file:<!DOCTYPE SHOWROOM SYSTEM "show.dtd">The show.dtd file:<!ELEMENT SHOWROOM (TV|LAPTOP)+> <!ELEMENT TV (#PCDATA)><!ATTLIST TV

count CDATA #REQUIRED> <!ELEMENT LAPTOP (#PCDATA)><!ATTLIST LAPTOP

count CDATA #REQUIRED>

Page 21: Tp2

Core XML / Chapter 2 / Slide 23 of 25

Internal Entity Declaration Entities that have their contents within the XML

document are called internal entities. Syntax

<!ENTITY entity-name "entity-value"> Example

<!ENTITY writer “Nicole D."><!ENTITY copyright "Copyright Aptech Ltd.">

In XML document the entities would be referred as shown below: <author>&writer;&copyright;</author>

Page 22: Tp2

Core XML / Chapter 2 / Slide 24 of 25

External Entity Declaration Entities whose contents are found outside the

XML document are called external entities. They are declared using the SYSTEM keyword. Syntax

<!ENTITY entity-name SYSTEM "URI/URL"> Example

<!ENTITY writer SYSTEM "http://www.xml101.com/entities/entities.xml"><!ENTITY copyright SYSTEM "http://www.xml101.com/entities/entities.dtd">

Page 23: Tp2

Core XML / Chapter 2 / Slide 25 of 25

XML Namespaces - 1 Two or more applications on the Internet may also

have some element names that are common. Namespaces help avoid such ambiguity that may arise.

It also allows to combine documents from different sources and enables the identification of what element or attributes come from which source.

It instructs the user agent to access the DTD against which the document is validated.

Page 24: Tp2

Core XML / Chapter 2 / Slide 26 of 25

XML Namespaces - 2 A URI(Uniform Resource Identifier) is used

to identify namespaces in XML. It includes Uniform Resources Name(URN)

and a Uniform Resource Locator(URL). URL contains the reference for a document

or an HTML page on a web. URN is a universally unique number that

identifies Internet resources.

Page 25: Tp2

Core XML / Chapter 2 / Slide 27 of 25

Needs of a Namespace Namespaces are used to overcome the conflict

that arise when reuse and extension of the DTD’s take place.

Namespaces help standardize and uniquely brand elements and attributes.

Namespaces employ the URI to instruct the user-agent about the location of the DTD against which the XML document is checked for validity.

Namespaces ensure that element names do not conflict and do clarify their origins.

Page 26: Tp2

Core XML / Chapter 2 / Slide 28 of 25

Syntax for Namespace A prefix is associated with the URI that can be

used as a namespace. Syntax

xmlns:[prefix]= “[URI of namespace]” The xmlns: is a reserved attribute

Example

xmlns:ins= “http://www.Aptech_edu.ac” Namespace needs to be declared before using It is declared in the root element of the document

Page 27: Tp2

Core XML / Chapter 2 / Slide 29 of 25

Attributes and Namespaces Attributes comes within the namespace of

their element unless they are predefined. We can also incorporate attributes from

two domains: <samplexmlns= “http://www.Aptech_edu.ac”xmlns:tea_batch= “http://www.tea.org”><batch-list> <batch type=“thirdbatch”>Evening Batch</batch> <batch tea_batch:type= “thirdbatch”>Tea batch III </batch> <batch>Afternoon Batch</batch></batch-list></sample>

Page 28: Tp2

Core XML / Chapter 2 / Slide 30 of 25

Namespace Application The new XSL syntax makes use of namespace

to identify both its own tags, and the formatting vocabulary tags.

The xsl: prefix are in the http//www.w3.org/TR/WD-xsl namespace.

The fo: prefix are in the http//www.w3.org/TR/WD-xsl/FO.

XSL is written in XML syntax and uses tags, elements, and attributes.

Page 29: Tp2

Core XML / Chapter 2 / Slide 31 of 25

Namespace Example

<book

xmlns:html=“http//www.w3.org/TR/WD-xsl/FO”>

<index>

<chapter>this is chapter 1</chapter>

<html:br/>

<chapter>this is chapter 1</chapter>

</index>

</book>

Page 30: Tp2

Core XML / Chapter 2 / Slide 32 of 25

Summary-1 A well-formed document is one that conforms to the basic rules of

XML. A valid document is well formed and is also validated against a DTD. The DTD specifies the grammatical structure of an XML document,

thereby allowing XML parsers to understand and interpret the document’s contents.

The use of the SYSTEM keyword indicates to the parser that this is an external declaration, and that the set of rules for this XML document can be found in a specified file.

EMPTY element-content type specifies that the element has no child elements or character data.

Page 31: Tp2

Core XML / Chapter 2 / Slide 33 of 25

Summary-2 #CDATA means that the element contains character data that is not to be

parsed by a parser.#PCDATA means that the element contains data that is to be parsed by a parser.

Specifying a default value for an attribute in the DTD ensures that the attribute will get a value, even if the author of the XML document does not include it.

Specifying the value of an attribute as ‘Implied’ means that the particular attribute is not mandatory and can be specified in the XML document.

Specifying the value of an attribute as ‘Required’ means that the particular attribute is mandatory (that is, its value must be provided in the XML document).

Page 32: Tp2

Core XML / Chapter 2 / Slide 34 of 25

Summary-3 ‘ID’ is the identifier type, and should be unique. This

attribute value is used to search for a particular instance of an element. Each element can only have one attribute of type ID.

A DTD can be either External or Internal. Entities allow us to create an alias to some large piece of text,

so that, in the document, the same piece of text can be referred to, simply by referring to the alias.

Namespaces allow us to combine documents from different sources, and be able to identify which elements or attributes come from which source.