xml

57
XML Mukesh N Tekwani [email protected] www.myexamnotes.com

Upload: mukesh-tekwani

Post on 17-May-2015

1.911 views

Category:

Education


1 download

DESCRIPTION

Introduction to XML

TRANSCRIPT

Page 1: XML

XMLMukesh N Tekwani

[email protected]

www.myexamnotes.com

Page 2: XML

Disadvantages of HTML – Need for XML

HTML lacks syntax checking HTML lacks structure HTML is not suitable for data interchange HTML is not context aware – HTML does

not allow us to describe the information content or the semantics of the document

HTML is not object-oriented HTML is not re-usable HTML is not extensible

Page 3: XML

Introduction to XML

XML – Extensible Markup Language Extensible – capable of being extended Markup – it is a way of adding information to the text

indicating the logical components of a document How is it different from HTML?

HTML was designed to display data XML was designed to store, describe and transport

data XML is also a markup language like HTML XML tags are not predefined – we must design

our own tags.

Page 4: XML

Differences between HTML and XML

HTML XML

1. Designed to display data 1. Designed to store and transport data between applications and databases.

2. Focus is on how data looks 2. Focus is on what data is

3. It has pre-defined tags such as <B>, <LI>, etc

3. No predefined tags; all tags must be defined by the user. E.g., we can create tags such as <TO>, <FROM>, <BOOKNAME>, etc

4. HTML is used to display information

4. XML is used to describe information

5. Every tag may not have a closing tag.

5. Every tag must have a closing tag.

6. HTML is not case sensitive. 6. XML is case sensitive

7. HTML is for humans 7. XML is for computers

Page 5: XML

Advantages (Features) of XML - 1 XML simplifies data sharing

Since XML data is stored in plain text format, data can be easily shared among different hardware and software platforms.

XML separates data from HTML To display dynamic data in HTML, the code

must be rewritten each time the data changes. With XML, data can be stored in separate files so that whenever the data changes it is automatically displayed correctly. We have to design the HTML for layout only once.

Page 6: XML

Advantages (Features) of XML - 2 XML simplifies data transport

With XML, data can be easily exchanged between different platforms.

XML makes data more available Since XML is independent of hardware, software and

application, XML can make your data more available and useful.

Different applications can access your data in HTML pages

XML provides a means to package almost any type of information (binary, text, voice, video) for delivery to a receiving end.

Page 7: XML

Advantages (Features) of XML - 3 Internationality

HTML relies heavily on ASCII which makes using foreign characters very difficult. XML uses Unicode so that many European and Asian languages are also handled easily

Page 8: XML

XML Document – Example 1

<?xml version="1.0" encoding="ISO-8859-1"?><class_list>

<student><name>Anamika</name><grade>A+</grade>

</student><student>

<name>Veena</name><grade>B+</grade>

</student></class_list>

Page 9: XML

XML Document–Example 1 - Explained

The first line is the XML declaration. <?xml version="1.0" encoding="ISO-8859-1"?> It defines the XML version (1.0) It gives the encoding used (ISO-8859-1 = Latin-1/West

European character set) The XML declaration is actually a processing instruction

(PI) an it is identified by the ? At its start and end The next line describes the root element of the

document (like saying: "this document is a class_list“) The next 2 lines describe 2 child elements of the

root (student, name, and grade) And finally the last line defines the end of the root

element: </class_list>

Page 10: XML

Logical Structure

XML uses its start tags and end tags as containers.

The start tag, the content and the end tag form an element

Elements are the building blocks out of which an XML document is assembled.

An XML document has a tree-like structure with the root element at the top and all the other elements are contained within each other.

Page 11: XML

Tree structure

XML documents form a tree structure. XML documents must contain a root element. This

element is "the parent" of all other elements. The elements in an XML document form a document

tree. The tree starts at the root and branches to the lowest level of the tree.

All elements can have sub elements (child elements) <root>

<child><subchild>.....</subchild>

</child></root

Page 12: XML

XML – Example 2

Page 13: XML

XML – Example 2

<bookstore><book category = "COOKING">

<title lang = "en">Everyday Italian</title><author>Giada De Laurentiis</author><year>2005</year><price>30.00</price>

</book>

<book category = "CHILDREN"><title lang = "en">Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book>

<book category = "WEB"><title lang = "en">Learning XML</title><author>Erik T. Ray</author><year>2003</year><price>39.95</price>

</book></bookstore>

Page 14: XML

Important Definitions

XML Element An element is a start tag, content, and an

end tag. E.g., <greeting>”Hello World</greeting>

XML Attribute An attribute provides additional information

about elements E.g., <note priority = “high”>

Page 15: XML

Important Definitions

Child elements – XML elements may have child elements<employee id = “100”>

<name><first>Anita</first><initial>D</initial><last>Singh</last>

</name></employee>

Parent Element Name

Children of parent element

Page 16: XML

XML Element

An XML element is everything from the element's start tag to the element's end tag.

An element can contain other elements, simple text or a mixture of both.

Elements can also have attributes.

Page 17: XML

XML Syntax Rules

All XML elements must have a closing tag

XML tags are case sensitive. The tag <Book> is different from the tag

<book> Opening and closing tags must be

written with the same case<Message>This is incorrect</message><message>This is correct</message>

Page 18: XML

XML Syntax Rules

XML elements must be properly nested HTML permits this:

<B><I>This text is bold and italic</B></I>But in XML this is invalid. All elements must be properly nested within one another.<B><I>This text is bold and italic</I></B>

XML documents must have a root element. It is the parent of all other elements.<root>

<child><subchild>.....</subchild>

</child></root>

Page 19: XML

XML Syntax Rules

XML Entity References Some characters have a special meaning

in XML. E.g., If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element. <message>if salary < 1000 then

</message> To avoid this error, replace the "<"

character with an entity reference: <message>if salary &lt; 1000

then</message>

Page 20: XML

XML Syntax Rules

XML Entity References There are 5 predefined entity references

in XML:Entity Symbol Descriptio

n&lt; < Less than

&gt; > Greater than

&amp; & Ampersand

&apos; ‘ Apostrophe

&quot; “ Quotation mark

Page 21: XML

XML Syntax Rules

Comments in XML (similar to HTML)<!-- This is a comment -->

White space is preserved in XML but not in HTML XML Naming Rules

Names can contain letters, numbers, and other characters

Names cannot start with a number or punctuation character

Names cannot start with the letters xml (or XML, or Xml, etc)

Names cannot contain spaces Any name can be used, no words are reserved.

Page 22: XML

XML Markup Delimiters

Every XML element is made up of the following parts:

Symbol Description< Start tag open

delimiter</ End tag open

delimitersomething element name> tag close delimiter/> empty tag close

delimiter

Page 23: XML

Different Types of XML Markups 5 Types of Markup in XML

Elements Entities Comments Processing Instructions Ignored Sections

Page 24: XML

Element Markup

Element Markup It is composed of 3 parts: start tag, the

content, and the end tag. Example: <name>Neetu</name> The start tag and the end tag can be

treated as wrappers The element name that appears in the start

tag must be exactly the same as the name that appears in the end tag.

Example: <Name>Neetu</name>

Page 25: XML

Different Types of XML Markups Attribute Markup

Attributes are used to attach information to the information contained in an element.

General syntax for attributes is: <elementname property = ‘value’>

Or <elementname property = “value”>

Attribute value must be enclosed within quotation marks

Use either single quotes or double quotes but don’t mix them.

Page 26: XML

Attribute Markup

If we specify the attributes for the same element more than once, the specifications are merged.

<?xml version = “1.0”?><myparas><para num = “first”>This is Para 1

</para><para num = ‘second’ color = “red”>This

is Para 2</para><myparas>

Page 27: XML

Attribute Markup

When the XML processor encounters line 3, it will record the fact that para element has the num attribute

When it encounters the 4th line it will record the fact that para element has the color attribute

Page 28: XML

Reserved Attribute

The xml:lang attribute is reserved to identify the human language in which the element was written

The value of attribute is one of the following: en English fr French de German

Page 29: XML

XML Attributes

Attribute provides additional information about the element Similar to attributes in HTML e.g., <IMG SRC=“sky.jpg”> In this SRC is the

attribute XML Attribute values must be quoted

XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted.

<note date = 01/01/2010> <---------- This is invalid <to>Priya</to><from>Deeali</from>

</note>

<note date = “01/01/2010”> --------- Now OK since enclosed in double quotes

<note date = ‘01/01/2010’> --------- This is also OK since enclosed in single quotes

Page 30: XML

XML Attributes and Elements Consider the following example:

<person gender = "female">

<firstname>Geeta</firstname>

<lastname>Shah</lastname></person>

<person><gender>female</

gender><firstname>Geeta</

firstname><lastname>Shah</

lastname></person>

Gender is an attribute

Gender is an element

Page 31: XML

Problems with XML Attributes Attributes cannot contain multiple values

whereas elements can Attributes cannot contain tree structures Attributes are not easily expandable (for

future changes) Attributes are difficult to read and maintain Use elements for data. Use attributes for information that is not

relevant to the data.

Page 32: XML

Illustrating Problematic Attributes Consider the following example:

<note day=“03" month="02" year="2010"to="Tina" from=“Yasmin" heading="Reminder"body=“Happy Birthday!"></note>

Better way:<note><date><day>03</day><month>02</month><year>2010</year></date><to>Tina</to><from>Yasmin</from><heading>Reminder</heading><body>Happy Birthday!</body></note>

Page 33: XML

When to use Attributes?

XML Attributes can be used to assign ID references to elements. Metadata – data about data – should be stored as attributes The ID can then be used to identify the XML element

<messages><note id="501">

<to>Tina</to><from>Yasmin</from><heading>Reminder</heading><body>Happy Birthday!</body>

</note><note id="502">

<to>Yasmin</to><from>Tina</from><heading>Re: Reminder</heading><body>Thank you, my dear</body>

</note></messages>

Page 34: XML

What does Extensible mean in XML?

Consider the following XML example:<note>

<to>Anita</to><from>Veena</from><body>You have an exam tomorrow</body>

</note>Suppose we create an application that extracted the <to>, <from> and <body> elements from the XML document to produce the result:MESSAGE To: AnitaFrom:VeenaYou have an exam tomorrow

Page 35: XML

What does Extensible mean in XML?

Now suppose the author of the XML document added some extra information to it:

<note><date>2008-01-10</date><to>Anita</to><from>Veena</from><heading>Reminder</heading><body>You have an exam tomorrow</body>

</note>

Page 36: XML

What does Extensible mean in XML?

This application will not crash because it will still find the <to>, <from> and <body> elements in the XML document and produce the same output.

Page 37: XML

XML Validation

What is a “well formed” XML document? XML with correct syntax is "Well Formed"

XML. A "Well Formed" XML document has correct

XML syntax. XML documents must have a root element XML elements must have a closing tag XML tags are case sensitive XML elements must be properly nested XML attribute values must be quoted

Page 38: XML

Wellformed Document - Rule 1 Elements are case-sensitive. If you define you language to use

lowercase elements, then all instances of those elements must be in lowercase.

Page 39: XML

Bad Examples…

<H1>Sample Heading</H1>

<h1>Sample Heading</H1>

<H1>Sample Heading</h1>

Page 40: XML

Rule 2:

All elements that contain text or other elements must have both start and ending tags.

Page 41: XML

Rule 3:

All empty elements (commonly known as standalone tags) must have a slash (/) before the end of the tag.

Page 42: XML

Rule 4:

All attribute values must be contained in quotes, either single or double – no exceptions!

Page 43: XML

Rule 5:

Elements may not overlap. Elements must be nested properly

within other elements and can not start before a sub-element and end within the sub-element.

Page 44: XML

Rule 6:

Isolated markup characters (characters essential to creating markup documents) may not appear in parsed content as is.

Isolated markup characters must be represented as a character entity and include the following: <, [, ], >, ', " and &.

Page 45: XML

Isolated Markup Characters

< &lt;

[ &#91;

] &#93;

> &gt;

' &apos;

" &quot;

& &amp;

Page 46: XML

Bad Examples…

<h1>Jack &amp Jill</h1>

<equation>5 &lt 2</equation>

These examples are invalid since they are both examples forgetting the semi-colon following the character entity.

Page 47: XML

Good Examples…

<h1>Jack &amp; Jill</h1>

<equation>5 &lt; 2</equation>

Page 48: XML

Rule 7:

Element (and attribute) names must start with either a letter (uppercase or lowercase) or a underscore.

Element names may contain letters, numbers, hyphens, periods and underscores inclusively. BAD

EXAMPLES

<bad*characters><illegal space><99number-start>

GOOD EXAMPLES

<example-one><_example2><Example.Three>

Page 49: XML

XML Validation

A “well formed” XML document conforms to the rules of a Document Type Definition (DTD)

<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE note SYSTEM "Note.dtd"><note><to>Tina</to><from>Yasmin</from><heading>Reminder</heading><body>Happy Birthday!</body></note>

Page 50: XML

Viewing XML Files - 1

Page 51: XML

Viewing XML Files - 2

The XML document will be displayed with color-coded root and child elements.

A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. 

To view the raw XML source (without the + and - signs), select "View Page Source" or "View Source" from the browser menu.

Page 52: XML

Viewing XML Files - 3

Why XML documents display like this? XML documents do not carry information

about how to display the data. Since XML tags are created by the user of

the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table.

Without any information about how to display the data, most browsers will just display the XML document as it is.

Page 53: XML

Using CSS to display XML Files CSS (Cascading Style Sheets) can be

used to format a XML document. Consider this XML document:

Page 54: XML

Displaying Formatted XML document-1

<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type = "text/css" href = "birthdate.css"?><birthdate> <person>

<name><first>Anokhi</first><last>Parikh</last>

</name> <date>

<month>01</month><day>21</day><year>1992</year>

</date> </person></birthdate>

Page 55: XML

Displaying Formatted XML document-2

birthdate{

background-color: #ffffff;

width: 100%;}person{

margin-left: 0;}name{

color: #FF0000;font-size: 20pt;

}

month, day, year{

display:block;color: #000000;margin-left: 20pt;

}

Stylesheet – birthdate.css

Page 56: XML

Final Output

Page 57: XML

XSLT

XSL is a language for style sheets An XSL style sheet is a file that describes how to

display an XML document XSL contains a transformation language for XML

documents: XSLT. XSLT is used for generating HTML web pages from XML data.

XSLT - eXtensible Stylesheet Language Transformations

XSLT is used to transform an XML document into an HTML document

XSLT is the recommended style sheet language for XML