introduction to xml

41
1 Introduction to XML Chapter 1

Upload: fazli-kabashi

Post on 26-May-2015

318 views

Category:

Lifestyle


1 download

TRANSCRIPT

Page 1: Introduction to XML

1

Introduction to XML

Chapter 1

Page 2: Introduction to XML

2 Core XML / Chapter 1 / Slide 2 of 35

Chapter Objectives -1 Discuss markup language List and explain drawbacks of HTML Discuss the architecture of XML documents List the benefits of XML Discuss Parser

Page 3: Introduction to XML

3 Core XML / Chapter 1 / Slide 3 of 35

Chapter Objectives -2 Build a complete XML Document:

Character Data Comments Processing Instructions Entities General Entities Parameter Entities The DOCTYPE Declarations

Page 4: Introduction to XML

4 Core XML / Chapter 1 / Slide 4 of 35

History of Markup Documents recorded using paper and pen

Typesetters formatting documents

Tools used by typesetters to format a document

Page 5: Introduction to XML

5 Core XML / Chapter 1 / Slide 5 of 35

Markup Language A Markup language defines the rules that help to add

meaning to the content and structure of documents. They are classified as:

Stylistic Markup – It determines the presentation of the document

Structure Markup – It defines the structure of the document

Semantic Markup – It determines the content of the document

Page 6: Introduction to XML

6 Core XML / Chapter 1 / Slide 6 of 35

SGML Generalized Markup Language (GML) is the

system of formatting documents. GML was fine-tuned and came to be known

as Standard Generalized Markup Language (SGML).

SGML is the source of origin of all markup languages

Page 7: Introduction to XML

7 Core XML / Chapter 1 / Slide 7 of 35

Features of SGML It describes markup language, which allows

authors to create their own tags that relate to their content.

It needs a separate file that will contain all the rules for the language, for its interpretation

A SGML application is markup language derived from SGML.

Page 8: Introduction to XML

8 Core XML / Chapter 1 / Slide 8 of 35

HTML HTML is the most famous markup language derived

from SGML. It was created to mark up technical papers so that

they could be transferred across different platforms for the scientific community.

It is now also used by those non-scientific users who are concerned about their document’s presentation.

Page 9: Introduction to XML

9 Core XML / Chapter 1 / Slide 9 of 35

Drawbacks of HTML Fixed tag set Presentation technology does not relate to the contents It is flat Clogging HTML is not international Data interchange is impossible Does not have a robust linking mechanism HTML is not reusable

Page 10: Introduction to XML

10 Core XML / Chapter 1 / Slide 10 of 35

HTML and XML code Examples

<UL> <LI> TOM CRUISE <UL> <LI> CLIENT ID : 100 <LI> COMPANY : XYZ Corp. <LI> Email : [email protected] <LI> Phone : 3336767 <LI> Street Adress: 25th

St. <LI> City : Toronto <LI> State : Toronto <LI> Zip : 20056 </UL></UL>

<Details>

<CONTACT>

<PERSON_NAME>TOM CRUISE </PERSON_NAME>

<ID> 100 </ID>

<Company>XYZ Corp. </Company>

<Email> [email protected]</Email>

<Phone> 3336767 </Phone>

<Street> 25th St. </Street>

<City> Toronto </City>

<State> Toronto </State>

<ZIP> 20056 </ZIP>

</CONTACT>

</Details>

HTML Code XML Code

Page 11: Introduction to XML

11 Core XML / Chapter 1 / Slide 11 of 35

XML -1 XML stands for Extensible Markup Language. It overcomes all the drawbacks of HTML. It allows the user to define their own set of tags, and also

makes it possible for others (people or programs) to understand it.

It is more flexible than HTML. It inherits the features of SGML and combines it with the

features of HTML. It is a smaller version of SGML.

Page 12: Introduction to XML

12 Core XML / Chapter 1 / Slide 12 of 35

XML -2 XML is a metalanguage and it describes other

languages. The data contained in an XML file can be displayed

in different ways. It can also be offered to other applications for further

processing. Style sheets help transform structured data into

different HTML views. This enables data to be displayed on different browsers.

Page 13: Introduction to XML

13 Core XML / Chapter 1 / Slide 13 of 35

XML Architecture - 1 XML supports three-tier architecture for handling

and manipulating data. It can be generated from existing databases using a

scalable three-tier model. XML tags represent the logical structure of data that

can be interpreted and used in various ways by different applications.

The middle-tier is used to access multiple databases and translate data into XML.

Page 14: Introduction to XML

14 Core XML / Chapter 1 / Slide 14 of 35

XML Architecture -2

Page 15: Introduction to XML

15 Core XML / Chapter 1 / Slide 15 of 35

XML – A Universal data format

HTML is a single markup language, but XML is a family of markup languages.

Any type of data can be easily defined in XML. XML is popular because it supports a wide range of

applications and is easy to use. XML has a structured data format, which allows it to

store complex data

Page 16: Introduction to XML

16 Core XML / Chapter 1 / Slide 16 of 35

Benefits of XML The three-tier architecture has easier

scalability and better security. The benefits of XML are classified into the

following: Business benefits Technological benefits

Page 17: Introduction to XML

17 Core XML / Chapter 1 / Slide 17 of 35

Business Benefits Information sharing:

Allows businesses to define data formats in XML Provides tools to read, write and transform data between

XML and other formats XML inside a single application:

Powerful, flexible and extensible language Content Delivery:

Supports different users and channels, like digital TV, phone, web and multimedia kiosks

Page 18: Introduction to XML

18 Core XML / Chapter 1 / Slide 18 of 35

Technological Benefits

Technological Benefits

Re-use of data

Separation of data and

presentation

Extensibility Semantic information

Page 19: Introduction to XML

19 Core XML / Chapter 1 / Slide 19 of 35

XML Document Structure An XML document is composed of sets of

“entities” identified by unique names. All documents begin with a root or document

entity. Entities are aliases for more complex functions. Documents are logically composed of declarations,

elements, comments, character references, and processing instructions.

Page 20: Introduction to XML

20 Core XML / Chapter 1 / Slide 20 of 35

Well formed and Valid Documents

An XML document is considered as well formed, if a minimum set of requirements defined in the XML 1.0 specification are satisfied.

The requirements ensure that correct language terms are used in the right manner .

A valid XML document is a well-formed XML document, which conforms to the rules of a Document Type Definition (DTD).

DTD defines the rules that an XML markup in the XML document must follow.

Page 21: Introduction to XML

21 Core XML / Chapter 1 / Slide 21 of 35

Parsers - 1 Parsers help the computer interpret an XML

file.<?xml version=“1.0”?> <nxn> </nxn>

Editor with the XML document

Parsed document viewed in the browser

XML document parsed by the parser

Their are two types of parsers: Non Validating parserValidating parser

Page 22: Introduction to XML

22 Core XML / Chapter 1 / Slide 22 of 35

Parsers - 2

XML file

Other related files (like DTD file)

Parsers load the XML and other related files to check whether the XML document is well formed and valid

Data tree

Page 23: Introduction to XML

23 Core XML / Chapter 1 / Slide 23 of 35

Data versus Markup

<NAME> Tom Cruise </NAME>

Markup

Data

Page 24: Introduction to XML

24 Core XML / Chapter 1 / Slide 24 of 35

Creating an XML Document To create an XML document:

State an XML declaration Create a root element Create the XML code Verify the document

Page 25: Introduction to XML

25 Core XML / Chapter 1 / Slide 25 of 35

Stating an XML Declaration Syntax

<?xml version=“1.0” standalone=“no” encoding=“UTP-8”?> ‘Standalone’ and ‘encoding’ attributes are

optional, only the version number is mandatory ‘Standalone’ – is the external declaration ‘Encoding’ - specifies the character encoding

used by the author XML 1.0 version is default

Page 26: Introduction to XML

26 Core XML / Chapter 1 / Slide 26 of 35

Creating a Root Element There can only be one root element It describes the function of the document Every XML document must have a root

elementExample

<?xml version=“1.0” standalone=“no” encoding=“UTP-8”?>

<BOOK>

</BOOK>

Page 27: Introduction to XML

27 Core XML / Chapter 1 / Slide 27 of 35

Creating the XML Code -1 It is the process of creating our own elements

and attributes as required by our application. Elements are the basic units of XML content. Tags tell the user agent to do something to the

content encased between the start and end tag.Opening Tag Content Closing Tag

<TITLE> Aptech Ltd </TITLE>

Element

Parts of an element

Page 28: Introduction to XML

28 Core XML / Chapter 1 / Slide 28 of 35

Creating the XML Code -2 Rules govern the elements:

At least one element required XML tags are case sensitive End the tags correctly Nest tags Properly Use legal tags Length of markup names Define Valid Attributes

Page 29: Introduction to XML

29 Core XML / Chapter 1 / Slide 29 of 35

Verify the document The document should follow the

XML rules; otherwise it will not be read by the browser or by any other XML reader

Page 30: Introduction to XML

30 Core XML / Chapter 1 / Slide 30 of 35

Comments This is information for the understanding of

the user, and is to be ignored by the processor.

Syntax<!- - Write the comment here -- >

Example <!-- don't show these <NAME>KATE WINSLET</NAME> <NAME>NICOLE KIDMAN</NAME> <NAME>ARNOLD</NAME>--> <NAME>TOM CRUISE</NAME>

The example given will display only the name TOM CRUSIE, and others are treated as comments.

Page 31: Introduction to XML

31 Core XML / Chapter 1 / Slide 31 of 35

Processing Instruction A processing information is a bit of information meant

for the application using the XML document. These instructions are directly passed to the application

using the parser. The XML declaration is also a processing agent.

<?xml:stylesheet type=“text/xsl”?>

Name of application Instruction information

Page 32: Introduction to XML

32 Core XML / Chapter 1 / Slide 32 of 35

Character Data The text between the start and end tags is

defined as ‘character data’. Character data may be any legal (Unicode). Character data is classified into:

PCDATA CDATA

Page 33: Introduction to XML

33 Core XML / Chapter 1 / Slide 33 of 35

PCDATA It stands for parsed character data. PCDATA is text that will be parsed by a Parser. Tags inside the text will be treated as markup and

entities will be expanded.

Entity Name Character&lt; <&gt; >

&amp; &&quot; "&apos; '

 

Predefined entities

Page 34: Introduction to XML

34 Core XML / Chapter 1 / Slide 34 of 35

CDATA It means character data. It will not be parsed by the Parser. CDATA are used to make it convenient to include

large blocks of special characters. The character string ]]> is not allowed within a

CDATA block as it will signal the end of the CDATA block.

<SAMPLE> <![CDATA[<DOCUMENT> <NAME>TOM CRUISE</NAME> <EMAIL>[email protected]</EMAIL> </DOCUMENT>]]> </SAMPLE>

Example

Page 35: Introduction to XML

35 Core XML / Chapter 1 / Slide 35 of 35

Entities Entities are used to avoid typing long pieces of text

repeatedly within a document. There are two categories of entities:

General entitiesSyntax<!ENTITY ADDRESS "text that is to be represented by an entity">

Parameter entitiesSyntax <!ENTITY % ADDRESS "text that is to be represented by an entity">

Page 36: Introduction to XML

36 Core XML / Chapter 1 / Slide 36 of 35

Examples of EntitiesAn example of Parameter entities

< CLIENT = "&APTECH;" PRODUCT = "&PRODUCT_ID;" QUANTITY = "15">

Entity declaration Syntax

%PARAMETER_ENTITY_NAME;

Example

%address;

An example of a General entity

<!ENTITY full_address " My Address 12 Tenth Ave. Suite 12 Paris, France">

Entity declaration Syntax

&ENTITY_NAME; Example

&address;

Page 37: Introduction to XML

37 Core XML / Chapter 1 / Slide 37 of 35

The DOCTYPE declarations The <!DOCTYPE [..]> declaration follows the XML

declaration in an XML document. Syntax

<?xml version="1.0"?><!DOCTYPE myDoc [

...declare the entities here....<myDoc>

...body of the document....</myDoc>

Example<!DOCTYPE CUSTOMERS [ <!ENTITY firstFloor "15 Downing St Floor 1"><!ENTITY secondFloor "15 Downing St Floor 2"><!ENTITY thirdFloor "15 Downing St Floor 3">]>

Page 38: Introduction to XML

38 Core XML / Chapter 1 / Slide 38 of 35

Attributes An attribute gives information about an

element. Attributes are embedded in the element start

tag. An attribute consists of an attribute name and

attribute value.Example<TV count="8">SONY</TV><LAPTOP count="10">IBM</LAPTOP>

Page 39: Introduction to XML

39 Core XML / Chapter 1 / Slide 39 of 35

Summary-1 A markup language defines a set of rules that adds meaning to the

content and structure of documents XML is extensible, which means that we can define our own set of tags,

and make it possible for other parties (people or programs) to know and understand these tags. This makes XML much more flexible than HTML

XML inherits features from SGML and includes the features of HTML. XML can be generated from existing databases using a scalable three-tier model. XML-based data does not contain information about how data should be displayed

An XML document is composed of a set of “entities” identified by unique names

Page 40: Introduction to XML

40 Core XML / Chapter 1 / Slide 40 of 35

Summary-2 A well-formed document is one that conforms to the basic rules of

XML; a valid document is a well-formed document that conforms to the rules of a DTD (Document Type Definition)

The parser helps the computer to interpret an XML file Steps involved in the building of an XML document are:

Stating an XML declaration Creating a root element Creating the XML code Verifying the document

Character data is classified into PCDATA and CDATA

Page 41: Introduction to XML

41 Core XML / Chapter 1 / Slide 41 of 35

Summary-3 Entities are used to avoid typing long pieces of text repeatedly

in a document. The two types of entities are: General entities Parameter entities

The <!DOCTYPE […]> declaration follows the XML declaration in an XML document.

An attribute gives information about an element