what information can we see…

29
What information can we see… WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire Register now On the 7 th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed Tim berners-lee Tim is the well known inventor of the Web, … Ian Foster Ian is the pioneer of the Grid, the next generation internet

Upload: kirima

Post on 15-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

What information can we see…. WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: What information can we see…

What information can we see…

WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong,

india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh

international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …

Page 2: What information can we see…

What information can a machine see…

Page 3: What information can we see…

Solution: markup with “meaningful” tags?

<name> </

name><location> </location>

<date> </date><slogan> </slogan><participants>

</participants>

<introduction>

</introduction>

<speaker> </speaker><bio> </

bio>…

Page 4: What information can we see…

Structured Web Documents in XML

XML, a language that lets one write structured Web documents with a user-defined vocabulary

Page 5: What information can we see…

Web page which contains information about a particular book in html

<h2>Nonmonotonic Reasoning: Context-Dependent Reasoning</h2>

<i>by <b>V. Marek</b> and <b>M. Truszczynski</b></i><br>

Springer 1993<br>ISBN 0387976892

Page 6: What information can we see…

A typical representation in xml

<book><title>

Nonmonotonic Reasoning: Context-Dependent Reasoning</title><author>V. Marek</author><author>M. Truszczynski</author><publisher>Springer</publisher><year>1993</year><ISBN>0387976892</ISBN></book>

Page 7: What information can we see…

Imagine an intelligent agent trying to retrieve the authors of the particular book

From html From xml

Page 8: What information can we see…

XML allows to represent information that is also machine-accessible.

XML separates content from use and presentation.

Page 9: What information can we see…

Another Example

<h2>Relationship matter-energy</h2><i> E = M × c2 </i>

<equation><meaning>Relationship matter-energy</meaning><leftside> E </leftside><rightside> M × c2 </rightside></equation>

Page 10: What information can we see…

XML is a meta-language: it does not have a fixed set of tags, but allows users to define tags of their own.

applications on the WWW must agree on common vocabularies if they need to communicate and collaborate

Communities and business sectors are in the process of defining their specialized vocabularies, creating XML applications

Page 11: What information can we see…

mathematics (MathML) bioinformatics (BSML) human resources (HRML) astronomy (AML) news (NewsML) investment (IRML) SBML (System Biology) Bioinformatic Sequence Markup Language (BSML) MicroArray and Gene Expression Markup Language

(MAGE-ML)

Page 12: What information can we see…

Chemical Markup Language

Molecular Dynamics [Markup] Language (MoDL)StarDOM - Transforming Scientific Data into XMLBioinformatic Sequence Markup Language (BSML)BIOpolymer Markup Language (BIOML)CellML

Gene Expression Markup Language (GEML)GeneX Gene Expression Markup Language (GeneXML)Genome Annotation Markup Elements (GAME)Microarray Markup Language (MAML)XML for Multiple Sequence Alignments (MSAML)Systems Biology Markup Language (SBML)OMG Gene Expression RFPProtein Extensible Markup Language (PROXIML)

Page 13: What information can we see…

The XML Language

An XML document consists of a prolog a number of elements and attributes

Page 14: What information can we see…

Prolog of an XML Document

The prolog consists of an XML declaration and an optional reference to external structuring

documents

<?xml version="1.0" encoding="UTF-16"?>

<!DOCTYPE book SYSTEM "book.dtd">

Page 15: What information can we see…

XML Elements

The “things” the XML document talks about E.g. books, authors, publishers

An element consists of: an opening tag the content a closing tag

<lecturer>David Billington</lecturer>

Page 16: What information can we see…

XML Elements (continue)

Tag names can be chosen almost freely. The first character must be a letter, an

underscore, or a colon No name may begin with the string

“xml” in any combination of cases E.g. “Xml”, “xML”

Page 17: What information can we see…

Content of XML Elements Content may be text, or other elements, or nothing

<lecturer><name>David Billington</name><phone> +61 − 7 − 3875 507 </phone>

</lecturer>

If there is no content, then the element is called empty; it is abbreviated as follows:<lecturer/> for <lecturer></lecturer>

Page 18: What information can we see…

XML Attributes An empty element is not necessarily

meaningless It may have some properties in terms of attributes

An attribute is a name-value pair inside the opening tag of an element<lecturer

name="David Billington" phone="+61 − 7 − 3875 507“

/>

Page 19: What information can we see…

XML Attributes: An Example

<order orderNo="23456" customer="John Smith" date="October 15, 2002“>

<item itemNo="a528" quantity="1"/><item itemNo="c817" quantity="3"/>

</order>

Page 20: What information can we see…

The Same Example without Attributes

<order><orderNo>23456</orderNo><customer>John Smith</customer><date>October 15, 2002</date><item>

<itemNo>a528</itemNo><quantity>1</quantity>

</item><item>

<itemNo>c817</itemNo><quantity>3</quantity></item>

</order>

Page 21: What information can we see…

XML Elements vs Attributes

Attributes can be replaced by elements

When to use elements and when attributes is a matter of taste and need

But attributes cannot be nested

Page 22: What information can we see…

Further Components of XML Docs

Comments A piece of text that is to be ignored by

parser <!-- This is a comment -->

Processing Instructions (PIs) Define procedural attachments <?stylesheet type="text/css"

href="mystyle.css"?>

Page 23: What information can we see…

Well-Formed XML Documents

Syntactically correct documents Some syntactic rules:

Only one outermost element (called root element)

Each element contains an opening and a corresponding closing tag

Tags may not overlap <author><name>Lee Hong</author></name>

Attributes within an element have unique names Element and tag names must be permissible

Page 24: What information can we see…

The Tree Model of XML Documents: An Example

<email><head>

<from name="Michael Maher"

address="[email protected]"/><to name="Grigoris Antoniou"

address="[email protected]"/><subject>Where is your draft?</subject>

</head><body>

Grigoris, where is the draft of the paper you promised me

last week?</body>

</email>

Page 25: What information can we see…

The Tree Model of XML Documents: An Example

Page 26: What information can we see…

The Tree Model of XML Docs

The tree representation of an XML document is an ordered labeled tree: There is exactly one root There are no cycles Each non-root node has exactly one parent Each node has a label. The order of elements is important … but the order of attributes is not important

Page 27: What information can we see…

XML is not enough to ensure valid data structure!

Any XML document which conforms to the XML syntax (such as every tag must have a corresponding closing tag is considered) to be well-formed

However, this does not mean that all the structure of the data is what you wanted. For instance you may want to enforce: That a particular data field is present for each child Data fields in each child appear in the same order That a data field may not be present more than once in a child

node How machines know about structure they process?

Page 28: What information can we see…

Issues Validation and Interoperability

How application can verify whether the data you receive from the outside world?

Is xml document follows the specified structure?

Is xml document follows the restrictions on the elements and attributes?

How all the xml document follows that particular structure?

Page 29: What information can we see…

Structuring XML Documents

An XML document is valid if it is well-formed respects the structuring information it

uses There are two ways of defining the

structure of XML documents: DTDs (the older and more restricted way) XML Schema (offers extended

possibilities)