processing xml
DESCRIPTION
5. Processing XML. Overview. Parsing XML documents Document Object Model (DOM) Simple API for XML (SAX) Class generation. What's the Problem?. ?. The XML Handbook Goldfarb Prescod - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/1.jpg)
5
Processing XML
![Page 2: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/2.jpg)
5 - 2
Parsing XML documents Document Object Model (DOM) Simple API for XML (SAX)
Class generation
Overview
![Page 3: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/3.jpg)
5 - 3
What's the Problem?
<?xml version="1.0"?><books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author> <publisher>Prentice Hall</publisher> <pages>655</pages> <isbn>0130811521</isbn> <price currency="USD">44.95</price>
</book> <book> <title>XML Design</title> <author>Spencer</author> <publisher>Wrox Press</publisher>
...</book>
</books>
?
Book
?
![Page 4: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/4.jpg)
5 - 4
Parsing XML Documents
Document Tree
Parser
Docu-ment
DTD /Schema
Applicationimplements
DocumentHandler
endDocument
startDocument
endElement
endElement
startElement
startElement
DOM SAX
![Page 5: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/5.jpg)
5 - 5
Parser
Project X (Sun Microsystems) Ælfred (Microstar Software) XML4J (IBM) Lark (Tim Bray) MSXML (Microsoft) XJ (Data Channel) Xerces (Apache) ...
![Page 6: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/6.jpg)
5 - 6
Prescod
book
PrenticeHall
<?xml version="1.0"?><books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author> <publisher>Prentice Hall</publisher> <pages>655</pages> <isbn>0130811521</isbn> <price currency="USD">44.95</price>
</book> <book> <title>XML Design</title> <author>Spencer</author> <publisher>Wrox Press</publisher>
...</book>
</books>
The Document Object Model
XML Document Structure
The XMLHandbook Goldfarb 655
books
book
publisher pages isbnauthortitle
...
![Page 7: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/7.jpg)
5 - 7
The Document Object Model
Provides a standard interface for access to and manipulation of XML structures.
Represents documents in the form of a hierarchy of nodes.
Is platform- and programming-language-neutral
Is a recommendation of the W3C (October 1, 1998)
Is implemented by many parsers
![Page 8: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/8.jpg)
5 - 8
DOM - Structure Model
Document
Node
NodeList
Element
Prescod
book
PrenticeHall
The XMLHandbook Goldfarb 655
books
book
publisher pages isbnauthortitle
...
![Page 9: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/9.jpg)
5 - 9
The Document Interface
Method Result
docTypeimplementationdocumentElementgetElementsByTagName(String)createTextNode(String)createComment(String)createElement(String)create CDATASection(String)
DocumentTypeDOMImplementationElementNodeListStringCommentElementCDATASection
![Page 10: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/10.jpg)
5 - 10
The Node Interface
Method Result
nodeNamenodeValuenodeTypeparentNodechildNodesfirstChildlastChildpreviousSiblingnextSiblingattributesinsertBefore(Node new,Node ref)replaceChild(Node new,Node old)removeChild(Node)hasChildNode
StringStringshortNodeNodeListNodeNodeNodeNodeNodeNamedMapNodeNodeNodeBoolean
![Page 11: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/11.jpg)
5 - 11
Node Types / Node NamesResult: NodeType /NodeName
Node Node Node Fields Type NameELEMENT_NODE 1 tagNameATTRIBUTE_NODE 2 name of attributeTEXT_NODE 3 "#text"CDATA_SECTION_NODE 4 "#cdata-section"ENTITY_REFERENCE_NODE 5 name of entity referencedENTITY_NODE 6 entity namePROCESSING_INSTRUCTION_NODE 7 targetCOMMENT_NODE 8 "#comment"DOCUMENT_NODE 9 "#document"DOCUMENT_TYPE_NODE 10 document type nameDOCUMENT_FRAGMENT_NODE 11 "#document-fragment"NOTATION_NODE 12 notation name
![Page 12: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/12.jpg)
5 - 12
The NodeList Interface
Method Result
lengthitem(int)
IntNode
![Page 13: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/13.jpg)
5 - 13
The Element Interface
Method Result
tagNamegetAttribute(String)setAttribute(String name, String value)removeAttribute(String)getAttributeNode(String)setAttributeNode(Attr)removeAttributeNode(String)getElementsByTagName
StringStringAttr
AttrAttr
NodeList
![Page 14: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/14.jpg)
5 - 14
DOM Methods for Navigation
firstChild lastChild
nextSiblingpreviousSibling
parentNode
getElementsByTagName
childNodes(length, item())
![Page 15: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/15.jpg)
5 - 15
DOM Methods for Manipulation
appendChildinsertBeforereplaceChildremoveChild
createElementcreateAttributecreateTextNode
![Page 16: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/16.jpg)
5 - 16
Example
Goldfarb Spencer
books
book book
author authorauthor
Prescod
doc.documentElement.childNodes.item(0).getElementsByTagName("author"). item(1).childNodes.item(0).datadoc.documentElement.childNodes.item(0).getElementsByTagName("author"). item(1).childNodes.item(0).data
Root NodeDOM
Object TextBookssecondAuthor
TextSubnodes
firstthereof
firstBook
Authors
![Page 17: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/17.jpg)
5 - 17
Script
<HTML><HEAD><TITLE>DOM Example</TITLE></HEAD><BODY><H1>DOM Example</H1><SCRIPT LANGUAGE="JavaScript">
var doc, root, book1, authors, author2; doc = new ActiveXObject("Microsoft.XMLDOM"); doc.async = false; doc.load("books.xml"); if (doc.parseError != 0)
alert(doc.parseError.reason); else {
root = doc.documentElement;document.write("Name of Root node: " + root.nodeName + "<BR>");document.write("Type of Root node: " + root.nodeType + "<BR>");book1 = root.childNodes.item(0);authors = book1.getElementsByTagName("author");document.write("Number of authors: " + authors.length + "<BR>");author2 = authors.item(1);document.write("Name of second author: " + author2.childNodes.item(0).data);}
</SCRIPT></BODY></HTML>
<HTML><HEAD><TITLE>DOM Example</TITLE></HEAD><BODY><H1>DOM Example</H1><SCRIPT LANGUAGE="JavaScript">
var doc, root, book1, authors, author2; doc = new ActiveXObject("Microsoft.XMLDOM"); doc.async = false; doc.load("books.xml"); if (doc.parseError != 0)
alert(doc.parseError.reason); else {
root = doc.documentElement;document.write("Name of Root node: " + root.nodeName + "<BR>");document.write("Type of Root node: " + root.nodeType + "<BR>");book1 = root.childNodes.item(0);authors = book1.getElementsByTagName("author");document.write("Number of authors: " + authors.length + "<BR>");author2 = authors.item(1);document.write("Name of second author: " + author2.childNodes.item(0).data);}
</SCRIPT></BODY></HTML>
![Page 18: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/18.jpg)
5 - 18
SAX - Simple API for XML
Docu-ment
DTD
Application
endDocument
startDocument
endElement
endElement
startElement
startElement
Parser
![Page 19: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/19.jpg)
5 - 19
SAX - Simple API for XML
Event-driven parsing model "Don't call the DOM, the parser calls you." Developed by the members of the XML-DEV Mailing List Released on May 11, 1998 Supported by many parsers ... ... but Ælfred is the saxon king.
![Page 20: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/20.jpg)
5 - 20
Procedure
DOM Creating a parser instance Parsing the whole document Processing the DOM tree
SAX Creating a parser instance Registrating event handlers with the parser Parser calls the event handler during parsing
![Page 21: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/21.jpg)
5 - 21
Namespace Support
<?xml version="1.0"?><order xmlns="http://www.net-standard.com/namespaces/order" xmlns:bk="http://www.net-standard.com/namespaces/books" xmlns:cust="http://www.net-standard.com/namespaces/customer">...<bk:book> <bk:title>XML Handbook</bk:title> <bk:isbn>0130811521</bk:isbn></bk:book>....</order>
![Page 22: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/22.jpg)
5 - 22
Access to Qualified Elements
Node "book"
bk:book
http://www.net-standard.com/namespaces/books
bk
book
Interface "Node"
DOM Level 2
Method
nodeName
namespaceURI
prefix
localName
qName
uri
localName
SAX 2.0
startElement
![Page 23: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/23.jpg)
5 - 23
Generation of Data Structures
DTD / Schema'yacht'
Generation
01 yacht05 name05 details10 type
Class
Processing
<?xml?><yacht yachtid='147'><name>Mona Lisa</name><image file='yacht147.jpg'/><description> Any text describing this yacht 147</description><details> <type>GULFSTAR 55</type> ength>1700</length> <width>480</width> <draft>170</draft> <sailsurface>112</sailsurface> <motor>84</motor> <headroom>202</headroom> <bunks>8</bunks></details></yacht>
01 yacht05 VENTANA05 details10 GULFSTAR 55
Object
![Page 24: Processing XML](https://reader035.vdocuments.mx/reader035/viewer/2022062721/568135bd550346895d9d2208/html5/thumbnails/24.jpg)
5 - 24
Summary
To avoid expensive text processing, applications use an XML parser that creates a DOM tree of a document.
The DOM provides a standardized API to access the content of documents and to manipulate them.
Alternatively or additionally, applications can work event-based using the SAX interface, which is provided by many parsers.