jaxp ( java api for xml processing )krunapon/courses/178375/slides/...java web services, software...

35
Java Web Services, S oftware Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API for XML Processing ) Dr. Kanda Runapongsa Dr. Kanda Runapongsa Department of Computer Engineering Department of Computer Engineering Khon Kaen University Khon Kaen University 2 Overview Overview What are XML Parsers? What are XML Parsers? What is JAXP ? What is JAXP ? SAX : Simple API for XML SAX : Simple API for XML DOM : Document Object Model DOM : Document Object Model SAX vs. DOM SAX vs. DOM When to Use DOM ? When to Use DOM ? When to Use SAX ? When to Use SAX ? Transforming with XSLT Transforming with XSLT

Upload: others

Post on 06-Mar-2020

38 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 1

1

JAXP( Java API for XML

Processing )

Dr. Kanda Runapongsa Dr. Kanda Runapongsa Department of Computer EngineeringDepartment of Computer Engineering

Khon Kaen UniversityKhon Kaen University

2

OverviewOverview

�� What are XML Parsers?What are XML Parsers?�� What is JAXP ?What is JAXP ?�� SAX : Simple API for XMLSAX : Simple API for XML�� DOM : Document Object ModelDOM : Document Object Model�� SAX vs. DOMSAX vs. DOM

�� When to Use DOM ?When to Use DOM ?�� When to Use SAX ?When to Use SAX ?

�� Transforming with XSLTTransforming with XSLT

Page 2: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 2

3

What are XML ParsersWhat are XML Parsers ??

�� In order to process XML data, every program or In order to process XML data, every program or server process needs an XML parserserver process needs an XML parser

�� The parser extracts the actual data out of the The parser extracts the actual data out of the textual representation textual representation

�� It is essential for the automatic processing of It is essential for the automatic processing of XML documentsXML documents

4

What are XML Parsers? (Cont.)What are XML Parsers? (Cont.)

�� Parsers also check whether documents conform Parsers also check whether documents conform to the XML standard and have a correct to the XML standard and have a correct structurestructure

�� There are two types of XML parsersThere are two types of XML parsers�� Validating: check documents against a DTD or an Validating: check documents against a DTD or an

XML SchemaXML Schema�� NonNon--validating: do not check documents against a validating: do not check documents against a

DTD or an XML Schema.DTD or an XML Schema.

Page 3: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 3

5

What is JAXP ?What is JAXP ?�� JAXP : JAXP : The Java API for XML ProcessingThe Java API for XML Processing�� JAXP = JAXP = SAXSAX + + DOMDOM + + XSTLXSTL (Java API)(Java API)�� NowNow(24/ 09/ 2004)(24/ 09/ 2004), JAXP , JAXP v.1.2.6v.1.2.6�� JAXP allows you JAXP allows you tto o uuse any XMLse any XML��compliant parser from within compliant parser from within

your applicationyour application�� A thin and lightweight Java API for parsing and transforming A thin and lightweight Java API for parsing and transforming

XML documentsXML documents�� Allows for pluggable parsers and transformersAllows for pluggable parsers and transformers�� Allow passing of XML document using :Allow passing of XML document using :

>> Event Driven (SAX 2.0)>> Event Driven (SAX 2.0)>>>> Three Bases (DOM level 2)Three Bases (DOM level 2)

6

JAXP� Pluggable Frameworkfor Parsers and Transformers

User Application

Reference Parsers Other Parser

JAXP Interface

Page 4: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 4

7

Packages in JAXPPackages in JAXP

�� The JAXP API model is quite easy to The JAXP API model is quite easy to understand and simple to useunderstand and simple to use

�� javax.xml.parsersjavax.xml.parsers�� Provide a common interface for different Provide a common interface for different

vendorvendor’’s SAX and DOM parserss SAX and DOM parsers�� org.w3c.domorg.w3c.dom

�� Define the Document class (DOM) as well as Define the Document class (DOM) as well as classes for all of the components of a DOMclasses for all of the components of a DOM

8

Packages in JAXPPackages in JAXP

�� org.xml.saxorg.xml.sax�� Define the basic SAX APIsDefine the basic SAX APIs

�� javax.xml.transformjavax.xml.transform�� Define the XSLT APIs that let you transform Define the XSLT APIs that let you transform

XML into other formsXML into other forms

Page 5: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 5

9

Current Parsing ApproachesCurrent Parsing Approaches

�� SAX (Simple API for XML) and DOM SAX (Simple API for XML) and DOM (Document Object Model) allow programmers (Document Object Model) allow programmers to access their information stored in XML to access their information stored in XML documentsdocuments�� Using any programming language and a parser for Using any programming language and a parser for

that languagethat language�� Both of them take very different approaches to Both of them take very different approaches to

giving you access to your information giving you access to your information

10

OverviewOverview

�� XML ParsersXML Parsers�� What is JAXP ?What is JAXP ?�� SAX : Simple API for XMLSAX : Simple API for XML�� DOM : Document Object ModelDOM : Document Object Model�� SAX vs. DOMSAX vs. DOM

�� When to Use DOM ?When to Use DOM ?�� When to Use SAX ?When to Use SAX ?

�� Transforming with XSLTTransforming with XSLT

Page 6: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 6

11

SAXSAX( Simple API for XML )( Simple API for XML )

12

OverviewOverview

�� What is SAX ?What is SAX ?�� SAX Operational ModelSAX Operational Model�� Processing XML with JAXP SAXProcessing XML with JAXP SAX�� Callback InterfacesCallback Interfaces�� Handling SAX events Handling SAX events

�� startDocumentstartDocument, , endDocumentendDocument, characters, characters�� startElementstartElement, , endElementendElement

�� What the What the ContentHandlerContentHandler DoesnDoesn’’ t Tell Yout Tell You

Page 7: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 7

13

SAXSAX( Simple API for XML )( Simple API for XML )

�� SAX API is based on an eventSAX API is based on an event--driven processing driven processing model wheremodel where�� The data elements are interpreted on a sequential The data elements are interpreted on a sequential

basisbasis�� The callbacks are called based on selected The callbacks are called based on selected

constructsconstructs

�� It uses a sequential readIt uses a sequential read--only approach and does only approach and does not support random access to the XML not support random access to the XML elementselements

14

SAX Operational ModelSAX Operational Model

XML

Document ParserProvided Handler

Input

Events

Page 8: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 8

15

Processing XML with JAXP Processing XML with JAXP SAXSAX

�� Major steps for parsing using JAXP:Major steps for parsing using JAXP:�� Getting Factory and Parser classes to perform XML Getting Factory and Parser classes to perform XML

parsingparsing�� Setting options such as namespaces, validation, and Setting options such as namespaces, validation, and

featuresfeatures�� Creating a Creating a defaultHandlerdefaultHandler implementation classimplementation class

16

Getting a Factory ClassGetting a Factory Class

�� Obtain a factory class using the Obtain a factory class using the SAXParserFactorySAXParserFactory’’ss static static newInstancenewInstance() () methodmethod�� SAXParserFactorySAXParserFactory factory =factory =SAXParserFactory.newInstanceSAXParserFactory.newInstance();();

Page 9: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 9

17

Getting and Using a Getting and Using a SAXParserSAXParser ClassClass

�� Obtain the SAX parser class from the factory by Obtain the SAX parser class from the factory by calling the calling the newSAXParsernewSAXParser() static method() static method�� SAXParserSAXParser parser = parser = factory.newSAXParserfactory.newSAXParser();();

�� Parse the XML data by calling the parse methodParse the XML data by calling the parse method�� parser.parse(parser.parse(““methodCall.xmlmethodCall.xml””, handler);, handler);�� The second argument is the handler with type The second argument is the handler with type

ContentHandlerContentHandler

18

Callback InterfacesCallback Interfaces

�� SAX uses the Observer design pattern to tell SAX uses the Observer design pattern to tell client applications whatclient applications what’’s in a documents in a document

�� Java developers are most familiar with this Java developers are most familiar with this pattern from the event architecture of the AWT pattern from the event architecture of the AWT and Swingand Swing�� MouseListenerMouseListener as the observeras the observer�� Button as the SubjectButton as the Subject

Page 10: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 10

19

Callback Interfaces (Cont.)Callback Interfaces (Cont.)

�� In SAX, In SAX, XMLReaderXMLReader plays the role of the plays the role of the Subject and the Subject and the org.xml.sax.ContentHandlerorg.xml.sax.ContentHandlerplays the role of Observerplays the role of Observer

�� The biggest difference between the AWT and The biggest difference between the AWT and SAX is that SAX does not allow more than one SAX is that SAX does not allow more than one listener to be registered with each listener to be registered with each XMLReaderXMLReader

20

ContentHandlerContentHandler and and DefaultHandlerDefaultHandler

�� There are eleven methods declared in the There are eleven methods declared in the ContentHandlerContentHandler interface. interface.

�� Few SAX programs actually use all eleven Few SAX programs actually use all eleven methodsmethods

�� SAX includes the SAX includes the org.xml.sax.helpers.DefaultHandlerorg.xml.sax.helpers.DefaultHandler class that class that implements the implements the ContentHandlerContentHandler interfaceinterface

�� By extending By extending DefaultHandlerDefaultHandler, we only have to , we only have to override methods we actually care about override methods we actually care about

Page 11: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 11

21

Extending from Class Extending from Class DefaultHandlerDefaultHandler

�� The following code lists the methods that are The following code lists the methods that are often often overridedoverrided when defining the class that when defining the class that extends from extends from DefaultHandlerDefaultHandler�� public void public void startDocumentstartDocument()()�� public void public void endDocumentendDocument()()

22

Extending from Class Extending from Class DefaultHandlerDefaultHandler

�� Methods often be Methods often be overridedoverrided�� public void public void characters(charcharacters(char[] text, [] text, intint start, start, intint

length)length)�� public void public void startElement(StringstartElement(String namespaceURInamespaceURI, ,

String String localNamelocalName, String , String qualifiedNamequalifiedName, Attributes , Attributes attsatts))

�� public void public void endElement(StringendElement(String namespaceURInamespaceURI, , String String localNamelocalName, String , String qualifiedNamequalifiedName))

Page 12: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 12

23

Receiving DocumentsReceiving Documents

�� The parser invokes The parser invokes startDocumentstartDocument()() as soon as as soon as it begins parsing a new document before it it begins parsing a new document before it invokes any other methods in invokes any other methods in ContentHandlerContentHandler

�� It calls It calls endDocumentendDocument()() after itafter it ’’s finished parsing s finished parsing the document and will not report any further the document and will not report any further content from that documentcontent from that document

24

Receiving ElementsReceiving Elements

�� When the parser encounters a start tag, it calls When the parser encounters a start tag, it calls the the startElementstartElement()() methodmethod

�� When the parser encounters an en tag, it calls When the parser encounters an en tag, it calls the the endElementendElement()() methodmethod

�� When the parser encounters an emptyWhen the parser encounters an empty--element element tag, it calls the tag, it calls the startElementstartElement()() method and then method and then the the endElementendElement()() methodmethod

Page 13: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 13

25

Handling AttributesHandling Attributes

�� Attributes are not reported through separate Attributes are not reported through separate callbackscallbacks

�� Instead an Attributes object containing all the Instead an Attributes object containing all the attributes of an element is passed to the attributes of an element is passed to the startElementstartElement()() method for the startmethod for the start--tag or tag or emptyempty--element tag of the element that processes element tag of the element that processes the attributesthe attributes

26

Receiving CharactersReceiving Characters

�� When the parser reads # PCDATA, it passes this When the parser reads # PCDATA, it passes this text to the text to the characters()characters() method as an array of method as an array of charschars

�� You must not assume that the parser will pass You must not assume that the parser will pass you the maximum contiguous run of text in a you the maximum contiguous run of text in a single call to single call to characters()characters()

Page 14: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 14

27

What the What the ContentHandlerContentHandler DoesnDoesn’’t t Tell YouTell You

�� The type of quotes that surround attributesThe type of quotes that surround attributes�� Whether empty elements are represented as Whether empty elements are represented as

<name></ name> or <name/ ><name></ name> or <name/ >�� Whether an attribute was specified in the Whether an attribute was specified in the

instance document or defaulted in from the instance document or defaulted in from the DTD or schemaDTD or schema

28

OverviewOverview

�� XML ParsersXML Parsers�� What is JAXP ?What is JAXP ?�� SAX : Simple API for XMLSAX : Simple API for XML�� DOM : Document Object ModelDOM : Document Object Model�� SAX vs. DOMSAX vs. DOM

�� When to Use DOM ?When to Use DOM ?�� When to Use SAX ?When to Use SAX ?

�� Transforming with XSLTTransforming with XSLT

Page 15: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 15

29

DOMDOM(Document Object Model)(Document Object Model)

30

OverviewOverview

�� DOM and Programming LanguagesDOM and Programming Languages�� The Evolution of DOMThe Evolution of DOM�� TreesTrees�� DOM in ActionDOM in Action�� DOM Parsers for JavaDOM Parsers for Java�� Parsing Documents with a DOM ParserParsing Documents with a DOM Parser�� The Node InterfaceThe Node Interface�� The The NodeListNodeList InterfaceInterface

Page 16: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 16

31

DOM and Programming LanguagesDOM and Programming Languages

�� DOM is defined in the Interface Definition DOM is defined in the Interface Definition Language (IDL) so that itLanguage (IDL) so that it’’s language neutrals language neutral

�� DOM bindings exist for most objectDOM bindings exist for most object--oriented oriented languages including Java, JavaScript, C++, languages including Java, JavaScript, C++, Python, and Python, and PerlPerl

32

The Evolution of DOMThe Evolution of DOM

�� The first version wasnThe first version wasn’’t an official specification, t an official specification, just the object model that Netscape Navigator 3 just the object model that Netscape Navigator 3 and Internet Explorer 3 implemented in their and Internet Explorer 3 implemented in their browsers. This is sometimes called DOM Level browsers. This is sometimes called DOM Level 00

�� DOM Level 0 only applied to HTML DOM Level 0 only applied to HTML documents and only in the context of JavaScriptdocuments and only in the context of JavaScript

Page 17: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 17

33

The Evolution of DOM (Cont.)The Evolution of DOM (Cont.)

�� The growing incompatibility between the two The growing incompatibility between the two browser object models made it obvious that browser object models made it obvious that something more standard was neededsomething more standard was needed

�� Hence, the W3C launched the W3C DOM Hence, the W3C launched the W3C DOM Activity and began working on DOM Level 1Activity and began working on DOM Level 1

34

The Evolution of DOM (Cont.)The Evolution of DOM (Cont.)

�� DOM Level 2 cleaned up the DOM Level 1DOM Level 2 cleaned up the DOM Level 1�� The big change was namespace support in the The big change was namespace support in the

Element and Element and AttrAttr interfacesinterfaces�� DOM2 also added a number of supplementary DOM2 also added a number of supplementary

interfaces for events, traversal, ranges, views, interfaces for events, traversal, ranges, views, and style sheetsand style sheets

�� From this point, we will learn about DOM2From this point, we will learn about DOM2

Page 18: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 18

35

TreesTrees

�� According to DOM, an XML document is a tree According to DOM, an XML document is a tree made up of nodes of several typesmade up of nodes of several types

�� The tree has a single root node, and all nodes in The tree has a single root node, and all nodes in this tree except for root have a single parent this tree except for root have a single parent nodenode

�� Each node has a list of child nodesEach node has a list of child nodes�� How to call a node that has the empty list of How to call a node that has the empty list of

children?children?�� A leaf nodeA leaf node

36

Tree and NodesTree and Nodes

�� There can also be nodes that are not part of the There can also be nodes that are not part of the tree structuretree structure�� Each attribute node belongs to one element node Each attribute node belongs to one element node

but is not considered to be a child of that elementbut is not considered to be a child of that element�� A full DOM document is composed of a tree of A full DOM document is composed of a tree of

nodes, various nodes that are somehow nodes, various nodes that are somehow associated with other nodes but are not associated with other nodes but are not themselves part of the tree, and a random themselves part of the tree, and a random assortment of disconnected nodesassortment of disconnected nodes

Page 19: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 19

37

Tree NodesTree Nodes

�� Besides its tree connections, each node has a Besides its tree connections, each node has a local name, a namespace URI, and a prefix; local name, a namespace URI, and a prefix; though for several kinds of nodes, these may be though for several kinds of nodes, these may be nullnull�� For instance, the local name, namespace URI, and For instance, the local name, namespace URI, and

prefix of a comment are always be nullprefix of a comment are always be null

38

Tree Nodes (Cont.)Tree Nodes (Cont.)

�� Each node has a string valueEach node has a string value�� For textFor text--ishish things like text nodes and things like text nodes and

comments, this tends to be the text of the nodecomments, this tends to be the text of the node�� For attributes, itFor attributes, it’’s normalized value of the s normalized value of the

attributeattribute�� For everything else, including elements and For everything else, including elements and

documents, the value is nulldocuments, the value is null

Page 20: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 20

39

Tree Node TypesTree Node Types

�� DOM divides nodes into twelve types, seven of DOM divides nodes into twelve types, seven of which can potentially be part of a DOM treewhich can potentially be part of a DOM tree�� Document nodesDocument nodes�� Element nodesElement nodes�� Text nodesText nodes�� Attribute nodesAttribute nodes�� Processing instruction nodesProcessing instruction nodes

40

Tree Node Types (Cont.)Tree Node Types (Cont.)

�� Types of a node in DOMTypes of a node in DOM�� Comment nodesComment nodes�� Document type nodesDocument type nodes�� Document fragment nodesDocument fragment nodes�� Notation nodesNotation nodes�� CDATA section nodesCDATA section nodes�� Entity nodesEntity nodes�� Entity reference nodesEntity reference nodes

Page 21: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 21

41

Document NodesDocument Nodes

�� Each DOM tree has a single root document Each DOM tree has a single root document nodenode

�� This node has childrenThis node has children�� Since all documents have root elements, a Since all documents have root elements, a

document node always has exactly one element document node always has exactly one element node childnode child

�� If the document has a document type If the document has a document type declaration, then it will also have one document declaration, then it will also have one document type node childtype node child

42

Document Nodes (Cont.)Document Nodes (Cont.)

�� If the document contains any comments or If the document contains any comments or processing instructions before or after the root processing instructions before or after the root element, then these will also be child nodes of element, then these will also be child nodes of the document nodethe document node

�� The order of all children is maintainedThe order of all children is maintained

Page 22: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 22

43

Element NodesElement Nodes

�� Each element node has a name, a local name, a Each element node has a name, a local name, a namespace URI (which may be null if the namespace URI (which may be null if the element is not in any namespace) and a prefix element is not in any namespace) and a prefix (which may also be null)(which may also be null)

�� An element node can contain text nodes, An element node can contain text nodes, comment nodes, and processing instruction comment nodes, and processing instruction nodesnodes

44

Attribute NodesAttribute Nodes

�� An attribute node has a name, a local name, a An attribute node has a name, a local name, a prefix, a namespace URI, and a string valueprefix, a namespace URI, and a string value

�� The attribute value is normalizedThe attribute value is normalized�� All white space characters are converted into a All white space characters are converted into a

single spacesingle space

�� Attributes are not considered to be children of Attributes are not considered to be children of the element they are attached tothe element they are attached to

Page 23: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 23

45

Leaf NodesLeaf Nodes

�� Only document, element, attribute, entity, and Only document, element, attribute, entity, and entity reference can have children entity reference can have children

�� The remaining node types do not have childrenThe remaining node types do not have children�� Several types of leaf nodes, such as text nodes, Several types of leaf nodes, such as text nodes,

comment nodes, processing instruction nodes, comment nodes, processing instruction nodes, and CDATA section nodesand CDATA section nodes

46

DOM Parsers for JavaDOM Parsers for Java

�� JAXP, the Java API for XML Processing, JAXP, the Java API for XML Processing, provides standard parser independent means to provides standard parser independent means to parse existing documents, create documents, and parse existing documents, create documents, and serialize inserialize in--memory DOM trees to XML filesmemory DOM trees to XML files

�� JAXP is a standard part of Java 1.4JAXP is a standard part of Java 1.4 or higheror higher

Page 24: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 24

47

Parsing Documents with a DOM Parsing Documents with a DOM ParserParser

�� Unlike SAX, DOM does not have a class or Unlike SAX, DOM does not have a class or interface that represents the XML parserinterface that represents the XML parser

�� Each parser vendor provides their own unique Each parser vendor provides their own unique class (org.apace.xerces.parsers.DOMParser, class (org.apace.xerces.parsers.DOMParser, oracle.xml.parser.v2.DOMParser)oracle.xml.parser.v2.DOMParser)

�� Since these classes do not share a common Since these classes do not share a common interface or superclass, the methods they use to interface or superclass, the methods they use to parse documents vary tooparse documents vary too

48

JAXP DOM ParserJAXP DOM Parser

�� The lack of a standard means of parsing an The lack of a standard means of parsing an XML document is one of the holds that JAXP XML document is one of the holds that JAXP fillsfills

�� If your parser implements JAXP, then instead of If your parser implements JAXP, then instead of using the parserusing the parser--specific classes, you can use the specific classes, you can use the javax.xml.parsers.DocumentBuilderFacotryjavax.xml.parsers.DocumentBuilderFacotry and and javax.xml.parsers.DocumentBuilderjavax.xml.parsers.DocumentBuilder classes to classes to parse the documentsparse the documents

Page 25: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 25

49

The Node InterfaceThe Node Interface

�� Once youOnce you’’ve parsed the document and formed ve parsed the document and formed org.w3c.dom.Document object, you can forget org.w3c.dom.Document object, you can forget the differences between the various parsers and the differences between the various parsers and just work with the standard DOM interfacesjust work with the standard DOM interfaces

50

Common DOM MethodsCommon DOM Methods

�� When youWhen you’’re working with the DOM, youre working with the DOM, you’’ ll ll often use the following methodsoften use the following methods�� Document.getDocumentElementDocument.getDocumentElement(): Returns the (): Returns the

root of the DOM treeroot of the DOM tree�� Node.getFirstChildNode.getFirstChild() and () and Node.getLastChildNode.getLastChild()()�� Node.getNextSiblingNode.getNextSibling()()�� Element.getAttribute(StringElement.getAttribute(String attrNameattrName))

Page 26: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 26

51

The The NodeListNodeList InterfaceInterface

�� DOM stores the lists of children of each node in DOM stores the lists of children of each node in NodeListNodeList objectsobjects

�� Indexes start from 0 and continue to one less Indexes start from 0 and continue to one less than the length of the list, just like Java arraysthan the length of the list, just like Java arrays

�� package org.w3c.dom; public interface package org.w3c.dom; public interface NodeListNodeList{ { public Node public Node item(intitem(int index); index);

public public intint getLengthgetLength(); }(); }

52

OverviewOverview

�� XML ParsersXML Parsers�� What is JAXP ?What is JAXP ?�� SAX : Simple API for XMLSAX : Simple API for XML�� DOM : Document Object ModelDOM : Document Object Model�� SAX vs. DOMSAX vs. DOM

�� When to Use DOM ?When to Use DOM ?�� When to Use SAX ?When to Use SAX ?

�� Transforming with XSLTTransforming with XSLT

Page 27: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 27

53

SAX vs. DOMSAX vs. DOM

�� In the case of DOM, the parser does almost In the case of DOM, the parser does almost everythingeverything�� Read the XML document inRead the XML document in�� Create an Object model on top of itCreate an Object model on top of it�� Give you a reference to this object model (a Give you a reference to this object model (a

Document object) so that you can manipulate itDocument object) so that you can manipulate it

�� SAX doesnSAX doesn’’ t expect the parser to do mucht expect the parser to do much

54

SAX vs. DOM (Cont.)SAX vs. DOM (Cont.)

�� For SAX, the parser should For SAX, the parser should �� Read in the XML document Read in the XML document �� Fire a bunch of events depending on what tags it Fire a bunch of events depending on what tags it

encounters in the XML documentencounters in the XML document

�� Then, the programmer needs to make sense of Then, the programmer needs to make sense of all the tag events and create objects in their own all the tag events and create objects in their own object modelobject model

Page 28: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 28

55

OverviewOverview

�� XML ParsersXML Parsers�� What is JAXP ?What is JAXP ?�� SAX : Simple API for XMLSAX : Simple API for XML�� DOM : Document Object ModelDOM : Document Object Model�� SAX vs. DOMSAX vs. DOM�� When to Use DOM ?When to Use DOM ?�� When to Use SAX ?When to Use SAX ?

56

When to Use DOM ?When to Use DOM ?

�� DOM is quite easy to implementDOM is quite easy to implement�� Good for the development to be done in a short Good for the development to be done in a short

amount of timeamount of time

�� DOM has crated a tree of nodes DOM has crated a tree of nodes �� When you need to quickly access children and parent When you need to quickly access children and parent

of current nodesof current nodes�� When you need to modify an XML structureWhen you need to modify an XML structure

�� What are the disadvantages of using DOM?What are the disadvantages of using DOM?

Page 29: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 29

57

When to Use SAXWhen to Use SAX ??

�� SAX requires little memory because SAX requires little memory because �� It does not construct an internal representation of It does not construct an internal representation of

the XML datathe XML data

�� It works well when you simply want to read data It works well when you simply want to read data and have the application act on itand have the application act on it�� You see the data as it streams in, but you canYou see the data as it streams in, but you can’’ t go t go

back to an earlier position or leap ahead to a back to an earlier position or leap ahead to a different positiondifferent position

58

OverviewOverview

�� XML ParsersXML Parsers�� What is JAXP ?What is JAXP ?�� SAX : Simple API for XMLSAX : Simple API for XML�� DOM : Document Object ModelDOM : Document Object Model�� SAX vs. DOMSAX vs. DOM

�� When to Use DOM ?When to Use DOM ?�� When to Use SAX ?When to Use SAX ?

�� Transforming with XSLTTransforming with XSLT

Page 30: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 30

59

XSL XSL -- The Style Sheet of XMLThe Style Sheet of XML

�� XML does not use predefined tags (we can use XML does not use predefined tags (we can use any tags we want)any tags we want)

�� <table> could mean an HTML table, a piece of <table> could mean an HTML table, a piece of furniture, or something elsefurniture, or something else

�� XSL: something in addition to the XML XSL: something in addition to the XML document that describes how the document document that describes how the document should be displayedshould be displayed

60

What is XSLT?What is XSLT?

�� XSLT transforms an XML document into another XSLT transforms an XML document into another XML document, such as an XHTML documentXML document, such as an XHTML document

�� XSLT can XSLT can �� Add new elements into the output fileAdd new elements into the output file�� Remove elementsRemove elements�� Rearrange and sort elementsRearrange and sort elements�� Test and make decisions about which elements to displayTest and make decisions about which elements to display

Page 31: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 31

61

How Does XSLT Work?How Does XSLT Work?

�� XSLT transforms an XML XSLT transforms an XML source treesource tree into into an XML an XML result treeresult tree

�� XSLT uses XPath to define parts of the XSLT uses XPath to define parts of the source document that source document that matchmatch one or more one or more predefined predefined templatestemplates

62

How Does XSLT Work?How Does XSLT Work?

�� When a match is found, XSLT will When a match is found, XSLT will transformtransformthe matching part of the the matching part of the sourcesource document into document into the the resultresult document document

�� The parts of the source document that do not The parts of the source document that do not match a template will end up unmodified in the match a template will end up unmodified in the result documentresult document

Page 32: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 32

63

XPathXPath and XSLTand XSLT

�� XPathXPath is a standard that provides the is a standard that provides the mechanism for accessing the elements of an mechanism for accessing the elements of an XML documentXML document

�� XPathXPath identifies the parts of the input document identifies the parts of the input document to be transformedto be transformed

�� XPathXPath enables you to traverse an XML enables you to traverse an XML document and select the set of elementsdocument and select the set of elements

64

Transforming with XSLTTransforming with XSLT

�� XSL provides the syntax and semantics for XSL provides the syntax and semantics for specifying formattingspecifying formatting

�� XSLT is the processor that performs the XSLT is the processor that performs the formatting taskformatting task

�� XSLT is often used for the purpose of XSLT is often used for the purpose of generating various output formats for an generating various output formats for an application that enables access to heterogeneous application that enables access to heterogeneous client types client types

Page 33: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 33

65

XSLT supported in JAXP XSLT supported in JAXP

�� Steps required for transformation follow these Steps required for transformation follow these logical stepslogical steps�� Obtain a transformation factory used for Obtain a transformation factory used for

instantiating a transformer classinstantiating a transformer class�� Create a new transformer classCreate a new transformer class�� Use the transformer class for transforming the data Use the transformer class for transforming the data

by specifying the XML input source and the output by specifying the XML input source and the output sourcesource

66

Getting the Factory and Transformer ClassGetting the Factory and Transformer Class

�� Use the factory class for instantiating a transformer Use the factory class for instantiating a transformer implementation classimplementation class

TransformerFactoryTransformerFactory factory = factory = TransformerFactory.newInstanceTransformerFactory.newInstance();();

�� Use the Use the transfomertransfomer class for applying the class for applying the stylesheetstylesheet to to the input XML datathe input XML dataTransformer Transformer transfomertransfomer = = factory.newTransformer(newfactory.newTransformer(newStreamSoruce(StreamSoruce(““order.xslorder.xsl””));));

Page 34: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 34

67

Transforming the XMLTransforming the XML

�� The transformer then calls the transform method to The transformer then calls the transform method to invoke the transformation process. invoke the transformation process.

�� The parameters required in the transform method are The parameters required in the transform method are input stream and output resultinput stream and output result

�� transformer.transform(newtransformer.transform(newStreamSource(StreamSource(““ PurchoseOrder.xmlPurchoseOrder.xml”” ), ), new new StreamResult(System.outStreamResult(System.out));));

68

ResourcesResources�� MSXML: Microsoft XML Parser: MSXML: Microsoft XML Parser:

http:/ /http:/ / msdn.microsoft.commsdn.microsoft.com/ xml// xml/�� Apache Apache XercesXerces: XML parsers in Java and C++: : XML parsers in Java and C++:

http:/ /http:/ / xml.apache.orgxml.apache.org�� IBM IBM AlphaWorksAlphaWorks: :

http:/ / www.alphaworks.ibm.com/ tech/ xml4jhttp:/ / www.alphaworks.ibm.com/ tech/ xml4j�� expatexpat: : http:/ /http:/ / www.jclark.com/ xml/ expat.htmlwww.jclark.com/ xml/ expat.html�� XP: XP: http:/ /http:/ / www.jclark.com/ xml/ xpwww.jclark.com/ xml/ xp//�� Other sourcesOther sources

�� XML.comXML.com web siteweb site�� Cover Pages: XML web siteCover Pages: XML web site

Page 35: JAXP ( Java API for XML Processing )krunapon/courses/178375/slides/...Java Web Services, Software Park Thailand, 2004 Dr. Kanda Runapongsa, Khon Kaen University 1 1 JAXP ( Java API

Java Web Services, Software Park Thailand, 2004

Dr. Kanda Runapongsa, Khon Kaen University 35

69

ExercisesExercises-- Compile and run these filesCompile and run these files�� SAX SAX

ex1.java , ex2.java , ex3.java , ex4.java,ex5.java , ex5.java ex1.java , ex2.java , ex3.java , ex4.java,ex5.java , ex5.java , ex6.java , ex7.java, ex6.java , ex7.java

�� DOM DOM dom1.java , dom2.java , dom3.java dom1.java , dom2.java , dom3.java , dom4.java , , dom4.java , dom5.javadom5.java

�� Download ExercisesDownload Exerciseshttp:/ / gear.kku.ac.th/ ~ krunapon/ 178375/ exercises/ jahttp:/ / gear.kku.ac.th/ ~ krunapon/ 178375/ exercises/ jaxp_exer.zipxp_exer.zip