java www week 10 version 2.1 mar 2008 slide [email protected] java (jsp) and xml format of...
TRANSCRIPT
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Java (JSP) and XML
Format of lecture: What is XML? A sample XML file… How to use XML with JSP
Example code for parsing an XML file in JSP
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
What is XML? XML is principally concerned with the description and
structure of data (content & presentation are separate) Traditional methods of data storage and exchange
employ a variety of schemes Electronically, these usually are of the form of simple text
files (Comma Separated Values etc.) or binary files Both methods have their own advantages and
disadvantages CSV files contain data that can be easily read but lack a
description of their own format Binary files contain data and a description of its data
format (as a Word document does) but are proprietary schemes that require specific applications to read them
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
What is XML? XML stands for eXtensible Markup Language Developed by World Wide Web Consortium
(W3C) Aims to provide the best of both worlds
Stores data in an easy to read text format Also contains a description of the data
Open standard - looks similar to HTML (large user base) except the extensible nature of XML allows for the creation of user-defined tags
It is not a replacement for HTML (although it may eventually supplant it)
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
What is XML? XML offers a standard method for storing
structured data
Language independence (English, Chinese etc.)
Hierarchical structure allows for simple and efficient querying/parsing of the document
Simple data interchange between applications and/or distributed objects
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
What is XML? Can improve upon and replace existing EDI
(Electronic Document Interchange) solutions such as EDIFACT
Websites gain by having content and presentation separate A site could be developed purely in XML
Cost benefits No need for private EDI networks – the
Internet used as exchange medium Reduced time to implement/reduced
maintenance
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Illustrative Example
Paper-based or Simple Text File
Jonathan WestlakeStaffordshire [email protected]
Comma Separated Values
Jonathan,Westlake,Staffordshire University,[email protected]
Binary – a string of 0s and 1s0101 1001 0111 0110 0110 0001 0110 1110 0010 0000 etc.
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Illustrative Example
Simple ‘Freeform’ (i.e. without an XML schema) XML File
<contact>
<name><firstname>Jonathan</firstname><lastname>Westlake</lastname>
</name><workplace>Staffordshire University</workplace><email>[email protected]</email>
</contact>
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
How to use XML with JSP XML is a big topic! We are just going to look at one of the more fundamental
aspects of XML An XML parser simply checks that your XML document
is syntactically correct (well-formed) and contains correctly formatted data (valid)
Once an XML document is parsed, the information it contains is accessible inside our web application
There are two ways of parsing an XML document – Simple API for XML (SAX) <reference link> and Document Object Model (DOM) <reference link>
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
SAX Parser
Event Driven : An event is triggered each time the parser encounters a beginning, or ending tag
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
DOM Parser
A tree representing the document is built in memory.
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
DOM Tree Structurecontacts---contact
| |---name | | | |---firstname | | └---Text | | | └ ---lastname | └---Text | |---workplace | └---Text | └ ---email
└---Text
For our contactsexample
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Standard DOM Parsing
Searching for a contact based on their email address
Create a Document object Load the XML file into the Document Using iterations over the Document nodes, you
can access any of the values or attributes that are stored in the XML file
When you find the record that contains the email address that you are looking for, do something with the information
Standard DOM Parsing
// Get a factory object (many ways to build a Document, this lets us choose)DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
// ensures the factory object returned is set to validate the XML data with the schema
docBuilderFactory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");
docBuilderFactory.setValidating(true); docBuilderFactory.setNamespaceAware(true); // get the Document builder object that we will use to build the DOM tree in
memoryDocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
// load the XML file into a new Document objectDocument doc = docBuilder.parse(new File(“myfile.xml”));
// does a bit of extra DOM processing (strips out empty nodes etc.)doc.normalize();
Standard DOM Parsing// get the whole node structureNodeList nodelist = getNodeList("/contacts", doc);
for(int index = 0; index < nodelist.getLength(); index++){ Node node = nodelist.item(index);
if(node.getNodeType(TEXT_NODE)) { if(node.getNodeName().equals(“firstname”))
firstname = node.getNodeValue();if(node.getNodeName().equals(“lastname”))
lastname = node.getNodeValue();if(node.getNodeName().equals(“workplace”))
workplace = node.getNodeValue();if(node.getNodeName().equals(“email”))
email = node.getNodeValue();if(email.equals(“[email protected]”)){ displayRecord(firstname, lastname, workplace, email); break;}
}}
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Standard DOM Parsing Problems
The previous procedure is ok if you have a fairly ‘flat’ XML structure that does not contain many different node types
If you have many contacts stored then it may be very slow to iterate through the nodes until you find the contact you are looking for
For more complex XML documents, you can’t use a simple FOR loop to iterate
You end up with code that looks more like…
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Not Nice!
Element root = doc.getDocumentElement(); Node configNode = root.getFirstChild(); NodeList childNodes = configNode.getChildNodes(); for (int childNum = 0; childNum < childNodes.getLength(); childNum+
+){
if ( childNodes.item(childNum).getNodeType() == Node.ELEMENT_NODE ){ Element child = (Element) childNodes.item( childNum );
if ( child.getTagName().equals( "header" ) ){ // Do something with the header
System.out.print("Got a header!\n");}
}}
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
XPath
XPath (XML Path Language) is a terse (short, non-XML) syntax for addressing portions of an XML document
A typical XPath expression is a Location Path consisting of a string of element or attribute qualifiers separated by forward slashes ("/"), similar in appearance to a file system path
E.g. this gets the email field of the first contact//contact[1]/email
NB would you expect to have seen //contact[0]/...?
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Using XPath
We can use the XPath syntax to retrieve any information we like from the Document
XPath is a new(ish) specification and initially it was only available via Java extensions
As of version 1.5 of the JDK, Java now natively supports XPath
Search Using XPath // create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath(); // now execute the XPath select statement to get the contact that matches the email address NodeList nodes = (NodeList)xPath.evaluate(//contact[email=‘[email protected]’],
nodelist, XPathConstants.NODESET);
Node fNameNode = (NodeList) xPath.evaluate(//firstname’], nodes, XPathConstants.NODESET);
String firstname = firstNameNode.getNodeValue();
Node lNameNode = (NodeList) xPath.evaluate(//lastname’], nodes, XPathConstants.NODESET);
String lastname = lastNameNode.getNodeValue();
Node workNode = (NodeList) xPath.evaluate(//workplace’], nodes, XPathConstants.NODESET);
String workplace = workPlaceNode.getNodeValue();
Node emailNode = (NodeList) xPath.evaluate(//email’], nodes, XPathConstants.NODESET);String email = emailNode.getNodeValue();
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
XMLHelper.java
package xmlhelper;
import javax.xml.xpath.XPath;import javax.xml.xpath.XPathConstants;import javax.xml.xpath.XPathExpressionException;import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;import org.w3c.dom.Node;import org.w3c.dom.NodeList;
So JSP has access to a set of packages which include DOM and XPath
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
getNodeListXPath()
public static NodeList getNodeListXPath(String expression, Document target) throws XPathExpressionException{
// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();
// now execute the XPath select statementNodeList nodeList = (NodeList) xPath.evaluate(expression, target,
XPathConstants.NODESET);
// return the resulting nodereturn nodeList;
}
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
getBooleanXPath()public static boolean getBooleanXPath(String expression, Document target) throws XPathExpressionException{
// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();
// now execute the XPath select statementBoolean nodeBoolean = (Boolean)xPath.evaluate(expression, target,
XPathConstants.BOOLEAN);
// return the resulting nodereturn nodeBoolean.booleanValue();
}
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
getNumberXPath()public static double getNumberXPath(String expression, Document target) throws XPathExpressionException{
// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();
// now execute the XPath select statementDouble nodeNumber = (Double)xPath.evaluate(expression, target,
XPathConstants.NUMBER);
// return the resulting nodereturn nodeNumber.doubleValue();
}
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
getStringXPath()public static String getStringXPath(String expression, Document target) throws XPathExpressionException{
// create the XPath and initialize itXPath xPath = XPathFactory.newInstance().newXPath();
// now execute the XPath select statementString nodeText = (String)xPath.evaluate(expression, target,
XPathConstants.STRING);
// return the resulting nodereturn nodeText;
}
Java WWW Week 10
Version 2.1 Mar 2008
Slide [email protected]
Summary
XML is important as it offers: Neutral data exchange Can be built into a web application Can be searched for content using XPath
Java (and therefore JSP) can use XML and Xpath
Used widely in enterprise-scale information systems
No lecture next week but the first revision session in preparation of the module class test (short exam)