dom interfaces
TRANSCRIPT
-
8/19/2019 DOM Interfaces
1/9
1
DOM interfaces
The DOM defines several Java interfaces. Here are the most common interfaces:
• Node - The base datatype of the DOM.
• Element - The vast majority of the objects you'll deal ith are !lements.
• Attr "epresents an attribute of an element.
• Text The actual content of an !lement or #ttr.
• Document "epresents the entire $M% document. # Document object is often referred to
as a DOM tree.
Common DOM methods
&hen you are orin( ith the DOM) there are several methods you'll use often:
• Document.getDocumentElement() - "eturns the root element of the document.
• Node.getFirstChild() - "eturns the first child of a (iven *ode.
• Node.getLastChild() - "eturns the last child of a (iven *ode.
• Node.getNextSibling() - These methods return the ne+t siblin( of a (iven *ode.
• Node.getPreviousSibling() - These methods return the previous siblin( of a (iven *ode.
• Node.getAttribute(attrName) - ,or a (iven *ode) returns the attribute ith the
reuested name.
Steps to Using DOM
,olloin( are the steps used hile parsin( a document usin( DOM arser.
• /mport $M%-related paca(es.
• 0reate a Document1uilder
• 0reate a Document from a file or stream
-
8/19/2019 DOM Interfaces
2/9
2
• !+tract the root element
• !+amine attributes
• !+amine sub-elements
m!ort #$L%related !ac&ages
import or(.2c.dom.34
import java+.+ml. parsers.34
import java.io.34
Demo Exam!le
'ere is the in!ut xml ile e need to !arse*
56+ml version789.86;5class;
5student rollno782?5=mars;
5=student;
5student rollno78@
-
8/19/2019 DOM Interfaces
3/9
3
public class DomarserDemo
public static void mainEFtrin(G ar(sI
try
,ile input,ile 7 ne ,ileE8input.t+t8I4
Document1uilder,actory db,actory7 Document1uilder,actory.ne/nstanceEI4
Document1uilder d1uilder 7 db,actory.neDocument1uilderEI4
Document doc 7 d1uilder.parseEinput,ileI4 doc.(etDocument!lementEI.normaliCeEI4
Fystem.out.printlnE8"oot element :8
doc.(etDocument!lementEI.(et*ode*ameEII4
*ode%ist n%ist 7 doc.(et!lements1yTa(*ameE8student8I4 Fystem.out.printlnE8----------------------------8I4
for Eint temp 7 4 temp 5 n%ist.(et%en(thEI4 tempI
*ode n*ode 7 n%ist.itemEtempI4
Fystem.out.printlnE8Kn0urrent !lement :8 n*ode.(et*ode*ameEII4
if En*ode.(et*odeTypeEI 77 *ode.!%!M!*TL*OD!I !lement e!lement 7 E!lementI n*ode4
Fystem.out.printlnE8Ftudent roll no : 8
e!lement.(et#ttributeE8rollno8II4
Fystem.out.printlnE8,irst *ame : 8 e!lement
.(et!lements1yTa(*ameE8firstname8I
.itemEI .(etTe+t0ontentEII4
Fystem.out.printlnE8%ast *ame : 8 e!lement
.(et!lements1yTa(*ameE8lastname8I
.itemEI
.(etTe+t0ontentEII4 Fystem.out.printlnE8*ic *ame : 8
e!lement
.(et!lements1yTa(*ameE8nicname8I
.itemEI .(etTe+t0ontentEII4
Fystem.out.printlnE8Mars : 8
e!lement .(et!lements1yTa(*ameE8mars8I
.itemEI
.(etTe+t0ontentEII4
catch E!+ception eI
-
8/19/2019 DOM Interfaces
4/9
4
e.printFtacTraceEI4
This ould produce the folloin( result:
"oot element :class
----------------------------
0urrent !lement :student
Ftudent roll no : 2?
0urrent !lement :student
Ftudent roll no : @
-
8/19/2019 DOM Interfaces
5/9
SA# (the Sim!le AP or #$L) is an event-based parser for +ml documents.Nnlie a DOM
parser) a F#$ parser creates no parse tree. F#$ is a streamin( interface for $M%) hich means
that applications usin( F#$ receive event notifications about the $M% document bein(
processed an element) and attribute) at a time in seuential order startin( at the top of thedocument) and endin( ith the closin( of the "OOT element.
• "eads an $M% document from top to bottom) reco(niCin( the toens that mae up a
ell-formed $M% document
• Toens are processed in the same order that they appear in the document
• "eports the application pro(ram the nature of toens that the parser has encountered as
they occur
• The application pro(ram provides an 8event8 handler that must be re(istered ith the
parser
• #s the toens are identified) callbac methods in the handler are invoed ith the
relevant information
When to use?
ou should use a F#$ parser hen:
• ou can process the $M% document in a linear fashion from the top don
• The document is not deeply nested
• ou are processin( a very lar(e $M% document hose DOM tree ould consume too
much memory.Typical DOM implementations use ten bytes of memory to represent one byte of $M%
• The problem to be solved involves only part of the $M% document
• Data is available as soon as it is seen by the parser) so F#$ ors ell for an $M%
document that arrives over a stream
-
8/19/2019 DOM Interfaces
6/9
Disadvantages of SA
• &e have no random access to an $M% document since it is processed in a forard-only
manner
• /f you need to eep trac of data the parser has seen or chan(e the order of items) you
must rite the code and store the data on your on
Content!and"er #nterface
This interface specifies the callbac methods that the F#$ parser uses to notify an application
pro(ram of the components of the $M% document that it has seen.
• void startDocument() - 0alled at the be(innin( of a document.
• void endDocument() - 0alled at the end of a document.
• void startElement(String uri+ String localName+ String ,Name+ Attributes atts) -
0alled at the be(innin( of an element.
• void endElement(String uri+ String localName+String ,Name) - 0alled at the end of
an element.
• void characters(char- ch+ int start+ int length) - 0alled hen character data is
encountered.
• void ignorable/hites!ace( char- ch+ int start+ int length) - 0alled hen a DTD is
present and i(norable hitespace is encountered.
• void !rocessingnstruction(String target+ String data) - 0alled hen a processin(
instruction is reco(niCed.
• void setDocumentLocator(Locator locator)) - rovides a %ocator that can be used to
identify positions in the document.
• void s&i!!edEntit0(String name) - 0alled hen an unresolved entity is encountered.
• void startPreix$a!!ing(String !reix+ String uri) - 0alled hen a ne namespace
mappin( is defined.
• void endPreix$a!!ing(String !reix) - 0alled hen a namespace definition ends its
scope.
-
8/19/2019 DOM Interfaces
7/9
Attri$utes #nterface
This interface specifies methods for processin( the attributes connected to an element.
• int getLength() - "eturns number of attributes.
• String get1Name(int index)
• String get2alue(int index)
• String get2alue(String ,name)
DOM
import java.io.34
import java+.+ml.parsers.34
import or(.2c.dom.34import or(.+ml.sa+.34
public class parsin(LDOMDemo public static void mainEFtrin(G ar(sI
try
Fystem.out.printlnEPenter the name of $M% documentQI4
1uffered"eader input7ne 1ufferedreaderEne /nputFtream"eaderEFystem.inII4
Ftrin( fileLname7input.read%ineEI4,ile fp7ne ,ileEfileLnameI4
ifEfp.e+istsEII
try
Document1uilder,actory ,actoryLobj7 Document1uilder,actory.ne/nstanceEI4Document1uilder builder7,actoryLobj.neDocument1uilderEI4
/nputFource ipLsrc7ne /nputFourceEfileLnameI4
Document doc7builder.parseEipLsrcI4
Fystem.out.printlnEPfileLnameQis ell-formed.QI4
-
8/19/2019 DOM Interfaces
8/9
catch E!+ception eI
Fystem.out.printlnEfileLnameQis not ell-formed.QI4Fystem.e+itE9I4
else
Fystem.out.printlnEPfile not found:QfileLnameI4
catchE/O!+ception e+I
e+.printFtacTraceEI4
F#$ simple #/ for $M%
import java.io.34
import or(.+ml.sa+4
import or(.+ml.sa+.helpers4 public class parsin(LF#$Demo
public static void mainEFtrin(G ar(sI thros /O!+ception
try
Fystem.out.printlnEPenter the name of $M% documentQI41uffered"eader input7ne 1ufferedreaderEne /nputFtream"eaderEFystem.inII4
Ftrin( fileLname7input.read%ineEI4
,ile fp7ne ,ileEfileLnameI4ifEfp.e+istsEII
try
$M%"eader reader7$M%"eader,actory.create$M%"eaderEI4
reader.parseEfileLnameI4
Fystem.out.printlnEPfileLnameQis ell-formed.QI4
catch E!+ception eI
Fystem.out.printlnEfileLnameQis not ell-formed.QI4
Fystem.e+itE9I4
else
-
8/19/2019 DOM Interfaces
9/9
Fystem.out.printlnEPfile not found:QfileLnameI4
catchE/O!+ception e+Ie+.printFtacTraceEI4
D3$ SA#
Ftores the entire $M% document into memory
before processin(
arses node by node
Occupies more memory DoesnRt store the $M% in memory
&e can insert or delete nodes &e canRt insert or delete a node
DOM is a tree model parser F#$ is an event based parser
Document Object Model EDOMI #/ F#$ is a Fimple #/ for $M%
reserves comments DoesnRt preserve comments
F#$ (enerally runs a little faster than DOM F#$ (enerally runs a little faster than DOM
Traverse in any direction. Top to bottom traversin( is done in this
approach